Logging the Internet of Things - Connected power plants demand a new paradigm

Logging the Internet of Things

Connected power plants demand a new paradigm

Rocco Gagliardi
by Rocco Gagliardi
time to read: 10 minutes

Keypoints

  • The IoT changes the log paradigm
  • The actual infrastructure will be heavily impacted
  • Some mix of products are emerging
  • The solution design must be carefully assessed

The log is born to inform humans about the status of a specific system, just in case of a problem, as record of what happened. Some times – a good example: xen_netfront: xennet: skb rides the rocket: 21 slots – the message sounds like a joke*: is the human-human communication.

In the wonderful world of SIEM (Security Information and Event Management), the plethora of messages generated by different (part of) systems was interpreted and correlated: processed or – better – pre-processed by machine for the end-human-user.

In the era of the IoT (Internet of Things), we will assist in a basic paradigm change: the message generated by a machine will be used by another machine, not by a human.

The Network is the Computer

I’m not sure what John Cage had in mind with the statement The Network is the computer, but I don’t think he meant IoT. The IoT refer to (a huge number of) interconnected devices that runs with very little human intervention. We install them in our houses, we wear them during our training, part of them are with us during trips and part remains at home, informing us about the location of our goldfish or the status of our fridge.

In the industry, the IoT is used to monitor the status of very complex installations, using thousands of sensors with millions of samples and help saving lot of money. Data generated by these devices (sensors, actuators, etc.) can be used to predict the system behavior and to improve future versions.

Classic vs IoT log

How changes the traditional log compared with the IoT?

Key Classic Log IoT Log
Avg number of logs/[s] low high
Avg length of log [bytes] high low
Validity timerange minutes – days micro seconds – seconds
Log content State transfer Telemetry
Log main scope History Precog
Communication streams Machine ⇒ Human Machine ⇒ Machine

As use case, take a power plant monitoring. Some of the advantages to constantly monitor the operating parameters of a complex machine such, for example, a gas turbine:

Possible consequences:

Extend the idea to other parts of the system, and you will quickly have lot of sensors to monitor:

Parameter Value
Sensors Analog: 30.000, Digital: 20.000
Sampling rate Analog: 50ms, Digital: 1s
Data type Analog: float, Digital: short-int
History 1 – 3 years

The necessary infrastructure to handle such a data-flow coming from so many different sources, diverges from the classical architecture, and must assure:

Emerging solutions

Alongside the classic logging technologies, new specific solutions for the telemetry are being consolidated. No particular standard is adopted, since the data are just k→v tuples. Regarding the communication protocol, almost everyone is based on IP. On top of IP, the choice is mainly between MQTT, XMPP and CoAP.

In the storage area, for telemetry, the choice goes to dedicated databases (Graphite/InfluxDB/Hadoop) and specific software to display charts or build dashboards. To mix new features in old solutions, some parser/extractors may be used to extract performance data and push it to telemetry databases while maintaining the old processes in place.

Communication fundamentals

The key component is the IP protocol, especially the “new” version 6. The battery for many sensors, especially those designed for the IEEE802.15.4, must last for months; so, the device is not online all the time, cannot communicate at high speed or – sometimes – is just out of range. This kind on networks are known as LLNs (Low-power and Lossy Networks).

Following, a very short list of key points for each protocol.

Protocol Pro Contra
MQTT Pushed by IBM. Subscribed services (Many2Many). Two way communication over unreliable nets. NAT is not critical. QoS in place (Fire-and-forget, At-least-once and Exactly-once). Low power, but not for extremely constrained devices. Normally “online” all the time (addressed in MQTT-SN). Long topic names, impractical for 802.15.4 (addressed in MQTT-SN).
CoAP Pushed by CISCO. Primarly a One2One protocol. Resource discovery. Interoperate with HTTP/REST. Sensor is typically a server, so NAT must be designed carefully. Since UDP, no SSL/TLS. DTLS can be used.
XMPP Pushed by CISCO. Real-time. Massive scalability. Security. Not been practical over LLNs. Need for an XML parser.

Products by phases

Following, a summary of products with references; refer to the schema for the interconnections between components. Please remember that the same process can be implemented in very different manners: just use the product most familiar to you.

Phase Components Key to consider
Collect Fluent Bit, Collectd, Telegraf, Beats, rsyslog System type. Framework already present.
Queue Redis, MemCached, RabbitMQ Routing customization. Performance. Delivery assurance.
Parse Fluentd, Logstash, rsyslog, Graylog Input / Output. Message parsing plugins.
Store InfluxDB, Prometeus, Hadoop, Elasticsearch, RDD Speed. Query language. Granularity.
Visualise Kibana, Graylog, Chronograf, Grafana, Thruk, Cacti Authentication / Authorization. Visualitation types. Query language / Transformation functions. Dashboard customization.
Act Graylog, Kapacitor Triggering capabilities. Query language / Transformation functions. Storage. Integration.

Some possible paths for logged data

Summary

Logging and using the enormous amount of data generated by the upcoming IoT infrastructure will be challenging for the whole IT infrastructure: for the network, for the CPUs, and for the software development. A lot of solutions are popping out, all with pros und contras. Depending on what are the primary goals of the project, one (mix) may be better as another.

Footnote

*) This “joke” appeared in our XEN infrastructure just after an upgrade, and wasn’t funny. For more goto Kernel Line Tracing: Linux perf Rides the Rocket.

About the Author

Rocco Gagliardi

Rocco Gagliardi has been working in IT since the 1980s and specialized in IT security in the 1990s. His main focus lies in security frameworks, network routing, firewalling and log management.

Links

You want to bring your logging and monitoring to the next level?

Our experts will get in contact with you!

×
Enhancing Data Understanding

Enhancing Data Understanding

Rocco Gagliardi

Transition to OpenSearch

Transition to OpenSearch

Rocco Gagliardi

Graylog v5

Graylog v5

Rocco Gagliardi

auditd

auditd

Rocco Gagliardi

You want more?

Further articles available here

You need support in such a project?

Our experts will get in contact with you!

You want more?

Further articles available here