Logging the Internet of Things

Connected power plants demand a new paradigm

by Rocco Gagliardi

time to read: 10 minutes

Keypoints

The IoT changes the log paradigm
The actual infrastructure will be heavily impacted
Some mix of products are emerging
The solution design must be carefully assessed

The log is born to inform humans about the status of a specific system, just in case of a problem, as record of what happened. Some times – a good example: xen_netfront: xennet: skb rides the rocket: 21 slots – the message sounds like a joke*: is the human-human communication.

In the wonderful world of SIEM (Security Information and Event Management), the plethora of messages generated by different (part of) systems was interpreted and correlated: processed or – better – pre-processed by machine for the end-human-user.

In the era of the IoT (Internet of Things), we will assist in a basic paradigm change: the message generated by a machine will be used by another machine, not by a human.

The Network is the Computer

I’m not sure what John Cage had in mind with the statement The Network is the computer, but I don’t think he meant IoT. The IoT refer to (a huge number of) interconnected devices that runs with very little human intervention. We install them in our houses, we wear them during our training, part of them are with us during trips and part remains at home, informing us about the location of our goldfish or the status of our fridge.

In the industry, the IoT is used to monitor the status of very complex installations, using thousands of sensors with millions of samples and help saving lot of money. Data generated by these devices (sensors, actuators, etc.) can be used to predict the system behavior and to improve future versions.

Classic vs IoT log

How changes the traditional log compared with the IoT?

Key	Classic Log	IoT Log
Avg number of logs/[s]	low	high
Avg length of log [bytes]	high	low
Validity timerange	minutes – days	micro seconds – seconds
Log content	State transfer	Telemetry
Log main scope	History	Precog
Communication streams	Machine ⇒ Human	Machine ⇒ Machine

As use case, take a power plant monitoring. Some of the advantages to constantly monitor the operating parameters of a complex machine such, for example, a gas turbine:

Control of operations within the parameters established by the manufacturer
Identification of possible problems before they lead to failure

Possible consequences:

The manufacturer reduces warranty costs expected by operational mistakes
The insurance premiums are lower because the more accurate models
The operator can properly plan their maintenance and better comply with their SLA

Extend the idea to other parts of the system, and you will quickly have lot of sensors to monitor:

Parameter	Value
Sensors	Analog: 30.000, Digital: 20.000
Sampling rate	Analog: 50ms, Digital: 1s
Data type	Analog: float, Digital: short-int
History	1 – 3 years

The necessary infrastructure to handle such a data-flow coming from so many different sources, diverges from the classical architecture, and must assure:

Storage capability: Not only space, but also query flexibility and access speed.
Very low latency: The signal validity life is trending toward microseconds; not only the network must assure optimal performance, but also all other components.
High computational power: Large amount of data require CPU power to be interpreted and correlated; form landscape recognition to median of PH values at different temperatures, high speed computers are a must.

Emerging solutions

Alongside the classic logging technologies, new specific solutions for the telemetry are being consolidated. No particular standard is adopted, since the data are just k→v tuples. Regarding the communication protocol, almost everyone is based on IP. On top of IP, the choice is mainly between MQTT, XMPP and CoAP.

In the storage area, for telemetry, the choice goes to dedicated databases (Graphite/InfluxDB/Hadoop) and specific software to display charts or build dashboards. To mix new features in old solutions, some parser/extractors may be used to extract performance data and push it to telemetry databases while maintaining the old processes in place.

Communication fundamentals

The key component is the IP protocol, especially the “new” version 6. The battery for many sensors, especially those designed for the IEEE802.15.4, must last for months; so, the device is not online all the time, cannot communicate at high speed or – sometimes – is just out of range. This kind on networks are known as LLNs (Low-power and Lossy Networks).

MQTT/MQTT-S is a publish/subscribe messaging protocol designed for lightweight M2M communications originally developed by IBM
XMPP (eXtensible Messaging and Presence Protocol) has its roots in instant messaging and is a contender for mass scale management of consumer goods.
CoAP (Constrained Application Protocol) over UDP is used for resource constrained, low-power sensors and devices connected via lossy networks, especially when there are a high number of sensors and devices within the network.

Following, a very short list of key points for each protocol.

Protocol	Pro	Contra
MQTT	Pushed by IBM. Subscribed services (Many2Many). Two way communication over unreliable nets. NAT is not critical. QoS in place (Fire-and-forget, At-least-once and Exactly-once).	Low power, but not for extremely constrained devices. Normally “online” all the time (addressed in MQTT-SN). Long topic names, impractical for 802.15.4 (addressed in MQTT-SN).
CoAP	Pushed by CISCO. Primarly a One2One protocol. Resource discovery. Interoperate with HTTP/REST.	Sensor is typically a server, so NAT must be designed carefully. Since UDP, no SSL/TLS. DTLS can be used.
XMPP	Pushed by CISCO. Real-time. Massive scalability. Security.	Not been practical over LLNs. Need for an XML parser.

Products by phases

Following, a summary of products with references; refer to the schema for the interconnections between components. Please remember that the same process can be implemented in very different manners: just use the product most familiar to you.

Phase	Components	Key to consider
Collect	Fluent Bit, Collectd, Telegraf, Beats, rsyslog	System type. Framework already present.
Queue	Redis, MemCached, RabbitMQ	Routing customization. Performance. Delivery assurance.
Parse	Fluentd, Logstash, rsyslog, Graylog	Input / Output. Message parsing plugins.
Store	InfluxDB, Prometeus, Hadoop, Elasticsearch, RDD	Speed. Query language. Granularity.
Visualise	Kibana, Graylog, Chronograf, Grafana, Thruk, Cacti	Authentication / Authorization. Visualitation types. Query language / Transformation functions. Dashboard customization.
Act	Graylog, Kapacitor	Triggering capabilities. Query language / Transformation functions. Storage. Integration.

Some possible paths for logged data

Summary

Logging and using the enormous amount of data generated by the upcoming IoT infrastructure will be challenging for the whole IT infrastructure: for the network, for the CPUs, and for the software development. A lot of solutions are popping out, all with pros und contras. Depending on what are the primary goals of the project, one (mix) may be better as another.

Footnote

*) This “joke” appeared in our XEN infrastructure just after an upgrade, and wasn’t funny. For more goto Kernel Line Tracing: Linux perf Rides the Rocket.

About the Author

Rocco Gagliardi has been working in IT since the 1980s and specialized in IT security in the 1990s. His main focus lies in security frameworks, network routing, firewalling and log management.

You want to bring your logging and monitoring to the next level?

Our experts will get in contact with you!

Transition to OpenSearch

Rocco Gagliardi

Graylog v5

Rocco Gagliardi

auditd

Rocco Gagliardi

Security Frameworks

Rocco Gagliardi

You want more?

Further articles available here

You need support in such a project?

Our experts will get in contact with you!

You want more?

Further articles available here

Logging the Internet of Things

Connected power plants demand a new paradigm

Keypoints

The Network is the Computer

Classic vs IoT log

Emerging solutions

Communication fundamentals

Products by phases

Summary

Footnote

About the Author

Links

Tags

You want to bring your logging and monitoring to the next level?

Transition to OpenSearch

Graylog v5

auditd

Security Frameworks

You want more?

You need support in such a project?

You want more?