Logging - Using Timeseries Database

Logging

Using Timeseries Database

Rocco Gagliardi
by Rocco Gagliardi
time to read: 9 minutes

Keypoints

This is how you Log Timeseries in Databases

  • Heavy-index systems, thanks to the explosion of microservices, are increasingly under pressure
  • IT-operation needs to access the logs without necessarily aggregating or finely filtering the data
  • Loki is an excellent complement to Prometheus by synchronizing metrics with logs
  • But if specific data aggregations, counts, or fine filters are needed, Loki is not the right solution

In recent years, the problem of logging has been addressed using systems that allow the rapid extraction of particular data or relationships between them from a huge mass of information. We, therefore, pushed towards extreme indexing of data, extracting parts, modifying them, or adding others (for example geo-tagging), all aimed at obtaining ever-greater extraction speeds. Often, the size of the indexes exceeds that of the data itself.

In recent years, moreover, we have gone from a mostly static network to extremely dynamic networks, with micro-services that constantly change both in number and in position. Keeping track of these changes, for debugging or for Business Analysis or Intelligence (BA/BI), has become very complex and expensive, as the heavy-index solution does not scale easily and requires high investments.

The problem of keeping track of events happening in applications in a human-readable form is destined to disappear in the near future. In addition to being often useless, the amount of data generated is becoming unmanageable. Solutions in which machines communicate problems to machines that make the necessary adjustments, cutting human intervention completely, are already in use and their number is increasing. In the meantime, however, the problem exists and needs new solutions.

Without taking anything away from the BA/BI solutions currently in use, the key, especially for the Operation department, lies in the concept of observability: Determining the functioning of a system by analyzing its outputs, basically metrics, logs, and traces.

In recent years, Grafana has been elected to defacto standard for observability, with some specialized tools for each pillar:

Heavy or Light Index

As mentioned before, microservice systems produce a huge amount of data. To manage them, we are increasingly moving towards systems with light indexing, as opposed to the trending systems that make massive use of indexing.

Loki is a horizontally-scalable, highly-available, multi-tenant log aggregation system inspired by Prometheus. It is designed to be very cost-effective and easy to operate. It does not index the contents of the logs, but rather a set of labels for each log stream.

Is Loki a replacement for ELK, Splunk, or Graylog? No. It is up to you to use Loki, ELK or a combination of both. Some differences between the two system classes are shown in the table.

Capability Heavy Index Light Index
Write Performance low high
Read Performance high high
Typical index size (compared to data) >50% <5%
Filter capabilities high low
Aggregation capabilities high low
Scalability low high
Cost high low

Take as example the Apache httpd log: ELK will split a typical httpd logline in ca. 12 pieces and index each of them. Loki will index just 3 pieces: Timestamp, Method, HTTPStatus.

How Loki works

Getting logs with Loki involves two simple components: The Loki server and the Promtail client. Both are written in Go, therefore are extremely easy to deploy as just a single file. As Grafana, Loki is based on Cortex Database that is scalable and multi-tenant.

Loki Working Model

Loki does not index data, it indexes and groups log streams. It divides data in chunks, fed by streams, and stores them on the file system (we used a similar technique for our specialized search engine).

Streams are identified by Labels. Labels are k->v pairs, that can be extracted from any part of the data. Every unique combination of labels and values defines a stream, and logs for a stream are batched up, compressed, and stored as chunks. Each chunk is compressed using a configurable algorithm (default gzip).

So, chunks are the index to Loki log data. They are used to find the compressed log content, which is stored separately. Once the right chunk is identified in the index, the chunk is decompressed and content grepped with regexp. It’s may sound weak, but since concurrency can be leveraged, the searches are really fast.

Loki Indexing Model

Note: The index size, or the access speed, depends on the number of chunks. The number of chunks depends on the number of streams which depends on the number of combinations of labels and values. It is therefore clear that, unlike an ELK, although it is possible, it is not recommended to index (label) for example IP Addresses, since each new IP would create a new stream. This means, for example, that searches on a specific IP and over a long period of time do not perform well in Loki.

To obtain excellent search times, it is, therefore, necessary to carefully choose the labels to extract and use.

Loki use case

Loki’s use-case is specific: To support the debugging process by quickly filtering important messages from the amount of those generated by the different systems involved in an application. A container environment is an ideal playground for Loki.

The power of Loki is shown if it is integrated with Prometheus; using the same labels it is possible to refer to the same segments of data and compare the metrics with the messages of the various systems.

The creation of dashboards, trends, counts etc. is not Loki’s strong point. Think of Loki as a power-grep.

Summary

Every system creates logs: Switches, routers, OS, Web servers, firewalls, services on your Kubernetes clusters, public cloud services, and more. Especially for operators, being able to collect and analyze these logs is crucial. And the growing popularity of microservices, IoT, cybersecurity, and cloud has brought an explosion of new types of log data.

Loki tries a different approach from heavy-index systems like ELK. By avoiding extreme indexing and renouncing to the possibility of aggregating and finely filtering the collected data, it focuses on extremely fast ingestion of large masses of data and equally fast search for portions of them, maintaining multi-tenancy and scaling at very low costs.

About the Author

Rocco Gagliardi

Rocco Gagliardi has been working in IT since the 1980s and specialized in IT security in the 1990s. His main focus lies in security frameworks, network routing, firewalling and log management.

Links

You want to test the security of your firewall?

Our experts will get in contact with you!

×
Transition to OpenSearch

Transition to OpenSearch

Rocco Gagliardi

Graylog v5

Graylog v5

Rocco Gagliardi

auditd

auditd

Rocco Gagliardi

Security Frameworks

Security Frameworks

Rocco Gagliardi

You want more?

Further articles available here

You need support in such a project?

Our experts will get in contact with you!

You want more?

Further articles available here