System Log Monitoring

A Useful Minimal Solution

by Tomaso Vasella

on May 04, 2023

time to read: 9 minutes

Keypoints

This is How You Profit from System Log Monitoring

Collecting, recording, and monitoring system events is a basic security measure
Python can be used to programmatically access Systemd's journal directly
Event driven processing of log messages is easy to implement
A minimal solution with simple keyword searching is quite effective

Collecting, recording and monitoring events – especially if they are security-relevant – is a basic security measure that should not be missing on any system. Many operating systems, programming languages and software frameworks offer more or less flexible options for collecting and recording events. Windows contains the Event Log, Unix and Linux typically use a syslog implementation such as syslogd, rsyslog or syslog-ng. Most modern Linux distributions use the logging component of systemd.

Thus, plenty of tools for collecting and recording events are readily available and are active in operating systems at least in a basic configuration. However, when it comes to analyzing logs the situation looks a bit different. Although there is a good range of powerful solutions for this purpose such as Graylog, ELK or Prometheus, most of them require dedicated systems, often considerable resources and a lot of configuration effort. Centralized collection, correlation, and analysis of logs is highly recommended in larger environments, but may be beyond the effort one can or is willing to invest in smaller environments or individual systems. This article examines a simple method to programmatically access systemd’s journal and implement a minimal but useful notification solution with simple methods.

Logging with Journald

Journald is a component of Systemd and is available as a package in many current Linux distributions and is often active as the default logging solution. In contrast to classic syslog solutions, journald uses a binary file format for writing logs to local storage (on-disk). These logs can be examined and searched with the command journalctl. In addition, there is a python module available that provides direct programmatic access to the journal, among other things. This python module is used in the following example.

Accessing the Journal with python-systemd

To use the python module python-systemd it must be imported.

from systemd import journal

The individual entries in the journal are accessed with the class systemd.journal.Reader(). Various settings can be selected for accessing the journal entries. For our purpose, we have defined that only entries with the severity level LOG_INFO or higher should be considered and that only those events are relevant that have been recorded since the last start of the system since our log monitor should continuously monitor new events and not care about earlier events.

j = journal.Reader()
j.log_level(journal.LOG_INFO)
j.this_boot()

Since we are only interested in the events newly added to the journal, we jump to its end. There is or there was a bug or an ambiguity in the documentation, that is jumping to the end of the journal places the pointer after the last entry, which requires going back one position to access the last entry in the journal. Our tests were not conclusive in whether this is still the case, but the following approach worked well.

j.seek_tail()
j.get_previous()

Event-driven Processing

After programmatic access to the journal has been established, an efficient method is needed to read and process newly arriving messages. An event-driven approach is suitable for this where an event loop waits for incoming messages and executes commands only when a message actually arrives. This is a much more efficient approach than polling for new messages at a high frequency, but it requires appropriate signaling.

Linux provides this possibility with the system calls poll or epoll which can wait for events arriving at a file descriptor. In Python, there is a module called select that makes the functions poll() and epoll() available. Therefore, we can use poll() to observe the file descriptor of the journal, which can be implemented as follows.

p = select.poll()
journal_fd = j.fileno()
poll_event_mask = j.get_events()
p.register(journal_fd, poll_event_mask)

Log Monitoring

All that remains is to analyze the individual log messages and react to certain events. The simplest method proved to be searching for keywords such as Error, Fail or Problem and sending an email on their occurrence. According to the documentation, the function process() must be called after each use of poll(). The C function sd_journal_process() mentioned in the documentation corresponds to the function process() in the select Python module.

while p.poll():
    if j.process() != journal.APPEND:
        continue
    for event in j:
        if re.search('(fail|error|alarm|problem|emerg)', event['MESSAGE'], re.IGNORECASE):
            <do stuff>

False Positives

A large number of messages can quickly appear in event logs that contain the keywords looked for but may be considered false positives. For example, the logs of the mail server Postfix often contain entries such as SSL_accept error from unknown or the logs of the IMAP server Dovecot show messages like failed: Connection reset by peer (no auth attempts in 1 secs) which may be caused by automatic scanners in the Internet. Therefore a feature is necessary to identify and to ignore such unwanted log lines. This can be achieved by a simple substring comparison or in more complex cases regular expressions might be used.

Practical Implementation

With the above considerations, a complete solution could look like the following.

import select
import re
import smtplib
import logging
from email.message import EmailMessage
from systemd import journal

def main():
    j = journal.Reader()
    j.log_level(journal.LOG_INFO)
    j.this_boot()

    j.seek_tail()
    j.get_previous()

    p = select.poll()

    journal_fd = j.fileno()
    poll_event_mask = j.get_events()
    p.register(journal_fd, poll_event_mask)

    while p.poll():
        if j.process() != journal.APPEND:
            continue
        for event in j:
            if re.search('(fail|error|alarm|problem|emerg)', event['MESSAGE'], re.IGNORECASE):
                if not is_false_positive(event):
                    msg = EmailMessage()
                    msg['Subject'] = f'[ALERT] Error detected in {event.get("_SYSTEMD_UNIT",event["SYSLOG_IDENTIFIER"])} on {event["_HOSTNAME"]}'
                    msg['From'] = 'sender@localsystem'
                    msg['To'] = 'recipient@othersystem'
                    msg.set_content(f'{event["__REALTIME_TIMESTAMP"]} {event["MESSAGE"]}')
                    try:
                        s = smtplib.SMTP('localhost')
                        s.send_message(msg)
                        s.quit()
                    except Exception as e:
                        logger.error(f'Err: could not send mail. Reason: {e}')

def is_false_positive(event):
    false_positives = [
        {'_SYSTEMD_UNIT': 'postfix.service', 'MESSAGE': 'SSL_accept error from unknown'},
        {'_SYSTEMD_UNIT': 'dovecot.service', 'MESSAGE': 'TLS handshaking: SSL_accept() failed'}
    ]

    # if all of the keys and all of the values (values as substrings) of any of the false positives are in event
    return any(all(fp[key] in event.get(key, '') for key in fp) for fp in false_positives)

if __name__ == '__main__':
    logger = logging.getLogger(__name__)
    logger.setLevel(logging.INFO)
    logger.propagate = False
    logger.addHandler(journal.JournalHandler())
    logger.info(f'Starting log monitoring script {__file__}')

    main()

To permanently run the log monitor as a system service, a Systemd Unit can be created:

[Unit]
Description=Log monitor
After=network.target

[Service]
Type=simple
ExecStart=/usr/bin/python3 /usr/local/bin/logmon.py
TimeoutStartSec=0
Restart=always
StartLimitInterval=0

[Install]
WantedBy=default.target

In our example, this unit was placed at /etc/systemd/system/logmon.service and then activated:

systemctl enable logmon && systemctl start logmon

Summary

It is evident that the described approach cannot replace a full-featured system monitoring solution, especially in comparison with centralized solutions such as Graylog, Elastic, Prometheus, Zabbix, Nagios or others. However, in the sense of a favorable effort benefit ratio, the presented solution has proven to be very useful. Before completely refraining from implementing system monitoring, be that because of a lack of time or because a larger solution seems too extensive, it is better to choose a simple approach that can be implemented with simple means and requires little maintenance.

About the Author

Tomaso Vasella has a Master in Organic Chemistry at ETH Zürich. He is working in the cybersecurity field since 1999 and worked as a consultant, engineer, auditor and business developer. (ORCID 0000-0002-0216-1268)

You want to bring your logging and monitoring to the next level?

Our experts will get in contact with you!

Security Testing

Tomaso Vasella

The new NIST Cybersecurity Framework

Tomaso Vasella

Flipper Zero WiFi Devboard

Tomaso Vasella

Denial of Service Attacks

Tomaso Vasella

You want more?

Further articles available here

You need support in such a project?

Our experts will get in contact with you!

You want more?

Further articles available here

System Log Monitoring

A Useful Minimal Solution

Keypoints

Logging with Journald

Accessing the Journal with python-systemd

Event-driven Processing

Log Monitoring

False Positives

Practical Implementation

Summary

About the Author

Links

Tags

You want to bring your logging and monitoring to the next level?

Security Testing

The new NIST Cybersecurity Framework

Flipper Zero WiFi Devboard

Denial of Service Attacks

You want more?

You need support in such a project?

You want more?