Enhancing Data Understanding
Rocco Gagliardi
This article is the first in a series of two. In this part we will discuss how to approach log management and our experience with it. In the second part we’ll look more specifically at the requirements and costs of the various stages as well as provide an overview of different tools.
Often, during our audits, we run into the logging? problem; regulations or recommendations that, at some point, require the ability to analyze the data sent by computers about their operational state. Even conscientious companies, with very high levels of security, run into troubles. We see a series of erroneous behaviours:
The problem is not trivial. Very good tools trying to collect and analyze a wide range of messages coming at very high speed from a plethora of different devices and applications, each of them garbage and highly important information at the same time. SIEM solutions are at the moment state of the art in log collection/analysis. Top, expensive products in this category are Q1 Labs and HP ArcSight, but there are also many useful open source tools to start with.
Buy a server, turn it on and let it do something for a couple of days is often sold as a solution, but this is quite unrealistic. Log management, and security log management as a special case, is a complex project that involves different components, technical and human and that requires a very high quality conceptional phase.
Between 2003 and 2008 we developed and deployed SE.LO.R.SY. (SEcurity LOg Reporting SYstem), our customized security log management solution. Several of our clients asked us for a solution to correlate data with the objective of identifying attacks in heterogeneous environments using complex communication channels. Our goal was to create a flexible enough, dedicated, highly secure solution for early adapters to solve very specific customer needs.
We provided the solution (software, db logic etc.) without extra fee; they payed us for the conception and the logic of what to log, where, why and how to report and present that to different internal divisions (legal, compliance, audit, it, security) and tiers ranging from technician to risk officer.
Supporting even a small number of log/s requires a robust and reliable infrastructure: All engines for
must be stable, work together and provide good performance. If one of these key factors is missing, the entire solution may become unusable.
We designed SE.LO.R.SY. with focus on reporting, so to produce results in offline mode, but it is also possible to interact with the collected data and to be alerted in near real-time manner (~min interval).
Core components used:
Component | Usage | Details |
---|---|---|
syslog-ng | transport | internally initiated ssh tunnels to collect syslog data from external devices on dmz server |
perl/dbi | parsing engine | parsing/normalizing and transactional insert relational records in three different tier-1 (option tier-2) mysql databases |
php | reporting engine | reporting/presentation engine based on proprietary presentation framework |
On a single low-end server (i686, 4GB, RAID-5) SE.LO.R.SY. was able to parse, normalize (extract ~10/15 pieces of information from the logline) and transactionally store about 1K logs/s (sustained), without in-memory summarization. At the same time, creation of daily reports based on tables/graphs (filtering and counting on a base of 2M logs/d) needed around 10min/report. Interactivity was tuned by the customer: Performance was increased placing strategic indexes – The challenge was to balance performance (insert/query) and space for an optimal overall usability.
We developed parsers for Checkpoint Firewall, Sonicwall, Windows Servers, Citrix, Unix log and audit, RACF and some others. At the end of the day we had solutions for auditing the administrators, AD changes and data calls on Windows platforms, escalating if the log-in chain (no sso) was corrupted from the initial log-in on the portal web, monitoring Citrix account and core banking application, correlating RACF logins with Unix logins and comparing them to the paper workflow of the given appropriate rights and many more, and it was possible to deliver customized security reports to different users, ranging from operators up to management.
Designing, engineering, implementing and customizing SE.LO.R.SY. for our different customers was not a simple task and we learned a lot about the log-management products from the inside point of view.
A log is a record of the events occurring within an organization. Logs are composed of log entries; each entry contains information related to a specific event that has occurred. Many logs within an organization contain records related to computer security (generated by firewalls, authentication software, anti-viruses, etc.).
Logging and auditing take different approaches to collecting data:
SIM (Security Information Management) is an event collection from applications and operational systems, normally responsible for:
SEM (Security Events Management)is collecting events relating together for security benefit, normally responsible for:
SIEM: Security Information and Event Management.
Successful attacks on systems do not necessarily leave clear evidence of what happened. It is necessary to build a configuration in advance that collects this evidence, both in order to determine that something anomalous has occurred and in order to respond appropriately. Log consolidation is fundamental for detecting APT. Standards require you to log user, application and network activity.
However they tend to be very vague in how that information gets processed. You can usually get away with dropping in a black box, generating some colourful management reports, and be considered compliant. It may not help you find that backdoored system that is calling home, but you have met the standard.
A well-configured logging and audit infrastructure will show evidence of any misconfiguration which might leave the system vulnerable to attack, acting as problem prevention. Routine log analysis is beneficial for identifying security incidents, policy violations, fraudulent activity, and operational problems. Logs are also useful when performing auditing and forensic analysis, supporting internal investigations, establishing baselines, and identifying operational trends and long-term problems.
Auditing has the advantage of being more comprehensive, but the disadvantage of reporting a large amount of information is that most of it is uninteresting. Logging has the advantage of being compatible with a wide variety of client applications and of reporting only information considered important by each application, but the disadvantage that the information reported is not consistent between applications. Syslog is a standard on many different systems just because the MSG field is free-form text field, so you can fill it with whatever you like; more than 750 logfile formats are currently used.
The scope of the infrastructure is the fundament on which to build them: Do you want to improve security or is there a compliance specification you need to adhere to? Declaring the primary goal to be compliant, monitoring, alerting or a combination of them steers the whole project in different direction.
It is very important to define the deliveries that the system must or should produce; it is preferable to start indicating the name, sections, records, filters, sort order, grouping, counting and the audience for each report.
Infrastructure must produce something for different audience with different needs, must be maintained and must be audited. Identify teams to maintain the infrastructure with particular regard to the sensitivity of the data processed; normally, people deputed to the maintenance don’t have the rights to read/modify the processed data.
Define the destination groups: security, administrators, managers, architects; each of them need to look at the same data from a different perspective. Define the management team and carefully design the permission model to implement. Define the audit team and assure they regularly review the key parts of the system, from the generation trough transmission up to storage.
Be sure to not under dimension the resources needed. Assign enough time to personnel to learn the infrastructure, identify the problems, solve it until a stable system state is reached. A typical pattern is to analyze a huge amount of logs, most of them useless; list the reported problems, prioritize them based on the number of logs that can be removed with a single action, act to correct the problem; repeat until log entry is eliminated or the number of superfluous log entries becomes acceptable.
This takes time and may have unpredictable impact on multiple company resources! Example: Firewall log filled with drops caused by a misconfigured application ⇒ Fixing the application may involve other departments and take a long time to be engineered and implemented.
Define the requirements to meet and declare the method to use. Use these definitions to assure that the logging sources reports what is needed.
Do not just declare meet PCI-DSS requirements. Specify in detail which part of the regulation one must be compliant with, which information from which source is needed, what should be checked/counted, what is a normal/warning/error state, the generation frequency, who receives the result, what should be checked, how and how long should it be retained, what to do in case of warning and in case of error. An example of definition:
If you would buy something from a vendor or start a project internally using open-source tools, summarize all requirements in a Request For Proposal (RFP) using very clear, short, punctual questions; even so you will get nebulous answers from your counterpart.
Describe functionality, use cases, which data must be parsed, which should, how many Events Per Second (EPS) you expect. Compare with open source products, but consider they will also cost.
During our years of development experience in log management solution, we learned that it is probably the single most effective security solution you can deploy. If you really care about the logs, this will give you unrivaled visibility into the inner workings of your network. A logging system can be resource intensive, but it can also provide a very high rate of return.
A common log language is missing and this makes it really difficult to manage the huge amount of information, so do not expect the X-Tool to solve all your problems. It is important to start with a very good concept, this is the absolutely first step; to do this, it is fundamental to collect as many needs/inputs as possible from different departments. Don’t just limit the focus on the single IT requirements; carefully plan resources needed and integrate the log management as part of daily business. Expecting to solve everything in a couple of weeks is naive.
{$t:Security log - part 1,$a:rcc,$v:1.0}
Our experts will get in contact with you!
Rocco Gagliardi
Rocco Gagliardi
Rocco Gagliardi
Rocco Gagliardi
Our experts will get in contact with you!