Counting the FW1 Logfile

Counting the FW1 Logfile

Rocco Gagliardi
by Rocco Gagliardi
time to read: 11 minutes

There are many log analysis tools ready to use that you can download, unpack and run, each with many great features and user friendly interfaces. However, in special cases, it may be more efficient to extract and manipulate data using a self engineered tool, basically because you will get a deeper understanding of data and the ability of create a very specified customized outputs.

Dealing with large logfiles is pretty simple. We will see how to extract information from an exported Firewall-1 logfile and format it for our need.

Scope and Requirements

Last year, we worked on a project to protect the production servers from the internal users, splitting the internal network into many separate parts and ruling the traffic flowing between them. The idea was to install an open firewall, log all traffic (6-8 millions loglines/day) to find connections, decide whether the connection was needed or not and if yes implement an allow rule. At the end of the project, it was just necessary to swap the last rule from Allow to Drop to block all not required communication.

The main problem was to automatically keep track of analyzed connections and decisions about the validity of them, made by an external process, during all 6 months of the project duration. We decided to develop some tools for analyzing firewall logs and a database to track/sync rules implemented.

The requests have been defined as:

Process

Basically the process of analysis entailed the following steps:

  1. export the log files of the previous day
  2. extract the connections: action-source-destination-service
  3. import into the database
  4. the database correlates the sequences with those in memory: if already present, updates the counter and adjusts priorities ( in case of missing decision); if not present, creates a new case to be investigated on a priority basis
  5. produce a daily report on the status of cases

Getting the Logfile

Firewall-1 log is stored in binary format in a protected location, a privileged account is required to start the Checkpoint log export utility on the management/log server. Normally a firewall engineer does not have access to that location (generally he is a normal user on a production system and just has SmartConsole read access), in this case it is still possible to export the logfile from the SmartConsole Tracker; it takes time and generates a big text file but it’s a good compromise between security and usability.

Interpreting the Logfile

Once exported, we have a lot of lines to count on. By default, Checkpoint tools export following information:

num;date;time;orig;type;action;alert;i/f_name;i/f_dir;product;src;dst;proto;rule;service;s_port;icmp-type;icmp-code;message_info;xlatesrc;xlatedst;NAT_rulenum;NAT_addtnl_rulenum;xlatedport;xlatesport;th_flags;Attack Info;attack

We use Perl for many reasons, but mainly because it is available by default on most systems. Even if it would be possible to use the CSV parser, for simplicity it was decided to analyze the file using a simple split by ";". Performance, with a normal computer isn’t really an issue.

The heart of the extraction procedure and analysis comes down to these few lines of code:

038 my @counts = ('rule','src','dst','proto','service');
...
129 while ( $ln=INPUT> ) {
130    chomp;
131    @hdr = split (/;/,$ln);
132    for ($i=0; $i<@hdr; $i++) {
133       chomp @hdr[$i];
134       $header->{@hdr[$i]}=$i;
135    }
136    while ( $line=INPUT> ) {
137       $linecounts++;
...
139       chomp;
140       @content = split (/;/, $line);
...
145       foreach $counter ( @counts ) {
146          $stats->{$content[$header->{action}]}->{$counter}->{$content[$header->{$counter}]}++;
147       }
148       $stats->{$content[$header->{action}]}->{cnn}->{$content[$header->{src}]}->{$content[$header->{dst}]}->{$content[$header->{service}]}++;
149       $date = $content[$header->{date}];
150    }
151    last
152 }

Basically we read a line, split them in columns by the ";" and get an array of values. We can store all information we need in a relational form using a hash ($stats), which is a very flexible and powerful structure in Perl.

Lines Result
131-135 To make the tool comfortable (the sequence of information might change depending on the version) it is preferred to not code the position of the elements in the array but to read the header and refer to the position with the column name.
145-147 For the basic statistics, we add a key for each action, defined counter, counter value and increment this by 1 each time the sequence is found.
148 For connections statistics, we add a key for each action, source-ip, destination-ip, service and increment this by 1 each time the sequence is found.

With just two lines of code, once the whole logfile has been parsed, we have in memory a hash with all our statistics on used rules/sources/destination/protocol/services and a traffic matrix on who talks to whom:

print Dumper($stats)
$VAR1 = {
   'accept' => {
      'proto' => {
         'tcp' => 4
      },
      'src' => {
         '192.168.2.1' => 3,
         '192.168.3.1' => 1
      },
      'rule' => {
         '1' => 4
      },
      'service' => {
         'http' => 3,
         'https' => 1
      },
      'dst' => {
         '192.168.20.1' => 3,
         '192.168.10.1' => 1
      },
      'cnn' => {
         '192.168.2.1' => {
            '192.168.20.1' => {
               'http' => 3
            }
         },
         '192.168.3.1' => {
            '192.168.10.1' => {
               'https' => 1
            }
         }
      }                         
   }
};

Produce the Output

As defined in the specifications, it was necessary to generate a table of daily connections; it is very simple to produce such outputs: basically, we iterate on the hash(es) and print the results, like in the following example:

167 %actions = %$stats;
168 foreach $action (keys %actions) {
169   print OUTPUT "-"x20 ."\n"." Action: $action"."\n"."+"."-"x19 ."\n";
170   foreach $counter ( @counts ) {
171     foreach $count ($stats->{$action}->{$counter}) {
172       %hs = %$count;
173       print OUTPUT "+- $action - $counter" ."\n";
174       foreach $k ( (sort { $hs{$b} <=> $hs{$a} } keys %hs) ) {
175         printf(OUTPUT ("%6s => %s\n", $hs{$k}, $k)) if $hs{$k} > $filter;
176       }
177     }
178   }
179 }

The resulting summary as text format:

--------------------
 Action: accept
+-------------------
+- accept - rule
     4 => 1
+- accept - src
     3 => 192.168.2.1
     1 => 192.168.3.1
+- accept - dst
     3 => 192.168.20.1
     1 => 192.168.10.1
+- accept - proto
     4 => tcp
+- accept - service
     3 => http
     1 => https

The csv for export:

date,rulenr,accept,drop
20121128,1,4,0
date,rulenr,src,dst,proto,service,accept,drop
20121128,1,192.168.2.1,192.168.20.1,http,3,0
...

Various other functions can be, and have been, added to facilitate the extraction and increase the efficiency of the tool, but with this minimum you can read, interpret and summarize millions of lines of information.

Daily Usage

The daily extraction of nearly 7 million loglines required approx. 1 hour (including export, transfer and parsing), the parsing part took approximately 20 minutes.

Once created, the CSV file was imported in a MS Access DB used for storage and correlation (ca. 1GB in size at the end of the project) and through a separated front-end MS Access database it was possible to analyze, track and report all the required information.

Summary

Once clearly and exactly defined what is needed from where and how to use them, the analysis of the information is easier than it may appear at first sight; also the handling of large amounts of data is relatively simple with cpu/memory available today and a few lines of simple code.

Sometimes it is better to use mechanisms technically simple but very flexible sewn on to a existing process than adapt a process to a good tool – And, finally, hashes are great!

References

About the Author

Rocco Gagliardi

Rocco Gagliardi has been working in IT since the 1980s and specialized in IT security in the 1990s. His main focus lies in security frameworks, network routing, firewalling and log management.

Links

You want to test the security of your firewall?

Our experts will get in contact with you!

×
Enhancing Data Understanding

Enhancing Data Understanding

Rocco Gagliardi

Transition to OpenSearch

Transition to OpenSearch

Rocco Gagliardi

Graylog v5

Graylog v5

Rocco Gagliardi

auditd

auditd

Rocco Gagliardi

You want more?

Further articles available here

You need support in such a project?

Our experts will get in contact with you!

You want more?

Further articles available here