Vulnerability Scanning Data - Analyzing Reliability and Accuracy

Vulnerability Scanning Data

Analyzing Reliability and Accuracy

Marc Ruef
by Marc Ruef
on April 02, 2020
time to read: 13 minutes


Analyzing the Quality of Vulnerability Scans

  • Automated vulnerability scanning software can identify potential vulnerabilities quickly and efficiently
  • But automation goes hand in hand with the risk of false positives
  • Accordingly, results must be examined to ensure that they are reliable and accurate
  • Understanding the relevant tests allows conclusions to be drawn with respect to functionality and quality

Modern IT infrastructure is diverse and fast-moving. It’s tough to keep track of which components could have weaknesses and security gaps. Comprehensive manual testing isn’t possible anymore, which is why automated vulnerability scanning solutions are used for broad-based analyses. The results such analyses produce may often fail to take individual and complex relationships into account, which translates into variable quality. However, it is essential that we identify this quality if we are to continue working with the results.

Analyzing checks helps to build an understanding of how they work so as to derive the expected quality. A security check – and thus also a relevant scanning plugin – can be based on three different mechanisms:

Manufacturer information in plugins and reports

For certain products, the manufacturer helps to explain how the functional principle is structured and what quality of results can be expected.

An indication of quality by the manufacturer

Some manufacturers indicate how high a check’s trustworthiness is, often referring to it as accuracy or confidentiality. This creates a simple starting situation so that the quality can be assessed.

When it comes to a classification such as this, how the quality is measured and created is naturally always questionable. Manufacturers tend to rate their results better than they really are. The achievable quality level and the form it is communicated in should be investigated using random samples as a basis. Qualys, for example, tends to report derivative tests with Accuracy: poor.

Notes in the title and description

Purely derivative checks can be identified by their title and description. Mechanisms must be assumed to be derivative as soon as version numbers are recognizable or countless vulnerabilities are identifiable (from a patch, for example).

Sometimes, the text description explicitly states how it works. Take 105729 – EOL/Obsolete Software: Nginx Server Detected by Qualys, for example:

The unauthenticated check tries to fetch the version from the version exposed in the Server: tag of a HTTP response

So, it is clear here that this is a derivative plugin that derives the information from the welcome banner as a non-authenticated user.

Checks based on scans often have a title or description that refers to exposed components – or, in the case of a web server, the exposed file paths. Usually, this is referred to as File /cgi-bin/foo.cgi found rather than a specific vulnerability. Here are some examples from Qualys:

ID Title
10931 VBulletin members2.php cross-site scripting vulnerability
11076 YABB SE Reminder.PHP SQL injection vulnerability
12767 Moodle badges/external.php cross-site scripting vulnerability
86410 Apache web server/server-status information disclosure vulnerability

If successful exploiting has taken place in a further step, the results are usually recorded in the report. HTML outputs with the injected code fragments are displayed when testing cross-site scripting vulnerabilities, while directory traversal reveals the contents of the read files.

Determining functionality

The best way of assessing a check’s basic quality is if its functionality is determined generally and specifically. There are various approaches to this, which will be discussed below.

Analyzing code

The simplest, most reliable and most sustainable method is examining a check’s code. This can be done without any problems using open source solutions. The commercial product Nessus from Tenable has been a good example of this for nigh on 20 years.

The Nessus checks are implemented as single plugins that are written in NASL (Nessus Attack Scripting Language), a scripting language based on C. The plugins can be easily loaded into a text editor, where they can be examined or adapted. Nmap performs a similar process here to NSE scripts written in LUA, which are included in the standard package or can be developed independently.

Nessus’ core component was also open source for years, but was not released as source code anymore for commercial reasons. There are also some precompiled libraries that are included. Seeing how they work isn’t easy.

Commercial products such as Qualys do not claim that they can guarantee such transparency. There are no sources available – neither for the core component nor for the individual checks. Active efforts are even being made to make it more difficult to analyze (reverse engineer) the components.

Deriving a vulnerability

If there is no way of gaining an insight into a check’s sources, the first step is to derive the expected functionality from the vulnerability to be checked.

During this process, it is important to remember that vulnerability scanners want to achieve a result in as simple and straightforward a way as possible. So it’s not unusual for most checks to be derivative or at most scanning checks by nature. Implementing a real exploiting check is complex, error-prone and risky.

Certain protocols, platforms and check classes – take typically classic protocols such as HTTP, SMTP and FTP, for example – lend themselves to being implemented in a purely derivative manner. They are kept very simple, communicate in plain text and usually show the installed software when a connection is established. By evaluating this welcome banner, the product and sometimes even version can be recognized by means of simple pattern matching.

This is why certain vulnerability scanners come with checks that explicitly identify outdated versions and then attach corresponding defects to them. Here are some examples from Qualys:

ID Title
12913 PHP 5.5.x and 5.4.x denial-of-service vulnerability
13481 jQuery prior to 3.4.0 cross-site scripting vulnerability
19000 MySQL banner
87329 Apache HTTP server prior to 2.3.30 multiple vulnerabilities

If a banner older than 2.3.30 is identified on a web server, then this check is assumed to be correct.

The reliability of a check such as this must be discussed on three levels:

Banners can be modified with some web server implementations. Alternatively, the information required for a check (such as the installed patches) is not shown. In this case, faulty derivations could take place. This makes derivative checks very unreliable.

Scan access goes one step further than purely derivative checks. This is where effective access is gained to determine the existence or behavior of components. But exploiting does not take place, as this is often associated with decisive disadvantages due to its complexity and invasiveness.

Many vulnerability scanners have generic checks for identifying typically vulnerable scripts. This task used to be done by what are known as CGI scanners, which tried out different URLs. As soon as a status code gives an indication that the vulnerability exists, it is assumed that the file, and thus the vulnerability introduced by it, exists.

Similar to the purely derivative approach, the number of potential false positives is not negligible. After all, just because a component exists and/or behaves as one would expect it to, does not mean that the vulnerability will remain. A component could have been updated or corrected with a patch, but this detail is not detected by scan-only access.

Extensive exploiting is only enforced in rare cases. This is time-consuming, invasive and therefore risky. Commercial vulnerability scanners practically never perform such tests. Only extended active scanners (e.g. Burp Suite from Portswigger) or exploiting frameworks (e.g. MetaSploit or ATK – Attack Tool Kit) are consistently based on this option.

Access behavior

One approach that is not considered in the literature is analyzing the timing behavior. The basic rule is that the more complex a check is, the longer it will take. On the one hand, this is due to the increase in the number of work steps. But it can also be related to the load on resources (e.g. required clock cycles, RAM processing).

Unfortunately, the three methods do not allow typical time values to be assigned. Each check must be examined separately. Nevertheless, it is conceivable that testing a single vulnerability with the three different variants will result in different timing behavior. Rhythm can also play a revealing role during this process.

The access rhythm and volume can be examined as an alternative or in addition. A simple welcome banner query requires less data volume than additionally importing the exploit code and reading the successful return. The table below illustrates the increase in this complexity.

Step Derivative Scan Exploiting
Establishing a connection
Reading the banner
Requesting a resource  
Sending an additional exploit code    
Response received  

The fact that complex protocols, compression, retransmissions and piggyback mechanisms can lead to a distorted perception must be taken into account. So, in this case, we must strive to perform extensive examinations to reach a conclusion.


Vulnerability scanners have become an indispensable and integral part of comprehensive security checks. Their associated automation ensures enhanced efficiency and coverage. But this always carries the risk of incorrect results. Determining how trustworthy a product is, or the results it generates are, helps with finding out whether and to what extent it can be used.

The behavior of the software or the individual checks can be examined during this process. A distinction can be made in this regard between derivative checks, scans and exploiting. A check’s title and description often give an indication of what approach is used. Ideally, the check’s source code can be examined to provide clarity.

Wherever this is not possible, an attempt can be made to analyze the access behavior. The timing, rhythm and data volume can provide indications so that appropriate derivations can be made.

About the Author

Marc Ruef

Marc Ruef has been working in information security since the late 1990s. He is well-known for his many publications and books. The last one called The Art of Penetration Testing is discussing security testing in detail. He is a lecturer at several universities, like ETH, HWZ, HSLU and IKF. (ORCID 0000-0002-1328-6357)


Are you interested in a Penetration Test?

Our experts will get in contact with you!

scip Cybersecurity Forecast

scip Cybersecurity Forecast

Marc Ruef

Home Automation

Home Automation

Marc Ruef

Cyber War

Cyber War

Marc Ruef

scip Cybersecurity Forecast

scip Cybersecurity Forecast

Marc Ruef

You want more?

Further articles available here

You need support in such a project?

Our experts will get in contact with you!

You want more?

Further articles available here