Checklists or Scenarios

That is the Question

by Veit Hailperin

time to read: 11 minutes

A penetration test task consists of the stages scoping, testing and documentation. For all three stages, there is extensive discussion – of varying quality – among the IT security community. For scoping, there is sometimes talk, and highly polemic talk at that, of whether social engineering should also be included. Elsewhere, you will find claims that all the peripheral systems in the company form part of the scope. It’s an important discussion, but risk scenarios are too often mixed up with statement contexts. This has a negative effect on any lay observers trying to follow the discussion. And in testing, too, you will find these debates. One article on the subject worth reading is Purple Testing, which is a combination of the classic schema of red (attack) versus blue (defense). Another is Interactive Pentesting, in which the author calls for greater involvement of clients during tests. What makes these articles interesting is that they look beyond the normal bounds of these discussions. Less valuable are the hair-splitting articles that examine the exact definition of words such as penetration test and security assessment.

One aspect that sometimes comes off second-best is documentation, which seems to affect testers like garlic as sunlight affect vampires. That reports must be written for the reader and not the tester, for example, is one simple fact that is often overlooked. A simple list of vulnerabilities is sometimes enough when you’re collaborating directly with the development team. But where the report is required for approval of a new budget, a purely technical report will rarely cut it.

At scip, we are always trying to improve and that leads to discussions about what to test, how to test and naturally also documentation of findings. Here, I would like to reflect on two approaches to testing: checklist-based testing versus scenario-based testing in relation to (web) applications – and its implications for documentation.

Checklist-based testing

Essentially, this already exists in many automated scanners for networks and web applications. Test cases – nothing more than points in a checklist – are defined in the program, and these are then run through. The question is, how does this approach work for manual tests, and does it make sense?

Many advantages and disadvantages depend on the implementation and quality of the checklist. Are the points in the list risks or test cases/payloads? Let’s examine this using the example of session management. Depending on the approach, we can carry out the four following points in the checklist one by one:

complexity of the session token
collision probability of the session token
length of the session token
whether token can be guessed

Or simply summarized as one item:

session token technology

As these are four different problems, you could well argue that each deserves its own place in the checklist. On the other hand, however, you could also argue that the four items result in the same risk – a session being overtaken. What’s more, presumably only one tool is needed for the first three items, because from an implementation perspective they are all interdependent. The probability of collision, for example, is higher if the token is short. Fortunately, most applications these days use existing session token technologies that have been subject to frequent testing and are regarded as secure. If, for example, a simple user name is used for the token, that will be noted in the test and documented. But it also makes sense to integrate this point under ‘session token technology’.

On the other hand, if the issue of risk versus test case is included as an entry in the checklist in relation to cross-site scripting (XSS), it is easy to decide whether differentiation according to risk or test makes sense:

differentiation of risks: yes (reflected, stored and DOM-based XSS)
itemize each individual payload: no

XSSs that exist because of browser errors (universal XSS and mutation-based XSS) should be included in a checklist for browser testing. If there is a filter, and if it is possible to circumvent it, then this is another vulnerability and it is better to list it separately.

There is no one answer to the issue of test case or risk; rather it must be decided on a case-by-case basis. The highest premise is defined as the highest and most comprehensive coverage of the defined scope.

Benefits of checklists in testing:

It’s easier to insure that nothing is forgotten. With large applications in particular, it is easy to forget to check a certain type of vulnerability or a configuration setting. The checklist helps testers ensure they don’t forget anything.
When a vulnerability is found that isn’t included in the list, it is easy to expand the list and to include this point in future tests. This leads to continual improvement.

With checklists, it is important that testers don’t regard lists as definitive, but rather as minimum requirements. This is more a problem of how the checklist is used than the approach itself. Where a finding arises that is rare and not in the checklist, it may not be worth listing it as an additional point. Instead, it should be documented separately. In this case, concentrating solely on the list would be detrimental due to the extra effort generated without added value.

The checklist also helps with documentation:

The individual points can largely be made up of existing standards, such as OWASP, Finma, PCI DSS, etc. Should a customer use one of these standards internally, they know straight away how to categorize them. At the same time, as testers we can include points we regard as important, but which nonetheless are not described in any of the standards. One example of this is called the cross-site script inclusion attack, which doesn’t correspond with any of the points in the OWASP Testing Guide.
If the customer – and this is the exception rather than the rule – wants to see only vulnerabilities listed in the documentation, filtering by vulnerability is always easier than subsequently listing every test.
Many audit reports I have seen document only vulnerabilities. When you work closely with the technical team, this may be entirely sufficient and may even be desirable. But for anyone providing documentation for a broader readership, it is often a good idea to offer something more than just a list of technical security problems.
The fact that the checklist is used during the test means that it doesn’t take too much extra effort to document what was tested and found not to be problematic. This is also important!

The appearance of vulnerabilities in the context of all tested points is then actually more a side-effect of complete documentation. This helps avoid a situation where a major vulnerability sets the tone amid 100 points that were otherwise all positive; for example, if the coding of the task wasn’t complete, but session and configuration of the server header were carried out without error.

Reports that do more than just criticize are generally easier to accept. That’s because we want the report to improve security and support the developers – rather than generate resistance.

An additional, positive side-effect here is training of administrators and developers. Just because a vulnerability is not apparent in the application does not mean that it was actively avoided or even that anyone was aware of its existence.

Scenario-based testing

The scenario-based testing approach has its origins in risk analysis. Risk analysis is used to find out which risks exists and assesses them according to their effect. This can then be used to develop appropriate counter-measures. This approach can also be used for web application penetration tests.

Compromise of the server or defacing of the website are always general risks. But every application has its own specific risks. For a banking application, a user able to carry out transfers in the name of another user would be fatal. For a mail service, being able to read another user’s emails is the kind of risk that must never occur.

As with the checklist approach, scenarios have to be developed first of all. Some can be developed independently by the customer, while others require insider knowledge. For example, with an application that functions as a training platform, there may be artificial intelligence working in the background. Were an attacker able to steal it, the company’s entire business model would collapse. Without this background information, this may not be listed as a testing objective. On the other hand, a statement about this in the report would certainly be welcome.

The advantages of scenario-based testing are:

More active search for paths to the goal. The path here is usually a clever chain of smaller vulnerabilities and not a sole security gap. And this is precisely the strength of the scenario-based testing approach. The goal is defined and it is above all the creative approaches in, for example, getting to a resource that later influence the quality of the test.
A welcome side-effect is increased enjoyment of work. This may have no direct impact on the customer, but it certainly does indirectly, because enjoyment in work usually means better quality results.

This means of testing also has benefits for documentation:

If a report is targeted at company managers who may not be so technically adept, it is easier to make it more meaningful. For example, you could make the statement that of the defined risks X, Y and Z, only Y is present and represents an actual risk. This is in major contrast to a large number of small vulnerabilities where it is not necessarily apparent how they might lead to a successful attack.
Overall, scenarios are easier for people without a technical background to understand. Testers repeatedly complain of a lack of acceptance for the leading role of IT security. Acceptance becomes easier when it is apparent what is at stake. Scenarios are one way of showing this.
For the technical readership of the documentation, too, scenarios help to clarify effects. A technical comprehension of a cross-site request forgery (CSRF) is not the same as understanding the impact of a CSRF in a specific application. The consequences of a CSRF vary significantly. It’s no coincidence that bug bounty sites list sub-categories that may sound strange to testers, such as logout CSRF or CSRF of public forms. But this is understandable when you see the impact. A CSRF with which an attacker can directly carry out a transfer in a banking application is clearly a problem. When protection is lacking overall, it is sometimes not so readily apparent why this may be a danger. The same applies to logout CSRF, often downplayed on the internet, but something that can have particularly grave consequences. In a – from an auditor’s perspective – particularly fine Attack on Uber, it was precisely this logout CSRF that proved to be a key component. The article is also a good example of how a reflected XSS can become more significant in the user context (the author calls it self-XSS).

Problems can arise when this approach is carried to the extreme. If the documentation contains only complete scenarios, the report may miss out on known vulnerabilities. These might be relevant only in a future scenario, but the possibility of the attack scenario occurring at all could be prevented. Therefore, these vulnerabilities must also find a place in the documentation.

Conclusion

Both approaches have their positive aspects. In the scip Red Team, we try to draw the best from both sides. For the tests, we use checklists to guarantee completeness and comparability. The individual vulnerabilities identified sometimes give rise to new scenarios that were not previously apparent. Conversely, we also use scenarios that can sometimes lead to new points being included in the checklist. And in reports, too, we use a combination of both approaches. The goal of any report is:

comprehensibility for all relevant readers (whether management, administrator or developer)
completeness
transparency of the tests carried out
relevance

In the case of transparency, the checklist approach can help enormously. It is easy to see what was tested, which areas were fine and which areas need improvement. This approach also helps maintain completeness. Comprehensibility and relevance are covered as far as possible by scenarios. Scenarios flow as often as possible into the individual points and are referred to in the management summary. With scenarios, we can also ensure the appropriate customer-specific relevance.

About the Author

Veit Hailperin has been working in information security since 2010. His research focuses on network and application layer security and the protection of privacy. He presents his findings at conferences.

You want to evaluate or develop an AI?

Our experts will get in contact with you!

Security Testing

Tomaso Vasella

Active Directory certificate services

Eric Maurer

Foreign Entra Workload Identities

Marius Elmiger

Active Directory certificate services

Eric Maurer

You want more?

Further articles available here

You need support in such a project?

Our experts will get in contact with you!

You want more?

Further articles available here

Checklists or Scenarios

That is the Question

Checklist-based testing

Scenario-based testing

Conclusion

About the Author

Links

Tags

You want to evaluate or develop an AI?

Security Testing

Active Directory certificate services

Foreign Entra Workload Identities

Active Directory certificate services

You want more?

You need support in such a project?

You want more?