Isn’t business continuity part of security?
Andrea Covello
Companies have solutions implemented that are designed to protect them against viruses: Strategies, concepts and, last but not least, intelligently impelemented technologies, that are kept up to date. But what is lacking more often than not is a clearly structured modus operandi when it comes to the protection of your company’s IT-infrastructure as well as emergency protocols should a virus be detected in your company. Sometimes, there are rulings concerning incident management or CERT-Scenarios, but when it comes down to reacting quickly in a coordinated and targeted fashion, protocols, checklists, tools and communication channels, to name just a few, are missing. Especially those that are concerned with combatting the threat of the virus. In this article you will find important recommendations so that you can successfully deal with viruses in a rational manner.
Subjectively speaking, it seems that the subject of virus infections has been derailed in the media, unless it’s about the currently popular spying incidents, which appear to be a favourite subject of news outlets. Thus, it’s time to get the subject back into our minds and back to its roots. Because the risks are highest, when everyone involved thinks that they’re safe.
In general, all appearances of malware are considered to be virus attacks. The term malware stands for software that inflicts damage and is a portmanteau of the Latin word malus (English: bad or malicious) and the term ware. Malware is a collective term that covers the entire spectrum of software that is compromising systems in unwanted ways and/or intrudes systems. These programs are able to perform unwanted tasks that may or may not be damaging to our systems in either a very obvious or a hidden way. Examples of malware are the following:
Due to the fact that the term virus – even though it’s factually not quite correct – has established itself as a synonym for malware, I will use the term “virus” for the rest of the article in order to preserve the legibility of this article. Thus, when I’m writing about virus protection, I am writing about the protection against all sorts of malicious software.
These days, the main job of a virus is not its spread anymore. Most of today’s viruses come in the shape of a Trojan horse and are custom-tailored to spy on a system from afar or remote control them. Or both.
Security incidents in this contest are occurences in which the security and integrity of the services, the data or the infrastructure of a company are affected in a negative way. Reasons for these incidents can be targeted attacks or accidental tampering. Companies are constantly monitoring their IT-infrastructure with the intent to spot security compromises and reacting to them according to their criticality.
In this article you will find advice for the handling of the incidents known as viral attacks. We are focusing on reactive actions in case of emergency. However, this articles assumes that fundamental rules in case of an incident exist and are established.
The occurence of a virus in a corporate environment should be categorized, applying rankings regarding relevancy and potential effects as well as dangers. The following table gives an overview of the scenarios that make sense.
Malware | Internal | Effect | Risk | Action | Procedure | |
---|---|---|---|---|---|---|
1 | New | No | Source of information (media, Vulnerability Database, Newsletter, …) | Low | Patternupdate required | Active: Active initiation of a pattern update |
2 | New | Yes | An unknown virus appears in the company, no pattern file available, AV-Console/Scanner does not sound the alarms, damage/spread in process | Acute | Contain damage and spread | Reactive: Acute measures according to article |
3 | Known | Yes | AV-Console/Scanner sounds alarm, infected file is moved to quarantine | Low | Damage/Spread stopped | Active: AV Tools according to processes defined in teams |
4 | Known | Yes | AV-console/scanner sounds alarm, infected file cannot be moved to quarantine | Acute | Limit Damage/Spread, control distribution of pattern | Reactive: Immediate measures according to article |
5 | Known | Yes | False Positive, harmless file is recognized as virus and moved to quarantine | Medium | If needed, define exception, pattern update required, lift quarantine | Reactive: AV Tools according to processes defined in teams |
The cases that require immediate action (Numbers 2 and 4) are being described in detail in this document. Cases with low or medium risk can be delegated inside the company to internal teams that deal with the issue according to standard protocols.
The succession of events to determine the threat as well as answer further questions about the incident is being structured as follows: h2. Realtime
The first step of any threat management is of vital important: The recognition of the incident and the classification of the relevancy of the incident. It is up to the system administrators of the targeted area to ensure the alarming of the automated surveillance systems (Malware/Anti-virus-tools and consoles among others) are set up so that the generated messages are automatically relayed to the people in operations. Each message generated by a virus-surveillance system should also be relayed to a central instance where a complete correlation of all messages received can be performed. Thresholds should be defined that aim to increase accuracy of surveillance. These thresholds should be defined by suitably equipped teams and they are to be custom-tailored to the environment it’s used in.
If a system administrator has knowledge of an intrusion into the system, he undertakes a first classification. This classification must be based on the likelihood of a worst-case scenario as well as the potential damage associated with it. A detailed analysis will have to be done later on after the initial threat has been successfully neutralized. That later analysis may contradict the system administrator’s earlier classification. However, it is the thought of prevention that is deciding. This means: When in doubt, it is better to operate on a higher threat level than needed.
Before further measures can be undertaken, people need to be informed of what is going on. These are predefined stakeholders and teams. Usually, these groups are:
Depending on the urgency of an incident, the predefined person responsible decides whether or not the measures taken to combat the threat need to be sped up or not. He decides, if immediate measures – usually after receiving a recommendation from the system administrator – are to be initiated or if further escalation is needed. Decisions taken by the system administrator must be discussed with operations, application managers and people in charge of business as well as incident management teams.
In particularly critical cases, those who are causing a lot of damage in a company, Incident Management and potential BCM Protocols can call in an Emergency Team made up of other teams in the company. The person responsible for this team decides in collaboration with Incident Management, if other people in the company need to be informed or involved.
The goal is to limit the impact of the incident and stop its spread in order to minimize the negative effects of the intrusion. Should this not be feasible, other measures need to be taken, such as isolation. Should this measure be necessary, it needs to be discussed with operations, the emergency response teams and of course the security officer, the business owners and incident management.
As soon as the immediate negative effects have been tackled, the incident can be thoroughly analyzed. The goal is to find out the following:
Documentation from this moment on is not just done by ways of Incident Management but also in the form of Lessons Learned. The system administrators are responsible for the completeness of the documentation.
To clean up the incident in a persistent way, the solution chosen in Step 4 will be carried out. The main goal should be the prevention of the incident ever occurring again as well as the restoration of nominal operational conditions.
The findings that have been made during the combatting of the threat are to be analyzed under in a Lessons Learned process. These lessons have a place in the documentation of the incident and are to be relayed to all persons involved in the management of the incident.
The tools that can be utilized to support the dealing with the threat are largely dependent on your company’s system and the malware. But each relevant division of the company should have clearly defined checklists and toolboxes that can be deployed in case of an incident and worked with in a systematic fashion. It is up to the system administrators to define these tools and keep them up to date.
Based on the aforementioned advice, companies should define checklists and other clearly fleshed-out and defined actions that are custom-tailored for each division and for each subject. These lists and factsheets have to be short and contain a specific path to follow so that they can be used in case of crisis. Ideally, these lists are step-by-step instructions, minimalistic, based on text and mainly visual aids.
These lists and factsheets are to be checked and kept up to date on a regular basis. Also, a review of the emergency protocols should happen whenever there’s been a significant change in the system topography. There should be a review process for revised protocols.
Each affected division should have a toolbox to handle an incident.
Phase | Action | Details/Aids |
---|---|---|
1. Recognize and Categorize | 1. Recognize | Malware-Consoles, AV, Anomalies. Classifications. |
2. Classification | ||
3. Categorisation | ||
2. Alarm | 1. Define recipients | Pre-defined Points-of-Contact, list of responsible people. Team leaders, teams, security officer, business owner. Use synchronous media (phone, face-to-facce conversation) |
2. Alarm | ||
3. Get confirmation | ||
3. Act immediately and isolate | 1. Assess urgency | Pre-defined Points-of-Contact, List of people responsible. Team leaders, teams, security officer, business owner. Synchronize knowledge, explain further plan of actions |
2. Are immediate measures necessary? | ||
3. Is the spread stopped? | ||
4. Analyze and implement solutions | 1. Check if status quo has been restored | Check if status quo from before the incident has been restored. Obtain confirmation from business owners and operations. Document all phases in case management files |
2. Ensure nominal operational status of operations. | ||
3. Update case |
It is of vital importance to not only actively bother with the technology surrounding malware and Anti-Virus solutions, but to define clear structures that are used in case of an incident. Especially the illusion of safety should be avoided, simply because there is extensive anti-malware software deployed in the system and despite the fact that the classic subject of virus is not that prevalent in the media anymore. It is not enough to wish away things that you do not want to deal with. It seems prudent to plan for possible scenarios. These offer systematic points of reference and help to orientate the crisis management crew in case of an incident.
Our experts will get in contact with you!
Andrea Covello
Michèle Trebo
Lucie Hoffmann
Yann Santschi
Our experts will get in contact with you!