For years it has been clear that internal employees pose a significant risk when it comes to the issue of data breaches. Data loss prevention tools (DLP) offer one approach to dealing with this problem. One of the biggest reasons for using these tools is the ability to prevent data leakage automatically (blocking rules). It sounds almost too good to be true: After implementing a DLP tool, simply activate the predefined rules, watch the behavior for several weeks in monitoring mode, then change to blocking mode straight away and you have an effective risk management solution. Of course it’s not that simple. In reality, there are few enterprise-level companies willing to take the risk of deploying blocking mode even on a rudimentary level.
Simply put, because the DLP rules are imprecise and thus return too many false positives (FP). The FPs resulting from blocking would impede key business processes or even render them inoperative altogether.
There are many different reasons for this, from fundamentally incorrect assumptions as to how DLP solutions are supposed to be used through to highly complex protected objects that cannot simply be described with a simple DLP rule.
You find all kinds of different data in the use cases for which the DLP solutions were designed to minimize the risk of data leakage. In addition to the false assumptions just described, the complexity of various protected objects is often the reason behind the high number of false positives. The problem can be illustrated using the following example with customer data.
The easiest objects to protect are identification numbers, such as account numbers. IDs, particularly those with more characters than a telephone number, are easy to implement (with indexing or regular expression rules). No FPs or only negligible numbers can be expected for these types of IDs.
The problem is with data that does not always fit the same pattern, such as free text fields, e.g. names or addresses. These types of data are always susceptible to FPs, because basically anything can occur here.
If, for instance, you want to represent your customer base using a DLP, you would most likely index the data from the customer database to create a DLP rule. However, this raises a whole set of problems. Even in the rather unusual scenario in which the customer database has been entirely cleaned up and validated, there are still plenty of sticking points in the various attributes of the free-text fields.
It is very likely that a process will have to be defined to prevent unvalidated data from being included in the DLP rules. Only an undesired attribute in the DLP rule can unleash a storm of FPs.
If you have the data quality under control, free-text fields still always have various idiosyncrasies that increase FP rates. For example, a popular Asian first name is “An”. But “An” is also a commonly used preposition in the German language. Do we exclude customers named “An” or accept the high number of FPs triggered by the presence of the preposition?
Ultimately, there is no way to get around excluding certain data from the DLP rules, especially when you have to cover thousands or hundreds of thousands of data records. If you then exclude certain data in order to minimize the FP rates, you also reduce the coverage of the DLP rule at the same time. If 5% or 10% of the initial data has to be excluded from the customer database, this must be documented and accepted as a residual risk.
If you have reached this point, the tools included with the DLP solutions are unlikely to be very helpful. The DLP setup is, of course, highly specific to the particular company and must be able to handle this complexity on its own.
Along with the technical DLP implementation, an administrative body must be introduced to deal with DLP rules in particular. This sort of DLP policy management is designed to create the necessary transparency and thus information about the quality and status of the DLP rules, as well as to constantly improve the quality of the DLP rules.
DLP policy management should take a position between the Business and IT departments. A two-person team, one from the Business (Risk) and one from the IT (DLP rule author) departments is one possible method.
It is important to understand that while DLP rules can be defined during an initial (implementation) project, they usually result in very high FP rates. The important tuning (minimizing FP rates and increasing coverage) of the DLP rules can only succeed over time. Thus, policy management is aimed at ensuring that even after the project is complete, DLP rules can be improved iteratively during operation.
So how is DLP policy management supposed to work? Basically, it is essential to understand that only an iterative process can be effective when it comes to protecting large volumes of data. Too many factors affect the quality of DLP rules, and most of these factors are still unknown during the initial DLP project. The following concepts can therefore be effective.
The simplest representation of a DLP rule is its configuration in the DLP solution, but this does not reveal anything about the quality of the rule. To gain an end-to-end view of a DLP rule, the following factors should be documented, at the very least:
To address the iterative aspect, version management needs to be introduced.
If the DLP rules are managed in an end-to-end view throughout their lifecycle, the reporting is the easy part. The information is logical and does not have to be interpreted – but only if the data collection is centralized and automated. On the other hand, the standard DLP solutions offer only a few tools for this and often you are on your own. For this reason, the reporting features in the DLP implementation project should be evaluated closely and consideration given to the requirements of a DLP policy management solution.
DLP projects should be business-driven. IT departments are service providers wanting to implement the requirements of their business. Because the detailed requirements for DLP solutions cannot be specified by the IT departments in the same way as an anti-virus solution, this input must be provided clearly on the business end.
DLP solutions can be very expensive and yet have only a minimal impact in reducing the risk of data leakage. It becomes tempting to install a tool with a blocking mode and then assume that risks have been effectively minimized. This problem is particularly common at the enterprise level. But if a DLP solution is thought of as a controlling tool for already-existing measures to prevent data leakage, it serves as an excellent broken process detector.
Yet inconsistencies in data management (here DLP at least identifies the first broken processes) and the properties of the many different kinds of data requiring protection make creating and using effective DLP rules complex. In addition, the tools included with DLP solutions are usually inadequate. For this reason, clearly structured policy management should be implemented as the operational standard at the same time. During the evaluation of the DLP solution, particular attention should also be paid to reporting functions. If the budget does not provide for policy management, the result is usually stagnation, which is concealed by false reporting (which increases risks!), or a new DLP solution is considered in order to straighten it out again. This is expensive and the benefits are unfortunately limited.
Those who can implement DLP policy management from the outset should do so conscientiously. Those still struggling with an onslaught of FPs or unsatisfactory coverage should also consider introducing a DLP policy management solution – even if this may require taking a few steps backwards.
Our experts will get in contact with you!
Our experts will get in contact with you!
Further articles available here