Anti-fraud in online services – recognizing fraudulent anomalies

Anti-fraud in online services

recognizing fraudulent anomalies

Marc Ruef
by Marc Ruef
on June 15, 2017
time to read: 13 minutes

Banks, credit card issuers and online traders are all confronted with the problem of fraud. This article discusses how malicious activities can be identified and responded to, based on the behavior of fraudsters and their access attempts.

The normal user

It is possible to say that there is such a thing as a normal user. This concept covers all users whose actions are normal or at last reasonably so. For example the largest proportion of Facebook users are between 25 and 34 years old (29.7%), female (76%) and use the site for an average of 20 minutes. If you combine these three qualities, you get the normal Facebook user.

Such statistics can be produced for all offers, services and applications. The results from different platforms are broadly similar in some cases, while in others there are clear sociodemographic and/or technical deviations.

To detect fraudulent behavior, we first need to identify normal behavior. Examples of this at a technical level:

Deviations from the determined norm can then be considered as possible indications of fraudulent behavior. The introduction of an initial implementation should be based on the average user, comparing professional solutions in the finance sector with the normal behavior of specific users, users in the same zip code area, and users in the same geographic language area. Other groupings (e.g. chosen web browser) are also conceivable.

Penalty points for deviations

The recommended approach does not envisage that all deviations should immediately be seen as fraudulent behavior and sanctioned. Rather, evidence should be compiled by combining various indications that allow an objective decision to be reached.

To this end, penalty points should be defined for individual deviations. These deviations should be listed and divided into various categories of severity. It is not necessary to assign 51 points for one attack and 59 points for a different attack. Instead, it is best to work with single-digit points, in this case 5 and 6. This simplifies the system enormously and allows a better overview.

To avoid false positives, an increase in the severity of an infringement should result in disproportionate increments. Exponential increases in the form of 1, 2, 4, 8 and 16 have proved successful. This makes it much more difficult for a customer to inadvertently score the maximum number of points than if a scale of 1, 2, 3, 4, 5 were employed.

The number of increments should allow for three, four, five or six possibilities. Less than three limits the increment options, while having more possibilities has a negative impact on complexity and traceability.

The metric should always be customized and be developed iteratively when devising the model.

Example of a Swiss online retailer

The following example shows how increments for geolocalization can be determined for a Swiss online retailer that does not ship internationally:

ID Deviation Penalty points
Client IP address
1 IP address outside Switzerland 1
2 IP address outside the German-speaking countries 2
3 IP address outside the EU 4
4 IP address outside Europe 6
5 IP address in countries with a high cybercrime rate (e.g. Russia, China) 8
6 IP address of a known VPN service or Tor exit node 8
Web browser
7 User agent not an up-to-date browser 4
8 Empty user agent string 8
Dispatch address
9 Dispatch to known users with outstanding accounts 2
10 Dispatch to known users with known debt enforcement proceedings 4
11 Dispatch to new users without previous orders 4
12 Dispatch to non-existent addresses 6

Time-based deviations

A particular class of deviations can be observed on the time axis. There are pauses between activities that are typical of normal users. For example, it takes x seconds for the corresponding credentials to be entered after the login page has been loaded. If this pause is particularly short or long, this may be an indication of suspicious behavior. This approach makes it easy to identify automations (e.g. brute-force attacks), for example. In this case, an attacker would have to introduce random but realistic delays to avoid detection.

Time-based deviations are always based on the time period when two different actions are being performed. These actions do not need to occur in direct succession. Entering credentials and submitting a login form are generally performed in succession. An alternative example is online banking, where it is also possible to monitor the time period between entry of the user name and the first executed transaction. Deviations may therefore indicate that accounts are being raided in a targeted manner.

As with all deviations, the time-based deviations should also attract penalty points. This corresponds to the pattern used for the evaluation of the original actions. What is meant here by ‘action’, however, is the temporal sequence of two actions.

Thresholds for actions

An action can be triggered when a predefined threshold is reached. Such actions can include a whole range of passive or active actions. For example:

It is therefore necessary to define which action should be triggered when a certain number of penalty points has been accumulated. The difficulty here is finding a happy medium. False negatives mean that no illegitimate accesses were identified or that they were not responded to. False positives mean that restrictions have been applied too rigorously and that legitimate access has therefore been unnecessarily made more difficult. Both must be prevented.

Example of incorrect logins

An easily comprehensible example shows what happens in the event of incorrect login attempts. For simplicity’s sake, we have worked on the assumption that any incorrect login attempt increases the penalty points by 1.

Increment Threshold Action
1 3 Account holder notified of suspicious activity. Offer to reset the password if it has been forgotten.
2 5 Delaying data processing by 1–3 seconds to slow down brute-force attacks considerably.
3 10 Login is no longer possible. The user is not notified of this, however, with the result that brute-force access attempts are in vain. A login would be unsuccessful even if the correct login data were entered.
4 25 Locking the source IP address for all login attempts for all user accounts for 15 minutes, in order to limit wide-scale brute-force attacks.

Here we can see how an inner-to-outer approach has been selected: The penalty is initially defined as narrowly as possible (only specific information) and then gradually expanded (slower speed followed by restrictions). This allows collateral damage to be prevented or at least retarded.

Approach to determine increments

The aspect of the approach discussed here that causes the majority of customers the greatest difficulties is the creation of a metric that includes the customer’s increments. It seems to be simple enough when it comes to incorrect login attempts, but it becomes more complicated if we want to monitor transactions after a successful login.

To understand this better, we need to consider scenarios. The points generated in the discussed scenarios need to be sanctioned accordingly. For example, if a fraudster has taken over an account to send an order to a non-existent address and has secured and automated access via VPN (no accompanying user-agent string is sent), this results in 22 points. Based on the thresholds defined below, this would lead to a reduction in accessibility:

Threshold Action
4 Warning/error message
6 Advanced logging
12 Processing delayed
16 Particular bodies informed
16 Additional identification required
18 Additional authentication required
20 Manual check required
22 Accessibility restricted
25 All accesses disabled

When discussing the scenarios, it may become apparent that some violations are not assigned penalty points clearly enough. In other words, minor infringements are not recognized at a sufficiently early stage or cannot be distinguished from gross violations. Should this be the case, the underlying penalty points will need to be reassigned. Developing a sustainable model and reliable model requires much effort and discussion.

Example of an implementation

In a client-/server-based solution, the security mechanisms on the server must be combined to prevent manipulations. Certain points that can be assigned penalty points must be defined in the application logic. In this example, a penalty point is added for each incorrect login attempt.

if(password_verify($_POST['password'], $row['password'])){
   echo 'You are authenticated!';
}else{
   $_SESSION['penalty'] = $_SESSION['penalty'] + 1;
   echo 'Authentication failed.';
}

In a further step, it is important to prompt the application as soon as possible to ascertain whether the threshold for penalty points has already been reached. To this end, the ideal solution is to initially use a code block in the following form:

if($_SESSION['penalty'] >= 3){
   mailuser();
}elseif($_SESSION['penalty'] >= 5){
   slowdown();
}elseif($_SESSION['penalty'] >= 10){
   disablelogin();
}

Intelligent implementation prevents the execution of unnecessary codes (at an early stage in the application logic) and relies on intelligent thresholds and sustained actions.

Working with simulations

The challenge is to develop a reliable model. Doing so requires a comprehensive evaluation of what constitutes legitimate behavior and may require several months of user accesses to be evaluated and bundled. The large amount of data involved makes this task complex and time-consuming.

Nevertheless, or perhaps precisely for that reason, we recommend a simulation-based approach. The model is provided as a simulation into which the data gathered to date can be incorporated. Parameterizing the events, thresholds and actions allows statistical statements to be made about the reliability (false positives and false negatives) of individual settings.

It is never possible to achieve a perfect model, and a certain degree of deviation from the optimum must always be reckoned with. The simulation helps in the identification and acceptance of this deviation. Despite these measures, perhaps 2% of cases of fraud cannot be detected and prevented. Nonetheless, this represents an improvement of 98%.

Summary

Fraud in online services is not a new phenomenon. If anything, it has become much more professional. It is essential that it is tackled in order to ensure that certain services remain reliable and economical.

It therefore makes sense to employ anomaly detection. To be able to identify and correctly evaluate deviations, it is first necessary to determine what constitutes normal behavior. Assigning penalty points means that predefined thresholds are eventually reached, which then result in specific actions being triggered. This can make attacks more difficult or allow them to be completely repelled.

A simulation makes it easier to develop an optimal model. It is important to remember that it is generally only possible to get close to the optimum, rather than actually reach it. However, there is always room for improvement and this should therefore be the objective.

About the Author

Marc Ruef

Marc Ruef has been working in information security since the late 1990s. He is well-known for his many publications and books. The last one called The Art of Penetration Testing is discussing security testing in detail. He is a lecturer at several universities, like ETH, HWZ, HSLU and IKF. (ORCID 0000-0002-1328-6357)

Links

Is your data also traded on the dark net?

We are going to monitor the digital underground for you!

×
Ransomware Detection, Defense, and Analysis

Ransomware Detection, Defense, and Analysis

Marc Ruef

Data Markets

Data Markets

Marc Ruef

Password Leak Analysis

Password Leak Analysis

Marc Ruef

MITRE ATT&CK

MITRE ATT&CK

Marc Ruef

You want more?

Further articles available here

You need support in such a project?

Our experts will get in contact with you!

You want more?

Further articles available here