Online Tracking - How does Tracking on the Internet work?

Online Tracking

How does Tracking on the Internet work?

Ralph Meier
by Ralph Meier
on March 25, 2021
time to read: 10 minutes

Keypoints

  • The most common tracking method is by the use of cookies
  • Due to changes in the GDPR and the blocking of third-party cookies, there is an increased reliance on other technologies
  • How to create a fingerprint of my browser
  • Tips to protect yourself from tracking

“After all, I have nothing to hide, online tracking does not affect me, right?” Based on such statements, this article was written and the question asked: How does tracking on the Internet actually work?

There are several ways in which tracking is used on the Internet. In the following sections, some variants will be shown and explained. The list is not exhaustive.

Tracking through Cookies

The most common and widespread method of tracking is the use of cookies. Cookies are text files that are stored in a user’s browser. Cookies allow returning users on the same device and browser to use the desired online store or a forum without logging in again. Cookies can also be used to track and log interactions and movements on a website. With normal cookies, tracking visitor interactions is only possible on the same website.

Tracking Provider

Often, this is not enough for online store operators. Therefore, they resort to a tracking provider. They work with embedded or linked scripts and components such as images, tracking pixels, web pages and fonts, to name a few. These objects are loaded when a web page is accessed and trigger requests to tracking servers.

A very famous example is Facebook’s like button. It informs Facebook about every visit to a web page on which the button has been embedded. Facebook can also link the visit of the website with the embedded like button to a previously used Facebook account, if the cookies created have not been removed. Otherwise, the visit is linked to an advertising ID.

In addition to the like button, Facebook has other tracking methods such as the newer Facebook Pixel. It is used to track which ads the buyer has seen or which ad has led him/her to visit the website when making a purchase in an online store. The events to which the Facebook Pixel reacts and logs can be customized. Facebook uses it to give online store operators a better overview of their current ad campaigns and to sell more ads on their platform.

Tracking providers store the information obtained about the respective user in a cookie in his browser. Such cookies are often referred to as tracking cookies or third-party cookies, as they do not originate from the visited website itself but from third-party providers. Tracking cookies often store user characteristics such as IP address and information about the browser or device used.

Blocking Third-Party Cookies

Internet browsers such as Safari or Firefox started blocking third-party cookies as a default setting a few years ago to counteract such tracking methods. Tracking providers then instructed customers to integrate their tracking code directly into the website in order to be able to create first-party cookies, i.e. cookies in the context of the visited website.

Change due to the GDPR

Due to the changes in the European GDPR, which have been applied since May 25 2018, tracking cookies or cookies in general have become much more visible through the common cookie banners on websites. Storing cookies without the consent of the visitor is no longer allowed since that change. Despite the threat of penalties for non-compliance, cookies are sometimes stored in visitors’ browsers even before they select the desired option. In addition, it happens that the visitor can select allow only necessary cookies, but the website saves allow all cookies and adds all cookies to the visitor’s browser.

Fingerprinting

Browser fingerprinting is about creating a fingerprint of the browser in use. This can be done by combining information available to the browser, such as screen resolution, installed fonts or installed plugins.

However, there are also more sophisticated variants such as the canvas fingerprint. As a prerequisite, JavaScript must be enabled in the visitor’s browser. The visited web page contains code for generating a fingerprint, which first creates a canvas element. Canvas elements can be created by the HTML tag <canvas> and allow to create shapes/figures like rectangles, circles in all sizes, colors and combinations.

Various 2D graphics and texts are then rendered onto this created canvas element. For texts, a panagram is mostly used. This is a set that contains all the letters of the alphabet. Different fonts and font sizes are used to achieve higher entropy (information content). The code for generating the actual fingerprint uses the toDataURL() method, which returns the content of the created canvas element as a DOMString in Base64 format. Due to existing differences in the OpenGL version, rendering engine of the browser and installed fonts, the resulting DOMString can be distinguished well enough from other website visitors and their used browsers. To further increase the information content, additional information such as configured time zone, operating system used as well as previously mentioned information is added to the fingerprint. The result is a long, unique string that is hashed and then sent via HTTP request or stored in a cookie so that the browser used can be recognized on other websites.

Favicon Method

This tracking method also goes in the direction of fingerprinting, but it is done with the use of favicons. Each time a web page is called up, a check is made to see whether the corresponding favicon for this web page is already in the cache or whether it needs to be downloaded. In the paper Tales of Favicons and Caches: Persistent Tracking in Modern Browsers it was shown that it is possible to create a fingerprint of a device using different favicons. The requested page forwards the request to n other web pages, which have different favicons. For each web page it is checked whether the favicon is available in the cache or not. From this information, a fingerprint is created, which can be used to recognize the device. The browser manufacturer Brave has already fixed this problem and allows the stored favicons to be deleted manually. In Firefox, the favicons were never cached at all. The Safari, Google Chrome and Microsoft Edge browsers have not yet fixed the problem.

It is not known whether this method has been used by tracking providers so far.

What does Google know about me?

Google displays personalized advertising by default when using its services and on websites and apps from Google advertising partners. The Google account or an advertising ID is used, which is linked to a device or a browser. All data generated during the use of Google services, i.e. activities and information, is saved in order to assign suitable categories to the user. Based on these categories, corresponding advertising will be displayed. All Google account holders can view the assigned categories .

A snippet of selected assigned categories from my peronal Google account:

Extract from assigned categories of my personal Google account

I was particularly surprised by the Parent Status category. By clicking, Google provides more details, even if they are very general, why this category was assigned. For Parent Status, the rationale was as follows:

Google estimates this demographic because your signed-in activity on Google services, and on other websites and apps, is similar to people who have told Google that they are in this category.

Personalized advertising can also be deactivated on the abovementioned page. Information about own activities at Google and their recording in a history can be viewed or configured at myactivity.google.com . It is quite exciting to look at this information, as long as it is still enabled. If you are now curious and want to see all the data that is connected to your Google account, you can do so at takeout.google.com.

How can you mitigate tracking as a user?

More tips and recommendations on preventing tracking and generally maintaining privacy can be found at privacytools.io and ssd.eff.org.

Conclusion

The insights gained through the above tracking methods are mostly used for personalized advertising. Some see an advantage in personalized advertising, “at least then I will see what I am interested in.” On the other hand, the Cambridge Analytica scandal has shown that such personal data can be used to influence people with targeted election advertising on a large scale.

In the end, it is up to each individual how much information he/she wants to reveal to which website or manufacturer. Maintaining privacy by circumventing tracking usually involves a certain amount of effort.

About the Author

Ralph Meier

Ralph Meier completed an apprenticeship as an application developer, with a focus on web development with Java, at a major Swiss bank and then completed a Bachelor of Science in Computer Science UAS Zurich at the ZHAW School of Engineering. His primary task is doing security-related analysis of web applications and services. (ORCID 0000-0002-3997-8482)

Links

General Data Protection Regulation GDPR is a Challenge?

Our experts will get in contact with you!

×
Dynamic Analysis of Android Apps

Dynamic Analysis of Android Apps

Ralph Meier

Burp Bambdas & BChecks

Burp Bambdas & BChecks

Ralph Meier

Disk Cloning

Disk Cloning

Ralph Meier

The BIOS

The BIOS

Ralph Meier

You want more?

Further articles available here

You need support in such a project?

Our experts will get in contact with you!

You want more?

Further articles available here