Excel Forensics - Detecting Activities without Track Changes

Excel Forensics

Detecting Activities without Track Changes

Marc Ruef
by Marc Ruef
time to read: 6 minutes

Keypoints

  • Forensic investigations traditionally focus on the file system
  • Manipulation of Office documents may also be significant
  • The use of the Track Changes function is one preferred method
  • Analysis of the Excel format also reveals chronological changes to data
  • This makes it possible to detect and trace actions

Forensic investigation of manipulated files traditionally focuses on fragments in the file system. But modern file formats can also contain data that may offer insights into activities and users. During a customer-specific project, we discovered a new approach to this in the case of Excel documents.

The case

The customer creates an Excel document in the XLSX format. This file is located on a file server, which means it can be edited by multiple users.

One of these legitimate users has illegitimately manipulated the file content. This has caused damage, requiring us to find out who was behind the malicious manipulation.

Forensic investigation

The customer provided us with the file for analysis. Since the malicious manipulation there had been a number of legitimate modifications to the file content. In addition, the illegitimate manipulation was ultimately reversed (overwritten). An intrusion of this sort is not an ideal starting point for a forensic investigation.

Traditional approach: Track Changes

The traditional approach is to check the data generated when using the Track Changes function. This function can be enabled in Excel (and other Office products) to record each change made to the file content in Office. All changes can be viewed, accepted or rejected.

The disadvantage of this approach is that this function must be deliberately enabled. It is not activated by default and can also be disabled by the user at a later stage.

In our case, Track Changes was not enabled, so we were unable to gather any useful data on this level.

Advanced alternatives: shared strings

Excel documents saved in XLSX format are XML files based on the XML spreadsheet format and are packed in a ZIP archive. So when you change the file extension from .xlsx to .zip, it can be unpacked in the usual way (using 7zip, for instance). The archive contains individual files and subdirectories.

Content of an XLSX file

The XML documents are stored in the xl/ subdirectory, which contains a file called sharedStrings.xml. This file contains all of the values found at least once in a cell in the Excel document. If they occur multiple times, they can be referenced in the worksheets under xl/worksheets/. This referential usage saves resources, much like a relational database.

What is interesting about this file is that the shared strings it contains are recorded in chronological order. So if the value Test 1 is entered into a cell and then Test 2 in another, both of these values will be stored in precisely this order.

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" count="3" uniqueCount="3"><si><t>The first entry</t></si><si><t>The second entry</t></si><si><t>The third entry</t></si></sst>

This means you can trace the order in which data was entered. Overwritten cells are also replaced in shared strings, and deleted cells are deleted. It is not possible to inject tags, as special characters (e.g. chevrons) are HTML-encoded.

Action in Excel Effect in sharedStrings.xml
Fill cell New entry at the end of the list
Overwrite cell Replace existing entry in the same position
Delete cell Existing entry deleted without replacement

In this particular case, we were able to use individual data to identify who entered data before and after the malicious manipulation, greatly narrowing down the number of potential culprits.

Conclusion

Good forensic investigators have to be creative with the means at their disposal. Modern computer systems offer numerous data sources that can be used for the purposes of analysis. This can be crucial, as the culprits are often unaware that this is possible. That was the case here, where analyzing the circumstances made a major contribution to the progress of the investigations.

About the Author

Marc Ruef

Marc Ruef has been working in information security since the late 1990s. He is well-known for his many publications and books. The last one called The Art of Penetration Testing is discussing security testing in detail. He is a lecturer at several faculties, like ETH, HWZ, HSLU and IKF. (ORCID 0000-0002-1328-6357)

Links

You need support in such a project?

Our experts will get in contact with you!

×
Specific Criticism of CVSS4

Specific Criticism of CVSS4

Marc Ruef

scip Cybersecurity Forecast

scip Cybersecurity Forecast

Marc Ruef

Voice Authentication

Voice Authentication

Marc Ruef

Bug Bounty

Bug Bounty

Marc Ruef

You want more?

Further articles available here

You need support in such a project?

Our experts will get in contact with you!

You want more?

Further articles available here