XXE (XML External Entity attack) is now increasingly being found and reported in major web applications such as Facebook, PayPal, etc. For instance, a quick look at the recent Bug Bounty vulnerabilities on these sites confirms this. Although XXE has been around for many years, it never really got as much attention as it deserved. Most XML parsers are vulnerable to it by default, which means it is the responsibility of a developer to make sure that the application is free from this vulnerability. In this article we will explore what XML external entities are and how they can be attacked.
What are XML external entities?
For someone who is not aware of XML, you can think of it as something that is used to describe data. Thus, two systems which are running on different technologies can communicate and exchange data with one another using XML. For example, below is a sample XML document which describes an employee. The ‘name’ ‘salary’ and ‘address’ are called XML elements.
Now these XML documents can contain something called ‘entities’ defined using a system identifier and are present within a DOCTYPE header. These entities can access local or remote content. For example, below is a sample XML document that contains XML entities.
In the above code, the external entity ‘entityex’ is declared with the value file:///etc/passwd. During XML parsing, this entity will be replaced with the respective value. The use of keyword ‘SYSTEM’ instructs the parser that the entity value should be read from the URI that follows. Thus, when the entity value is used many times, this would seem very helpful.
What is an XXE attack?
With XML entities, the ‘SYSTEM’ keyword causes an XML parser to read data from a URI and permits it to be substituted in the document. Thus, an attacker can send his own values through the entity and make the application display it. In simple words, an attacker forces the XML parser to access the resource specified by him which could be a file on the system or on any remote system. For example, the below code would fetch the folder/file present on that system and display it to the user.
How to identify XXE vulnerabilities
The straightforward answer to this question would be to identify those end points which accept XML as input. But sometimes you will encounter those cases where the end points that accept XML might not be so obvious (for example, those cases where the client uses only JSON to access the service). With these cases, a pen tester has to try out different things such as modifying the HTTP methods, Content-Type etc. to see how the application responds. If the application parses the content, then there is a scope for XXE.
How to confirm
For the purpose of demo, let us use the site http://testhtml5.vulnweb.com/ which is maintained by Acunetix. This is a test site that can be used to verify the capabilities of the Acunetix web scanner. Visit the site http://testhtml5.vulnweb.com/ and click on the ‘Forgot Password’ link present under ‘Login’. Observe that the application transmits data using XML as shown below.
Looking at the above request and response we can understand that the application is processing XML, receiving certain input and displaying it back. In order to test whether the XML parser is parsing and executing the XML sent by us, I have sent the below request.
Modified Request & Response:
As seen in the above request, we have now introduced an entity within the request (myentity). The response clearly indicates that the XML parser has parsed whatever we have sent and accordingly echoed back the value. This confirms that this application is vulnerable to XXE attacks.
How to exploit
The following code samples can be used for exploitation of the XML external entity vulnerability.
Click here to download the code associated with this article.
The above attack is known as ‘billion laughs’ attack and takes an exponential amount of space almost around 3 GB. Apart from these an attacker can also read sensitive data present on servers that the application can reach, look for open ports on backend systems by performing port scanning etc.
The impact of exploiting this vulnerability can be very dangerous, as it allows an attacker to read sensitive files present on the server, perform denial of service attack on the server, etc.
The main problem as discussed above is that the XML parser parses the untrusted data sent by the user. However, it may not be easy or possible to validate only data present within the system identifier in the DTD. Most XML parsers are vulnerable to XML external entity attacks (XXE) by default. Therefore, the best solution would be to configure the XML processor to use a local static DTD and disallow any declared DTD included in the XML document.
For example in Java (as shown in the below code), by setting the respective attributes to false, external entity attacks can be prevented. Thus external entities, parameter entities, and inline DTD are set to false to avoid XXE based attacks.
See download for second snippet of code.