In which cases can a path traversal vulnerability occur? How to detect this flaw and protect yourself from it?
This is what we will detail in this article.
Definition of a path directory attack
A path traversal or directory traversal attack aims at accessing and reading files stored outside the tree structure exposed directly by the web service.
It consists in modifying a request’s parameters to navigate in the tree structure. The goal of the attacker is to browse the directories to reach sensitive files to which access is normally not allowed (configuration files, source code…)
In some situations, the attacker may even have access to unauthorised functionality, such as writing files on the server. This can lead them to take control of the server and the vulnerability becomes then an RCE.
How does the path traversal vulnerability occur?
Most web applications use locally stored resources (images, scripts, text files…) to perform their tasks. Sometimes, these resources are embedded in other pages via parameters that a user can manipulate.
The path traversal flaw occurs when the user parameters aren’t sanitised and/or there is a lack of access control to the resources.
It’s then possible for an attacker to modify the parameters of the request to ask to return other resources.
The impact of this flaw is generally critical. Indeed, depending on the context, the attacker might be able:
- to read files, potentially:
- Configuration files where there are usually secrets (credentials, keys…) which then allow to exploit new vulnerabilities,
- Sensitive operating system files,
- to read the source code,
- to analyse the organisation of the server,
- sometimes to write on the server, which can lead to:
- a modification of the application’s behaviour,
- even, to take control of the server.
How to identify the directory traversal flaw?
To find a path traversal flaw, you need to rigorously list all the places where users can send data.
The OWASP Testing Guide details points to look for:
- Are there parameters that are related to operations requiring files?
- Are there unusual file extensions that are accepted?
- If there are interesting variable names? (item, file, home…)
- If there are cookies used for the generation of pages or templates?
Once the points have been identified, we need first to test different techniques to exploit the vulnerability. You can then try techniques to bypass the controls in place.
The most common method is to check if it’s possible to go up to other directories:
- directly with ../
- via encoding, which gives for example %2e%2e%2f
- via unicode notations, which gives for example %u2216%u2216
- with file:
- via wrappers
- …
These different techniques may need to be combined, as some protections may be in place while others aren’t.
You can rely on PayloadsAllTheThings, which lists a wide variety of payloads.
A second possibility is to include external resources directly in the call parameters, in cookies or other vectors.
Examples of path traversal vulnerabilities encountered
During the web application penetration tests we carry out, we regularly encounter this flaw.
Some of the vulnerabilities discovered are classic situations, for example the parameter used in a include
function isn’t protected. Or it’s possible to manipulate a parameter passed to a curl function in which it’s permitted to use file://
.
Other flaws are more hidden. For instance, we tested an application that had a pdf generator. The generator included the user’s personal data in the pdf.
To exploit the vulnerability, we modified the user’s “address” field by putting a <iframe src='file:///etc/passwd'></iframe>
. The generator interpreted the HTML. The problem was that the library was allowed to retrieve files via file://
.
When generating the pdf, the iframe was interpreted and would display the resources we had requested. The impact of a flaw like this is critical.
How to protect yourself from path traversal?
To avoid these flaws, several measures should be implemented:
- Do not use user input directly to call a file.
- User data shouldn’t be interpreted. It should be encoded, escaped and cleaned.
- It should be validated against a list of allowed expressions. If this isn’t possible, then the validation must confirm that there are only allowed contents (e.g. only alphanumeric characters).
In the case of our pdf generator example, the remediation is that HTML is not interpreted and file:
not allowed. Furthermore, it should not be possible to load local files in iframes, nor local web resources (SSRF).