What is NoSQL Injection? Exploitations and Security Tips

What is NoSQL Injection? Exploitations and Security Best Practices

SQL injections are well-known and widely documented vulnerabilities. They exploit flaws in relational databases to manipulate or extract sensitive data.

With the rapid growth of modern web applications, NoSQL databases have gained in popularity, offering greater flexibility and scalability than their SQL counterparts.

However, this flexibility does not come without risks: NoSQL injections, although less publicised, are a real threat to the security of applications using NoSQL databases.

In this article, we will explore the principles of NoSQL databases and understand how NoSQL injections work. We will also analyse concrete exploitation scenarios and best practices for effective protection.

Comprehensive Guide to NoSQL Injections

What is NoSQL Injection?
How NoSQL Databases Work?
- What are the types of NoSQL databases?
- How NoSQL works: focus on MongoDB
NoSQL Injection Vulnerabilities Exploitation Scenarios
- Authentication bypass using NoSQL injection attack
  - Operator injection
  - CVE-2024-48573 analysis
- Data extraction via NoSQL injection
  - JavaScript injection
  - Cypher injection
How to Prevent NoSQL Injections?

What is NoSQL Injection?

A NoSQL injection is an attack that targets NoSQL databases by exploiting vulnerabilities in the way queries are formulated. The aim is for an attacker to manipulate these insecure queries to bypass authentication or steal data.

Unlike SQL databases, where injections are based on SQL queries (such as SELECT or INSERT), NoSQL databases use query languages specific to each type of database. A NoSQL injection therefore consists of inserting malicious code into these queries, thereby modifying their behaviour and enabling the attacker to perform unauthorised actions.

Before diving into the various scenarios for exploiting NoSQL injections, it is essential to understand how NoSQL databases work.

How NoSQL Databases Work?

NoSQL databases are designed to meet the needs of modern applications, which require great flexibility and optimum performance when faced with massive volumes of data.

Unlike traditional relational databases (SQL), they are not based on a rigid, structured schema of tables with defined relationships.

One of the fundamental principles of NoSQL databases is their ability to evolve horizontally. Where a relational database generally requires the power of a single server to be increased, a NoSQL database can be distributed across several machines, guaranteeing better load distribution and increased fault tolerance. This approach enables systems to effectively manage scalability without compromising performance.

NoSQL databases are also designed to offer optimised performance, particularly in terms of reading and writing. They are often used for applications requiring very fast response times. In addition, their operation is often based on data replication and distribution mechanisms, guaranteeing high availability and resilience in the event of failure.

What are the types of NoSQL databases?

There are several types of NoSQL database, each adapted to specific use cases.

Document-oriented databases, such as MongoDB, store information in the form of JSON or XML documents.
Key-value databases, such as Redis or DynamoDB, are optimised for rapid access and are commonly used for session management or caching.
Others, such as Cassandra or HBase, adopt a column-oriented model to optimise performance on very large datasets.
Finally, graph-oriented databases, such as Neo4j, are designed to manage complex relationships between data.

The choice of a NoSQL database therefore depends on the specific needs of the application. If flexibility and speed of data access are paramount, a document- or key-value-oriented database will often be preferred. On the other hand, if the data is highly relational or requires complex analysis, a graph- or column-oriented database may be more appropriate.

How NoSQL works: focus on MongoDB

We’re now going to take a closer look at how MongoDB works. This is the NoSQL database we come across most often during our audits.

Here is an example of a document stored on MongoDB:

{
  "_id": ObjectId(
 "5f4f7fef2d4b45b7f11b6d7a"),
  "user_id": "Kevin",
  "age": 29,
  "Status": "A"
}

The ‘_id’ field is reserved for MongoDB. It serves as a primary key and uniquely identifies each document in a collection.

MongoDB offers great flexibility in data storage. The same type of object can contain different fields from one document to another. For example, one document may include a ‘Country’ field, while another will not. This allows users to fill in only the relevant information, avoiding the storage of empty fields.

To interact with a MongoDB database, the mongosh command line tool is particularly useful.

You can then browse the collections to analyse the data. The first commands are use (choose the collection) and show (view the collection).

Data is added using insertOne:

Other operations on the database are carried out using functions such as insertMany, find, updateOne, replaceOne, or remove, among others.

NoSQL Injection Vulnerabilities Exploitation Scenarios

Authentication bypass using NoSQL injection attack

Operator injection

A NoSQL injection occurs for the same reason as an SQL injection: the direct insertion of user data into a query sent to the database. Unlike SQL, there is no universal NoSQL language, which makes each injection specific to the implementation used (MongoDB, Neo4j, Cassandra, etc.).

A simple example of a NoSQL injection would be the following:

$manager = new MongoDB\Driver\Manager("mongodb://127.0.0.1:27017");
$query = new MongoDB\Driver\Query(array("email" => $_POST['email'], "password" => $_POST['password']));
$cursor = $manager->executeQuery('db.users', $query);

During authentication, the application must check whether the username and password supplied exist in the database and are correctly associated.

The request sent to the database will look like this:

db.users.find({
    email: "[email protected]",
    password: "superPassword"
});

As the parameters are controlled by the user, they can modify the query by adding operators. There are several operators, including:

$eq: checks that a value is equal to that of a field.
$ne: checks that a value is different from that of a field.
$gt: tests whether a value is greater than or equal to that of a field.
$and: adds an additional condition to the query.

The $ne operator is used to bypass the authentication process. With the following query, the user will be able to access the first account in the database, without needing to know any valid identifiers:

db.users.find({
    email: {$ne: " [email protected] "},
    password: {$ne: "invalidPassword"}
});

The database response will return all accounts whose email is not ‘[email protected]’ and whose password is not ‘invalidPassword’. Other operators, such as $regex, could also be used to obtain a similar result, by applying a regular expression that matches all the emails and passwords.

To correct this vulnerability, the first step is to check that the keys received are in a white list (in this case, ‘email’ and ‘password’). In general, it is recommended to use parameterised requests and to validate user input (for example, to ensure that the email is in the correct format).

CVE-2024-48573 analysis

The Aquila CMS has been affected by a NoSQL injection vulnerability, similar to the one described above. This flaw affected the password reset functionality, allowing an unauthenticated user to change the password of an existing account.

Here is the vulnerable code:

const resetPassword = async (token, password) => {
    const user = await Users.findOne({resetPassToken: token});
    if (password === undefined) {
        if (user) {
            return {message: 'Valid Token'};
        }
        return {message: 'Invalid Token'};
    }

    if (user) {
        try {
            user.password = password;
            user.needHash = true;
            await user.save();
            await Users.updateOne({_id: user._id}, {$unset: {resetPassToken: 1}});
            return {message: 'Password has been reset.'};
        } catch (err) {
            if (err.errors && err.errors.password && err.errors.password.message === 'FORMAT_PASSWORD') {
                throw NSErrors.LoginSubscribePasswordInvalid;
            }
            if (err.errors && err.errors.email && err.errors.email.message === 'BAD_EMAIL_FORMAT') {
                throw NSErrors.LoginSubscribeEmailInvalid;
            }
            throw err;
        }
    }
    return {message: 'User not found, unable to reset password.', status: 500};
};

The findOne function is used with a parameter provided by the user (the token). No checks are performed on this parameter, as the validation function (sanitizeFilter) is not applied to findOne by default.

To exploit this vulnerability, simply inject the $ne operator into the token parameter. This can be used to change the password of an account that does not correspond to the value of the token.

To correct this problem, the sanitizeFilter function must be explicitly called to prevent the injection of any malicious operators.

Data extraction via NoSQL injection

Exploiting a NoSQL vulnerability generally has less impact than an SQL injection, as data recovery is often limited to a specific collection (although this depends on the context, as we’ll see later). For example, an attack could extract all the comments from a blog, but not the user passwords, unlike an SQL injection where it is possible to extract the entire database.

However, in what follows, we will explore how it is possible to recover data from a NoSQL database, even without any prior knowledge of the structures or data.

JavaScript injection

One type of injection specific to NoSQL databases is Server-side JavaScript Injection (SSJI). In this case, the $where statement requires JavaScript code to be executed.

Let’s take the example of an authentication request:

db.users.find({
    $where: 'this.username === "<username>" && this.password === "<password>"'
});

If, with the ‘username’ or ‘password’ parameters, we manage to make the evaluation of the expression return True, the request will return a user and authentication will be bypassed.

This is achieved by using the OR (||) condition:

this.username === "" || true || ""=="" && this.password === "<password>"

We can test this expression in a browser’s JavaScript console to see that it will always return True, without needing to know the user name or password.

The interesting objective here is to extract data, in particular the username and password of one of the accounts. To do this, instead of always returning True, the expression should return True only if a specific condition is met, for example on the ‘password’ field.

The match function will be required and will take a regular expression (regex) as a parameter:

(this.password.match('^a.*'))

This expression will only return True if the target user’s password begins with an ‘a’. The application’s response will change as soon as the correct character is passed as a parameter. This process must then be repeated for each alphanumeric character. Data is extracted character by character.

It is important to note that this attack will be less effective if the password is stored as a hash in the database, rather than in clear text. However, the attacker can attempt to break the hash offline using a tool such as Hashcat.

Cypher injection

We are now going to look at another type of database: Neo4j, a graph-oriented database. The query language associated with this technology is called Cypher Query Language, which gives rise to what are known as Cypher injections. The principle remains the same as for the NoSQL injections discussed above, the difference lies solely in the payload, which must be adapted to this specific language.

Let’s take the example of a film catalogue where the user enters the name of the film they want to watch. The application responds by displaying all the films available with that name on the platform.

The request is as follows:

query = f"MATCH (m:Movie) WHERE toLower(m.title) CONTAINS toLower('{name}') RETURN m.title AS title"

In this example, the MATCH keyword is the equivalent of a SELECT in SQL. The user has control over the name parameter.

To demonstrate the vulnerability, we will try to extract the name of all the films on the platform. To do this, simply inject a condition that will always evaluate to True. As for bypassing authentication in an SQL injection, the expression 1=1 can be used for this manipulation.

The following query will match all the films:

query = f"MATCH (m:Movie) WHERE toLower(m.title) CONTAINS toLower('test') or 1=1 return m.title AS title// RETURN m.title AS title"

As you can see, the principle remains classic: we exit the current condition (here CONTAINS toLower()) to create a new one which will always be true, allowing us to retrieve all the values for a node. In our case, it is also possible to extract other data from the database by adding a new query to the injection.

An example is shown below:

query = f"MATCH (m:Movie) WHERE toLower(m.title) CONTAINS toLower('test’) MATCH (s:Credentials) WITH s.password as pwd return pwd as title// RETURN m.title AS title"

The injection is used here to extract passwords from the database. The final statement (return pwd as title) is crucial, as it allows the application to receive a title field in the response to the request, thus avoiding any errors.

How to Prevent NoSQL Injections?

To protect against NoSQL injections, it is essential to follow a few good development and security practices.

First of all, we recommend using prepared statements. This separates the data from the commands, preventing any injection attempts. In addition, each parameter must be carefully checked to ensure that it does not contain any dangerous characters or operators.
Secondly, it is crucial to systematically validate user input. Never trust data coming from the user and always check that it conforms to expectations before using it in a query.
Another important measure is to set up whitelists for authorised data fields. This reduces the risk of malicious manipulation of the database by only accepting validated data.

Authors: Julien BRACON – Pentester & Amin TRAORÉ – CMO @Vaadata