My SQLi adventure or: why you should make sure your WAF is configured properly
In August 2021 I was tasked with performing a Web Application security assessment for a large client. The automated scanning tool returned a possible SQL injection which, just like last time, couldn't be exploited using the said tool. The reason was Cloudflare's WAF and more specifically its SQL Injection filter.
The techniques described below do not compose a "universal" bypassing technique as the one
I've discovered previously and described here. Definitely not all of them are applicable against a fully configured Cloudflare WAF instance.
It is more accurate to say that this post describes my SQLi adventure from start to finish. During this I thought that I had bypassed Cloudflare WAF's SQLi filter. It turned out that the Cloudflare installation was not properly configured (some rules in the "OWASP Uri SQL Injection Attacks" Rule Set were not enabled). For this reason the findings were rejected for Cloudflare's bug bounty program. I feel, however, that the findings described in this post are interesting and show a number of points that security people need to notice when deploying security protections for their apps. Some partial findings might also come in handy to a person who tries to bypass WAFs in general. An attacker could leverage one or several techniques, depending on the end application, to bypass some WAFs and exfiltrate data by abusing an SQLi vulnerability.
Read below and I believe you will understand what I'm talking about.
Details about the application
The purpose of this web application is not important for the writeup. The stack on which it is written is not really important either, except for the /graphql endpoint which is exposed on the server, and the PostgreSQL service running the queries. GraphQL is a query language which processes a query and, in this case, runs the appropriate SQL queries on the backend and returns the data requested. I've never used or read about GraphQL before this assessment, so I followed the official tutorial which gave me a basic understanding.
The queries are formed similarly to JSON objects. The application sends a POST request to the /graphql endpoint. The request body contains the GraphQL query and looks something like this:
The unsanitized parameter at X-MARK is passed from GraphQL and replaced in a PostgreSQL stored procedure, thus allowing us to inject into an SQL query that
executes on the backend server.
This is pretty much what we have to know at this point about the application. In the sections following I lay out the observations I made while testing the application for the SQL injection and which led me to successful exploitation, including bypass of the Cloudflare SQLi filter.
The first important observation made is that the server responds differently to a valid email input (i.e. when the SQL query returns data) than it does to an invalid one. The HTTP code returned is always 200, but the response body differs:
This returns "OK":
This returns "NOT OK":
This might not look like much at first, but it was actually the mechanism that allowed me to validate the existence of specific data in the database.
It led me to the idea to perform a Blind SQL injection attack leveraging this mechanism and automate it with a script.
The script constructs a payload and sends it with the
POST request, which in its turn modifies the SQL query which runs on the backend server.
The process is described using steps below:
- We have a set of characters, called an
alphabetThis set includes all possible characters which can be part of a piece of data we are trying to extract from the database.
- Assume that
resourceis the SQL resource required to be retrieved. It can be the DB name, the username, a cell value in any table etc.
resourcewill be retrieved character-by-character. The character we are trying to retrieve is called
- We loop through the
alphabetand we compare each of its letters against
- If there is a match we log the result and move to the next
- At the end we have the complete
resourceby concatenating all the
The second important observation, which made things for me a lot easier, is the error message displayed in cases where
the SQL query performed was malformed. The error message did not allow me to perform an Error-Based SQL injection,
but instead displayed the full SQL query performed on the SQL server by PostgreSQL. It will become evident why this is important in a bit.
The SQL query extracted from the error message looked something like this:
1 2 3 4 5 6 7 8 9 10 11 12 SELECT t1.* FROM table1 AS t1 LEFT JOIN table2 AS t2 ON t2.id = t1.id AND t2.q_id IN (1, 2, 3, 4) AND t2.input = LOWER('X-MARK') WHERE t1.deleted_at IS NULL AND t1.name = $1 AND ( LOWER(t1.email) = LOWER('X-MARK') OR LOWER(t2.input) = LOWER('X-MARK') );
The user input that was submitted as the 'email' variable is reflected at the X-MARK. These are the two reasons that having the full SQL query was important for the exploitation process:
- One can see that the user input is wrapped in parentheses. This piece of information makes payload production process much easier, as one would know that the input must contain the closing parenthesis that matches the opening parenthesis, in order for the end SQL query to be valid.
- User input is reflected in multiple places in the query (lines 5, 10, 11). What gave me some trouble was line #5. If it were the case where X-MARK was only reflected inside the WHERE clause, things would've been easier. But in this case I had to make sure that my input didn't mess up the table JOIN. This was necessary in order to make sure the correct table rows were produced and queried against, so that I could get the data I wanted.
I started by trying to find the name of the current database. The SQL query to do this in PostgreSQL is:
There was a lot to take into consideration and it got wild from the beginning.
I had to come up with ways to bypass Cloudflare right from the beginning.
WAF Bypass #1
First of all, I had to start with comparing the first character of current_database() with the character 'a'. The way to do this is PostgreSQL is: Cloudflare blocks the 'substr' function, so the trick is to use either the 'left' or 'right' function. I used the 'right' function for this because 'left' gave me some trouble when trying to figure out when I had found all of the database name's characters. The new query (which compares 'a' with the last resource's character looks like this: and goes by undetected by the WAF.
NOTE: The function right(current_database(), N) returns the N rightmost characters of the database name. Because of this, when the last character is found to be, for example, X, the next call to the function should be:
Since we already know that we have to close an opening parenthesis in the query (from Observation #2), the body of the POST Request looks like this (only showing the 'email' variable here):
However, remembering how the backend SQL query includes a JOIN clause (from Observation #2), I also added some extra stuff in the query to make sure the SQL Join was executed correctly in the background. The body of the POST request looked like this:
The subsequent SQL query on the server looked like this:
1 2 3 4 5 6 7 8 9 10 11 12 SELECT t1.* FROM table1 AS t1 LEFT JOIN table2 AS t2 ON t2.id = t1.id AND t2.q_id IN (1, 2, 3, 4) AND t2.input = LOWER('email@example.com') OR t2.id = t1.id AND t2.q_id IN (1, 2, 3, 4) AND ('a'=(SELECT(right((SELECT current_database()),1)))) AND ('a'='a') WHERE t1.deleted_at IS NULL AND t1.name = $1 AND ( LOWER(t1.email) = LOWER('firstname.lastname@example.org') OR t2.id = t1.id AND t2.q_id IN (1, 2, 3, 4) AND ('a'=(SELECT(right((SELECT current_database()),1)))) AND ('a'='a') OR LOWER(t2.input) = LOWER('email@example.com') OR t2.id = t1.id AND t2.q_id IN (1, 2, 3, 4) AND ('a'=(SELECT(right((SELECT current_database()),1)))) AND ('a'='a') );
It's complicated, but the idea remains the same: if 'a' is the rightmost character of the Database name, we'll get an "OK" response from the server.
Anyway, after submitting this request to the server, I saw the dreaded page of the (misconfigured) Cloudflare WAF telling me that my request was blocked. Time for some more fiddling!
Before I gave it another go, I had to understand what was Cloudflare's issue with my query.
After some trial and error I figured out that the problem was the spaces in the FROM clause in the resulting server SQL query. This lead me to the second WAF bypass.
WAF Bypass #2
The second WAF bypass technique utilized here eliminates spaces in the SQL query and encloses the parts of the SQL FROM clause in parentheses. In short:
So the resulting body of the POST request becomes:
and the subsequent SQL query on the server looks like this:
1 2 3 4 5 6 7 8 9 10 11 12 SELECT t1.* FROM table1 AS t1 LEFT JOIN table2 AS t2 ON t2.id = t1.id AND t2.q_id IN (1, 2, 3, 4) AND t2.input = LOWER('firstname.lastname@example.org')OR(t2.id = t1.id)AND(t2.q_id IN (1, 2, 3, 4))AND('a'=(SELECT(right((SELECT current_database()),1))))AND('a'='a') WHERE t1.deleted_at IS NULL AND t1.name = $1 AND ( LOWER(t1.email) = LOWER('email@example.com')OR(t2.id = t1.id)AND(t2.q_id IN (1, 2, 3, 4))AND('a'=(SELECT(right((SELECT current_database()),1))))AND('a'='a') OR LOWER(t2.input) = LOWER('firstname.lastname@example.org')OR(t2.id = t1.id)AND(t2.q_id IN (1, 2, 3, 4))AND('a'=(SELECT(right((SELECT current_database()),1))))AND('a'='a') );
The whole process could be (and was) automated at this point to find the whole value of the database name. The same process also retrieved the username (by using the user function) and the database version (by using the version() function). But what about data that was stored in the database tables? The generic query to retrieve these data, as well as the query using the bypass methods I used in my previous article were not working. Both were blocked:
Why were my queries blocked? The problem is the FROM clause that comes right after the SELECT clause. The following query would pass through the (misconfigured) Cloudflare WAF SQLi filter just fine:
As soon as the WHERE clause was introduced at the end of the query, the WAF would kick in and block the request. My ultimate goal was to retrieve data from any table. It was time to dig deeper into the rabbit hole.
(Kinda sidetracked) Third attempt
I'm giving here an early failed attempt for the SQLi, just because I'd like for this post to
show people how the thought process unfolds during a pentest.
In my attempt to decomplicate things (i.e. break away from all the JOIN and FROM clauses), I put to use a simple semicolon and comment trick (;--). The plan was to first retrieve the database name and then build on from there to retrieve data in tables:
The resulting SQL query on the server is the following:
1 2 3 4 5 6 7 8 9 10 11 12 SELECT t1.* FROM table1 AS t1 LEFT JOIN table2 AS t2 ON t2.id = t1.id AND t2.q_id IN (1, 2, 3, 4) AND t2.input = LOWER('email@example.com')OR('a'=(SELECT(right((SELECT database_name()),1))))AND('a'='a');--') WHERE t1.deleted_at IS NULL AND t1.name = $1 AND ( LOWER(t1.email) = LOWER('firstname.lastname@example.org')OR('a'=(SELECT(right((SELECT database_name()),1))))AND('a'='a');--') OR LOWER(t2.input) = LOWER('email@example.com')OR('a'=(SELECT(right((SELECT database_name()),1))))AND('a'='a');--') );
Anyway, this of course doesn't work for two reasons:
- Everything after line #5 is ignored because of the comment. This is not necessarily limiting, but I'd rather perform my SQLi inside the FROM clause.
- I get the following error: "bind message supplies 1 parameters, but prepared statement requires 0". This is because the name variable is passed to the prepared statement but line #8 is ignored so the new prepared statement does not expect the variable.
I'm giving here another early attempt of mine to retrieve data from a table, that got me closer to my goal. What I knew until this point was that the following payload would get blocked by the WAF:
NOTE: I added the LIMIT and OFFSET keywords in order to retrieve only one row from table1. LIMIT indicates that we only want one row retrieved and OFFSET indicates how many rows we want to skip before starting retrieving data. In this case OFFSET 0 indicates that the database should skip 0 rows and return the first row in table1. This is useful in order to retrieve all the table's rows one-by-one.
WAF Bypass #3
Looking back at the SQL query retrieved from the error produced by the database server, I noticed that using the FROM clause might not be necessary. The table1 table is aliased as t1 in the query using the AS keyword, and any of its columns can be referenced based on t1. You can query the column column1 of table1 as such:
This passes through the (misconfigured) Cloudflare WAF just fine and thus the payload in the body of the POST request can be transformed as such:
The subsequent SQL query on the server looks like this:
This works fine, but the limitation is that only data from table1 (or table2) can be extracted, as these are the only tables aliased in the server SQL query. Moving on to the final, successful attempt.
Final (and successful) attempt
All right, if I wanted to retrieve any piece of data I wanted from the database, I had to put aside WAF bypass technique #3. At this point it looked like there was no way to avoid using the FROM clause. It also looked like there was no way to successfully conceal the FROM clause into the payload without it being detected by Cloudflare's WAF. Great. It seemed like the answer to was I was looking for was not in the SQL query. I had to take a step back.
We've seen that the body of the request sent was a GraphQL query, which was then translated into an SQL query.
So my next attempt was to mutate the GraphQL query and somehow manage to conceal the FROM clause in there,
which would hopefully translate as a working SQL query on the server.
As I mentioned before a GraphQL query is structured similarly to a JSON object. Data in JSON is stored in a dictionary as name/value pairs which are both strings. GraphQL queries require string keys but allow arbitrary parameters. These rules apply to GraphQL:
- Data is separated by commas
- Curly braces hold objects
- Square brackets hold arrays
So a GraphQL query parameter could look like any of the following ways:
This got me thinking: what happens to the SQL query on the backend server if I pass the value of the object as an array instead of a string. In short, I wanted to break the SQL injection query and move the FROM clause to a different object in order to trick Cloudflare.
I turned this
The request bypassed the WAF and I got an error back from the database, complaining about a malformed SQL query and showing me the full resulting SQL query. The query at the injection point looked like this:
Notice the comma (,) right after SELECT column1? That was my golden ticket out of the SQLi-filtering hell. Passing the value of the GraphQL query parameter as an array is translated as a string in the backend SQL server. The string is simply the concatenation of the array's items separated by a comma and a space character! The SQL query was malformed at this point, but I could comment out the commas and have a valid, WAF-bypassing request, that retrieved whatever data I wanted from whichever database table I chose to!
This is the body of the final POST request:
and the resulting valid SQL query on the SQL server:
Full Speed Ahead
To make the table retrieving process easier, I wrote a script in Python to automate the process. The pseudocode of the script goes something like this:
And this is how I exploited GraphQL to craft an SQL injection that bypassed a misconfigured Cloudflare WAF instance and was able to retrieve the whole database in the back end. As I mentioned in the beginning, this combination of bypassing techniques do not work against a properly-configured Cloudflare WAF.
The safest way to mitigate SQL injections on your databases is prepared statements. These come in most database interaction libraries for most languages. You can find a full list of ways to mitigate SQL injections at OWASP.
I mentioned this last time, but it is my opinion that the shift-left approach is of the greatest importance when developing software. You can spend lots of money and time configuring IDSs, WAFs and controls in general, but the most important thing is for developers to give the appropriate attention to the security of the applications they are developing (or even better have a penetration tester have a go at it).
PS: Don't forget to sanitize your users' input.