Prompt Injection
Andrea Hauser
This how secure GraphQL really is
In order to understand ways to attack GraphQL implementations, you must first understand the basics of GraphQL itself. GraphQL offers a way to query data and get real-time updates. In this article queries and manipulations are discussed.
Querying data looks like this:
query{ human(id: "1000") { name height } }
This query for the person with the ID 1000
is then answered as follows:
{ "data": { "human": { "name": "Luke Skywalker", "height": 1.72 } } }
It is noticeable that the received data has the same structure as the issued query. The data in the example above is displayed nicely, but the whole thing looks usually a bit less nice in the request body of the POST request:
{"operationName":null,"variables":{},"query":"{\n human(id: \"1000\") {\n name\n height\n }\n}\n"}
And the response body looks like this:
{"data":{"human":{"name":"Luke Skywalker","height":1.72}}}
The example for mutations will introduce variables at the same time. The mutation itself looks like this:
mutation ($ep: Episode!, $review: ReviewInput!) { createReview(episode: $ep, review: $review) { stars commentary } }
At the same time the variables $ep
and $review
must be defined, this is done as follows:
{ "ep": "JEDI", "review": { "stars": 5, "commentary": "This is a great movie!" } }
In the POST request body the whole thing looks like this:
{"operationName":null,"variables":{"ep":"JEDI","review":{"stars":5,"commentary":"This is a great movie!"}},"query":"mutation ($ep: Episode!, $review: ReviewInput!) {\n createReview(episode: $ep, review: $review) {\n stars\n commentary\n }\n}\n"}
The response to this mutation is as follows:
{ "data": { "createReview": { "stars": 5, "commentary": "Test is a great movie!" } } }
Besides the effective creation of the review with createReview(...)
, this example also asks for the result of the action just performed. These are stars
and commentary
in the curly brackets after createReview(...)
. Querying the result of the mutation is not very interesting in this example, but if there was a mutation like login(username,password)
, the result could be the login status and the authentication token.
This covers the most important basics which are necessary for further understanding. The basic examples of GraphQL used in this article come directly from graphql.org. For further examples and explanations it is recommended to use them directly, because the documentation available there is comprehensive and understandable.
The next sections each deal with concrete attack possibilities on a GraphQL endpoint.
In the simplest cases, the GraphQL endpoint is addressed with the URL /graphql
. However, the endpoint can be freely chosen by developers. If the request does not go to /graphql
or something similar, the POST request body must be examined more closely. If it contains keywords like query
or mutation
or has many characters like \n
in it, it can also be a GraphQL endpoint. How POST Request and Response Bodies look like in detail has already been discussed in the introduction.
Another way to identify GraphQL endpoints is to deliberately provoke error messages. If the following is sent to a GraphQL endpoint query={}
and the response contains something in this direction "Syntax Error: Expected Name, found }"
it can be assumed that it is a GraphQL endpoint.
It is also important to note that GraphQL endpoints can be addressed via both GET and POST requests. Both variants should always be tried.
If not properly disabled, appending ?debug=1
to the URL can enable descriptive error messages and stacktraces. For attacks on GraphQL endpoints, these stacktraces can make debugging of failed attacks much easier. It should be ensured that in production environments no technical error messages can be viewed and the debug URL has been disabled.
Introspection queries are queries that allow you to query the entire schema of an application; that is, to list all possible queries and mutations as well as all defined types. The default setups of GraphQL do not prevent introspection queries. The information that can be obtained by an introspection query makes the work of an attacker easier. A comprehensive introspection query looks like this:
query IntrospectionQuery { __schema { queryType { name } mutationType { name } subscriptionType { name } types { ...FullType } directives { name description locations args { ...InputValue } } } } fragment FullType on __Type { kind name description fields(includeDeprecated: true) { name description args { ...InputValue } type { ...TypeRef } isDeprecated deprecationReason } inputFields { ...InputValue } interfaces { ...TypeRef } enumValues(includeDeprecated: true) { name description isDeprecated deprecationReason } possibleTypes { ...TypeRef } } fragment InputValue on __InputValue { name description type { ...TypeRef } defaultValue } fragment TypeRef on __Type { kind name ofType { kind name ofType { kind name ofType { kind name ofType { kind name ofType { kind name ofType { kind name ofType { kind name } } } } } } } }
The Response to such an introspection query is usually very large, processing it manually and extracting the exciting information from it is very time-consuming. There are tools like GraphQL Voyager, which can visualize the results of introspection queries. Such visualizations can then be evaluated and the exciting areas can be approached more specifically. However, it may also be that this visualization makes it clear that data fields can be accessed that should not be public.
In order to prevent the simple and complete evaluation of the entire schema, Introspection should be deactivated in the productive environment. Deactivating GraphQL Introspection varies in simplicity depending on the language used. For PHP, for example, Introspection can be deactivated as follows:
<?php use GraphQL\GraphQL; use GraphQL\Validator\Rules\DisableIntrospection; use GraphQL\Validator\DocumentValidator; DocumentValidator::addRule(new DisableIntrospection());
The exact details for other implementation languages used can be found most easily in the documentation of the implementation itself.
Besides Introspection itself, developers can also use GraphiQL. GraphiQL is the GraphQL IDE. The IDE can be reached for example under endpoints like /graphiql
, /graphql/console
or /__graphql
. This makes the same information available as with Introspection, but in a nice and readable form.
GraphiQL should only be used in development or test environments that are not available to the public.
If queries are written incorrectly, loops between two queries may occur. These can then be nested further and further into each other and thus lead to overloading of the server and thus to denial of service. As an example an application is presented, which lists authors and their works. With a query an author can be queried, the result contains his works. In addition, there is a query where the author of a work can be queried. These two queries can be nested as deeply as desired.
query { author(id:1) { books { author { books { author { books { author { … } } } } } } } }
There are different approaches how to solve this problem. On the one hand depth limiting can be implemented. A maximum limit is set for how deep queries may be nested within each other. On the other hand, a maximum limit can be set for the execution time of a query and the execution of a query is aborted after this limit.
In the GraphQL documentation batching is described as follows:
Without additional consideration, a naive GraphQL service could be very “chatty” or repeatedly load data from your databases. This is commonly solved by a batching technique, where multiple requests for data from a backend are collected over a short period of time and then dispatched in a single request to an underlying database or microservice.
This means that a request can contain several mutations at once, which are then processed one by one. This allows, for example, brute force attacks on login procedures and two-factor mechanisms with tokens. Since only a single large POST request is required for this, it is possible that such a brute force attack is not recognized by classic WAFs. To prevent such attacks, brute force attacks must be recognized and handled in the business logic.
Besides all the GraphQL specific vulnerabilities, it should not be forgotten that the internal services and databases behind the GraphQL API may suffer from classic vulnerabilities from the OWASP Top 10 or similar. GraphQL also does not implement _authentication and authorization, the implementation of these concepts is left to the developers in the backend and should also always be checked whenever a GraphQL-API is encountered.
With every new technology there are more pitfalls for developers. The attacks listed here and their countermeasures can be used as a first step to implement a more secure GraphQL application.
Our experts will get in contact with you!
Andrea Hauser
Andrea Hauser
Andrea Hauser
Andrea Hauser
Our experts will get in contact with you!