When you want to quickly grep for something but the pattern is too elaborate, Semgrep comes in really handy. It’s a static analysis tool that has a lot of great use cases, but one usage I don’t hear about often is quickly writing disposable rules to validate an idea when reviewing code. So that’s what we’re going to do here!
Cross-site request forgery (CSRF) on GET
requests
Most mature web applications and frameworks will handle CSRF protections on POST
/PUT
/DELETE
requests automatically, however GET
requests are not supposed to do any state changing actions and have no CSRF projections. That’s where errors can slip in1! To quickly check for GET
CSRF I like to grep through all the GET
(or even HEAD
) routes and look for action words like create
, update
, delete
, etc. It’s a basic heuristic but it works well enough to catch mistakes and low-hanging fruits.
Ungreppable patterns
In some frameworks, like Ruby on Rails for example, route definitions are mostly one-liners:
get 'profile', action: :show, controller: 'users'
However some patterns are more complicated like this example from Kibana:
router.get(
{
path: '/internal/app_search/log_settings',
validate: false,
},
enterpriseSearchRequestHandler.createRequest({
path: '/as/log_settings',
})
);
This is where Semgrep will help.
One might say that the code snippet above isn’t too bad and could be grepped if we included some newlines in the regex, however the path
isn’t always in the same place and a Semgrep rule is much more reliable.
Workflow for building a rule
I want to match routes defined as in the snippet above where the first path
sounds like a state-changing action.
The first part of the workflow is to actually know the tool you’re working with! Read the documentation about writing rules so you know the features at your disposition. From reading the documentation, I know that metavariable-regex
is going to be useful to me here.
Using the playgroud
Semgrep.live is a playground where you can quickly test your rules with the latest version of Semgrep from the comfort of your browser. (Note: I wrote this a few months ago and the editor doesn’t look the same anymore! The workflow still works, don’t worry about it)
Let’s start a new rule by setting TypeScript as the target programming language and pasting the Kibana code from the beginning of this blog post.
What I’m lookin for here are path: "something"
patterns inside a router.get(...)
call so I will express that in semgrep terms. The semgrep code is very close to the sentence I just wrote!
It matches both occurences of path
but that’s perfectly fine. Here’s a quick breakdown of how the rule works, but really, read the documentation. :)
...
works a bit like .*
would in a regular expression; it will match anything and conveniently looks a lot like what someone would intuitively write to express that idea in a sentence- “and is inside” (or
pattern-inside
as we’ll see soon) tells Semgrep to look for the path:
pattern only in specific places $PATH
in path: $PATH
tells Semgrep that I want whatever is assigned to path
to be saved in the $PATH
variable
We’re almost there already! The most important part is missing however, actually matching only on action names that “sound” state-changing. To do this, let’s switch to the Advanced tab of the playground. While I’m there I’ll give a meaningful id
to my rule and will set languages to be TypeScript and JavaScript because both are used in Kibana.
This is where having read the documentation (have I mentioned that already?) is going to pay off, otherwise things might start looking a little cryptic. The playground is now showing the YAML representation of the rule I was writing over in the Simple tab. A few things to take note of:
“code is path: $PATH
” was translated to
Starting a value with |
is one of the many many (too many) ways to define a string in YAML. -pattern: "path: $PATH"
would have been equivalent but as patterns are frequently multi-line the |
way to express strings is useful.
“and is inside router.get(...)
” was translated to
- pattern-inside: router.get(...)
- Both of those are nested under
patterns
which allows you to use multiple patterns and apply a logical “and” to all of them. pattern-either
exists when an “or” is desired and they can be combined and nested at will. - There’s a
message
attribute that semgrep will print when it finds a match. - There’s a
severity
as well, I’ll keep WARNING
here given that this isn’t going to be void of false positives, but I might use ERROR
when I’m really confident in a rule.
For the last part, I want Semgrep to find action words in the last segment of the path present in $PATH
.
The documentation for metavariable-regex
mentions the following:
The metavariable-regex
operator searches metavariables for a Python re
compatible expression. This is useful for filtering results based on a metavariable’s value. It requires the metavariable
and regex
keys and can be combined with other pattern operators.
This is precisely what I’m looking for. metavariable
is $PATH
and regex
is ^.*/[^/]*(create|update|delete)[^/]*$
(see it in action on regex101 if you’re not super comfortable with regular expressions yet).
It didn’t match anything in my code snippet (which was expected) so I added another one with a made-up vulnerable pattern to validate that it works.
To polish things up I changed the message to message: Check $PATH for GET CSRF
and Semgrep will replace the value of $PATH
with the actual path in the output.
This is what the final rule looks like:
rules:
- id: kibana_get_csrf
patterns:
- pattern: |
path: $PATH
- pattern-inside: router.get(...)
- metavariable-regex:
metavariable: $PATH
regex: ^.*/[^/]*(create|update|delete)[^/]*$
message: Check $PATH for GET CSRF
languages: [ts, js]
severity: WARNING
My real rule has more words for the “action word” regex but I leave that as an exercise to the reader.
Use your rule
Now that the rule is written, it’s time to use it! Save the rule in a file and run Semgrep (output slighly trimmed to keep only the relevant bits):
$ semgrep scan --config kibana_get_csrf.yml --metrics off
x-pack/plugins/enterprise_search/server/routes/app_search/curations.ts
kibana_get_csrf
Check '/internal/app_search/engines/{engineName}/curations/find_or_create' for GET CSRF
110┆ path: '/internal/app_search/engines/{engineName}/curations/find_or_create',
⋮┆----------------------------------------
kibana_get_csrf
Check '/as/engines/:engineName/curations/find_or_create' for GET CSRF
121┆ path: '/as/engines/:engineName/curations/find_or_create',
x-pack/plugins/enterprise_search/server/routes/workplace_search/sources.ts
kibana_get_csrf
Check '/internal/workplace_search/sources/create' for GET CSRF
924┆ path: '/internal/workplace_search/sources/create',
⋮┆----------------------------------------
kibana_get_csrf
Check '/ws/sources/create' for GET CSRF
942┆ path: '/ws/sources/create',
And we have two findings! The second one isn’t a CSRF, it’s part of an OAuth flow and there’s a CSRF token passed as a query parameter but the first finding was indeed a real CSRF. It was reported to the Elastic bug bounty program and was fixed in version 8.4.0 by changing the route to require POST
.
Conclusion
With this quick walkthrough I hope that you can feel more confident to start using Semgrep. It’s a really fun tool that sits in between grep and more in-depth code analysis tools like CodeQL. Their support for certain languages isn’t quite there yet, but report all the bugs you find and contribute to making it better. There’s an easy link to report bugs in the playground. I’ve reported a few myself and the team is always super helpful.
PS: I know I kind of sound like I was sponsored to write this, but I swear I’m not affiliated with Semgrep or the company making it (r2c). :)
1: GraphQL APIs can be another interesting vector for CSRF but I won’t cover that here