Regular Expression Denial of Service

(693 views)

Service uptime and availability is a crucial factor that determines the success of online businesses. In a rapidly evolving world, it is important that transactions remain productive, and they help clients meet their business goals in a timely manner. A downtime in service availability is a form of attack known as Denial of Service. In this article, we will be learning about a specific form of Denial-of-Service attack that can be caused by regular expression malfunctioning. In a Snyk report published in 2019, ReDoS attack has spiked by 143% in Node applications.

To better understand ReDoS, or Regex DOS, let us understand some under-the-hood concepts about Regular Expressions and how are they consumed in modern web applications.

Regular Expressions: A Primer

A regular expression is a pattern consisting of a sequence of characters that define a search pattern. They are a powerful tool for manipulating text, but they can be complex and difficult to master. They can match a wide range of patterns, including specific characters, words, numbers, and symbols, as well as more complex patterns, such as groups, repetitions, and alternations. Regular expressions use special characters and operators, such as * (zero or more), + (one or more), ? (zero or one), and | (or), to create these patterns. Proper testing and optimization are necessary to avoid performance issues and vulnerabilities, such as ReDoS attacks.

To better understand a regex, consider the following regular expression:

^[A-Za-z]+$

This regular expression matches any string of letters from A to Z, both upper and lowercase. The "^" and "$" symbols indicate the beginning and end of the string, respectively. The square brackets "[ ]" specify a range of characters, and the "+" symbol indicates that the preceding character or group can appear one or more times.

Hence, a string like “ExpectoPatronum” would match this expression, whereas the string “Expecto12Patronum” would fail to match as it contains non-alphabetic characters.

Regular Expression DoS

Building upon this concept of regular expression, imagine if a regex pattern is such that it gets stuck when a certain string is fed into it. This form of regex pattern is what is called an Evil Regex. A Regular Expression Denial of Service (ReDoS) attack exploits the way some software processes regular expressions, or in other words, an evil regex.

An attacker can craft a specially designed input that triggers a regular expression to perform many unnecessary computations. This can cause the software to freeze or crash, and consume excessive resources, such as CPU time and memory, rendering the system unusable.

For example, consider the following Regular Expression:

(g+)+

If I supply this input - gggggggggggggggggggggggg! – then this input is going to make the regex loop forever. As a result, the dependant library/application that is using this regex pattern to validate some input is going to hang and may even crash. The entire world wide web makes use of regular expression in every layer to perform certain functionality. In the next section, we will understand how regular expression is included as a functionality in different components.

Vulnerable Software Impacted by ReDoS

Any software application that makes use of regular expressions is vulnerable to ReDoS attack. Some examples of ReDoS attacks include:

1. Web Applications: Web applications use regular expressions for input validation. This includes applications that validate user input such as passwords, usernames, and email addresses. This opens the potential of ReDoS attack in web applications. The vulnerability may lie on both client-side and server-side. There are mainly two strategies for input validation using regex:

  • a. Whitelisting: Accept the known good by validating the entire user input. This type of regex only accepts valid input that matches the regex engine.
  • b. Blacklisting: Reject anything that matches any of the vulnerable commands/inputs. In this scenario, the regular expression can be used for identification of attack fingerprints.

Many programmers are unaware of evil regex. The QA testing that is done on the web application mainly focuses on checking valid inputs. However, an attacker exploits invalid inputs. The regex engine running inside the web application will try all the existing paths until it rejects the input. Another reason ReDoS plagues web applications is because of the simplicity of the attack. The code base of the application is often open-source and the same regex pattern is being consumed on both client-side and server-side. There is also a lack of dynamic tools for regex evaluation.

  1. Text Editors: Text editors allow a user to edit text or write source code files in different programming languages. This type of software typically comes with a search and replace functionality that makes use of regular expression to match an item that the user is queried for. Consequently, they open surface for ReDoS attacks. This includes popular text editors like Vim, Sublime Text, and Atom. One such interesting case study is about CKEditor 5. This application is an online text editor that allows text to be exported to markdown. During the export process, it prevents default escaping of links. The regular expression used to recognize links can run extremely slowly when an attacker supplies a creative input. This issue may lead to denial of service.
  1. Command-line Utilities: Command-line utilities that use regular expressions for search and replace, such as grep and sed, are also vulnerable to ReDoS attacks. A vulnerable command line utility that can be compromised using regex is gnomon. This tool was found to be vulnerable via the following command containing the evil regex:
node -p "/^(\w+\s?)*$/.test('A long sentence with invalid characters that takes so much time to be matched that it potentially causes our CPU usage to increase drastically!!!')" | gnomon

While executing this command, it is observed that the application process consumed tremendously high CPU percent to process and execute the given regular expression. This happens because of the following reasons:

  • The engine first goes through the input and tries to match the content contained in parentheses \w+\s?. This is called backtracking feature.
  • As the quantifier + is greedy, it tries to find as many valid words as it can, so it returns a long sentence with invalid characters that takes so much time to be matched. This in turn potentially causes our CPU usage to increase.
  • The star quantifier (\w+\s?)* is applied, but there are no more valid words in the input, so it doesn’t return anything.
  • Due to the $ quantifier in our pattern, the regex engine tries to match the end of the input. Still, we have an invalid word, “drastically!!!”, so there’s no match.
  • The engine moves one step back to the previous position and tries to take a different path, hoping to find a match. Hence, the quantifier + decreases the count of repetitions, backtracks by one word, and tries to match the rest on the input — which, in this case, is “A long sentence with invalid characters that takes so much time to be matched that it potentially causes our CPU usage to”.
  • The engine then continues its search from the following position: the * quantifier can be applied again and matches the word increase. Remember, we have the $ quantifier; the engine uses that, but it fails to match drastically!!! Again.

4. Programming Languages and Libraries: Many programming languages and libraries use regular expressions for tasks such as input validation and pattern matching. Examples include JavaScript, Python, Ruby, .NET, and Java. To better understand how a programming language can be impacted, let us look at the interesting case study of CVE-2023-28755. This impacted a Ruby gem’s URI component. The URI parser was improperly handling invalid URLs that contained specific characters. Consequently, it led to an increase in execution time for parsing strings to URI objects. This increase in time led to a denial-of-service attack.

PoC

For better understanding of this vulnerability, I present my experiment with an online demo that has been shown for the CKEditor. The demo is available for you to try at: https://ckeditor.com/docs/ckeditor5/latest/features/markdown.html. Upon landing to the demo page, I found a demo of the markup editor tool.

The section highlighted in red is our target area to be checked. In the input box, I supplied the following input:

https://''''''''''''''''''''''''''''''''''''''''''''''''''''''

The application responded normally with the same output. The moment I tried to add some extra slashes in-between, as seen in the screenshot, the application stopped rendering the exact input. The regex engine in the backend got stuck in parsing the URL scheme that I had supplied. I left the page open in a browser tab for almost a day and still it could not parse the URI scheme.

I even made the input box blank, but the output remained the same as the one that I had previously supplied. This is how a regex DoS vulnerability works in a web text editor application.

Fixes

ReDoS is a critical vulnerability as it can bring down services and impact availability to different users. Hence, security teams must be able to prevent/detect it before the code logic moves into production.

Avoid Dynamic Regex Pattern

In a dynamic regular expression, the regex pattern is dependent on the content of the input text. As a result, it dynamically evaluates with more proximity to the input that is desired by the application. However, they are not suitable as any malicious input by an attacker can cause the regex pattern to hang and make the application unresponsive. Dynamically built input-based regex pattern must be avoided at all costs. If there is a need to use the same, it must be done after it is ensured that it is appropriately sanitized.

Check Before Implementation

Any regex pattern should be checked for ReDoS safety prior to being included in the functionality of an application. Some tools/methods for testing the same are:

1. a) Dynamic regex testing: Here you may try testing in the following manner:

    • Technique 1:
      • Attempt to penetrate the system with different inputs.
      • Observe the response time of the system. If it increases – try to repeat the characters of a given input.
      • In case there is a delay observed in response time, chances are that the application may be using an evil regex and there is a chance that application may be impacted by ReDoS.
    • Technique 2:
      • In places where the application is accepting input, try injecting an invalid escape sequence like “\m”.
      • If there is a difference observed in the usual response and the response received after this invalid response is received, there is a chance that the application may be vulnerable to ReDoS.

2. Penetration testing

3. Fuzzing – It can be achieved using various automated tools that are used for fuzzing. Input list can be supplied, and response time and code may be noted against each type of input.

4. Static regex code analysis: Here one can try to check in the following manner:

    • Analyse the source code to check for regex patterns being used.
    • For every regular expression pattern that you find, check whether it contains any evil pattern or whether it accepts user-defined input into any parameter.
    • If such pattern is found, then there is a possibility that you can be DoS-ed by regular expression.

References:

June 9, 2023
Subscribe
Notify of
guest
1 Comment
Newest
Oldest Most Voted
Inline Feedbacks
View all comments
Lizzy Agnes
7 months ago

A great hacker is really worthy of good recommendation , Henry
really help to get all the evidence i needed against my husband and
and i was able to confront him with this details from this great hacker
to get an amazing service done with the help ,he is good with what he does and the charges are affordable, I think all I owe him is publicity for a great work done via, Henryclarkethicalhacker at gmail com, and you can text, call him on whatsapp him on +12014305865, or +17736092741, 

© HAKIN9 MEDIA SP. Z O.O. SP. K. 2023
What certifications or qualifications do you hold?
Max. file size: 150 MB.
What level of experience should the ideal candidate have?
What certifications or qualifications are preferred?

Download Free eBook

Step 1 of 4

Name(Required)

We’re committed to your privacy. Hakin9 uses the information you provide to us to contact you about our relevant content, products, and services. You may unsubscribe from these communications at any time. For more information, check out our Privacy Policy.