The MarkdownTime Vulnerability Explained & How to Prevent It

Everybody is familiar with downtimes in major services. It can be very frustrating when a platform your organization depends upon becomes unavailable. And when it comes to a critical part of your software supply chain, downtime means your production pipeline stops working, and basically, your entire software factory is down. The damage can be very expensive. Now, imagine what would happen if a bad actor finds a vulnerability that allows an unauthenticated user to take down business critical infrastructure with one line of code...

In this article, we will explore "MarkDownTime" - a vulnerability we found in a very popular implementation of the markdown engine and the Denial-of-Service (DoS) attack that it could cause on dependent projects, such as GitHub and GitLab. Software supply chains can contain multiple looming threats and vulnerable dependencies. When a popular library is vulnerable to an easy-to-exploit attack, it will potentially cause millions of organizations to be vulnerable. Many commercial products might use libraries such as the ones we'll discuss, so users can be exposed to threats without even knowing. This is a call for action – all vendors using Markdown need to check if they are using a vulnerable implementation, as described below, and take the necessary actions to avoid being attacked.

What is Markdown?

Markdown is a language for formatted text using a plain text editor. Since so many projects require this functionality, there are multiple implementations for multiple purposes and languages, following the CommonMark Spec. One of the most popular variants is GitHub’s implementation - GFM (GitHub Flavored Markdown), which introduced various extended features, such as syntax highlighting, task lists, and tables. GFM is one of the most popular variants of markdown implementations, being internally used in multiple libraries that serve as Markdown support for popular languages, such as RubyGem, Go, Haskell, Lua, Perl, Python, and R.

Software Supply Chain Vulnerabilities

The problem with vulnerabilities in popular 3rd party libraries is that it’s incredibly complicated to track which projects are affected. We’re not talking about forks that are easily updated when a bug is found but uncontrolled copies of the original vulnerable code. Each Markdown implementation has a different codebase; some were created to support different languages (Javascript, Golang, C, Ruby, etc.), and some were created to implement various features. And when such a library becomes popular and widespread - a vulnerability inside of it could potentially enable an attack on millions of projects.

Let’s look at commonmarker, RubyGem’s most popular markdown parsing library, for example. This library has over 1 million dependent projects, with more than 27 million downloads of the RubyGem. Its internal implementation is based on a copy of GFM. So a vulnerability in GFM code will affect not only its GitHub project and all of its dependents - but also commonmarker and all of its dependents. The same goes for any other implementation of GFM, and the picture becomes chaotic. It’s practically impossible to keep track of all affected projects.

During our research, we found two SCM services susceptible to MarkDownTime – GitHub and GitLab – and reported it to them. All issues were fixed.

This article will explore the exploit on GitHub using commonmarker RubyGem. GitHub exposed the markdown engine by a REST API that, in some cases, did not require authentication. As a result, an unauthenticated attacker could easily bring down any GitHub server, shutting down production pipelines. Rate limiting does not prevent the attack from taking down the service.

GitHub Vulnerability and Exploitation

GitHub uses commonmarker as its markdown engine. As stated above, commonmarker is a Ruby gem wrapper for libcmark-gfm. Issuing a markdown request to the /markdown API endpoint, with a specially crafted text, results in high CPU usage for 60 seconds (request timeout).

Steps to Reproduce

MarkDownTime is triggered by issuing an HTTP POST request to /markdown with the following arguments:

Set the text field to the following value (provided a Python one-liner to avoid bloating)
- The script: python3 -c 'print("![l"* 100000 + "\n")
Set the gfm field to either True/False (both cases are vulnerable).

The markdown API authorization is configurable - so in some cases, it is possible to attack without authorization (e.g., GitHub cloud and enterprise configuring markdown to be non-authorized). Multiple requests can be issued in parallel to create a larger impact. Unauthenticated users can leverage MarkDownTime to create a DDoS attack on GitHub cloud services. Rate limiting doesn’t help, as a small number of requests is enough to initiate the attack.

Note: this vulnerability affected both GitHub.com and GitHub Enterprise Server. See GitHub's documentation on Markdown for reference.

What Is the Bug Behavior?

When rendering a markdown request, GitHub uses commonmarker ruby package with the autolink extension. This allows a polynomial attack, causing DoS. The first entry to the .so is calling markdown_to_html from commonmarker.rb: Code snippet for HTML rendering method with UTF-8 encoding

markdown_to_html is being exported by commonmarker.so, calling static VALUE rb_markdown_to_html which calls cmark_parser_finish which eventually calls parse_inline in a loop from subj - which is the text given by the user.
So - it peeks and advances every character within the user text. The flow we are interested in is this: Switch statement in C code for handling character parsing in a custom parser

push_bracket adds a bracket to subj, with image true and active true.
Eventually, try_extensions calls the autolink flow, which loops all created brackets if they got image true and active true (from autlink.c:match function) Conditional C code checking inline parser in brackets, returning NULL if true.

From autolink.c: cmark_inline_parser_in_bracket: C code snippet for inline parser checking image bracket status

So, if we'll repeat “![l” multiple times as our user text for the markdown API issuing, each occurrence will create a bracket, and every “l” (default flow) will loop all created brackets, making it polynomial, causing DoS.

Commonmarker Exploitation

This gif will present how the exploitation happens within the commonmarker engine, according to the understanding we have acquired in the last section.

Commonmarker DoS Impact

After 60 seconds (timeout), the request fails. Meanwhile, on the server end, a single CPU is burnt out. Issuing multiple requests in parallel results in multiple CPUs being burnt out. The entire server is completely unavailable by repeatedly sending a small number of requests.

Commonmarker CVE Disclosures

We’ve reported the vulnerability to both GitLab and GitHub, as well as to the commonmarker RubyGem maintainer. The report has been acknowledged for all implementations - GitLab issued CVE-2022-2931 and solved it internally. GitHub issued CVE-2022-39209, solved it on cmark-gfm and afterwards fixed all of their products. commonmarker’s maintainer fixed it, issuing a security advisory GHSA-4qw4-jpp4-8gvp.

Vulnerability Disclosure Timeline

[2022–04–18] Reported the vulnerability to GitLab
[2022–05–10] GitLab verified the vulnerability
[2022–06–29] Reported the vulnerability to GitHub
[2022-06-30] GitHub verified the vulnerability
[2022-07-27] Reported the vulnerability to gjtorikian (commonmarker's maintainer)
[2022-08-30] GitLab shipped a fix for the vulnerability with version 15.3.2
[2022-09-21] commonmarker RubyGem shipped a fix for the vulnerability with version 0.23.6
[2022-11-18] GitHub confirmed remediation throughout their products and provided approval for this writeup

How do I know if I’m vulnerable? And what should I do if I am?

If your product supports markdown capabilities, try to understand whether or not it’s vulnerable by following one of these options:

Which markdown library is used by your project?
1. Are you directly using cmark-gfm? Make sure it is updated to at least 0.29.0.gfm.6.
2. Are you using the Ruby wrapper, commonmarker? Make sure it is updated to at least v0.23.6.
3. If you’re using another markdown library, check if it’s based on GFM. Check if it contains the vulnerable autolink extension as described above. Understanding cmark-gfm’s fix can help with this assessment. If you find the vulnerability, it is highly advised to contact the maintainer and alert about the issue.
If you don’t know which markdown library is used, you can attempt to run the exploit - but make sure you’re not doing it in production environment since a successful attempt will make your services unavailable. To do this, you’ll need to input the vulnerable string (as described above under “Steps to reproduce”) to your markdown renderer. If you see very high CPU usage for a long time, hanging services, or any other indicator of DoS, you are vulnerable. If that’s the case, it is recommended to go back to step (1) and put extra effort into understanding which library is used in order to fix the vulnerability.
Contact us! We would be happy to help.

Vulnerable Dependencies - The Struggle

When you find vulnerable code within your organization, it is usually an in-house fix - you understand the bug and fix it ASAP. However, much of the code of most organizations is comprised of 3rd party libraries that are maintained by other people (if at all). That can cause trouble.

During the process of disclosing MarkDownTime, we experienced why addressing vulnerable dependencies can be so challenging: we first encountered the issue in GitLab, and after further investigation, we realized the issue originates in commonmarker. In our report to GitLab, we stressed the fact that the problem should be fixed in the library, and they tried to contact the library's maintainer but didn't succeed and eventually decided to fix it in-house by circumventing the underlying bug (exploitable data will be blocked before being rendered by the commonmarker library). We also talked directly to the library's maintainer, explaining everything about the bug and also suggesting a fixing pull request, but unfortunately, he told us, “I have no plan for a fix. I don't care”. Fortunately, after discovering that the GitHub platform is also vulnerable, they fixed the issue in their library and succeeded in making him fix the bug and issue a security advisory.

The process can be frustrating and, as you can see, can take months to fix completely. GitHub and GitLab are just two organizations we noticed that are vulnerable, but there are obviously many more products out there that are still susceptible to this attack. The entire software world is interconnected in a way we cannot anticipate. These dependencies often lead to catastrophic or unexpected consequences, such as discovering a widespread vulnerability affecting many products, as we saw in the Log4Shell incident.

There is a need for a better visibility and trust mechanism. To KNOW what components are used everywhere (SBOM anyone?) and to have a better and faster exchange of vulnerability and risk info.

Software Supply Chain Attacks - How We Can Help

The Legit Security Platform is designed to remediate supply chain threats - in this case, the integrated SBOM and tools like Dependabot will report the use of a bad version of commonmarker. Legit is a comprehensive platform; dependencies threats are just one of many security issues and vulnerabilities we detect, including SDLC misconfigurations, pipeline issues, IaC, Secrets, SCA, etc. We also try to help the software supply chain security community by creating open-source projects, finding software supply chain bugs, and helping to remediate them.

The MarkdownTime Vulnerability: How to Avoid DoS Attack on Business