Everybody is familiar with downtimes in major services. It can be very frustrating when a platform your organization depends upon becomes unavailable. And when it comes to a critical part of your software supply chain, downtime means your production pipeline stops working, and basically, your entire software factory is down. The damage can be very expensive. Now, imagine what would happen if a bad actor finds a vulnerability that allows an unauthenticated user to take down business critical infrastructure with one line of code...
In this article, we will explore "MarkDownTime" - a vulnerability we found in a very popular implementation of the markdown engine and the Denial-of-Service (DoS) attack that it could cause on dependent projects, such as GitHub and GitLab. Software supply chains can contain multiple looming threats and vulnerable dependencies. When a popular library is vulnerable to an easy-to-exploit attack, it will potentially cause millions of organizations to be vulnerable. Many commercial products might use libraries such as the ones we'll discuss, so users can be exposed to threats without even knowing. This is a call for action – all vendors using Markdown need to check if they are using a vulnerable implementation, as described below, and take the necessary actions to avoid being attacked.
What is Markdown?
Markdown is a language for formatted text using a plain text editor. Since so many projects require this functionality, there are multiple implementations for multiple purposes and languages, following the CommonMark Spec. One of the most popular variants is GitHub’s implementation - GFM (GitHub Flavored Markdown), which introduced various extended features, such as syntax highlighting, task lists, and tables. GFM is one of the most popular variants of markdown implementations, being internally used in multiple libraries that serve as Markdown support for popular languages, such as RubyGem, Go, Haskell, Lua, Perl, Python, and R.
Software supply chain vulnerabilities
Let’s look at commonmarker, RubyGem’s most popular markdown parsing library, for example. This library has over 1 million dependent projects, with more than 27 million downloads of the RubyGem. Its internal implementation is based on a copy of GFM. So a vulnerability in GFM code will affect not only its GitHub project and all of its dependents - but also commonmarker and all of its dependents. The same goes for any other implementation of GFM, and the picture becomes chaotic. It’s practically impossible to keep track of all affected projects.
During our research, we found two SCM services susceptible to MarkDownTime – GitHub and GitLab – and reported it to them. All issues were fixed.
This article will explore the exploit on GitHub using commonmarker RubyGem. GitHub exposed the markdown engine by a REST API that, in some cases, did not require authentication. As a result, an unauthenticated attacker could easily bring down any GitHub server, shutting down production pipelines. Rate limiting does not prevent the attack from taking down the service.
GitHub vulnerability and exploitation
GitHub uses commonmarker as its markdown engine. As stated above, commonmarker is a Ruby gem wrapper for libcmark-gfm. Issuing a markdown request to the /markdown API endpoint, with a specially-crafted text, results in high CPU usage for 60 seconds (request timeout).
Steps to reproduce
MarkDownTime is triggered by issuing an HTTP POST request to /markdown with the following arguments:
textfield to the following value (provided a python one-liner to avoid bloating)
python3 -c 'print("![l"* 100000 + "\n")
gfmfield to either True/False (both cases are vulnerable).
The markdown API authorization is configurable - so in some cases, it is possible to attack without authorization (e.g. GitHub cloud and enterprise configuring markdown to be non-authorized); Multiple requests can be issued in parallel to create a larger impact. Unauthenticated users can leverage MarkDownTime to create a DDoS attack on GitHub cloud services. Rate limiting doesn’t help, as a small number of requests is enough to initiate the attack.
What is the bug behavior?
When rendering a markdown request, Github uses commonmarker ruby package, with the autolink extension. This allows a polynomial attack, causing DoS. The first entry to the .so, is by calling
markdown_to_html from commonmarker.rb:
markdown_to_html is being exported by commonmarker.so, calling
static VALUE rb_markdown_to_html which calls
cmark_parser_finish which eventually calls
parse_inline in a loop from
subj - which is the text given by the user.
So - it peeks and advances every character within the user text. The flow we are interested in is this:
push_bracket adds a bracket to
image true and
try_extensions calls the autolink flow, which loops all created brackets if they got
image true and
active true (from autlink.c:
So, if we'll repeat “![l” multiple times as our user text for the markdown API issuing, each occurrence will create a bracket, and every “l” (default flow) will loop all created brackets, making it polynomial, causing DoS.
This gif will present how the exploitation happens within the commonmarker engine, according to the understanding we have acquired in the last section.
Commonmarker DoS Impact
After 60 seconds (timeout), the request fails. Meanwhile, on the server end, a single CPU is burnt out. Issuing multiple requests in parallel results in multiple CPUs being burnt out. The entire server is completely unavailable by repeatedly sending a small number of requests.
Commonmarker CVE Disclosures
We’ve reported the vulnerability to both GitLab and GitHub, as well as to the commonmarker RubyGem maintainer. The report has been acknowledged for all implementations - GitLab issued CVE-2022-2931 and solved it internally. GitHub issued CVE-2022-39209, solved it on cmark-gfm and afterwards fixed all of their products. commonmarker’s maintainer fixed it, issuing a security advisory GHSA-4qw4-jpp4-8gvp.
Vulnerability Disclosure Timeline
[2022–04–18] Reported the vulnerability to GitLab
[2022–05–10] GitLab verified the vulnerability
[2022–06–29] Reported the vulnerability to GitHub
[2022-06-30] GitHub verified the vulnerability
[2022-07-27] Reported the vulnerability to gjtorikian (commonmarker's maintainer)
[2022-08-30] GitLab shipped a fix for the vulnerability with version 15.3.2
[2022-09-21] commonmarker RubyGem shipped a fix for the vulnerability with version 0.23.6
[2022-11-18] GitHub confirmed remediation throughout their products and provided approval for this writeup
How do I know if I’m vulnerable? And what should I do if I am?
If your product supports markdown capabilities, try to understand whether or not it’s vulnerable by following one of these options:
Which markdown library is used by your project?
Are you directly using cmark-gfm? Make sure it is updated to at least 0.29.0.gfm.6.
Are you using the Ruby wrapper, commonmarker? Make sure it is updated to at least v0.23.6.
If you’re using another markdown library, check if it’s based on GFM. Check if it contains the vulnerable autolink extension as described above. Understanding cmark-gfm’s fix can help with this assessment. If you find the vulnerability, it is highly advised to contact the maintainer and alert about the issue.
If you don’t know which markdown library is used, you can attempt to run the exploit - but make sure you’re not doing it in production environment since a successful attempt will make your services unavailable. To do this, you’ll need to input the vulnerable string (as described above under “Steps to reproduce”) to your markdown renderer. If you see very high CPU usage for a long time, hanging services, or any other indicator of DoS, you are vulnerable. If that’s the case, it is recommended to go back to step (1) and put extra effort into understanding which library is used, in order to fix the vulnerability.
Contact us! We would be happy to help.
Vulnerable Dependencies - the struggle
When you find vulnerable code within your organization, it is usually an in-house fix - you understand the bug, and fix it ASAP. However, much of the code of most organizations is comprised of 3rd party libraries that are maintained by other people (if at all). That can cause trouble.
During the process of disclosing MarkDownTime, we experienced why addressing vulnerable dependencies can be so challenging: we first encountered the issue in GitLab, and after further investigation, we realized the issue originates in commonmarker. In our report to GitLab, we stressed the fact that the problem should be fixed in the library, and they tried to contact the library's maintainer, but didn't succeed and eventually decided to fix it in-house by circumventing the underlying bug (exploitable data will be blocked before being rendered by the commonmarker library). We also talked directly to the library's maintainer, explaining everything about the bug and also suggesting a fixing pull request, but unfortunately he told us, “I have no plan for a fix. I don't care”. Fortunately, after discovering that the GitHub platform is also vulnerable, they fixed the issue in their library and succeeded in making him fix the bug and issue a security advisory.
The process can be frustrating, and, as you can see, can take months to fix completely. GitHub and GitLab are just two organizations we noticed that are vulnerable, but there are obviously many more products out there that are still susceptible to this attack. The entire software world is interconnected in a way we cannot anticipate. These dependencies often lead to catastrophic or unexpected consequences, such as discovering a widespread vulnerability affecting many products, as we saw in the Log4Shell incident.
There is a need for a better visibility and trust mechanism. To KNOW what components are used everywhere (SBOM anyone?) and to have a better and faster exchange of vulnerability and risk info.
Software supply chain attacks - how we can help
The Legit Security Platform is designed to remediate supply chain threats - in this case, the integrated SBOM and tools like Dependabot will report the use of a bad version of commonmarker. Legit is a comprehensive platform; dependencies threats are just one of many security issues and vulnerabilities we detect, including SDLC misconfigurations, pipeline issues, IaC, Secrets, SCA, etc. We also try to help the software supply chain security community by creating open-source projects, finding software supply chain bugs, and helping to remediate them.