GitHub Security Lab

Getting RCE in Chrome with incomplete object initialization in the Maglev compiler

In this post I’ll exploit CVE-2023-4069, a type confusion vulnerability that I reported in July 2023. The vulnerability—which allows remote code execution (RCE) in the renderer sandbox of Chrome by a single visit to a malicious site—is found in v8, the Javascript engine of Chrome. It was filed as bug 1465326 and subsequently fixed in version 115.0.5790.170/.171. Vulnerabilities like this are often the starting point for a “one-click” exploit, which compromises the victim’s device when they visit a malicious website. What’s more, renderer RCE in Chrome allows an attacker to compromise and execute arbitrary code in the Chrome renderer process. That being said, the renderer process has limited privilege and such a vulnerability needs to be chained with a second “sandbox escape” vulnerability (either another vulnerability in the Chrome browser process or one in the operating system) to compromise Chrome itself or the device. While many of the most powerful and sophisticated “one-click” attacks are highly targeted, and average users may be more at risk from less sophisticated attacks such as phishing, users should still keep Chrome up-to-date and enable automatic updates, as vulnerabilities in v8 can often be exploited relatively quickly. The current vulnerability, CVE-2023-4069, exists in the Maglev compiler, a new mid-tier JIT compiler in Chrome that optimizes Javascript functions based on previous knowledge of the input types. This kind of optimization is called speculative optimization and care must be taken to make sure that these assumptions on the inputs are still valid when the optimized code is used. The complexity of the JIT engine has led to many security issues in the past and has been a popular target for attackers. Maglev compiler The Maglev compiler is a mid-tier JIT compiler used by v8. Compared to the top-tier JIT compiler, TurboFan, Maglev generates less optimized code but with a faster compilation speed. Having multiple JIT compilers is common in Javascript engines, the idea being that with multiple tier compilers, you’ll find a more optimal tradeoff between compilation time and runtime optimization. Generally speaking, when a function is first run, slow bytecode is generated, as the function is run more often, it may get compiled into more optimized code, first from a lowest-tier JIT compiler. If the function gets used more often, then its optimization tier gets moved up, resulting in better runtime performance—but at the expense of a longer compilation time. The idea here is that for code that runs often, the runtime cost will likely outweigh the compile time cost. You can consult An Introduction to Speculative Optimization in v8 by Benedikt Meurer for more details of how the compilation process works. The Maglev compiler is enabled by default starting from version 114 of Chrome. Similar to TurboFan, it goes through the bytecode of a Javascript function, taking into account the feedback that was collected from previous runs, and transforms the bytecode into more optimized code. However, unlike TurboFan, which first transforms bytecodes into a “Sea of Nodes”, Maglev uses an intermediate representation and first transforms bytecodes into SSA (Static Single-Assignment) nodes, which are declared in the file maglev-ir.h. At the time of writing, the compilation process of Maglev consists mainly of two phases of optimizations: the first phase involves building a graph from the SSA nodes, while the second phase consists of optimizing the representations […]

Read More

Coordinated Disclosure: 1-Click RCE on GNOME (CVE-2023-43641)

Today, in coordination with Ilya Lipnitskiy (the maintainer of libcue) and the distros mailing list, the GitHub Security Lab is disclosing CVE-2023-43641, a memory corruption vulnerability in libcue. We have also sent a text-only version of this blog post to the oss-security list. It’s quite likely that you have never heard of libcue before, and are wondering why it’s important. This situation is neatly illustrated by xkcd 2347: libcue is a library used for parsing cue sheets—a metadata format for describing the layout of the tracks on a CD. Cue sheets are often used in combination with the FLAC audio file format, which means that libcue is a dependency of some audio players, such as Audacious. But the reason why I decided to audit libcue for security vulnerabilities is that it’s used by tracker-miners: an application that’s included with GNOME—the default graphical desktop environment of many open source operating systems. The purpose of tracker-miners is to index the files in your home directory to make them easily searchable. For example, the index is used by this search bar: The index is automatically updated when you add or modify a file in certain subdirectories of your home directory, in particular including ~/Downloads. To make a long story short, that means that inadvertently clicking a malicious link is all it takes for an attacker to exploit CVE-2023-43641 and get code execution on your computer: The video shows me clicking a link in a webpage, which causes a cue sheet to be downloaded. Because the file is saved to ~/Downloads, it is then automatically scanned by tracker-miners. And because it has a .cue filename extension, tracker-miners uses libcue to parse the file. The file exploits the vulnerability in libcue to gain code execution and pop a calculator. Cue sheets are just one of many file formats supported by tracker-miners. For example, it also includes scanners for HTML, JPEG, and PDF. I am delaying publication of the proof of concept (PoC) used in the video, to give users time to install the patch. But if you’d like to test if your system is vulnerable, try downloading this file, which contains a much simpler version of the PoC that merely causes a (benign) crash. The offsets in the full PoC need to be tuned for different distributions. I have only done this for Ubuntu 23.04 and Fedora 38, the most recent releases of Ubuntu and Fedora at this time. In my testing, I have found that the PoC works very reliably when run on the correct distribution (and will trigger a SIGSEGV when run on the wrong distribution). I have not created PoCs for any other distributions, but I believe that all distributions that run GNOME are potentially exploitable. The bug in libcue libcue is quite a small project. It’s primarily a bison grammar for cue sheets, with a few data structures for storing the parsed data. A simple example of a cue sheet looks like this: REM GENRE “Pop, dance pop” REM DATE 1987 PERFORMER “Rick Astley” TITLE “Whenever You Need Somebody” FILE “Whenever You Need Somebody.mp3” MP3 TRACK 01 AUDIO TITLE “Never Gonna Give You Up” PERFORMER “Rick Astley” SONGWRITER “Mike Stock, Matt Aitken, Pete Waterman” INDEX 01 00:00:00 TRACK 02 AUDIO TITLE “Whenever You Need Somebody” PERFORMER “Rick Astley” SONGWRITER “Mike Stock, […]

Read More

Getting RCE in Chrome with incorrect side effect in the JIT compiler

In this post, I’ll explain how to exploit CVE-2023-3420, a type confusion vulnerability in v8 (the Javascript engine of Chrome), that I reported in June 2023 as bug 1452137. The bug was fixed in version 114.0.5735.198/199. It allows remote code execution (RCE) in the renderer sandbox of Chrome by a single visit to a malicious site. Vulnerabilities like this are often the starting point for a “one-click” exploit, which compromise the victim’s device when they visit a malicious website. A renderer RCE in Chrome allows an attacker to compromise and execute arbitrary code in the Chrome renderer process. The renderer process has limited privilege though, so the attacker then needs to chain such a vulnerability with a second “sandbox escape” vulnerability: either another vulnerability in the Chrome browser process, or a vulnerability in the operating system to compromise either Chrome itself or the device. For example, a chain consisting of a renderer RCE (CVE-2022-3723), a Chrome sandbox escape (CVE-2022-4135), and a kernel bug (CVE-2022-38181) was discovered to be exploited in-the-wild in “Spyware vendors use 0-days and n-days against popular platforms” by Clement Lecigne of the Google Threat Analysis Group. While many of the most powerful and sophisticated “one-click” attacks are highly targeted and average users may be more at risk from less sophisticated attacks such as phishing, users should still keep Chrome up-to-date and enable automatic updates, as vulnerabilities in v8 can often be exploited relatively quickly by analyzing patches once these are released. The current vulnerability exists in the JIT compiler in Chrome, which optimizes Javascript functions based on previous knowledge of the input types (for example, number types, array types, etc.). This is called speculative optimization and care must be taken to make sure that these assumptions on the inputs are still valid when the optimized code is used. The complexity of the JIT engine has led to many security issues in the past and has been a popular target for attackers. The phrack article, “Exploiting Logic Bugs in JavaScript JIT Engines” by Samuel Groß is a very good introduction to the topic. The JIT compiler in Chrome The JIT compiler in Chrome’s v8 Javascript engine is called TurboFan. Javascript functions in Chrome are optimized according to how often they are used. When a Javascript function is first run, bytecode is generated by the interpreter. As the function is called repeatedly with different inputs, feedback about these inputs, such as their types (for example, are they integers, or objects, etc.), is collected. After the function is run enough times, TurboFan uses this feedback to compile optimized code for the function, where assumptions are made based on the feedback to optimize the bytecode. After this, the compiled optimized code is used to execute the function. If these assumptions become incorrect after the function is optimized (for example, new input is used with a type that is different to the feedback), then the function will be deoptimized, and the slower bytecode is used again. Readers can consult, for example, “An Introduction to Speculative Optimization in V8” by Benedikt Meurer for more details of how the compilation process works. TurboFan itself is a well-studied subject and there is a vast amount of literature out there documenting its inner workings, so I’ll only go through the background that is needed for this […]

Read More

mTLS: When certificate authentication is done wrong

Although X.509 certificates have been here for a while, they have become more popular for client authentication in zero-trust networks in recent years. Mutual TLS, or authentication based on X.509 certificates in general, brings advantages compared to passwords or tokens, but you get increased complexity in return. In this post, I’ll deep dive into some interesting attacks on mTLS authentication. We won’t bother you with heavy crypto stuff, but instead we’ll have a look at implementation vulnerabilities and how developers can make their mTLS systems vulnerable to user impersonation, privilege escalation, and information leakages. We will present some CVEs we found in popular open-source identity servers and ways to exploit them. Finally, we’ll explain how these vulnerabilities can be spotted in source code and how to fix them. This blog post is based on work that I recently presented at Black Hat USA and DEF CON. Introduction: What is mutual TLS? Website certificates are a very widely recognized technology, even to people who don’t work in the tech industry, thanks to the padlock icon used by web browsers. Whenever we connect to Gmail or GitHub, our browser checks the certificate provided by the server to make sure it’s truly the service we want to talk to. Fewer people know that the same technology can be used to authenticate clients: the TLS protocol is also designed to be able to verify the client using public and private key cryptography. It happens on the handshake level, even before any application data is transmitted: If configured to do so, a server can ask a client to provide a security certificate in the X.509 format. This certificate is just a blob of binary data that contain information about the client, such as its name, public key, issuer, and other fields: $ openssl x509 -text -in client.crt Certificate: Data: Version: 1 (0x0) Serial Number: d6:2a:25:e3:89:22:4d:1b Signature Algorithm: sha256WithRSAEncryption Issuer: CN=localhost //used to locate issuers certificate Validity Not Before: Jun 13 14:34:28 2023 GMT Not After : Jul 13 14:34:28 2023 GMT Subject: CN=client //aka “user name” Subject Public Key Info: Public Key Algorithm: rsaEncryption RSA Public-Key: (2048 bit) Modulus: 00:9c:7c:b4:e5:e9:3d:c1:70:9c:9d:18:2f:e8:a0: The server checks that this certificate is signed by one of the trusted authorities. This is a bit similar to checking the signature of a JWT token. Next, the client sends a “Certificate verify” message encrypted with the private key, so that the server can verify that the client actually has the private key. How certificates are validated “Certificate validation” commonly refers to the PKIX certificate validation process defined in RFC 5280. In short, in order to validate the certificate, the server constructs a certification path (also known as a certificate chain) from the target certificate to a trust anchor. The trust anchor is a self-signed root certificate that is inherently trusted by the validator. The end entity certificate is often signed by an intermediate CA, which is also signed by another intermediate certificate or directly by a trust anchor. Then, for each certificate in the chain, the validator checks the signature, validity period, allowed algorithms and key lengths, key usage, and other properties. There are also a number of optional certificate extensions: if they are included in the certificate, they can be checked as well. This process is quite complicated, so every language or […]

Read More

The GitHub Security Lab’s journey to disclosing 500 CVEs in open source projects

When I stepped onto the scale this morning, I remembered that there are some numbers that feel awkward to celebrate, while perhaps some others are worth celebrating! Recently, the GitHub Security Lab passed the milestone of 500 CVEs disclosed to open source projects. What’s a CVE? In short, it’s the record of a security vulnerability, under the CVE program, intended to inform impacted users. So, finding more vulnerabilities in open source shouldn’t be good news, right? Even as developer communities are getting better at keeping themselves secure, security issues may still slip through their defenses. This means that there will always be a need for security researchers, like the Security Lab, to discover and help fix them. If you’re not familiar with the Security Lab, we’re a team of security experts who work with the broader open source community to help fix security issues in their projects, with the goal of improving the overall security posture of open source. Our core activity is to audit open source projects, not only the ones hosted on GitHub–and help their maintainers fix the vulnerabilities we find, for free. This research is foundational for our other activities, such as education, improvement of our open source static analysis rules, and tooling. And now we are celebrating more than 500 CVEs disclosed. ???? How did we get here? The history of the Security Lab dates back to Semmle, the company that created CodeQL, and which was later acquired by GitHub. 2017 was a pivotal year, as we realized how powerful our product could be for finding security vulnerabilities. Unlike many other static analysis tools, CodeQL efficiently codifies insecure patterns and responds urgently to new security threats at scale. To showcase this capability, Semmle created a small security research team who used CodeQL to search for vulnerabilities in open source projects, and a web portal named LGTM.com where all open source projects could run CodeQL for free and be alerted of potential security flaws directly within their pull requests. This approach grew into an important company objective: find and fix vulnerabilities at scale in open source. This was a way of giving back to the open source community, just like any software company should. In September 2019, GitHub acquired Semmle, providing an ideal home for advancing the goal of improving open source security at scale. This led to the creation of the Security Lab, with a larger team and new initiatives, including curating the GitHub Advisory Database. The GitHub Advisory Database provides developers with the most accurate information about known security issues in their open source dependencies. GitHub also incorporated CodeQL as a foundation of code scanning and a core pillar of GitHub Advanced Security (GHAS), keeping it free for open source. Code scanning reached parity with LGTM.com in 2022. We have also expanded beyond CodeQL and now use a variety of tools in our audit activities, such as fuzzing. But CodeQL remains one of the most effective tools in our toolbox, because it enables us to conduct variant analysis at scale, and allows us to share our knowledge of insecure patterns with the community, in the form of executable CodeQL queries. The secret? Our maintainers-first approach Not all reports get a CVE. CVE records are useful for informing downstream consumers, so when there is no downstream […]

Read More