Headline
Strengthening security of the software supply chain for LLVM
A lot of time and effort is put into writing security-focused software. Hardware vendors routinely add new features that help software developers increase the security of their software. Memory safe languages like Rust that help developers write safer code are becoming more and more popular. However, advancements in software security can be rendered useless if the supply chain for delivering software is compromised. As we’ve seen with the recent xz incident, a supply chain vulnerability can be exploited with malicious intent. In the LLVM project, we’ve been working to secure our own sof
A lot of time and effort is put into writing security-focused software. Hardware vendors routinely add new features that help software developers increase the security of their software. Memory safe languages like Rust that help developers write safer code are becoming more and more popular. However, advancements in software security can be rendered useless if the supply chain for delivering software is compromised. As we’ve seen with the recent xz incident, a supply chain vulnerability can be exploited with malicious intent. In the LLVM project, we’ve been working to secure our own software supply chain to help protect against these kinds of attacks.
****Managing commit access****
LLVM is a collection of reusable libraries and tools for building compilers. It includes a C compiler (clang), fortran compiler (Flang), and C++ standard libraries (libc++) among other things. LLVM is a very large project. It has a single Git repository that contains several related subprojects. Overall, the llvm-project Git repository receives about 300 commits a day, and has over 600 unique committers each month. To facilitate this rapid pace of development, the project has granted commit access to about 1,600 contributors. Having so many contributors with commit access is beneficial to the project, allowing it to sustain a high rate of feature development. However, having so many committers does increase the risk of someone outside the project gaining commit access using compromised credentials.
To mitigate this risk, the LLVM Project has recently adopted a policy of removing commit access from any user with fewer than 5 interactions (either a commit, or PR review or comment) with the repository in the past year.
Users with few interactions don’t need commit access. Infrequent contributors can submit pull requests, and it’s not much effort for a patch review or someone else with commit access to merge the PR. Also, users that are no longer active in the project may be at a greater risk of having their credentials compromised, because they may not be monitoring their GitHub account (or access tokens) and may not be up on the latest security recommendations for the LLVM Project.
Overall, fewer committers means less chance for compromise, but there still needs to be a certain number of people with commit access to keep the project running efficiently. Limiting the number of committers to some small number may be best for security, but it may not be best for the project overall. It’s important to balance security concerns with the other goals of any project.
****Securing GitHub Actions****
GitHub Actions is a framework allowing automation of various tasks for projects hosted on GitHub. The most common way projects use GitHub Actions is for continuous integration (CI) testing.
Typically, CI jobs are run when someone submits a pull request. This may seem harmless at first, but running a CI job for a pull request does come with some risks. Anyone with a GitHub account can submit a pull request, and it’s easy to modify CI tests to execute arbitrary code. So GitHub Actions jobs need to be hardened against running untrusted code from potentially malicious users.
Untrusted code could potentially steal login credentials from a machine, or take over a machine to use for Crypto mining, DDOS attacks, or some other kind of intensive task. GitHub has a number of best practices to help provide protections for projects Actions jobs. The LLVM project has been reviewing these and making changes to help ensure that our jobs are safe.
One safety feature we’ve enabled is to prevent GitHub Actions from running automatically for first-time contributors. Anyone contributing their first pull request to the project needs to have the GitHub Actions jobs started manually by someone with write access to the project. This helps protect against scripted attacks where attackers automatically create thousands of pull requests against various projects trying to find which ones are vulnerable. Requiring a manual step allows someone trusted in the project to review the PR to ensure it is legitimate.
Another best practice we’ve adopted is to ensure that all GitHub Actions jobs triggered by pull requests run with the fork’s context, and not the context of the main llvm-project repo. When a job runs in the fork’s context, it doesn’t have access to any of the main repository’s secrets, and is not permitted to modify code or issues in the repo. This is achieved by using the pull_request event for triggering PR jobs instead of the pull_request_target.
We also ensure that access tokens used by a GitHub Actions job has the least amount of permissions necessary to do its task. In most cases, our tokens have read-only access, so there’s little to no damage that could be done with them even if they were to be compromised.
****Release asset safety****
We’ve also recently made some improvements to how the LLVM Project handles release assets (sources and binaries). For one, we’ve started using the artifact attestation feature from GitHub. This allows users to verify that the artifacts are legitimate, but also the exact GitHub Actions job used to generate the artifacts. This helps provide transparency, and makes it easier for a third-party to reproduce the artifacts.
We no longer host third-party binaries (with a few exceptions) on the official release page. In the past, we let any contributor upload release binaries to our release page, but we didn’t have a good way to validate these binaries, so we created a different space for these kinds of uploads. Now on our official release page, we only host binaries that have been generated on the GitHub infrastructure, and have the artifact attestations to go with it (with one exception for Windows binaries).
****Security is an ongoing process****
Securing the supply-chain is never done. We always need to be aware of new threats, and also new technologies that can help protect projects. One way we keep track of how the project is doing is with the OpenSSF security scorecard. This monitors the repository and checks for common mistakes or other security issues. This scorecard is updated daily for our project, and displayed as a badge on the GitHub repository’s landing page.
Having automated tools to help secure the repository is helpful, but not enough. It’s up to every contributor to the project to do their part to help keep the repository safe. This means keeping their own login credentials safe, helping to review code before it’s committed to the repo, and being vigilant for suspicious activity in the project. Given that LLVM is a compiler, it is especially important to keep it safe. Malicious code in compilers can be propagated into anything it builds, making it a potential access point for attackers to any software in a distribution.
As the software industry gets better at creating security-focused software, whether through hardware features, new languages, or compiler-based techniques, attackers are looking for other approaches to compromise software. The software supply-chain can be a weak point for even the most hardened software. It’s vital that software developers (whether working on open source or proprietary software) understand supply-chain security and stay educated about the latest research and best practices.