So this researcher may have gotten lucky in choosing to dig into the tool that CodeRabbit got unlucky in forgetting.
The only other safety I can think of is a whitelist, perhaps of file pathnames. This helps to maintain a safe-by-default posture. Taking it further, the whitelist could be specified in config and require change approval from a second team.
They are not. The Github API secret key should never be exposed in the environment, period; you're supposed to keep the key in an HSM and only use it to sign the per-repo access token. Per the GH docs [0]:
> The private key is the single most valuable secret for a GitHub App. Consider storing the key in a key vault, such as Azure Key Vault, and making it sign-only. This helps ensure that you can't lose the private key. Once the private key is uploaded to the key vault, it can never be read from there. It can only be used to sign things, and access to the private key is determined by your infrastructure rules.
> Alternatively, you can store the key as an environment variable. This is not as strong as storing the key in a key vault. If an attacker gains access to the environment, they can read the private key and gain persistent authentication as the GitHub App.
[0]: https://docs.github.com/en/apps/creating-github-apps/authent...
Curious what this (isolation mechanism) means if anyone knows.
If they're anything like the typical web-startup "developing fast but failing faster", they probably are using docker containers for "security isolation".
(likely asked AI to implement x and ai completely disregarded the need to sandbox).
Based on the env vars seems like they're using anthropic, openai, etc. only?
Is that good? I assume it just catches a different 10% of the bugs.
if you want to learn how CodeRabbit does the isolation, here's a blog post about how: https://cloud.google.com/blog/products/ai-machine-learning/h...
It's really hard to trust a "hey we got this guys" statement after a fuckup this big
> After responsibly disclosing this critical vulnerability to the CodeRabbit team, we learned from them that they had an isolation mechanism in place, but Rubocop somehow was not running inside it.
In case you don't want to read through the PR
1. You run git clone inside the GCR function, so, you have at the very least a user token for the git provider
2. RCE exploit basically used the external tools, like a static analysis checker, which again, is inside your GCR function
3. As a contrived example, if I could RCE `console.log(process.env)` then seemingly I could do `fetch(mywebsite....`
I get it, you can hand wave some amount of "VPC" and "sandbox" here. But, you're still executing code, explicitly labeling it "untrusted" and "sandboxed" doesn't excuse it.
You can store the credentials in a key vault but then post them on pastebin. The issue is that the individual runner has the key in its environment variables. Both can be true- the key can be given to the runner in env and the key is stored in a key vault.
The important distinction here is - have you removed the master key and other sensitive credentials from the environment passed into scanners that come in contact with customer untrusted code??
What mechanism are you suggesting where access to the production system doesn’t let you also access that secret?
Like I get in this specific case where you are running some untrusted code, that environment should have been isolated and these keys not passed in, but running untrusted code isn’t usually a common feature of most applications.
This would make it so that even a compromised downstream service wouldn't actually be able to exfiltrate the authentication token, and all its misdeeds would be logged by the proxy service, making post-incident remediation easier (and being able to definitely prove whether anything bad has actually happened).
i believe the answer here was to exchange the token for something scoped to the specific repo coderabbit is running in, but alas, that doesn't remove the "RCE" _on_ the repo
what i'm clearly mis-remembering is being able to exchange the token for a smaller scope e.g., hey~ sign this jwt, with scopes=[org/repo1, org/repo2, permissions=write]
If anything, they got paid in exposure.
Instead they took it as an opportunity to market their new sandboxing on Google's blog [2] again with no mention of why their hand was forced into building the sandboxing they should have had before they rushed to onboard thousands of customers.
I have no idea what their plan was. They had to have known the researchers would eventually publish this. Perhaps they were hoping it wouldn't get the same amount of attention it would if they posted it on their own blog.
Here is a tool with 7,000+ customers and access to 1 million code repositories which was breached with an exploit a clever 11 year old could created. (edit: 1 million repos, not customers)
When the exploit is so simple, I find it likely that bots or Black Hats or APTs had already found a way in and established persistence before the White Hat researchers reported the issue. If this is the case, patching the issue might prevent NEW bad actors from penetrating CodeRabbit's environment, but it might not evict any bad actors which might now be lurking in their environment.
I know Security is hard, but come on guys
What a piece of shit company.
People were quick to blame firebase instead of the devs.
Vibrators are so fucking annoying, mostly dumb, and super lame.
Taking care of private user data in a typical SaaS is one thing, but here you have the keys to make targetted supply chain attacks that could really wreak havoc.
But at the same time, me as a customer of Github, would prefer if Github made it harder for vendors like CodeRabbit to make misstakes like this.
If you have an app with access to more than 1M repos, it would make sense for Github to require a short lived token to access a given repository and only allow the "master" private key to update the app info or whatever.
And/or maybe design mechanisms that only allow minting of these tokens for the repo whenever a certain action is run (i.e not arbitrarily).
But at the end of the day, yes, it's impossible for Github to both allow users to grant full access to whatever app and at the same time ensure stuff like this doesn't happen.
I'd rather GitHub finally fix their registry to allow these GH Apps to push/pull with that instead of PAT.
It is absurd that anyone can mess up anything and have absolutely 0 consequences.
Some ways to prevent this from happening:
1. Don't let spawned processes have access to your env, there are ways to allowlist a set of env vars that are needed for a sub process in all major languages
2. Don't store secrets in env vars, use a good secrets vault (with a cache)
3. Tenant isolation as much as you can
4. And most obviously - don't run processes that can execute the code they are scanning, especially if that code is not your code (harder to tell, but always be paranoid)
It’s very efficient to delegate something to one major actor but we are introducing single points of failure and are less resilient to vulnerabilities.
Critical systems should have defenses in depth, decentralized architectures and avoid trusting new providers with too many moving parts.
Why would you even grant it such permissions? this is ridiculous.
For example, why a tool like this code analysis service would need git write permission access in the first place?
The only consolation here is that it'd be difficult to forge git repositories because of the SHA hash conflicts for any existing checkout, although presumably even there, the success rates would still be high enough, especially if they attack front-end repositories where the maintainers may not understand what has happened, and simply move on with the replaced repo without checking what went on.
I understand mistakes happen, but lack of transparency when these happen makes them look bad.
Imagine following case: a company sends you a repository as a test task before the interview. You run something like "npm install" or run Rust compiler, and your computer is controlled by an attacker now.
Maybe those tools should explicitly confirm executing every external command (with caching allowed commands list in order to not ask again). And maybe Linux should provide an easy to use and safe sandbox for developers. Currently I have to make sandboxes from scratch myself.
Also this is an indication why it is a bad idea to use environment variables for secrets and configuration. Whoever wrote "12 points app" doesn't know that there are command-line switches and configuration files for this.
ketzo•3h ago
What a bizarre world we're living in, where computers can talk about how they're being hacked while it's happening.
Also, this is pretty worrisome:
> Being quick to respond and remediate, as the CodeRabbit team was, is a critical part of addressing vulnerabilities in modern, fast-moving environments. Other vendors we contacted never responded at all, and their products are still vulnerable. [emphasis mine]
Props to the CodeRabbit team, and, uh, watch yourself out there otherwise!
progforlyfe•3h ago
htrp•6m ago