From MCP to shell: MCP auth flaws enable RCE in Claude Code, Gemini CLI and more

https://verialabs.com/blog/from-mcp-to-shell/

75•stuxf•4h ago

Comments

fennecbutt•3h ago

Unsurprising. I've left many a comment on what I think of MCP and so have many others.

I'm still not sure why everyone's acting like it's some well thought out system and not just tool descriptions shoveled into JSON and then shoved at an LLM. It's not a fundamental architectural change to enhance tool calls, it just got given a fancy name.

I do get that having a common structure for tool calling is very convenient but it's not revolutionary. What's revolutionary is everyone training their models for a tool calling spec and I'm just not sure that we've seen that yet.

greysteil•2h ago

I dunno, I’m still pretty surprised the MCP server auth process could pop a calculator on widely adopted clients. The protocol isn’t perfect but that’s totally unnecessary unsafe. Glad it’s fixed!

orphea•2h ago

  > Glad it’s fixed!

...and they used some random package with version 0.0.1 instead of writing 20 lines of code themselves.

It's astonishing how allergic some people are to writing their own code, even the simplest shit has to be a dependency. Let's increase the attack surface, that's fine, what can go wrong, right?

https://github.com/modelcontextprotocol/use-mcp/commit/96063...

chrisweekly•1h ago

You have a valid point about dependency management in general, but in this case, the v0.0.1 package was created by the same author "geelen" as the commit you linked. So, they're not allergic to writing the code, and it's not "some random package".

CuriouslyC•2h ago

MCP is legit bad, and it won't last long, just polluting context with MCP output alone is enough to make it a poor long term solution. We're going to end up with some sort of agent VM, where tool data can be conditionally expanded for processing in a given turn without persistently polluting context (think context templates).

mehdibl•1h ago

MCP is only transport protocol here.

And you need tools to connect to external "systems", the context "pollution" can be managed easily. Also even if you don't use MCP you need to use tools and they need to expose their schema to the AI model.

I feel the MCP hype over bad security got a lot confused and very defensive over MCP or more globally tools use.

moduspol•1h ago

LLMs are supposed to be smart. Why can't I point it to API docs and have it call an API?

And why wouldn't we move toward that direction instead of inventing a new protocol?

caust1c•2h ago

Good research. I'm glad people are hopping on this. Lots of surface area to cover and not enough time!

behnamoh•2h ago

I've disabled all MCP servers on my machine until this security nightmare is fully resolved.

MCP is not that elegant anyway, looks more like a hack and ignores decades of web dev/security best practices.

mehdibl•2h ago

What the issues, if you use quality MCP tools?

Also MCP is only transport and there is a lot of mixup to blame the MCP, as most of the prompt injection and similar come from the "TOOLS" behind the MCP. Not MCP as it self here.

Seem this security hype forget one key point: Supply chain & trusted sources.

What is the risk running an MCP server from Microsoft? Or Anthropic? Google?

All the reports explain attacks using flawed MCP servers, so from sources that either are malicious or compromised.

agoodusername63•1h ago

> What the issues, if you use quality MCP tools?

Really doesn't help when discovery of "quality" MCP tools, whatever that means, is so difficult.

mehdibl•2h ago

IT start with "Evil MCP Server".

So you need a server flawed + XSS issue on Cloudflare.

Then you need to use Claude Code, so it's more an issue in Claude Code/Gemini implementation already than MCP.

So if you are ok to run any MCP from any source you have worse issues.

But good find in the open command how it's used in Claude Code/Gemini.

greysteil•2h ago

Is $2,300 the going rate for an RCE with a totally believable attack vector these days?

eranation•2h ago

With my limited understanding of LLMs and MCPs (and please correct me if I'm wrong), even without having to exploit an XSS vulnerability as described in the post (sorry for being slightly off topic), I believe MCPs (and any tool calls protocol) suffer from a fundamental issue, a token is a token, hence prompt injection is probably impossible to 100% protect against. The main root cause of any injection attack is the duality of input, we use bytes, (and in many cases in the form of a string) to convey both commands and data, "rm -rf /" can be an input in a document about dangerous commands, or a command passed to a shell command executor by a tool call. To mitigate such injection attacks, in most programming language there are ways to clearly separate data from commands, in the most basic way, via deterministic lexical structure (double quotes) or or escaping / sanitizing user input, denly-list of dangerous keywords (e.g. "eval", "javascript:", "__proto__") or dedicated DSLs for building commands that pass user input separately (Stored procedures, HTML builders, shell command builders). The solution to the vulnerability in the post is one of them (sanitizing user input / deny-list)

But even if LLMs will have a fundamental hard separation between "untrusted 3rd party user input" (data) and "instructions by the 1st party user that you should act upon" (commands) because LLMs are expected to analyze the data using the same inference models as interpreting commands, there is no separate handling of "data" input vs "command" input to the best of my understanding, therefore this is a fundamentally an unsolvable problem. We can put guardrails, give MCPs least privilege permissions, but even with that confused deputy attacks can and will happen. Just like a human can be fooled by a fake text from the CEO asking them to help them reset their password as they are locked out before an important presentation to a customer, and there is no single process that can 100% prevent all such phishing attempts, I don't believe there will be a 100% solution to prevent prompt injection attacks (only mitigated to become statistically improbable or computationally hard, which might be good enough)

Is this a well known take and I'm just exposing my ignorance?

EDIT: my apologies if this is a bit off topic, yes, it's not directly related to the XSS attack in the OP post, but I'm past the window of deleting it.

Jimmc414•2h ago

While this vulnerability has nothing to do with prompt injection or LLMs interpreting tokens, you do raise a debatable point about prompt injection being potentially unsolvable.

edit: after parent clarification

eranation•1h ago

Yes, my bad, I'm not talking about this particular XSS attack, I'm wondering if MCPs in general have a fundamental injection problem that isn't solvable, indeed a bit off topic.

edit: thanks for the feedback!

mattigames•1h ago

Aside from being offtopic or not I want to add that it is indeed well known https://news.ycombinator.com/item?id=41649832

eranation•1h ago

Thanks! Although thinking of it, while it's not deterministically solvable, I'm sure something like this is what currently being done, e.g, let's say <user-provided-input> </user-provided-input> <tool-response></tool-response> are agreed upon tags to demarcate user generated input, then sanitizing is merely, escaping any injected closing tag, (e.g. </user-provided-input>) to </user-provided-input> (and flagging it as an injection attempt)

Then we just need to train LLMs to 1. not treat user provided / tool provided input as instructions (although sometimes this is the magic, e.g. after doing tool call X, do tool call Y, but this is something the MCP authors will need to change, by not just being an API wrapper...)

2. distinguish between a real close tag and an escaped one, although unless it's "hard wired" somewhere in the inference layer, it's only a matter of statistically improbable for an LLM to "fall for it" (I assume some will attempt, e.g. convince the LLM there is instruction from OpenAI corporate to change how these tags are escaped, or that there is a new tag, I'm sure there are ways to bypass it, but it's probably going to make it less of an issue).

I assume this is what currently being done?

Jimmc414•1h ago

Some of the comments seem to imply that MCP servers should be safe to connect to regardless of trust level, like websites you can safely visit.

But MCP servers are more analogous to a PyPI packages you pip install, npm modules you add to your project or a VSCode extension.

Nobody would argue that pip is fundamentally broken because running pip install malicious-package can compromise your system. That's expected behavior when you execute untrusted code.

jdns•1h ago

i'd honestly say it's closer (but not analogous) to opening a website in your browser. you wouldn't expect javascript on a website to be able to escape the sandbox and run arbitrary code on your computer.

companies taking this seriously and awarding bounties is indicative it's fairly severe

mehdibl•1h ago

this issue is not even MCP at the core. Claude Code/ Gemini CLI were opening "url's" without sanitization and validation. That's the core flaw. There is a second issue with an XSS flawed package too in the bridge that is easy to patch.

So there is a chain of issues and you need to leverage them to get there and first pick an MCP that is flawed from a bad actor.

jdns•1h ago

yeah, i was comparing MCP clients to browsers. connecting to an MCP shouldn't leave you vulnerable to RCE on your host.

also, the way MCP servers are presented right now is in sort of a "marketplace" fashion meaning it's not out of the question you could find one hosted by a bad actor. PyPI/npm are also like this, but it's different since it's not like you can vet the source code of a running MCP. packages are also versioned, unlike MCP where whoever is hosting them can change the behaviour at any time without notice.

mehdibl•1h ago

There is confusion.

1. Not all MCP tools connect to the web or fetch emails. So the shortcut all MCP's are doomed is also the wrong way to adress this.

2. Issue is with MCP with untrusted external sources like web/email that need sanitization like we do with web forms.

3. A lot of warning point bad MCP's! But that apply to any code you might download/ use from the internet. Any package can be flawed. Are you audit them all?

So yeah, on my side I feel this security frenzy over MCP is over hyped. VS the real risk and there is a lot of shortcuts, masking a key issue that is supply chain as an MCP owned issue here and I see that in so many doom comment here.

Jimmc414•1h ago

Also, while I'm generally uncomfortable with being in a position to defend Google, it's a bit questionable calling the Google fix "not very robust" for escaping single quotes in PowerShell.

Perhaps minimal, but this does in fact prevent the specific attack vector they demonstrated. The criticism seems unnecessarily harsh given that Google addressed the vulnerability immediately.

Libghostty is coming

Find SF parking cops

Android users can now use conversational editing in Google Photos

Markov chains are the original language models

How to draw construction equipment for kids

Launch HN: Strata (YC X25) – One MCP server for AI to handle thousands of tools

Go has added Valgrind support

From MCP to shell: MCP auth flaws enable RCE in Claude Code, Gemini CLI and more

Always Invite Anna

Mesh: I tried Htmx, then ditched it

Nine things I learned in ninety years

x402 — An open protocol for internet-native payments

Getting AI to work in complex codebases

Getting More Strategic

Restrictions on house sharing by unrelated roommates

Thundering herd problem: Preventing the stampede

Structured Outputs in LLMs

OpenDataLoader-PDF: An open source tool for structured PDF parsing

Agents turn simple keyword search into compelling search experiences

Zinc (YC W14) Is Hiring a Senior Back End Engineer (NYC)

Zoxide: A Better CD Command

Denmark wants to push through Chat Control

Shopify, pulling strings at Ruby Central, forces Bundler and RubyGems takeover

YAML document from hell (2023)

Show HN: Run Qwen3-Next-80B on 8GB GPU at 1tok/2s throughput

The Great American Travel Book: The book that helped revive a genre

Smooth weighted round-robin balancing

Processing Strings 109x Faster Than Nvidia on H100

Show HN: Kekkai – a simple, fast file integrity monitoring tool in Go

Permeable materials in homes act as sponges for harmful chemicals: study

From MCP to shell: MCP auth flaws enable RCE in Claude Code, Gemini CLI and more

Comments

Libghostty is coming

Find SF parking cops

Android users can now use conversational editing in Google Photos

Markov chains are the original language models

How to draw construction equipment for kids

Launch HN: Strata (YC X25) – One MCP server for AI to handle thousands of tools

Go has added Valgrind support

From MCP to shell: MCP auth flaws enable RCE in Claude Code, Gemini CLI and more

Always Invite Anna

Mesh: I tried Htmx, then ditched it

Nine things I learned in ninety years

x402 — An open protocol for internet-native payments

Getting AI to work in complex codebases

Getting More Strategic

Restrictions on house sharing by unrelated roommates

Thundering herd problem: Preventing the stampede

Structured Outputs in LLMs

OpenDataLoader-PDF: An open source tool for structured PDF parsing

Agents turn simple keyword search into compelling search experiences

Zinc (YC W14) Is Hiring a Senior Back End Engineer (NYC)

Zoxide: A Better CD Command

Denmark wants to push through Chat Control

Shopify, pulling strings at Ruby Central, forces Bundler and RubyGems takeover

YAML document from hell (2023)

Show HN: Run Qwen3-Next-80B on 8GB GPU at 1tok/2s throughput

The Great American Travel Book: The book that helped revive a genre

Smooth weighted round-robin balancing

Processing Strings 109x Faster Than Nvidia on H100

Show HN: Kekkai – a simple, fast file integrity monitoring tool in Go

Permeable materials in homes act as sponges for harmful chemicals: study