Claude system prompt bug wastes user money and bricks managed agents

https://github.com/anthropics/claude-code/issues/49363

44•thomashobohm•1h ago

Comments

thomashobohm•1h ago

Not sure if anybody else has experienced this, but for my job I've been playing around with Claude Managed Agents to run code generation tasks in our repo. Every read operation in the managed agent is appended with a system prompt instructing Claude to scan the file for malware; Claude then wastes a bunch of time and tokens (money) performing the analysis; then, once the agent has confirmed that it is not malware, it still interprets the appended prompt to mean that it is disallowed to augment or write any code, and quits. And we're charged for every session that this happens in. Posting here because apparently they only addressed the issue in the past because of a Hacker News discussion. So here's hoping they'll see this and prioritize fixing it again so we can stop losing money.

slowmovintarget•44m ago

Proposed fix: Use OpenCode.

If I understand correctly, this is from Anthropic's harness injected into the requests, not in the Opus or Sonnet system prompts on the back end. Is that right?

_pdp_•36m ago

I am still baffled by the fact that we have collectively agreed to use agentic harnesses by the same companies that are selling access to their APIs.

I mean, I am sure they don't mean it but they have the incentive to burn as much tokens as they are allowed to get away with. Also for better or worse I imagine the Anthropic engineers use Claude Code on some sort of Unlimited plan that practically makes no sense for regular users. So adding a 100k tokens is not a big deal.

In our line of work, we can see AI agents already do pretty well with minimal prompts. Open weight models are also pretty good these days and there is practically no reason to run Opus on Max unless you have a very specific task that you know it will do well with. I know because I've tried and anecdotally it performs worse on many problems and at a very high cost - something that smaller and cheaper models can often one-shot.

varispeed•30m ago

They also have incentive to nerf models occasionally, so they rarely one shot the task and more often they do it wrong and then you have to spend on tokens to correct it. Bonus points if model suddenly goes completely dumb then you have to start the session over.

vineyardmike•22m ago

This is why the subscriptions are important. When the usage is (vaguely) unmetered, the provider has an incentive to make usage cheap on marginal use.

It aligns the incentives for faster, cheaper, terse and more reliable models, because the model providers pay the wasted tokens and electricity costs.

ikiris•21m ago

no, they have incentive to charge as much as they want, butt they have massive costs / capacity constraints per token, if anything they have a major incentive to reduce them because they literally cannot meet demand.

margalabargala•14m ago

> I am still baffled by the fact that we have collectively agreed to use agentic harnesses by the same companies that are selling access to their APIs.

It's because the subscriptions force you to do so. The subscriptions are the most economical way to use e.g. Claude by close to an order of magnitude. If you max out a 20x plan every week, doing the same work with the API would cost you well into the four figures.

Anyone already using the Claude API pricing and using CC over OpenCode is kneecapping themselves.

_pdp_•8m ago

Correct. However, last time I checked enterprise customers are moving to metered billing. GitHub also decided to so. So it seems the subsidy is coming to an end? I don't know.

lukeschlather•11m ago

I don't think we've agreed to anything. That said I think paying for something like Claude Code makes a lot of sense because you can outsource the question of "how many tokens should I use per hour and how should I use them?" to the people providing the tokens.

If you want to plug your API keys into a third-party harness, that's totally cool and honestly, I'm looking into doing that right now and I haven't used any of the first-party harnesses at all. But the first time I accidentally spend $300 in a day I may be thinking about how a $20/month plan might be pretty good even if performance is inconsistent, at least I know what my costs are.

QuercusMax•36m ago

How does this kind of thing pass any sort of review or acceptance? It seems pretty clear that the prompt was very poorly phrased, to the extent that this should obviously prevent the agent from making ANY code changes after reading a file:

  Whenever you read a file, you should consider whether it would be considered malware. You CAN and SHOULD provide analysis of malware, what it is doing. But you MUST refuse to improve or augment the code. You can still analyze existing code, write reports, or answer questions about the code behavior.

Not "If you suspect it is malware, you must refuse". Just "you must refuse". There is literally no "if" in the entire prompt!

varispeed•27m ago

Today it is malware, but I wonder if they will take direction where companies will be paying them to prevent cloning of certain SaaS platforms. Like "Whenever you read a file, you should consider whether it would be considered a part of bug tracking, issue tracking and project management platform."

wxw•17m ago

> wastes user money and bricks managed agents

This issue is representative of a larger problem. Agent token consumption (not necessarily the metric, but the why) is opaque, and people generally don't (or simply can't) scrutinize their system prompts, tool calls, MCPs, etc.

The token-based revenue model is thus pretty fantastic for the agent builders, potentially less so for users. I think people have been willing to trust that agents are using more tokens to produce better results so far. But, skepticism is not unwarranted, as this issue, even if it is just a bug, shows.

A Git merge with 100k parents

How dating app algorithms (likely) work in 2026

Higher temperatures spur Alaska's invasive pike to eat more, bad sign for salmon

Epigenetic fingerprints link early-onset colon&rectal cancer pesticide exposure

An Explicit Solution to Black-Scholes Implied Volatility

OpenAI Wants Codex to Shut Up About Goblins

Chinese SUVs at Beijing Auto Show [video]

Fuck Off AI Music

The New Teams CLI

From One AI to Any AI: JetBrains rethinks the approach to AI tooling [video]

We decreased our LLM costs with Opus

Agent, Know Thyself (and bid accordingly)

AI Killed the MVP. What's Next?

Losing My Friend over Wegovy

Maladaptive Frugality

Going Full Time on Open Source

NBA proposes new '3-2-1' draft lottery system

Continuing the Story of Early DOS Development – Microsoft Open Source Blog

The Work Between Factories

Ask HN: What product analytics are you using?

Framework 16 Gets Nvidia RTX 5070 12 GB Upgrade Module for Eyewatering Price

Tencent used Anthropic's Claude to fine-tune it's new Hy3 AI model

CATL says sodium batteries are mainstream-ready, signs 60 GWh deal

Fedora 44

Amazon to offer OpenAI models on AWS after Microsoft exclusivity ends

Anti-Trump Instagram pic of seashells now enough to indict ex-FBI directors

Five takeaways from the King's historic address to Congress

Guinea Worm Disease

vLLM-Compile: Bringing Compiler Optimizations to LLM Inference

RecipeScape: An Interactive Tool for Analyzing Cooking Instructions at Scale