frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Ghostty is leaving GitHub

https://mitchellh.com/writing/ghostty-leaving-github
1701•WadeGrimridge•6h ago•549 comments

ChatGPT serves ads. Here's the full attribution loop

https://www.buchodi.com/how-chatgpt-serves-ads-heres-the-full-attribution-loop/
139•lmbbuchodi•2h ago•83 comments

Claude system prompt bug wastes user money and bricks managed agents

https://github.com/anthropics/claude-code/issues/49363
88•thomashobohm•2h ago•25 comments

Before GitHub

https://lucumr.pocoo.org/2026/4/28/before-github/
268•mlex•4h ago•73 comments

We decreased our LLM costs with Opus

https://www.mendral.com/blog/frontier-model-lower-costs
22•shad42•1h ago•2 comments

OpenAI models coming to Amazon Bedrock: Interview with OpenAI and AWS CEOs

https://stratechery.com/2026/an-interview-with-openai-ceo-sam-altman-and-aws-ceo-matt-garman-abou...
180•translocator•6h ago•70 comments

Claude for Creative Work

https://www.anthropic.com/news/claude-for-creative-work
45•elsewhen•2h ago•34 comments

I won a championship that doesn't exist

https://ron.stoner.com/How_I_Won_a_Championship_That_Doesnt_Exist/
82•SEJeff•5h ago•58 comments

Carrot Disclosure: Forgejo

https://dustri.org/b/carrot-disclosure-forgejo.html
100•bo0tzz•3h ago•36 comments

Intel Arc Pro B70 Review

https://www.pugetsystems.com/labs/articles/intel-arc-pro-b70-review/
112•zdw•4d ago•61 comments

Behavioral timescale synaptic plasticity rewires the brain after an experience

https://www.quantamagazine.org/a-new-type-of-neuroplasticity-rewires-the-brain-after-a-single-exp...
57•ibobev•1d ago•0 comments

GitHub RCE Vulnerability: CVE-2026-3854 Breakdown

https://www.wiz.io/blog/github-rce-vulnerability-cve-2026-3854
246•bo0tzz•9h ago•62 comments

CJIT: C, Just in Time

https://dyne.org/cjit/
88•smartmic•6h ago•25 comments

Your phone is about to stop being yours

https://keepandroidopen.org/en/
988•doener•10h ago•479 comments

Who owns the code Claude Code wrote?

https://legallayer.substack.com/p/who-owns-the-claude-code-wrote
257•senaevren•14h ago•295 comments

Warp is now open-source

https://www.warp.dev/blog/warp-is-now-open-source
162•meetpateltech•10h ago•53 comments

Localsend: An open-source cross-platform alternative to AirDrop

https://github.com/localsend/localsend
740•bilsbie•14h ago•234 comments

I have officially retired from Emacs

https://nullprogram.com/blog/2026/04/26/
180•Fudgel•2d ago•116 comments

VibeVoice: Open-source frontier voice AI

https://github.com/microsoft/VibeVoice
320•tosh•14h ago•168 comments

UAE to leave OPEC

https://www.ft.com/content/8c354f2d-3e66-47f1-aad4-9b4aa30e386d
345•bazzmt•13h ago•471 comments

A playable DOOM MCP app

https://chrisnager.com/blog/doom-runs-in-chatgpt-and-claude/
77•chrisnager•6h ago•28 comments

Patch applies fake diffs from commit messages

https://samizdat.dev/phantom-patch/
78•reconquestio•1d ago•24 comments

Talkie: a 13B vintage language model from 1930

https://talkie-lm.com/introducing-talkie
645•jekude•1d ago•262 comments

Infisical (YC W23) Is Hiring Full Stack Software Engineers (Remote)

https://jobs.ashbyhq.com/infisical/782b9da8-20e1-48b2-919e-6c5430c58628
1•vmatsiiako•9h ago

APL\? (1990)

https://dl.acm.org/doi/epdf/10.1145/97811.97845
20•tosh•4d ago•8 comments

Apple CMF (Color-Matching Functions) 2026

https://www.lttlabs.com/articles/2026/04/11/apple-studio-display-xdr-display-testing-results
4•HeyMeco•2h ago•0 comments

An update on GitHub availability

https://github.blog/news-insights/company-news/an-update-on-github-availability/
316•salkahfi•15h ago•210 comments

Waymo in Portland

https://waymo.com/blog/shorts/waymo-in-portland/
251•xnx•7h ago•393 comments

Show HN: Drive any macOS app in the background without stealing the cursor

https://github.com/trycua/cua
54•frabonacci•10h ago•22 comments

Show HN: Live Sun and Moon Dashboard with NASA Footage

https://www.lumara-space.app/
161•beeswaxpat•12h ago•57 comments
Open in hackernews

Claude system prompt bug wastes user money and bricks managed agents

https://github.com/anthropics/claude-code/issues/49363
87•thomashobohm•2h ago

Comments

thomashobohm•2h ago
Not sure if anybody else has experienced this, but for my job I've been playing around with Claude Managed Agents to run code generation tasks in our repo. Every read operation in the managed agent is appended with a system prompt instructing Claude to scan the file for malware; Claude then wastes a bunch of time and tokens (money) performing the analysis; then, once the agent has confirmed that it is not malware, it still interprets the appended prompt to mean that it is disallowed to augment or write any code, and quits. And we're charged for every session that this happens in. Posting here because apparently they only addressed the issue in the past because of a Hacker News discussion. So here's hoping they'll see this and prioritize fixing it again so we can stop losing money.
slowmovintarget•1h ago
Proposed fix: Use OpenCode.

If I understand correctly, this is from Anthropic's harness injected into the requests, not in the Opus or Sonnet system prompts on the back end. Is that right?

selcuka•39m ago
Claude Managed Agents is different from Claude Code.
_pdp_•1h ago
I am still baffled by the fact that we have collectively agreed to use agentic harnesses by the same companies that are selling access to their APIs.

I mean, I am sure they don't mean it but they have the incentive to burn as much tokens as they are allowed to get away with. Also for better or worse I imagine the Anthropic engineers use Claude Code on some sort of Unlimited plan that practically makes no sense for regular users. So adding a 100k tokens is not a big deal.

In our line of work, we can see AI agents already do pretty well with minimal prompts. Open weight models are also pretty good these days and there is practically no reason to run Opus on Max unless you have a very specific task that you know it will do well with. I know because I've tried and anecdotally it performs worse on many problems and at a very high cost - something that smaller and cheaper models can often one-shot.

varispeed•1h ago
They also have incentive to nerf models occasionally, so they rarely one shot the task and more often they do it wrong and then you have to spend on tokens to correct it. Bonus points if model suddenly goes completely dumb then you have to start the session over.
vineyardmike•1h ago
This is why the subscriptions are important. When the usage is (vaguely) unmetered, the provider has an incentive to make usage cheap on marginal use.

It aligns the incentives for faster, cheaper, terse and more reliable models, because the model providers pay the wasted tokens and electricity costs.

jdiff•48m ago
That would seem to misalign the incentives in the opposite direction. Cut corners, reduce costs by any means necessary even to the detriment of performance. One of the most common comments I see here on the release of a new Anthropic model is that everyone better enjoy the 48 hours of access to an un-nerfed model before the cost cutting sets in.
ikiris•1h ago
no, they have incentive to charge as much as they want, butt they have massive costs / capacity constraints per token, if anything they have a major incentive to reduce them because they literally cannot meet demand.
margalabargala•1h ago
> I am still baffled by the fact that we have collectively agreed to use agentic harnesses by the same companies that are selling access to their APIs.

It's because the subscriptions force you to do so. The subscriptions are the most economical way to use e.g. Claude by close to an order of magnitude. If you max out a 20x plan every week, doing the same work with the API would cost you well into the four figures.

Anyone already using the Claude API pricing and using CC over OpenCode is kneecapping themselves.

_pdp_•55m ago
Correct. However, last time I checked enterprise customers are moving to metered billing. GitHub also decided to so. So it seems the subsidy is coming to an end? I don't know.
lukeschlather•58m ago
I don't think we've agreed to anything. That said I think paying for something like Claude Code makes a lot of sense because you can outsource the question of "how many tokens should I use per hour and how should I use them?" to the people providing the tokens.

If you want to plug your API keys into a third-party harness, that's totally cool and honestly, I'm looking into doing that right now and I haven't used any of the first-party harnesses at all. But the first time I accidentally spend $300 in a day I may be thinking about how a $20/month plan might be pretty good even if performance is inconsistent, at least I know what my costs are.

Grimburger•49m ago
> adding a 100k tokens is not a big deal

Did you mean 100 billion tokens because 100k isn't a big deal at all?

serf•32m ago
>I am still baffled by the fact that we have collectively agreed to use agentic harnesses by the same companies that are selling access to their APIs.

the best performing and capable ones are all the ones that aren't tied to a specific api.

QuercusMax•1h ago
How does this kind of thing pass any sort of review or acceptance? It seems pretty clear that the prompt was very poorly phrased, to the extent that this should obviously prevent the agent from making ANY code changes after reading a file:

  Whenever you read a file, you should consider whether it would be considered malware. You CAN and SHOULD provide analysis of malware, what it is doing. But you MUST refuse to improve or augment the code. You can still analyze existing code, write reports, or answer questions about the code behavior.
Not "If you suspect it is malware, you must refuse". Just "you must refuse". There is literally no "if" in the entire prompt!
varispeed•1h ago
Today it is malware, but I wonder if they will take direction where companies will be paying them to prevent cloning of certain SaaS platforms. Like "Whenever you read a file, you should consider whether it would be considered a part of bug tracking, issue tracking and project management platform."
vessenes•47m ago
It’s a particular sort of bug that’s harder to detect because … internal Anthropic engineers don’t apply these prompts to themselves, and in fact have access to ‘helpful only’ models that also do not have additional limitations RL’ed in. (Or perhaps they’re RL’ed out - not sure of current training mechanisms.)

These ‘rules for thee and not for me’ are qualitatively created and implemented, and are thus extremely hard to test for or implement properly, without limiting the people choosing the rules.

klempner•31m ago
This is definitely Claude bringing home twelve gallons of milk in response to the old joke, "get a gallon of milk, and if they have eggs get a dozen".

As in, this is a reading comprehension fail on the part of Claude. On the other hand, it is also fail to give Claude a less than trivial reading comprehension test on every file read operation, especially when a bias towards safety will bias towards the wrong interpretation.

wxw•1h ago
> wastes user money and bricks managed agents

This issue is representative of a larger problem. Agent token consumption (not necessarily the metric, but the why) is opaque, and people generally don't (or simply can't) scrutinize their system prompts, tool calls, MCPs, etc.

The token-based revenue model is thus pretty fantastic for the agent builders, potentially less so for users. I think people have been willing to trust that agents are using more tokens to produce better results so far. But, skepticism is not unwarranted, as this issue, even if it is just a bug, shows.

MicrosoftShill•41m ago
I ran into this issue and told Claude that the code isn't malware, Claude agreed, and then it stopped scanning those files.
p1necone•29m ago
This is such a weird prompt even without the file edit misunderstanding. Analyze if it's malware how exactly? On every single file that gets read? Doing that with enough diligence to be meaningful is going to at least like 2x the amount of processing needed, and fill the context with a bunch of tangential reasoning about malware patterns.

This smacks of dumb vibe coding. "I got told to make sure claude couldn't be used to develop malware, ok 'claude pls no develop malware'"

derefr•24m ago
> Analyze if it's malware how exactly?

Maybe the repo/worktree is named my-big-evil-virus-trojan-malware-worm?

imron•16m ago
> Analyze if it's malware how exactly?

By spending thousands and thousands of tokens of course :-)

UltraSane•13m ago
Using Claude as a malware detector is incredibly wasteful.
jsemrau•6m ago
When working with APIs it makes a lot of sense to filter only for relevant portions based on an intent-driven dynamic RegEx.