frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Ask HN: How are you keeping AI coding agents from burning money?

2•bhaviav100•1h ago
My agents retry a bit more than it should, and there goes my bill up in the sky. I tried figuring out what is causing this but none of the tools helped much.

and the worse thing for me is that everything shows up as aggregate usage. Total tokens, total cost, maybe per model.

So I ended up hacking together a thin layer in front of OpenAI where every request is forced to carry some context (agent, task, user, team), and then just logging and calculating cost per call and putting some basic limits on top so you can actually block something if it starts going off the rails. It’s very barebones, but even just seeing “this agent + this task = this cost” was a big relief.

It uses your own OpenAI key, so it’s not doing anything magical on the execution side, just observing and enforcing.

I want to know you guys are dealing with this right now. Are you just watching aggregate usage and trusting it, or have you built something to break it down per agent / task?

If useful, here is the rough version I’m using : https://authority.bhaviavelayudhan.com/

Comments

rox_kd•1h ago
In what settings do you mean - there are multiple strategies, I think building your own compaction layer in front seems a bit over-kill ? have you considered implementing some cache strategy, otherwise summary pipelines - I made once an agent which based on the messages routed things to a smaller model for compaction / summaries to bring down the context, for the main agent.

But also ensuring you start new fresh context threads, instead of banging through a single one untill your whole feature is done .. working in small atomic incrementals works pretty good

bhaviav100•44m ago
yes, compaction and smaller models help on cost per step.

But my issue wasn’t just inefficiency, it was agents retrying when they shouldn’t.

I needed visibility + limits per agent/task, and the ability to cut it off, not just optimize it.

DarthCeltic85•46m ago
I had gotten a student/ultra code for antigravity promo for three months, so I was using that, but that finally ran out this month. Currently Im using windstream and flipping between claude as my left brain and code extraction and the higher context but cheaperish models there.

honestly though, im getting to a point where im running custom project mds that flip between different models for different things, using list outputs depending on what it finds and runs. (I have two monorepo projects, and one thats a polyglot microengine that jumps using gRPC communication.)

The mds are highly specialized for each project as each project deals with vastly different issues. Cycling through the different pro accounts and keeping the mds in place over it all is helping me not kill my wallet.

bhaviav100•41m ago
hmm interesting model routing + specialized MDs makes sense for cost efficiency.

I’m seeing a different failure mode though that even with good routing, agents are looping or retrying and burning my money.

The US and Israel are making the Islamic republic stronger

https://www.aljazeera.com/opinions/2026/3/28/how-the-us-and-israel-are-making-the-islamic-republic
1•samizdis•2m ago•0 comments

Heerich.js – A tiny engine for 3D voxel scenes rendered to SVG

https://meodai.github.io/heerich/
1•OuterVale•5m ago•0 comments

Understanding Semiconductors: A Technical Guide for Non-Technical People

https://link.springer.com/book/10.1007/978-1-4842-8847-4
1•teleforce•14m ago•0 comments

Linux on Claude

https://github.com/prairielabs/LinuxOnClaude
1•indigodaddy•16m ago•0 comments

Can AI Exit Vim?

https://theadamcolton.github.io/can-ai-exit-vim
2•topwalktown•16m ago•1 comments

Five Things to Know About the Siri Chatbot Coming in iOS 27

https://www.macrumors.com/2026/03/27/ios-27-siri-chatbot-features/
1•evo_9•17m ago•0 comments

Calculate Dora Metrics for Free

https://www.arewedeploying.com/
1•jahrichie•20m ago•1 comments

Turing Complete

https://store.steampowered.com/app/1444480/Turing_Complete/
1•jr-14•24m ago•0 comments

Meet The 'Corporate Bro' Making Millions Satirizing Tech Sales

https://www.wsj.com/business/media/meet-the-corporate-bro-making-millions-satirizing-tech-sales-d...
1•petethomas•25m ago•0 comments

Claude found zero days in Ghost and the Linux kernel

https://twitter.com/chiefofautism/status/2037951563931500669
1•Murfalo•29m ago•0 comments

Rama matches CockroachDB's TPC-C performance at 40% less AWS cost

https://blog.redplanetlabs.com/2026/03/17/rama-matches-cockroachdbs-tpc-c-performance-at-40-less-...
1•nathanmarz•32m ago•0 comments

Sel – short film lauren flinner

https://www.youtube.com/watch?v=rhCn9DgOSiI
1•marysminefnuf•48m ago•0 comments

Kee – Key combination matching on the modern web

https://github.com/juzerzarif/kee
2•juzerzarif•52m ago•1 comments

Show HN: PeriodicTableOfElements.org

https://periodictableofelements.org/?lang=en
2•nadermx•1h ago•0 comments

Social media is populist and polarising; AI may be the opposite

https://www.ft.com/content/3880176e-d3ac-4311-9052-fdfeaed56a0e
1•malloryerik•1h ago•1 comments

Show HN: Anamnesis – Open-source 4D strategic memory engine for AI agents

https://github.com/gayawellness/anamnesis
2•gayawellness•1h ago•0 comments

Pretext Demos

https://chenglou.me/pretext/
1•vinhnx•1h ago•0 comments

Alzheimer's disease mortality among taxi and ambulance drivers (2024)

https://www.bmj.com/content/387/bmj-2024-082194
23•bookofjoe•1h ago•9 comments

pbix-mcp — create and modify Power BI PBIX files in pure Python

https://github.com/d0nk3yhm/pbix-mcp
2•d0nk3yhm•1h ago•0 comments

Translating non-trivial codebases with Claude

https://blog.danieljanus.pl/2026/03/26/claude-nlp/
1•vinhnx•1h ago•0 comments

Catching crumbs from the table by Ted Chiang (2000) [pdf]

https://gwern.net/doc/fiction/science-fiction/2000-chiang.pdf
2•sendes•1h ago•1 comments

The Opt Out Project

https://www.optoutproject.net/
4•billybuckwheat•1h ago•0 comments

BubbleWrap your dev env and agents

https://dpc.pw/posts/bubblewrap-your-dev-env-and-agents/
2•vinhnx•1h ago•0 comments

A simple explanation of the key idea behind TurboQuant

https://old.reddit.com/r/LocalLLaMA/comments/1s62g5v/a_simple_explanation_of_the_key_idea_behind/
2•thunderbong•1h ago•0 comments

IN Event of Moon Disaster [pdf]

https://www.archives.gov/files/presidential-libraries/events/centennials/nixon/images/exhibit/rn1...
2•interweb•1h ago•0 comments

Anthropic's Mythos leak: 3k files in a public CMS, and what the docs revealed

https://medium.com/ai-advances/anthropic-claude-mythos-leak-analysis-b77c1b304eb8
5•Aedelon•1h ago•0 comments

Git City – Your GitHub as a 3D City

https://www.thegitcity.com
3•fcoury•1h ago•0 comments

Seattle opens first light rail across floating bridge

https://www.fox13seattle.com/news/seattle-train-floating-bridge
4•whiskey-one•1h ago•0 comments

Ask HN: How are you keeping AI coding agents from burning money?

2•bhaviav100•1h ago•4 comments

What's Banned on Your Block?

https://www.strongtownschicago.org/whats-banned-on-your-block
2•animal_spirits•1h ago•0 comments