frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: I nerfed our coding agents on purpose

19•noahfradin•1h ago
Tl;dr: I trained a classifier to route to the least expensive model and reasoning depth to complete the request. Coupling that with additional automated token efficiency techniques has yielded 3x usage for the same spend. For anyone interested in trying it themselves: https://nerfguard.com

Various teammates and I switched over to Codex from Claude Code recently. We still bounce between the tools, but Codex’s speed and steerability coupled with performance gains were hard to ignore. One of the downsides was that the per token pricing kicked in way sooner. This is happening across the board, but we felt it in Codex more acutely. We’re a startup filled with people who work around the clock and are obsessed with building — naturally our daily bill alone was striking.

Luckily we’re going after a big mission and speed matters significantly more than marginal token spend on the edges. Still, it got us thinking about how it was ludicrous that while our product has a side effect of decreasing token spend and speeding up agentic workflows by many orders of magnitude, we were using these top tier models for all types of internal coding tasks without any of those optimizations. The waste felt pretty ridiculous — the most glaring culprit was that we were seemingly using the max intelligence model on max reasoning for every task even when the task clearly didn’t require it. As a company who spends a lot of time on cached intelligence, it was also easy for us to see how there was plenty of other low hanging fruit as well.

So, on a recent weekend, I quickly built a tool to optimize our usage. At its core is a very fast classifier that classifies your requests to the least intelligence required for the task and includes some nice token optimizations on top. The result is roughly the same quality for multiples lower token spend. But even more exciting for us, is that the properly bin packed intelligence and reasoning levels meant our speed also went up considerably. This wasn’t negligible.

We’ve observed up to 3x savings and hours per day per person in saved time that we would have otherwise been waiting on tool turns and coding agent responses.

For us, that means improved engineering velocity and significantly higher usage for the same spend. It also means more usage before getting throttled.

As I told friends about this, they also wanted to start using it to maximize the usage they could get out of their coding agent plans. There are now engineers across many of the most cutting edge AI companies using this tool to optimize their token utilization in this way. Not just to save money, but to maximize output. Turns out that the best way to avoid getting nerfed by Claude is to intentionally nerf yourself selectively. We decided to release it for the rest of the builder community to use as well. You can now turn on Nerfguard for yourself and start getting more usage today.

Comments

andrewlau624•1h ago
compelling. i've seen context compression and caching tools before, but combining spend optimization with model routing and throughput gains is a smart angle.
snookie139•1h ago
Nice! Always thought something like this should exist. Will definitely try it out!
FLFSandy•1h ago
Wow, we are really struggling with our token costs. I'll def be sharing it with our team!
woodedpisces•1h ago
how much do your tokens actually cost? for me, it's no more than a few thousand so I don't really see the need for this.
gnabgib•45m ago
What's a few thousand kilos of gold, between friends?
kburman•1h ago
All new accounts created within few min. Nothing to see here.
jonappleseed22•1h ago
Thanks for the feedback. We have a few interns who are new to Hacker News.

If you have feedback on the product, would love to hear!

gnabgib•46m ago
I thought this looked interesting and you were going to test..? https://news.ycombinator.com/item?id=48419764
AmazingEveryDay•1h ago
It's sad and inept but worse yet perhaps they'll learn, and use seasoned accounts next time.
gnabgib•47m ago
Complete with embarrassing replies. Amazing
aka22208•1h ago
I’ll give this a shot. I might be doing it wrong but if I split the work between codex and Claude in the same VS instance I rarely run into usage limitations.

African Burial Ground National Monument, New York

https://www.nps.gov/afbg/learn/historyculture/index.htm
1•thunderbong•4m ago•0 comments

US attorney opens investigation into California elections-sends prosecutor to LA

https://apnews.com/article/california-primary-ballot-counting-trump-investigation-22b06b32abdca1e...
2•petethomas•16m ago•0 comments

The smart TV in your living room is a node in the AI scraping economy

https://blog.includesecurity.com/2026/06/the-smart-tv-in-your-livingroom-is-a-node-in-the-aiscrap...
1•themaxdavitt•16m ago•0 comments

Exploiting ML-DSA bugs [pdf]

https://cr.yp.to/papers/mldsa-20260601.pdf
1•libroot•17m ago•1 comments

Show HN: Documenting an Obscure Japanese Wii Game – and-Kensaku

https://github.com/TylerJaacks/AndKensakuResearch
1•TylerJaacks•18m ago•0 comments

New Treatment for Alzheimer's and Parkinson's Discovered in Japan

https://www.inc.com/lucia-auerbach/future-of-brain-health-how-a-new-scientific-discovery-could-re...
2•nikolay•25m ago•0 comments

Misu

https://en.wikipedia.org/wiki/Misu
2•carabiner•30m ago•0 comments

eLoran

https://en.wikipedia.org/wiki/ELoran
2•jonbaer•32m ago•0 comments

Ultra-fast CSV parsing and encoding for Elixir

https://github.com/jeffhuen/RustyCSV
1•sntran•33m ago•1 comments

The intracies of modern camera lens repair (2024)

https://salvagedcircuitry.com/sigma-45mm.html
11•transistor-man•33m ago•0 comments

Re: Cache: 0-click SXSS on Next.js via reflected headers

https://zhero-web-sec.github.io/research-and-things/re-cache-excessive-reflection-type-confusion-...
1•logickkk1•36m ago•0 comments

Cumulative average BMI and cognitive decline: a 24-year cohort study

https://link.springer.com/article/10.1007/s00415-026-13696-2
3•PaulHoule•37m ago•0 comments

I built an email agent to triage bogus security reports

https://opencomputer.dev/blog/email-security-triage-agent/
1•iacguy•42m ago•0 comments

Why Do Asian Brands Pretend to Be Japanese?

https://www.thechow.net/p/asian-brands-pretending-japanese-miniso
2•herbertl•42m ago•0 comments

Game Theory Text - Thomas Ferguson

https://web.archive.org/web/20050301121109/http://www.math.ucla.edu/~tom/Game_Theory/Contents.html
2•soupspaces•43m ago•0 comments

The Sandbaggers (1978 – 80) Complete Series

https://archive.org/details/the-sandbaggers-1978-80
3•petethomas•44m ago•1 comments

She won a religious exemption from using AI at work

https://www.businessinsider.com/worker-got-religious-exemption-using-ai-at-work-2026-6
9•dgellow•45m ago•4 comments

Silent Ransom Group Impersonating IT Personnel Through Social Engineering [pdf]

https://www.ic3.gov/CSA/2026/260526.pdf
2•gnabgib•49m ago•0 comments

ToTra – open-source LLM gateway with GDPR/EU AI Act compliance

https://github.com/SugaC-275/ToTra
2•SugaC275•50m ago•0 comments

The Shift in Peering Threatening the Internet's Foundations

https://www.internetsociety.org/blog/2026/06/the-shift-in-peering-threatening-the-internets-found...
4•8organicbits•58m ago•0 comments

Trump Urges 'Less Shackled' Pulte to Fire Intelligence-Community Employees

https://www.wsj.com/politics/national-security/trump-urges-less-shackled-pulte-to-fire-intelligen...
2•petethomas•1h ago•0 comments

If you don't fall for these extortionists' calls they'll show up with USB sticks

https://www.theregister.com/cyber-crime/2026/06/05/if-you-dont-fall-for-these-extortionists-calls...
2•Bender•1h ago•0 comments

Small modular nuclear reactor reaches criticality in first test

https://arstechnica.com/science/2026/06/first-us-test-of-modular-reactor-reaches-criticality/
1•Bender•1h ago•0 comments

Spinal cord stimulation for upper limb motor function in people with hemiparesis

https://www.nature.com/articles/s41591-026-04435-1
4•bookofjoe•1h ago•0 comments

Baby botulism outbreak: FDA still doesn't know cause or how to prevent it

https://arstechnica.com/health/2026/06/baby-botulism-outbreak-fda-still-doesnt-know-cause-or-how-...
4•Bender•1h ago•0 comments

Nasdaq falls 4% and suffers worst day since April 2025 traders flee chip stocks

https://www.cnbc.com/2026/06/04/stock-market-today-live-updates.html
6•rawgabbit•1h ago•2 comments

We Ditched Postgres for ClickHouse to Process 12B Caches per Day

https://momentic.ai/blog/postgres-to-clickhouse-migration
6•wuweiweiwu•1h ago•0 comments

You shouldn’t Use SQLite

https://www.hendrik-erz.de/post/why-you-shouldnt-use-sqlite
3•andrewstuart•1h ago•5 comments

GrapheneOS user reported to authorities for using GrapheneOS

https://discuss.grapheneos.org/d/36134-grapheneos-user-reported-to-authorities-for-using-grapheneos
6•Cider9986•1h ago•0 comments

Echoes from Another Place

https://scholarlyfutures.substack.com/p/echoes-from-another-place
2•JohnHammersley•1h ago•0 comments