frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Ask HN: How are you estimating API costs before committing to an architecture?

1•sarthakaggarwal•2h ago
I've been building agent workflows on top of Claude's API and keep running into the same problem: I can't predict what a feature will cost until I've already built it and run it at some scale.

The input side is manageable — you can count tokens before sending. But output tokens are essentially unknowable upfront, and with agents that chain multiple calls (tool use, multi-turn reasoning, retries on failure), a single user action might be 3 API calls or 40. Multiply that by prompt caching behavior (which is great when it hits, but you can't always guarantee it), and the cost variance per task can easily be 10-20x.

This makes it really hard to do basic things like: set pricing for an AI-powered feature, decide whether an approach is even economically viable before building it, or give finance any kind of credible forecast.

What I've tried/looked at so far:

- Anthropic's token counting endpoint gives you exact input token counts pre-flight, which helps, but doesn't solve the output/chaining problem

- Logging everything post-hoc and building up averages per workflow — works but you're already committed by that point

- Setting hard spend caps at the API level — blunt instrument, doesn't help with per-feature attribution

- Looked at various OSS tools (ccusage, Langfuse, Helicone) — mostly retrospective dashboards, good for what did I already spend but not what will I spend

How are you handling this, especially if you're running agent-heavy workloads or building products where AI cost is a meaningful part of COGS. Are you doing any kind of pre-flight estimation? Cost-aware routing between models? Or just building first and optimizing later?

Comments

verdverm•2h ago
I stay relatively human-in-the-loop to keep costs down and quality up. My custom agent setup also has some "weird tricks" to keep context size down. Starting with a cheaper model like gemini-3-flash has good ROI too.

Claude Code skills for modern xOS (iOS, iPadOS, watchOS, tvOS) development

https://github.com/CharlesWiltgen/Axiom
1•rob•4m ago•0 comments

How Teens Use and View AI

https://www.pewresearch.org/internet/2026/02/24/how-teens-use-and-view-ai/
1•bookofjoe•5m ago•0 comments

Three scientists who said no to Epstein

https://www.science.org/content/article/meet-three-scientists-who-said-no-epstein
2•klipt•5m ago•0 comments

TrustLoop – Real-time policy enforcement and audit logging for AI agents

https://www.trustloop.live/
1•soji_mathew•8m ago•0 comments

Cybersecurity Forecast 2026 [pdf]

https://services.google.com/fh/files/misc/cybersecurity-forecast-2026-en.pdf
1•bookofjoe•11m ago•0 comments

Show HN: Interactive WordNet Visualizer-Explore Semantic Relations as a Graph

https://wordnet-vis.onrender.com/
1•ricky_risky•13m ago•0 comments

How to Manage Team Offsites Across Multiple Departments Without Micromanaging

https://daydreamsinruby.com/blog/2026-02-23-aligned-offsite-outcomes/
1•mooreds•14m ago•0 comments

Clud – super light-weight tool to turn natural language to terminal commands

https://github.com/oskob/clud
1•oskob•15m ago•1 comments

Log messages are mostly for the people operating your software

https://utcc.utoronto.ca/~cks/space/blog/programming/LogMessagesAreForOperation
1•todsacerdoti•16m ago•0 comments

A Race Within a Race: Exploiting CVE-2025-38617 in Linux Packet Sockets

https://blog.calif.io/p/a-race-within-a-race-exploiting-cve
3•WalterSobchak•16m ago•0 comments

So long, and thanks for all the logs

https://jerodsanto.net/2026/03/so-long-changelog/
2•mooreds•16m ago•0 comments

Computer Use Protocol – AI agents can perceive and interact with any desktop UI

https://github.com/computeruseprotocol/computeruseprotocol
3•k4cper-g•17m ago•0 comments

Why we love Vim (2021) [audio]

https://changelog.com/podcast/450
1•mooreds•18m ago•0 comments

Show HN: Limabean – a new implementation of Beancount in Clojure/Rust

https://github.com/tesujimath/limabean
1•tesujimath•18m ago•0 comments

Light-responsive porous aromatic frameworks manipulate CO2 uptake

https://www.pnas.org/doi/10.1073/pnas.2520024123
1•PaulHoule•19m ago•0 comments

Tech Legend Stewart Brand on Musk, Bezos and His Extraordinary Life

https://www.theguardian.com/technology/2026/feb/25/tech-legend-stewart-brand-on-musk-bezos-and-hi...
1•rmason•19m ago•0 comments

GoodSeed: A beautiful ML experiment tracker

https://goodseed.ai
1•gqsoqa•19m ago•0 comments

Avery Is Different: You Don't Vibe Code. You Work with an AI Virtual Engineer

https://avery.dev/blogs/avery-ai-virtual-engineer-vs-vibe-coding
1•rubanp•20m ago•0 comments

Show HN: give names/icons to Mac Spaces, jump between and track time across them

https://apps.apple.com/us/app/currentkey/id1456226992?mt=12
2•spenvo•20m ago•1 comments

Voxile: A ray traced game made in its own engine and programming language

https://elbowgreasegames.substack.com/p/voxray-games-pushes-major-update
3•spacemarine1•21m ago•1 comments

Xkcd-2501-Skill.md

https://gist.github.com/kyefox/96d762237ce23da6e130e5cd5762c6ab
2•Kye•21m ago•0 comments

Curiosity rover captures Martian spiderwebs up close

https://phys.org/news/2026-02-curiosity-rover-captures-martian-spiderwebs.html
2•luispa•21m ago•0 comments

Ask HN: Can hash verification replace EV code-signing on Windows?

1•hypersnatch_dev•21m ago•0 comments

Cursor discovered a novel solution to Problem Six of the First Proof challenge

https://twitter.com/mntruell/status/2028903020847841336
4•simianwords•22m ago•0 comments

Aegis - A safe, auditable, replayable agentic guardrails framework

https://github.com/agentlifylabs/Aegis
1•aposded•23m ago•0 comments

ChatGPT, write me a fictional paper: LLMs are willing to commit academic fraud

https://www.nature.com/articles/d41586-026-00595-9
1•gnabgib•25m ago•0 comments

Sentinel Defense Technologies – Remote (Global) – Equity → Pre-Seed in 6-10 Wks

https://sentinelcivilianriskanalysis.netlify.app
1•Viper117•26m ago•1 comments

Show HN: Noclaw

https://github.com/LucaLanziani/noclaw
1•lucalanziani•27m ago•0 comments

Waiting for the Barbarians

https://simonsarris.com/h/barbarians
1•simonebrunozzi•27m ago•0 comments

A Rational Analysis of the Effects of Sycophantic AI

https://arxiv.org/abs/2602.14270
1•zdw•28m ago•0 comments