frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Tell HN: Apple development certificate server seems down?

96•strongpigeon•15h ago•38 comments

Ask HN: How are people doing AI evals these days?

17•yelmahallawy•1d ago•17 comments

Ask HN: What Are You Working On? (March 2026)

283•david927•2d ago•1083 comments

Ask HN: Remember Fidonet?

116•ukkare•1d ago•67 comments

Ask HN: How are people forecasting AI API costs for agent workflows?

4•Barathkanna•5h ago•10 comments

Maybe we can keep on coding? pseudo code project

7•EmptyDrum•7h ago•8 comments

Ask HN: How to be alone?

671•sillysaurusx•2d ago•552 comments

Ask HN: Please restrict new accounts from posting

708•Oras•2d ago•502 comments

Ask HN: Most beautiful personal blog UI you have ever seen?

142•ms7892•2d ago•54 comments

Ask HN: How do you review gen-AI created code?

4•captainkrtek•10h ago•4 comments

Tell HN: Vertical tabs has arrived (behind a flag) in Chrome stable

5•crummy•11h ago•0 comments

Ask HN: Is Starlink still being jammed in Iran?

3•Jblx2•11h ago•1 comments

Ask HN: Can I repurpose a Bluetooth voice remote as input device for a PC?

15•albert_e•3d ago•20 comments

Ask HN: How to "make it" as a newlygrad/junior?

4•kartoffelsaft•12h ago•3 comments

Tell HN: I'm 60 years old. Claude Code has re-ignited a passion

1064•shannoncc•4d ago•975 comments

Ask HN: Devs who are out of work – what are you doing for income now?

6•a1n•6h ago•3 comments

Ask HN: Does automatic multilingual support make sense for a launch platform?

2•LeanVibe•17h ago•3 comments

Why is GPT-5.4 obsessed with Goblins?

14•pants2•1d ago•8 comments

Unlocked SaaS, file source as truth?

2•abmmgb•20h ago•1 comments

Ask HN: Is GitHub getting less reliable, or is it just me?

11•_pdp_•1d ago•8 comments

The Architecture of an Exit Scam: A Technical Audit of Zszrun

5•cappyfjao•23h ago•0 comments

Ask HN: Since a week HN keeps logging me off every few days, why?

5•epolanski•1d ago•2 comments

Ask HN: What AI content automation stack are you using in 2026?

3•jackcofounder•1d ago•3 comments

Ask HN: Do you still run Redis and workers just for background jobs?

2•sergF•1d ago•16 comments

Ask HN: Favorite Non-Spammy iPhone Games?

6•bix6•1d ago•8 comments

Ask HN: What game engine would you recommend for vibe coding?

6•general_reveal•1d ago•7 comments

Ask HN: Read‑only LLM tool for email triage and knowledge extraction?

2•maille•1d ago•4 comments

Ask HN: Any informed guesses on the actual size/architecture of GPT-5.4 etc.?

4•dsrtslnd23•1d ago•0 comments

Ask HN: Anyone else feel this community has changed recently?

58•kypro•4d ago•30 comments

Code-review-graph: persistent code graph that cuts Claude Code token usage

2•tirthkanani•1d ago•0 comments
Open in hackernews

Ask HN: How are people forecasting AI API costs for agent workflows?

4•Barathkanna•5h ago
I’ve been experimenting with agent-based features and one thing that surprised me is how hard it is to estimate API costs.

A single user action can trigger anywhere from a few to dozens of LLM calls (tool use, retries, reasoning steps), and with token-based pricing the cost can vary a lot.

How are builders here planning for this when pricing their SaaS?

Are you just padding margins, limiting usage, or building internal cost tracking? Also curious, would a service that offers predictable pricing for AI APIs (like a fixed subscription cost) actually be useful for people building agentic workflows?

Comments

clearloop•5h ago
imo switch to local models could be an option
Barathkanna•4h ago
Local models solve the marginal cost problem, but they move the complexity into infrastructure and throughput planning instead.
clearloop•2h ago
makes sense, it really depends on the use cases, I'm building my version of claw openwalrus for the local LLMs first goal, I think myself will use local models for daily tasks that heavily depend on tool callings, but for coding or doing research, I'll keep using remote models

and this topic actually inspires me that I can introduce a builtin gas meter for tokens

Lazy_Player82•4h ago
Honestly, if you're designing your agent workflows properly with hard limits on retries and tool calls, the variance shouldn't be that wild. Most of the unpredictability comes from not having those guardrails in place early on. A few weeks of real production data usually shows the average cost is more stable than you'd expect.
Barathkanna•4h ago
True, but for early stage builders it’s harder to design those guardrails upfront. A lot of the time you only discover the retry patterns and cost spikes once real users start hitting the system.
Lazy_Player82•4h ago
Fair point. And honestly, with more non-technical builders shipping agent-based products these days, that's probably where a service like this makes the most sense – for people who don't yet have the experience to know what guardrails to put in place.
Barathkanna•4h ago
Exactly. That’s actually why we started building Oxlo.ai. Early stage builders usually just want to experiment without worrying too much about token cost spikes.
sriramgonella•4h ago
local models are better in controlling costs rather commercial models are very high and no control on this cost..how ever again local models training setup to be archietected very well to train this continoulsly
thiago_fm•1h ago
That isn't true, if you run local models you'll also need to have to spend on operations.

Maybe focus first on providing value and later you can optimize this setup.

thiago_fm•1h ago
Just add very hard high limits and add instrumentation so you can track it and re-evaluate it accordingly.

This takes a couple of hours maximum at best.