I've heard from sources that I trust that both AWS and Google Gemini charge more than it costs them in energy to run inference.
You can get a good estimate for the truth here by considering open weight models. It's possible to determine exactly how much energy it costs to serve DeepSeek V3.2 Exp, since that model is open weight. So run that calculation, then take a look at how much providers are charging to serve it and see if they are likely operating at a loss.
Here are some prices for that particular model: https://openrouter.ai/deepseek/deepseek-v3.2-exp/providers
Or: what are they bleeding money on?
That doesn't mean that when they do charge for the models - especially via their APIs - that they are serving them at a unit cost loss.
I would presume that companies selling compute for AI inference either make some money or at least break even when they serve a request. But I wouldn't b surprised if they are subsidizing this cost for the time being.
[1]: https://finance.yahoo.com/news/sam-altman-says-losing-money-...
https://twitter.com/sama/status/1876104315296968813
"insane thing: we are currently losing money on openai pro subscriptions!
people use it much more than we expected"
I don't doubt that it is true that they lose money on a 200 subscription because the people that pay 200 are probably the same people that will max out usage over time, no matter how wasteful. Sam Altman was framing it in a way to say "it's so useful people are using it more than we expected!", because he is interested in having everyone believe that LLMs are the future. It's all bullshit.
If I had to guess, they probably at least break even on API calls, and might make some money on lower tier subscriptions (i.e.: people that pay for it but use it sparingly on a as-need basis).
But that is boring, and hints at limited usability. Investors won't want to burn hundreds of billions in cash for something that may be sort of useful. They want destructive amounts of money in return.
Bad idea, bad execution, I like it when a plan comes together.
anupsingh123•3h ago
Building justcopy.ai - lets you clone, customize and ship any website. Built 7 AI agents to handle the dev workflow automatically.
Kicked them off to test something. Went to grab coffee.
Came back to a $100 spike on my OpenRouter bill. First thought: "holy shit we have users!"
We did not have users.
Added logging. The agent was still running. Making calls. Spending money. Just... going. Completely autonomous in the worst possible way. Final damage: $200.
The fix was embarrassingly simple: - Check for interrupts before every API call - Add hard budget limits per session - Set timeouts on literally everything - Log everything so you're not flying blind
Basically: autonomous ≠ unsupervised. These things will happily burn your money until you tell them to stop.
Has this happened to anyone else? What safety mechanisms are you using?
fragmede•2h ago
anupsingh123•2h ago
W3schoolz•2h ago
anupsingh123•2h ago
I started this project out of frustration. When I tried to clone other projects using Claude Code and customize them a bit—simple Next.js, ECS, CDK, and Express server setups—it took several hours just to get everything working. I realized that while vibe coding is great, it's still time-consuming to build a production-ready, functioning product.
W3schoolz•2h ago
That said, I think a budget limit of $5-10k per agent makes sense IMO. You're underpaying your agents and won't get principal engineer quality at those rates.
magicalhippo•2h ago
Agents doing nothing, just doing things for the sake of doing things.
Seems we're there.
SpaceNoodled•2h ago
anupsingh123•2h ago