frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

AI API Prices are 90% Subsidized

https://tinyml.substack.com/p/the-unsustainable-economics-of-llm
20•csoham•3h ago

Comments

PaulHoule•3h ago
When the AI hype train left the station I said "we don't understand how these things work at all and they're going to get much cheaper to run" and that turned out to be... true.

Already vendors of legacy models like ChatGPT-4 have to subsidize inference to keep up with new entrants based on a better foundation. It's likely that inference costs can be brought down by another factor of ten or so so of course you have to 90% subsidize these to get where the industry will be in 2-3 years.

revskill•3h ago
No lol. The quality is mostly bad. Basically u need to prompt in detail like writing a novel for llm to understand. At that price, we want real AI who can really have common sense, not just an autocompletion tool.

Stop adverting LLM as AI, instead sell it as a superior copy & paste engine.

What's worst about LLM, is the more you talk with it, the worse it became to the point of broken.

mrtksn•3h ago
Subsidized is probably not the correct word here, it's probably more like loss leader in the race of the land grab.

It's like the early days of the internet when everything was amazing and all the people who put money into this thing were "losing" their money.

It's going to be like this until monopolization and moat becomes defensible and then they will enshittify the crap of it and make their money back 10x, 100x etc.

apsec112•3h ago
This ignores batching - token generation is much more efficient in batch - and I strongly suspect is itself written by AI, given the heavy use of bullets
biophysboy•2h ago
is it common for adjacent tokens to use the same weights in a memory cache?
twoodfin•45m ago
The “X—not Y” pattern is also a dead giveaway.
GaggiX•2h ago
This calculation doesn't account for batches, it makes no sense.
BriggyDwiggs42•2h ago
On average how much does batching bring costs down?
GaggiX•1h ago
It balances the computing and memory bandwidth bottleneck so by a lot, with continuous batching you can easily see a x10, x20 or more.
impure•50m ago
I’ve been playing around with Gemma E4B and have gotten really good results. That’s a model you can run on a phone. So although prices have been going up recently I suspect they will start to fall again soon.
python273•37m ago
A much better article on token prices: https://www.tensoreconomics.com/p/llm-inference-economics-fr...

There's not much incentive to subsidize prices for OpenRouter providers for example, and the prices are much lower than the $6.37/M estimate from the article.

https://openrouter.ai/meta-llama/llama-3.3-70b-instruct

avg $0.37/M input tokens, $0.73/M output tokens (21 providers)

Llama is not even a good example, as the recent models are more optimized using Mixture Of Experts and KV cache compression.

A Dictionary of the Language of Myst's D'ni

http://www.eldalamberon.com/dni_dict.htm
1•lelandfe•5m ago•0 comments

You are what you launch: how software became a lifestyle brand

https://omeru.bearblog.dev/lifestyle/
1•sebst•8m ago•0 comments

Habit tracking for neurodivergent minds – Habitualy

https://www.habitualy.app/
1•abdullahss•10m ago•0 comments

The Implementation of This Site

https://vale.rocks/posts/the-implementation-of-this-site
1•OuterVale•11m ago•0 comments

AI can't cross this line and we don't know why

https://www.youtube.com/watch?v=5eqRuVp65eY
2•domofutu•11m ago•0 comments

But AI has no "I": A phenomenology of intelligence without self

https://d1wqtxts1xzle7.cloudfront.net/123187462/But_AI_has_no_I_A_phenomenology_of_intelligence_without_self_5.27-libre.pdf?1749430868=&response-content-disposition=inline%3B+filename%3DBut_AI_has_no_I_A_phenomenology_of_intel.pdf&Expires=1750639709&Signature=U5twm0dQbd0JKps5RezarQ339EkfaoOZIF2GYIDONMo6w8EdkvVEZd~UHdl426OVJK9TUChsdVCUY9f1cNzKyxR3O9nZqSL3OeO7tlF9zTEqg24K4Gdzv5aBGYJw9e2kVMeZijgo9CU4IFOGEJeCBy7-dQP6zy7yHMbnYWsKXu0WoJFI64ompe7TwObxFpZWFsGnNsZj6aiqb4SAL3brMCD46KZlIwCLMr-4Vc9izpGyuo1AXY3v3PCJPRctOnRp14iFBRWub7jXKIUrWxXmByn0650VbVF-HvVwuuC4frVk2cHJQ51NGDfffyR6awd-9wAwSTd8~slN0Fec5hnzfQ__&Key-Pair-Id=APKAJLOHF5GGSLRBV4ZA
1•domofutu•12m ago•0 comments

The Future of Personalized Medicine Is Here: KJ's Story

https://www.chop.edu/centers-programs/genetherapy4inheritedmetabolicdisorders/future-personalized-medicine-here-kjs
1•monista•16m ago•0 comments

Claude-Code-SDK-Ts

https://github.com/instantlyeasy/claude-code-sdk-ts
2•handfuloflight•20m ago•0 comments

WWDC25: What's New in Passkeys [video]

https://www.youtube.com/watch?v=mV68bUYVSL0
2•DuckConference•20m ago•0 comments

Dev jobs are about to get a hard reset and nobody's ready

https://old.reddit.com/r/ClaudeAI/comments/1lhgdbd/dev_jobs_are_about_to_get_a_hard_reset_and/
4•ubj•22m ago•0 comments

Former Tesla owner: Here's what went wrong with my Tesla

https://old.reddit.com/r/AskReddit/comments/1cx5l0k/comment/l5127f6/
1•behnamoh•25m ago•0 comments

Gemini 2.5 threatening to kill itself after failing to debug your code

https://i.imgur.com/uwA7wxv.png
2•jay_kyburz•25m ago•0 comments

Apple defeats AliveCor bid to block US smartwatch imports in US appeal

https://www.reuters.com/legal/litigation/apple-defeats-alivecor-bid-block-us-smartwatch-imports-us-appeal-2025-03-07/
1•teleforce•38m ago•0 comments

Carts of Darkness (2008) [video]

https://www.youtube.com/watch?v=zi-f_J6hV-g
1•toomuchtodo•40m ago•1 comments

The Downside of Diversity

https://www.nytimes.com/2007/08/05/world/americas/05iht-diversity.1.6986248.html
1•mhb•40m ago•0 comments

All the Little Data

https://www.newcartographies.com/p/all-the-little-data
2•wigwamnh•43m ago•0 comments

Finding a billion factorials in 60 ms with SIMD

https://codeforces.com/blog/entry/143279
3•todsacerdoti•49m ago•0 comments

The Disney Bomb

https://en.wikipedia.org/wiki/Disney_bomb
1•beatthatflight•52m ago•1 comments

Jony Ive Deal Removed from OpenAI Site over Trademark Suit

https://www.bloomberg.com/news/articles/2025-06-22/jony-ive-deal-removed-from-openai-site-over-trademark-suit
3•thenicepostr•55m ago•1 comments

AI Tools for Hedge Funds

https://alexizydorczyk.com/ai-for-hedge-funds.html
1•izyda•58m ago•0 comments

Show HN: Lego Island Playable in the Browser

https://isle.pizza
3•foxtacles•59m ago•0 comments

Why Doctors Hate Their Computers (2018)

https://web.archive.org/web/20250104014248/https://www.newyorker.com/magazine/2018/11/12/why-doctors-hate-their-computers
1•PaulHoule•1h ago•0 comments

Steven Pinker's five-point plan to save Harvard from itself (2023)

https://www.bostonglobe.com/2023/12/11/opinion/steven-pinker-how-to-save-universities-harvard-claudine-gay/
1•mpweiher•1h ago•0 comments

Yah This Is C#

https://twitter.com/davidfowl/status/1936914514307657830
4•gokhan•1h ago•0 comments

Scientists find three years left of remaining carbon budget for 1.5°C

https://www.leeds.ac.uk/research-32/news/article/5801/scientists-find-three-years-left-of-remaining-carbon-budget-for-1-5-c
2•geox•1h ago•0 comments

Hawaii Highways

http://www.hawaiihighways.com/
2•yakattak•1h ago•0 comments

Tell HN: Knowledge is dead. Insight is currency in the age of AI

2•INKidea•1h ago•3 comments

Neil Sloane's favourite integer sequences

https://www.theguardian.com/science/alexs-adventures-in-numberland/2014/oct/07/neil-sloane-the-man-who-loved-only-integer-sequences
1•qifzer•1h ago•1 comments

Wave of syringe attacks mar France's street music festival

https://www.france24.com/en/live-news/20250622-wave-of-syringe-attacks-mar-france-s-street-music-festival
6•pizza•1h ago•2 comments

Vintage Supermarket Photos

https://theimaginaryworld.com/groceryA1.html
1•gaws•1h ago•0 comments