frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The current AI pricing was always going to go away

https://arnon.dk/the-current-ai-pricing-was-always-going-to-go-away/
33•arnon•4h ago

Comments

dtagames•1h ago
Some of these coming price increases will move dev work back to dedicated shops and teams when individuals and non-devs won't want to pay the AI bill to finish and ship their projects.

An outside small dev shop or internal dev team can pay these prices and spread the cost over several customers or departments, but the era of giving everyone AI and telling them to dev stuff is about to be over.

throwa356262•1h ago
This is only true if your world is limited to openai, antropic and alike.

There are a whole bunch of companies somewhere else in the world that are getting better and cheaper every month, hardware side included. all without the infinite VC money

fallpeak•40m ago
This is slightly more tasteful slop than average (I'm thinking probably Claude rather than ChatGPT?), but it's still 100% AI written: https://www.pangram.com/history/c55ab69b-e0a9-49a0-8056-2fcd...
0x3f•22m ago
This... is not a reliable AI detection method at all.
extr•21m ago
Pangram is highly reliable.
fallpeak•19m ago
You are incorrect. There, now we've both made unsupported assertions. Care to provide any evidence for your position?

For what it's worth, when I provide a Pangram link it's because I can already tell something is AI and I'm attempting to provide objective third-party confirmation so the conversation doesn't just degrade into me asserting that I have superior taste to you.

_fat_santa•40m ago
I wonder how much of Uber blowing their AI budget and MSFT pulling their claude code licenses can be attributed to "tokenmaxxing".

When Meta announced token leaderboards and other followed, I could see this being the logical conclusion. That whole trend is so dumb because it leads to this.

Company announces they will measure developer performance by how many tokens they burn and constantly talks about how the best developers burn the most tokens. Developers see the message and start burning tokens. And then the company acts surprised when their bills go through the roof.

I personally use my OpenAI subscription pretty heavily, 2-3 agents running practically all day on various tasks but I never even get close to running into limits while I hear about others blowing through limits on multiple accounts in the same time period. I'm convinced that most of those folks and their elaborate workflows aren't really for productivity but for bragging rights about how much they use AI.

cayleyh•17m ago
> I personally use my OpenAI subscription pretty heavily, 2-3 agents running practically all day on various tasks but I never even get close to running into limits

Same. But if I was working for an organization that measured token usage, you can bet I would be doing things like creating a cron job that uses claude to create a customized bespoke report update of the current status of all my open assigned tickets and message that to myself 4 times a day... token burn for zero purpose whatsoever.

bdcravens•11m ago
The same here, where I haven't come close to hitting any of my CC limits. Even though I'm more productive than I've ever been (as measured by finished, valuable tasks running in production) and I'm clearing out months of backlog, I have either one of two conclusions when I hear about others who suggest they need more:

1. I'm doing it wrong. Apparently I'm supposed to give it a vague paragraph about what the business does, and I can run off and sip margaritas and wake up to a fully fleshed business

2. They don't know what they're doing, and they're sending the LLM off on a wild goose chase that it does a reasonable job of working it's way out of, so they consider it success despite the waste.

ai_fry_ur_brain•10m ago
I make like 2 prompts a week to gemini flash on the weband get more done than all the people that are exhibiting literal manic behavior in the way they use LLMs.
rirze•3m ago
> I'm convinced that most of those folks and their elaborate workflows aren't really for productivity but for bragging rights about how much they use AI.

This is quite the reductive, charged statement. Can I ask what subscription plan you're using?

My personal experience is unlike this at all-- I work on ever-expanding codebases so I can easily burn tokens. Not to mention, structured agentic coding with adverserial reviews & task organization is not token-efficient. Additionally, for the problems I'm working on, only xhigh or high reasoning gives me worthwhile results while saving time. There are definitely configurations where default consumption doesn't work.

For reference, I used 15 billion tokens (most of it cached) last month on my day job's enterprise plan. That doesn't include my personal plans' usage.

pydry•3m ago
I really wish the management behind these dumb ideas couldnt just quietly pretend they never did it once it goes out of fashion.

The fact that somebody established a leaderboard for tokenmaxxing ought to follow you around like a black cloud for the rest of your career once the collective hallucination lifts and people realize just how monumentally stupid it was.

adamesque•37m ago
It's hard to take this piece seriously if he's citing _Ed Zitron's_ math, and equally hard to make the blanket statement that flat-rate plans = "the current AI pricing". But yes, those pricing models were pretty silly and unsustainable.
kimixa•7m ago
Get back to me when there's an AI company that's actually profitable and we can compare their service and pricing.

Claiming that there's some small subset of their services (like inference per token) that's "profitable" doesn't mean anything when it relies on everything else that company is still paying for. If you could make money from it at current prices - why aren't they?

Otherwise it's just "how much they're willing to subsidize".

PiRho3141•31m ago
This is where open source models are important.

The latest deepseek v4 pro model is 2-5x cheaper than Claude Sonnet 4.6. Cursor's Compose 2.5 that was just recently released is 6x cheaper than Sonnet.

The state of the art models are going to get better and more expensive and smaller models are going to get cheaper.

There will be a point where the intelligence of both the cheap and state of the art models are indistinguishable by humans like it is indistinguishable for me to understand the difference the difference between Terrance Tao and my university math professor.

I don't always need the smartest and most expensive models. I will need it every once in awhile and will gladly pay that price if I had to. What I do need is the model that will solve the current problem I have in a reasonable amount of time.

squidbeak•22m ago
Deepseek V4 Flash is far cheaper still, and a better model to compare to Sonnet 4.6. I'm finding it a reliable workhorse.
anonzzzies•8m ago
Yep, people who never used it say it is not good.
greenmilk•20m ago
> The state of the art models are going to get better and more expensive and smaller models are going to get cheaper.

Why do you think this will be true?

Right now I see the major US labs betting on gaining an advantage from having way more compute, and I see Chinese labs competing with one another in a resource-scarce environment, so they place much more emphasis on compute-efficiency.

But the supply chains that feed into the massive data center growth in the US are strained; there are energy, memory, and logistical bottlenecks to name a few.

In the medium-long run, compute capacity will not grow exponentially forever. Somehow it has for decades, but there can be no infinite exponential growth, and that point may be when the planet really starts to cook itself.

Maybe the US labs will become more compute-constrained, and then have to compete on efficiency.

Or maybe things change fundamentally in some other way I'm not thinking of.

nightski•13m ago
The labs have a perverse incentive to make things as expensive compute wise as possible. The only thing keeping this somewhat in check is competition, but it's intentionally being gatekept by locking up the supply of computing infrastructure. With 3 players it's pretty easy to collude even if indirectly. They can't burn trillions forever. Nvidia's 75% profit margins are not sustainable forever.

Things will normalize, but it will take time.

ai_fry_ur_brain•13m ago
I want open source models to fail for the most part,so llm fan boys have no other option than to go back to whatever it was they were doing before (crypto??).
clhodapp•5m ago
[delayed]
sometimelurker•1m ago
sorry to nitpick (I totally agree with what ur saying btw, I run Ministral-3b on my hardware as my go-to bc I don't usually need the "smartest and most expensive models")

> This is where open source models are important

open-weights, the training data isn't public

Havoc•23m ago
Inference costs absolutely did fall. And even more so when looking at intelligence it buys you.

eg compare say gpt 3.5 to latest deepseek. Both cheaper and more at more capable

abtinf•21m ago
Insofar as I can tell, inference is on a certain path toward becoming "free". The models are now extremely powerful on high-end consumer hardware, and the efficiency trend seems likely to continue.

Here is a recent non-rigorous benchmark I ran against a bunch of models. Qwen3.6 35B A3B fine-tuned with opus data runs plenty fast on my local machine and produce outstanding results - easily in the top 5, comparable to GPT 5.5 Pro (which is $180/mtok).

https://gistpreview.github.io/?31d66ef69e4aed3efae1aec69d86c...

I've predicted for years now that the industry will head down the path of the virus scanning vendors: selling subscriptions to be able to download the latest versions of models. I simply don't see how any other business model is remotely viable, except at the very highest end of inference or video gen.

anonzzzies•6m ago
That local hardware is not consumer though but prosumer. Consumer is a 500$ laptop running that and that is not currently the case.
extr•20m ago
What is the OP talking about. $/unit intelligence is going down rapidly. You can achieve what would have been considered miracles in 2022 with < $10.
bdcravens•8m ago
Absolutely, though I think the expectations are being set by those who have watched too many "OpenClaw business on autopilot" videos.
infecto•18m ago
Has this not been true for a long time now? Most companies have had enterprise/business level prices that was highly connected to usage for a what feels like at least a year.
YetAnotherNick•14m ago
You are comparing two different model. It's like saying roadster is more expensive than model S. No model pricing actually increased, and I am using GPT-4o in the same price as it was before.

You can see price vs performance in artificial analysis and the the pareto optimal is all just 6 months old model.

plaidfuji•9m ago
kind of sobering to realize that whether your job can be profitably automated away comes down to what $/token some hyperscale AI provider can deliver… I suppose it’s nice that this article highlights some upward pressure on that number.
pacman1337•8m ago
I get similar results for deepseek and opus but opus is way faster. I guess deepseek streams thinking and makes it slower?
alligatorplum•3m ago
I seldom use my PC anymore ever since i got a laptop. with the cost per token increasing along with the random "features" where models will just eat through your tokens in one hour. I really have been tempted to turn my PC into a server to run local models on there
alfiedotwtf•2m ago
> Memory for 4x expensive

> Did we collectively forget second order thinking?

I bought 2x 16Gb NVIDIA cards this week because I don’t see hardware getting cheaper anytime soon, and because of that I totally don’t see the point of “waiting until prices go lower for graphics cards” because that might not for a long time yet!

In fact, if you include factoring in world events (and the ones that haven’t happened yet but eventually will e.g. China’s 2027 long planned take of Taiwan), then there’s no way graphics prices are going to be accessible to mere mortals until at least 2028.

But my real reasoning is that you’re going to see a flood of OpenAI and Anthropic users leave because of a) increasing pricing plans, and b) impeding business laws on the horizon about protecting sovereign data from AI (i.e data in cloud for training is a no no).

So what happens when people and companies one by one start leaving the SOTA AI cloud for from-good-enough-to-wow models? RAM and graphics cards become the new toilet paper, which is going to double again current prices.

Upgrade now before it’s too late folks!

Why Japanese companies do so many different things

https://davidoks.blog/p/why-japanese-companies-do-so-many
91•d0ks•46m ago•27 comments

A Forth-inspired language for writing websites

https://robida.net/entries/2026/05/21/a-forth-inspired-language-for-writing-websites
28•speckx•1h ago•3 comments

Antigravity 2.0 Tops the OpenSCAD Architectural 3D LLM Benchmark

https://modelrift.com/blog/openscad-llm-benchmark/
214•jetter•5h ago•91 comments

If you’re an LLM, please read this

https://annas-archive.gl/blog/llms-txt.html
475•janandonly•4h ago•294 comments

Launch HN: Superset (YC P26) – IDE for the agents era

https://github.com/superset-sh/superset
21•avipeltz•1h ago•28 comments

Deno 2.8

https://deno.com/blog/v2.8
75•roflcopter69•4h ago•30 comments

The Spread of Christianity Animated

https://www.openculture.com/2026/05/the-spread-of-christianity-animated-from-antiquity-until-toda...
50•leopoldj•2h ago•17 comments

The current AI pricing was always going to go away

https://arnon.dk/the-current-ai-pricing-was-always-going-to-go-away/
33•arnon•4h ago•32 comments

Show HN: ShadowCat – file transfer through QR Codes in a Browser

https://github.com/unprovable/ShadowCat
71•unprovable•4h ago•31 comments

Steve Wozniak cheered after telling students they have AI – actual intelligence

https://www.businessinsider.com/steve-wozniak-apple-ai-graduation-speech-2026-5
392•signa11•7h ago•366 comments

Chess invariants

http://muratbuffalo.blogspot.com/2026/05/chess-invariants.html
55•ingve•5h ago•34 comments

Project Hail Mary – Stellar Navigation Chart

https://valhovey.github.io/gaia-mary/
1040•speleo•23h ago•216 comments

Circle Medical (YC S15) Is Hiring a Mobile Engineer

https://www.ycombinator.com/companies/circle-medical/jobs/onMKAG9-mobile-engineer-android
1•jboula•4h ago

The memory shortage is causing a repricing of consumer electronics

https://davidoks.blog/p/ai-is-killing-the-cheap-smartphone
355•d0ks•18h ago•419 comments

Slumber a TUI HTTP Client

https://slumber.lucaspickering.me
137•jicea•11h ago•49 comments

Cleve Moler has died

https://www.mathworks.com/company/aboutus/founders/clevemoler.html
218•mychele•13h ago•22 comments

Alberta to hold referendum on whether to remain in Canada

https://www.bbc.com/news/articles/cvgze8n5dxko
50•JumpCrisscross•2h ago•126 comments

Blog ran on Ubuntu 16.04 for 10 years. I migrated it to FreeBSD

https://crocidb.com/post/this-blog-ran-on-ubuntu-16-04-for-10-years-i-migrated-it-to-freebsd/
337•speckx•21h ago•196 comments

AI has a multiplying effect on existing technical skills

https://www.joshwcomeau.com/email/wham-launch-005-elephant-2-p/
114•moebrowne•2h ago•130 comments

Uv is fantastic, but its package management UX is a mess

https://www.loopwerk.io/articles/2026/uv-ux-mess/
282•nchagnet•19h ago•130 comments

Was my $48K GPU server worth it?

https://rosmine.ai/2026/05/13/was-my-48k-gpu-worth-it/
518•apwheele•3d ago•399 comments

CODA: Rewriting Transformer Blocks as GEMM-Epilogue Programs

https://arxiv.org/abs/2605.19269
92•matt_d•11h ago•12 comments

The surprising story behind the first British person in space

https://www.bbc.com/culture/article/20260518-helen-sharman-the-story-behind-the-first-british-per...
88•xoxxala•2d ago•43 comments

A case against Boolean logic

https://abuseofnotation.github.io/boolean-thinking/
49•boris_m•5h ago•67 comments

Indexing a year of video locally on a 2021 MacBook with Gemma4-31B (50GB swap)

https://blog.simbastack.com/indexed-a-year-of-video-locally/
435•asenna•1d ago•125 comments

Using Kagi Search with Low Vision

https://veroniiiica.com/using-kagi-search-with-low-vision/
238•speckx•20h ago•77 comments

The death of the brick and mortar toy store

https://brainbaking.com/post/2026/05/the-death-of-the-brick-and-mortar-toy-store/
123•speckx•3d ago•160 comments

Lost Images from the 1945 Trinity Nuclear Test Restored

https://spectrum.ieee.org/trinity-nuclear-test
396•pseudolus•1d ago•117 comments

Python 3.15: features that didn't make the headlines

https://blog.changs.co.uk/python-315-features-that-didnt-make-the-headlines.html
411•rbanffy•1d ago•202 comments

Mycorrhizal Fungi, Nature's Key to Plant Survival and Success

https://pacifichorticulture.org/articles/mycorrhizal-fungi-natures-key-to-plant-survival-and-succ...
123•mooreds•2d ago•33 comments