frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

LLM inference load balancer optimized for AMD Radeon VII GPUs

https://github.com/janit/viiwork
1•velmu•2m ago•0 comments

Show HN: I built a tool to show how much ARR you lose to FX fees

https://fixmyfx.com
1•TaniaBell_PD•8m ago•1 comments

3 New world class MAI models, available in Foundry

https://microsoft.ai/news/today-were-announcing-3-new-world-class-mai-models-available-in-foundry/
2•geox•8m ago•0 comments

Get alerts of stolen bikes in your area – Register your bike in case of theft

https://bikewatch.app
1•fullstacking•12m ago•1 comments

The Health and Healthcare Spending Effects of GLP-1s

https://www.nber.org/digest/202604/health-and-healthcare-spending-effects-glp-1s
1•neehao•13m ago•0 comments

Steam to Show Estimated FPS

https://www.tomshardware.com/video-games/pc-gaming/steam-starts-gathering-fps-data-with-latest-cl...
1•ortusdux•15m ago•0 comments

Gstack for Learning Chinese

https://github.com/geometer-jones/the-big-learn
1•geometerJones•15m ago•1 comments

KDE is getting support for the xx-fractional-scale-v2 Wayland protocol

https://www.neowin.net/news/kde-is-getting-support-for-the-xx-fractional-scale-v2-wayland-protocol/
1•bundie•15m ago•0 comments

Onepilot – Deploy AI coding agents to remote servers from your iPhone

https://onepilotapp.com
8•elmlabs•24m ago•4 comments

Tandem: An IDE for non-code docs for real-time collaboration with Claude Code

https://github.com/bloknayrb/tandem
2•bloknayrb•24m ago•1 comments

Show HN: A Dad Joke Website

https://joshkurz.net/
2•joshkurz•25m ago•0 comments

Everything I hate about the Mac

https://blog.d11r.eu/mac/
4•dominicq•25m ago•2 comments

No Agenda, No Meeting

https://noagendanomeeting.net
2•benbalter•26m ago•1 comments

Show HN: Real-time AI (audio/video in, voice out) on an M3 Pro with Gemma E2B

https://github.com/fikrikarim/parlor
2•karimf•26m ago•0 comments

Moon Mission Orbit Animations

https://sankara.net/astro/lunar-missions/mission.html?mission=artemis2
1•jaypatelani•26m ago•0 comments

New Programming Language – Codescript

https://github.com/GHisaque/Codescript/releases/tag/v1.0.0
1•IsaqueCrystal•27m ago•2 comments

Prysma: Anatomy of an LLVM Compiler Built from Scratch in 8 Weeks

https://old.reddit.com/r/Compilers/comments/1sccdmi/prysma_anatomy_of_an_llvm_compiler_built_from/
1•zyphorah•27m ago•1 comments

AI Is Rewiring World's Most Prolific Film Industry

https://www.reuters.com/technology/ai-is-rewiring-worlds-most-prolific-film-industry-2026-04-04/
1•rcarr•27m ago•0 comments

Callvent – I built an app that turns phone calls into calendar events

https://callvent.app/en/blog/building-callvent/
1•robertmittl•29m ago•0 comments

Ask HN: LLM-Based Spam Filter

1•michidk•37m ago•0 comments

Show HN: Built a model-agnostic, desktop-native, research studio for local files

https://old.reddit.com/r/LLMDevs/comments/1sbusn8/new_pdfviewer_notes_panel_search_downloader_tool/
1•ieuanking•38m ago•0 comments

Josefina Aguilar, maestra artesana del barro, murió a los 80 añOS

https://www.nytimes.com/es/2026/04/02/espanol/cultura/josefina-aguilar-artesana.html
1•paulpauper•43m ago•0 comments

The CA Minimum Wage Increase: Summing Up

https://marginalrevolution.com/marginalrevolution/2026/04/the-ca-minimum-wage-increase-summing-up...
2•paulpauper•43m ago•0 comments

What if everything still ran on vacuum tubes? [video]

https://www.youtube.com/watch?v=mEpnRM97ACQ
2•marklit•44m ago•1 comments

Smartphones, Online Music Streaming, and Traffic Fatalities

https://www.nber.org/papers/w34866
1•naves•46m ago•0 comments

Claude Code skill to preserve traditional Unix style conventions

https://github.com/agiacalone/unix-conventions
2•agiacalone•46m ago•1 comments

How Close Is Too Close? Applying Fluid Dynamics Research Methods to PC Cooling

https://www.lttlabs.com/articles/2026/04/04/how-close-is-too-close-applying-fundamental-fluid-dyn...
1•LabsLucas•47m ago•1 comments

DIY Air Drums

https://www.instructables.com/SpaceDrums-Play-Drums-in-the-Air/
2•nlarion•50m ago•0 comments

Marc Andreessen on why "this time is different" in AI

https://www.latent.space/p/pmarca
3•theorchid•51m ago•0 comments

Microsoft Hasn't Had a Coherent GUI Strategy Since Petzold

https://www.jsnover.com/blog/2026/03/13/microsoft-hasnt-had-a-coherent-gui-strategy-since-petzold/
8•naves•52m ago•1 comments
Open in hackernews

Codex is switching to API pricing based usage for all users

https://help.openai.com/en/articles/20001106-codex-rate-card
138•ccmcarey•2h ago

Comments

SilverElfin•2h ago
Does this mean there’s no such thing as a “subscription” to ChatGPT for businesses? I thought they offered businesses a subscription with some amount of built in quota previously, including for the side products like codex and sora.
afrisch•1h ago
There are still subscriptions that give access to both ChatGPT and Codex, but with a much smaller usage quota than before the change (which came at the same time as the end of the 2x promo). I couldn't find the equivalent in terms of credit for the usage included with these $20/25 seats...
m-hodges•1h ago
The days of subsidized access is rapidly coming to an end.
LtWorf•1h ago
Good!
thejazzman•1h ago
It’s kind of a rug pull to effectively raise the price like 10x. I can’t afford to finish some of my projects with this change
SecretDreams•1h ago
That is okay.

Ultimately, we need to know the true cost of this technology to evaluate how effectively or ineffectively it can displace the workforce that existed before it.

techgnosis•58m ago
Agreed, this has to happen and the sooner the better.
GaggiX•1h ago
There are plenty of good models on Openrouter that are very cheap, maybe it's time to experiment with alternatives.
sfmike•1h ago
what are some of them?
oompydoompy74•1h ago
Kimi K2
GaggiX•1h ago
MiniMax M2.7, MiMo-V2-Pro, GLM-5, GLM5-turbo, Kimi K2.5, DeepSeek V3.2, Step 3.5 Flash (this last one is particularly cheap while still being powerful).
subscribed•31m ago
Can't judge on the quality of the comparison but I'd start from https://arena.ai/leaderboard/code and maybe from OpenRouter's ranking.
JesseTG•1h ago
Is writing it by hand the old-fashioned way not on the table?
DecoySalamander•1h ago
Not really. Many scenarios where that would mean spending 50x the time or hiring a team.
thejazzman•1h ago
Absolutely not. I took on some thins that would normally take 5-10 people and many months.

Some people are turn out slop. I was really excited to try and make some impressive shit. My whole life has been dedicated to trying to embody what Apple preached in the early days.

I knew this was coming, but I thought I had a little more time to try and get them over the finish line, ya know?

Maintenance by hand might be achievable, but it’s extremely hard when you’ve built something really big.

I’ve only got so much savings left to live on.

I’m not saying anyone owes me anything, but we all need to pivot and in a lot less sure my pivot is going to work out now

SlinkyOnStairs•51m ago
> I took on some thins that would normally take 5-10 people and many months.

Based on what, exactly?

It's very easy to claim some software would've taken you months to make, but this is ridiculous. Estimating project duration is well known to be impossible in this field. A few years ago you'd get laughed out the room for making such predictions.

> I’ve only got so much savings left to live on.

Respectfully, what are you doing here?

Yeah sure, the Apple dream. But supposing AI did in fact make you this legendary 100x developer, so it would to everyone else including those with significantly more resources. You'd still be run out of the market by those with bigger budgets or more marketing, and end up penniless all the same.

I would strongly recommend you not put all your proverbial eggs in this basket.

dmd•1h ago
It's really not. As a one-person IT department I'm now able to build things in hours or days that it previously would have taken my weeks or even months to build (and thus they didn't get done). Things people have wanted for years that I didn't ever have the time for, I can now say "yes" to.
bornfreddy•52m ago
Then I would say they judged the situation correctly when they decided to raise prices.

That said: competition will soon kick in.

sdevonoes•39m ago
Ridiculous
le-mark•9m ago
What am I an assembler programmer now?!? Am I to plug wires and flip switches!?!

/s

nearbuy•49m ago
If my math is right, assuming a mix of around 70% cached tokens, 20% input tokens, and 10% output tokens, it breaks even with the old pricing at around 130k tokens per message, or about 13k output tokens per message.

With the hidden reasoning tokens and tool calls, I have no idea how many tokens I typically use per message. I would guess maybe a quarter of that, which would make the new pricing cheaper.

SoftTalker•43m ago
Sounds like saying my plan to get rich buying up $10 bills for $1 hit kind of a rug pull in that people aren't selling them for that price anymore.
bloppe•43m ago
I don't think you can call it a rug pull when everybody saw it coming from miles away
nojito•1h ago
So many folks are just burning tokens just to burn them.

The infrastructure build out just can't keep up with it.

Bombthecat•1h ago
Management demands it
UltraSane•1h ago
subsidies always lead to waste.
subscribed•42m ago
This is false.

Two examples:

- https://www.msn.com/en-us/money/other/three-years-after-tria...

- https://record.umich.edu/articles/public-school-investment-r...

_fizz_buzz_•48m ago
Although I have to say I am sometimes surprised how much people burn through their usage. I was briefly on a Claude Max plan and then switched to a pro plan and still almost never hit my limit.
butterlettuce•46m ago
It’s Joever.
Rastonbury•1h ago
So Anthropic bundled CC with Claude.ai cuz OAI bundled chatgpt with Codex, now OAI is unbundling, IPO must be around the corner. Writing is also on the wall for CC usage based subscriptions now that main competitor effectively got rid of it. How are the Chinese models looking?
matheusmoreira•18m ago
> Writing is also on the wall for CC usage based subscriptions now that main competitor effectively got rid of it.

And I just subscribed for a year's worth of Claude... Terrible timing I guess. Do you know if the open models are viable?

__mharrison__•1h ago
For the past month, I've been claiming that $20/mo codex is the best deal in AI.

Now I'm going to have to find the new best deal.

verdverm•1h ago
We are exiting a hype cycle, well into the adoption curve. Subscriptions were never going to last.

My next step is going to be evaluating open and local models to see if they are sufficiently close to par with frontier models.

My hope is that the end of seat based pricing comes with this tech cycle. I was looking for document signing provider that doesn't charge a monthly, I only need a few docs a year.

__mharrison__•51m ago
I recently experimented creating a Python library from scratch with Codex. After I was done, I took the PRD and Task list that was generated and fed them to opencode with Qwen 3.5 running locally.

Opencode was able to create the library as well. It just took about 2x longer.

selectodude•47m ago
Which version of Qwen 3.5 did you use?
verdverm•46m ago
which quant as well
alifeinbinary•45m ago
I'm developing software in this area right now, so I try a lot of the new models. They're not even close for coding tasks. It basically comes down to 26b parameters vs 1T parameters / quantisation / smaller context sizs, there's no comparison. However, for agentic work, tool calling, text summarisation, local LLMs can be quite capable. Workloads that run as background tasks where you're not concerned about TTFB, cold starts, tok/s etc., this is where local AI is useful.

If you have an M processor then I would recommend that you ditch Ollama because it performs slowly. We get double or triple tok/s using omlx or vmlx, respectively, but vmlx doesn't have extensive support for some models like gpt-oss.

AstroBen•38m ago
Kimi K2.5 (as an example) is an open model with 1T params. I don't see a reason it has to be local for most use cases- the fact that it's open is what's important.
piyh•1h ago
Already paying for Google photo storage, AI pro for an extra $7 is a steal with anti-gravity.
matt_heimer•57m ago
That's only good for the web based UI. If you want Gemini API access which is what this article is about then you must go the AIStudio route and pricing is API usage based. It does have a free usage tier and new signups can get $300 in free credits for the paid tier so it's I think it's still a good deal, just not as good as using the subscriptions would be.
spijdar•34m ago
No? Isn't the article about Codex, which is roughly equivalent to "Gemini CLI" and Google's Antigravity? Google's subscriptions include quotas for both of those, albeit the $20 monthly "Pro" plan has had its "Pro" model quota slashed in the last few weeks. You still get a large number of "Gemini 3 Flash" queries, which has been good enough for the projects I've toyed with in Antigravity.
matt_heimer•18m ago
I guess that's true but I find Google's models better than their public tooling. The Pro subscription includes "Gemini Code Assist and Gemini CLI" but the Gemini Code Assist plugin for IntelliJ which is my daily driver is broken most of the time to the degree that it's completely unusable. Sometimes you can't even type in the input box.

The only way I can do serious development with Gemini models is with other tooling (Cline, etc) that requires API based access which isn't available as part of the subscription.

operatingthetan•18m ago
Google is by far the best deal for AI, they give you so many 'buckets' of usage for a variety of products, and they seem to keep adding them.
kingstnap•7m ago
If you aggressively use all buckets Google is incredibly generous. In theory for one AI pro subscription you can get what is a ridiculous return in investment in a family plan.

You could probably be charging google literally thousands if all 6 members were spamming video and image generation and antigravity.

operatingthetan•2m ago
The family sharing is the real hack lol. I don't think any other provider does that.
purrcat259•56m ago
Good luck sticking within limits, I have been burning up my baseline limits insanely fast within a few prompts, a marked change from a few weeks ago.

There's a few complaints online about the same happening to multiple users.

Otherwise anti-gravity has been great.

cmrdporcupine•2m ago
I bought one of the google AI packages that came with a pile of drive storage and Gemini access.

Unfortunately gemini as a coding agent is a steaming useless pile. They have no right selling it, cheap open weight Chinese models are better at this point.

It's not stupid it just is incompetent at tool use and makes bad mistakes. It constantly gets itself into weird dysfunctional loops when doing basic things like editing files.

I'm not sure what GOOG employees are using internally, but I hope they're not being saddled with Gemini 3.1. It's miles behind.

scosman•47m ago
Check out z.ai coder plan. The $27/mo plan is roughly the same usage as the 20x $200 Claude plan. I have both and Claude is a little better, but GLM 5.1 is much better value.
rustyhancock•35m ago
Agreed, I use Z.ai and the usage is fantastic the only temper that recommendation that it's often unreliable. Perhaps a few times per week it's unresponsive. Maybe more often it seems to become flakey.

It's very variable though recently I'm noticing it's more reliable but there was a patch where it was nearly unusable some days.

I guess I won't complain for the price and YMMV.

scosman•20m ago
Agreed. They had a rough patch around the 4.7 to 5 upgrade. New architecture required hardware migration. The 5 to 5.1 upgrade was much smoother (same architecture new weights). As you say, little rough around edges, but still great value. Trick I learned is that it's max 2 parallel requests per user. You can put a billion tokens a month through it, but need to manage your parallelism.
mickeyp•4m ago
If you're ok with a model provider that goes down all the time and has such a poor inference engine setup that once you get past 50k tokens you're going to get stuck in endless reasoning loops.
Skunkleton•1h ago
The title is misleading and not in the article. This change is for business/enterprise accounts. Also, these are still credit based. The change is that credits now operate on tokens like the API rather than on messages as they used to.
petcat•1h ago
> Customers on existing Plus, Pro and Enterprise/Edu plans should continue to use the legacy rate card. We’ll migrate you to the new rates in the upcoming weeks.
ccmcarey•1h ago
Nope, they buried the lead a bit but this is coming for _all_ users, even pro/plus subscription plans. So you get chatgpt pro/plus benefits, and then effectively $20/$200 in credits for codex
adamtaylor_13•1h ago
Sounds like a death knell to me.

If I recall correctly, Ed Zitron noted in a recent article that one of the horsemen of his AI-pocalypse would be price hikes from providers.

hn_throwaway_99•1h ago
Literally every VC funded consumer product has switched from a "growth at all costs" phase to a "Now we hike prices, make money, and generally enshittify" phase, and tons of those companies are still around (e.g. Uber), so I'm not sure why anyone thinks it would be much different for AI.
cyanydeez•1h ago
yes, but how many succeed without any kind of moat or having destroyed the existing companies?

I'm still running local LLMs and finding perfectly acceptable code gen.

cududa•1h ago
That guy has his own form of AI psychosis
supliminal•1h ago
Every time an Ed Zitron article is posted on HN, it is met with a torrent of vitriol and personal attacks. The articles are okay if not overly wordy but I don’t see how the subject matter elicits that strong of a response.

At any rate, this observation is not unique to Ed, lots of people have made the same conclusion that the math doesn’t add up from a business profitability perspective.

SlinkyOnStairs•24m ago
> The articles are okay if not overly wordy but I don’t see how the subject matter elicits that strong of a response.

Hot take, but really it's more of an observation than a take: We saw this exact response in Blockchain & crypto circles a few years ago. (Though HN wasn't quite as culturally "central" to those)

Economic Bubbles are subject to the Tinkerbell Effect. They exist so long as people exist in them, and collapse when either 1) They become so financially unsustainable as to collapse, having consumed all the money the economy could possibly give them, or 2) People stop believing in the bubble and stop feeding it money.

In this regard, the statement "NTFs are stupid" was not merely ridiculing those who bought them, but a direct attack on the bubble and those invested in it. And this is something the people involved in the bubble understand instinctively, even if they aren't consciously aware of it. (There's a psychological mechanism to that, but it's not relevant)

So consequently, they react aggressively to dissent. They seek to enforce their narrative, because not doing so is a threat to the bubble and their financial interests.

---

AI's not much different to that. It's clearly a bubble to everyone including the AI execs saying it out loud.

And people react aggressively to dissent like Ed's, because if the wider public stops believing in AI's future, the bubble bursts. They'll stop tolerating datacenter construction, they'll sell their Nvidia shares, they'll demand regulators restrict AI.

(And to those who can feel their aggression rising reading this comment. Hi, yes. I see you. If I were wrong, nothing I said would matter. You'd be wasting your time engaging with it, history would simply prove me wrong. But by all means, type up that reply or click that button.)

PhilippGille•1h ago
Is this not just about extra credit? So what's included in the subscription doesn't change - just extra credits are now token based instead of message based? (For Plus/Pro)
raincole•59m ago
Yes.

> This format replaces average per-message estimates with a direct mapping between token usage and credits.

It's to replace the opaque, per-message calculation, not the subscription plan.

liuliu•49m ago
It does feel like also impact the usage meter for subscription plans?
raincole•46m ago
Usage meter has always been completely opaque anyway. They could (and probably did) shrink the limit whenever they like.
mrtesthah•34m ago
Ostensibly this makes usage meter rate changes more transparent?
liuliu•3m ago
It is a bit insidious that the price hike coincide with the end of 2x promotion, which makes the usage change a bit more obscure.
nba456_•45m ago
God every single title I read about AI on this site ends up being a straight up lie.
sixtyj•39m ago
I miss “BREAKING NEWS” as it is used at X /s
camdenreslink•35m ago
I think this might also impact how usage is calculated for subscription plans as well, not just overages (using tokens instead of messages for calculating usage). But the message from OpenAI seems vague.
alkonaut•1h ago
Not only do I not keep up with the tech itself, I don’t even keep up with how to pay for it.
kvanbeek•1h ago
So migrate to gemini now?
matt_heimer•1h ago
If you use Google's tooling but not if you need API access. API access is not in the subscriptions and uses token based pricing. For development I find that the Gemini IDE plugins that have good free usage and are included in the subscriptions aren't great. Gemini plug-in under IntelliJ is often broken, etc. The best experience is with other tools like Cline where you've had to use a developer based account which is API usage based already.

But Gemini's API based usage also has a free tier and if that doesn't work for you (they train on your data) and you've never signed up before you get several hundred dollars in free credits that expire after 90 days. 3 months of free access is a pretty good deal.

adi_kurian•1h ago
Makes sense. Right now the subscriptions are like Uber as I remember it in NYC in 2014.
AstroBen•1h ago
Things must be bad if they're doing this before their IPO
rvnx•43m ago
Billions of USD in debt, a business model bleeding cash with no profit in perspective, high-competition environnement, a sub-par product, free-to-use offline models taking off, potential regulatory issues, some investor commitments pulling out... tricky.

But let's not cry for the founders, they managed to get away with tons of money. The problem is for the fools holding the bag.

AstroBen•41m ago
Unfortunately the fools holding the bag are going to be those who own index funds when these companies are inserted into them.
convexly•1h ago
This pricing only really makes sense if the users can predict their usage, if not people that use this heavily are just going to be hamstrung and are going to start rationing their usage.
jamesu•1h ago
The current pricing model (for plus) feels deliberately confusing to me, I can never really tell if I'm nearing any kind of limit with my account since nothing really seems to tell me.
supliminal•1h ago
Any takes on how Codex compares to Claude? I mostly use it to run ahead, document, investigate and prep the actual implementation for Claude.

Gemini burned me too many times but maybe the situation has improved since.

RobinL•39m ago
5.4 is great. I use it for python professionally and for typescript/front-end games and educational apps recreationally. In my experience it's roughly as good as opus, just a lot cheaper. It's amazing how much usage you get for $20/mo
mrtesthah•36m ago
gpt-5.4 is unmatched. Claude is possibly better in web UI tasks, but not much else.
fabian2k•1h ago
Is this something that is likely to also change the way Github Copilot bills? Right now the billing is message-based, not token-based. And OpenAI and Microsoft are rather opaquely intertwined in the AI space.
phainopepla2•48m ago
Hard to say, but GitHub Copilot also allows access to Anthropic, Google and Grok models, so I don't know that a change from a single provider would necessarily change how they bill
anuramat•59m ago
from what they wrote, they're just changing how they measure the usage; might even be a good thing if you manage your context right:

> This format replaces average per-message estimates for your plan with a direct mapping between token usage and credits. It is most useful when you want a clearer view of how input, cached input, and output affect credit consumption.

mrweasel•46m ago
Why not just attach a real dollar amount, rather than using "credits"?

Well, I know why. I just wanted to be snarky. It's just that trying to hide the actual price is getting a bit old. Just tell me that generating this much code will cost me $10.

hmry•43m ago
Pay 100 Gold or 15 Gems to generate this feature
toddmorey•29m ago
You joke but as a parent, I’m so sick of the gem packs, etc. they try to push on the kids to obfuscate your actual spend on games in real world money.

And now it feels like the are gamifying the compute we use for work for all the same reasons.

devmor•5m ago
I hate that pattern so much. It’s also not just to obfuscate the spending - it’s also to ensure you already have some amount left over in your account, so that it feels like you’re not spending as much to just “top up” and afford that one thing you want this time.

If you have some left over that you can’t spend, it feels like you’ve “wasted” them.

LeafItAlone•21m ago
What is snarky about that?

The answer is so that they can charge different prices per credit. If you buy low amounts, they can charge one price. If you buy in bulk, they can offer a discount. The usage is the same, but they can differentiate price per usage to give people more a favorable price if they are better customers.

Is there anything wrong with that?

SlinkyOnStairs•17m ago
A fundamental architectural problem is that they genuinely do not know what a query will cost ahead of time.

Even for a single standalone LLM that's the case, and the 'agentic' layers thrown on top just make that problem exponentially worse.

One'd need to entirely switch away from LLMs to fix this problem.

babyshake•16m ago
Isn't this an orthogonal issue that doesn't affect whether billing is done with credits or money?
gigatexal•46m ago
good. just like the Claude model. getting the pricing to be in line with costs is the only way this remains sustainable.
sdevonoes•37m ago
I would prefer if it actually explodes sooner rather than later
flufluflufluffy•40m ago
wouldn’t it be “usage based pricing” not “pricing based usage”
felixbraun•14m ago
5h and weekly resets remain, but the quotas are now ‘filled’ differently?
DeathArrow•13m ago
Well, Alibaba, Z.ai and MiniMax have quite generous coding plans. And their models are not far from OpenAI's and Anthropic's.