frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Nvidia with unusually fast coding model on plate-sized chips

https://arstechnica.com/ai/2026/02/openai-sidesteps-nvidia-with-unusually-fast-coding-model-on-plate-sized-chips/
23•Bender•4d ago

Comments

greyskull•1h ago
Missing "OpenAI sidesteps" from the beginning of the title article title
sorenbs•1h ago
Yeah. Completely changes the meaning of the article. I thought Nvidia was now competing with Cerebras. That's not the case...
jeron•1h ago
very excited for cerebras, hopefully nvidia/amd will have less AI sales and bring back more consumer options when they realize they have abandoned/neglected the market that made them who they are
krackers•1h ago
Nvidia bought groq, so they might be working on their own answer to low-latency serving. (I found this good explanation of groq compared to TPU [1])

[1] https://reddit.com/r/LocalLLaMA/comments/1pw8nfk/nvidia_acqu...

reliabilityguy•1h ago
I have a question for those who closely follows Cerebras: do they have a future beyond being inference platform based on (an unusual) in-house silicon?
bob1029•1h ago
They can also train models using this silicon. They're advertising 24T parameter models on their site right now.
adgjlsfhk1•1h ago
tldr is possibly. their packaging does offer inherent advantages in giving you maximal compute without external communication, and that seems likely to remain true unless 3d stacking advances a lot further.
sorenbs•1h ago
If chip manufacturing advances allow them to eventually run leading edge models at speeds much faster than competition, that seems a really bright future all on its own. Their current chip is reportedly 5nm already, and much too small for the real 5.3-codex: https://www.cerebras.ai/press-release/cerebras-announces-thi...
dgacmu•1h ago
My mental model of cerebras is that they have a way of giving you 44GB of SRAM (and then more compute than you'll probably need relative to that). So if you have applications where the memory access patterns would benefit massively from basically having 44GB of L3-ish-speed SRAM, and it's worth $1-3m to get that, then it's a win.

Honestly not sure what else fits that bill. Maybe some crazy radar applications? The amount of memory is awfully small for traditional HPC.

wmf•27m ago
Do they need any future beyond inference? It's going to be a huge market.
reliabilityguy•24m ago
In principle? No. In practice? I think others, eg TPUs and Trainiums of the world will cannibalize a lot of Cerebras’s share. I am not an expert though, that’s why I’m asking opinions of others.
uniclaude•1h ago
Previous discussion on 5.3 codex Spark (sharing as the article doesn’t add tremendous value to it): https://news.ycombinator.com/item?id=46992553
gortok•1h ago
Ever since the recent revelation that Ars has used AI-hallucinated quotes in their articles, I have to wonder whether any of these quotes are AI-hallucinated, or if the piece itself is majority or minority AI generated.

If so, I have to ask: If you aren’t willing to take the time to write your own work, why should I take the time to read your work?

I didn’t have to worry about this even a week ago.

what•1h ago
>I didn’t have to worry about this even a week ago

No, you didn’t realize you had to worry about this until a week ago.

cyanydeez•55m ago
Im actually very cpncerned people have yet to realize they dont need to put truth values on internet content.

Once you default to 'doesnt matter if true' you end up being a lot more even keeled.

Havoc•1h ago
> On Thursday, OpenAI released its first production AI model to run on non-Nvidia hardware,

They used amd gpus before - MI300X via azure a year plus ago

nguyentran03•1h ago
the hardware diversification story here is more interesting than the speed numbers. OpenAI going from a planned $100B Nvidia deal to "actually we're unsatisfied with your inference speed" within a few months is a pretty dramatic shift. AMD deal, Amazon cloud deal, custom TSMC chip, and now Cerebras. that's not hedging, that's a full migration strategy.

1,000 tok/s sounds impressive but Cerebras has already done 3,000 tok/s on smaller models. so either Codex-Spark is significantly larger/heavier than gpt-oss-120B, or there's overhead from whatever coding-specific architecture they're using. the article doesn't say which.

the part I wish they'd covered: does speed actually help code quality, or just help you generate wrong code faster? with coding agents the bottleneck isn't usually token generation, it's the model getting stuck in loops or making bad architectural decisions. faster inference just means you hit those walls sooner.

conception•1h ago
With agent teams I’ve found CC significantly better at catching mistakes on itself before it finishes its task. Having several agents challenging the implementation agents seems to produce better results. If so, faster is better always as you then can run more adversarial/verification tasks before finishing.
aurareturn•58m ago
If you are OpenAI, why wouldn’t you naturally want more than one single supplier? Especially at a time where no one can get enough chips.
irishcoffee•52m ago
> OpenAI going from a planned $100B Nvidia deal to "actually we're unsatisfied with your inference speed" within a few months is a pretty dramatic shift.

A different way to read this might be: Nvidia isn't going to agree to that deal, so we now need to save face by dumping them first"

I imagine sama doesn't like rejection.

nerdsniper•20m ago
I'm 99% sure this 20-hour old user is an LLM posting on HN. Specifically, ChatGPT.
RobotToaster•1h ago
One thing I don't get about Cerebras, they say it's wafer scale, but the chips they show are square, I thought wafers were circular?
cyanydeez•52m ago
I believe the discs are a product of the manufacturing, having to spin them. The entire disc is not useable, so not really would you call it a wafer. If the entire cjip comes from the wafer, its wafer scale.
wmf•28m ago
They cut off the sides. It's the largest square you can make from a wafer.
gpm•28m ago
Their chips aren't actually square, they get an extra 2.9mm in both dimensions by having slightly rounded corners. They are wasting the rest of the circle though yes.
ElijahLynn•53m ago
Title is currently: "OpenAI sidesteps Nvidia with unusually fast coding model on plate-sized chips"
AndrewKemendo•43m ago
Mark my words:

The era of “Personal computing” is over

Large scale Capital is not gonna make any more investments into microelectronics going forward

Capital is incentivized to make large data centers and very high speed private Internet, not public Internet, private Internet like starlink

So the same way in the 1970s it was the main frame era and server side computing, which turned into server side rendering, which then turned into client side rendering which culminated in the era of the private computer in your home and then finally in your pocket

we’re going back to server side model communication and that’s going to encompass effectively the gateway to all other information which will be increasingly compartmentalized into remote data centers and high-speed access

Dark web agent spotted bedroom wall clue to rescue girl from abuse

https://www.bbc.com/news/articles/cx2gn239exlo
81•colinprince•55m ago•30 comments

Study: Self-generated Agent Skills are useless

https://arxiv.org/abs/2602.12670
248•mustaphah•4h ago•107 comments

14-year-old Miles Wu folded origami pattern that holds 10k times its own weight

https://www.smithsonianmag.com/innovation/this-14-year-old-is-using-origami-to-design-emergency-s...
402•bookofjoe•7h ago•80 comments

AI is destroying Open Source, and it's not even good yet

https://www.jeffgeerling.com/blog/2026/ai-is-destroying-open-source/
79•VorpalWay•1h ago•47 comments

Show HN: Andrej Karpathy's microgpt.py to C99 microgpt.c – 4,600x faster

https://github.com/enjector/microgpt-c
38•Ajay__soni•1h ago•2 comments

Rise of the Triforce

https://dolphin-emu.org/blog/2026/02/16/rise-of-the-triforce/
89•max-m•4h ago•7 comments

Show HN: Scanned 1927-1945 Daily USFS Work Diary

https://forestrydiary.com/
46•dogline•2h ago•7 comments

Show HN: Free Alternative to Wispr Flow, Superwhisper, and Monologue

https://github.com/zachlatta/freeflow
91•zachlatta•4h ago•45 comments

Testing Postgres race conditions with synchronization barriers

https://www.lirbank.com/harnessing-postgres-race-conditions
65•lirbank•5h ago•29 comments

What your Bluetooth devices reveal

https://blog.dmcc.io/journal/2026-bluetooth-privacy-bluehood/
307•ssgodderidge•11h ago•121 comments

Running NanoClaw in a Docker Shell Sandbox

https://www.docker.com/blog/run-nanoclaw-in-docker-shell-sandboxes/
59•four_fifths•3h ago•19 comments

Visual Introduction to PyTorch

https://0byte.io/articles/pytorch_introduction.html
122•0bytematt•3d ago•12 comments

PascalABC.net

https://pascalabc.net:443/en
26•andsoitis•2d ago•5 comments

State of Show HN: 2025

https://blog.sturdystatistics.com/posts/show_hn/
58•kianN•6h ago•11 comments

PCB Rework and Repair Guide [pdf]

https://www.intertronics.co.uk/wp-content/uploads/2017/05/PCB-Rework-and-Repair-Guide.pdf
93•varjag•2d ago•28 comments

Turing Labs (YC W20) Is Hiring – Founding GTM Sales Hacker

1•turinglabs•4h ago

Suicide Linux (2009)

https://qntm.org/suicide
80•icwtyjj•5h ago•50 comments

Qwen3.5: Towards Native Multimodal Agents

https://qwen.ai/blog?id=qwen3.5
379•danielhanchen•16h ago•180 comments

Show HN: Jemini – Gemini for the Epstein Files

https://jmail.world/jemini
260•dvrp•20h ago•48 comments

Neurons outside the brain

https://essays.debugyourpain.com/p/you-are-not-just-your-brain
55•yichab0d•7h ago•21 comments

Show HN: 2D Coulomb Gas Simulator

https://simonhalvdansson.github.io/2D-Coulomb-Gas-Tools/index_gpu.html
30•swesnow•6h ago•5 comments

Ghidra by NSA

https://github.com/NationalSecurityAgency/ghidra
310•handfuloflight•2d ago•175 comments

Show HN: Maths, CS and AI Compendium

https://github.com/HenryNdubuaku/maths-cs-ai-compendium
54•HenryNdubuaku•10h ago•14 comments

The long tail of LLM-assisted decompilation

https://blog.chrislewis.au/the-long-tail-of-llm-assisted-decompilation/
45•knackers•7h ago•13 comments

Building for an audience of one: starting and finishing side projects with AI

https://codemade.net/blog/building-for-one/
5•lorisdev•1h ago•0 comments

LCM: Lossless Context Management [pdf]

http://papers.voltropy.com/LCM
25•ClintEhrlich•7h ago•14 comments

Hear the "Amati King Cello", the Oldest Known Cello in Existence

https://www.openculture.com/2021/06/hear-the-amati-king-cello-the-oldest-known-cello-in-existence...
5•tesserato•3d ago•2 comments

How to take a photo with scotch tape (lensless imaging) [video]

https://www.youtube.com/watch?v=97f0nfU5Px0
103•surprisetalk•9h ago•5 comments

Privilege is bad grammar

https://tadaima.bearblog.dev/privilege-is-bad-grammar/
219•surprisetalk•7h ago•220 comments

Building a model that visualizes strategic golf

https://golfcoursewiki.substack.com/p/i-spent-the-last-month-and-a-half
15•scoofy•8h ago•6 comments