frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Nvidia Launches Vera CPU, Purpose-Built for Agentic AI

https://nvidianews.nvidia.com/news/nvidia-launches-vera-cpu-purpose-built-for-agentic-ai
74•lewismenelaws•2h ago

Comments

jauntywundrkind•1h ago
Given the price of these systems the ridiculously expensive network cards isn't such a huge huge deal, but I can't help but wonder at the absurdly amazing bandwidth hanging off Vera, the amazing brags about "7x more bandwidth than pcie gen 6" (amazing), but then having to go to pcie to network to chat with anyone else. It might be 800Gbe but it's still so many hops, pcie is weighty.

I keep expecting we see fabric gains, see something where the host chip has a better way to talk to other host chips.

It's hard to deny the advantages of central switching as something easy & effective to build, but reciprocally the amazing high radix systems Google has been building have just been amazing. Microsoft Mia 200 did a gobsmacking amount of Ethernet on chip 2.8Tbps, but it's still feels so little, like such a bare start. For reference pcie6 x16 is a bit shy of 1Tbps, vaguely ~45 ish lanes of that.

It will be interesting to see what other bandwidth massive workloads evolve over time. Or if this throughout era all really ends up serving AI alone. Hoping CXL or someone else slims down the overhead and latency of attachment, soon-ish.

Maia 200: https://www.techpowerup.com/345639/microsoft-introduces-its-...

babelfish•1h ago
Most of the big AI/HPC clusters these systems are aimed at aren’t running regular PCIe Ethernet between nodes, they’re usually wired up with InfiniBand fabrics (HDR/NDR now, XDR soon)
bob1029•1h ago
> It might be 800Gbe but it's still so many hops, pcie is weighty.

Once you need to reach beyond L2/L3 it is often the case that perfectly viable experiments cannot be executed in reasonable timeframes anymore. The current machine learning paradigm isn't that latency sensitive, but there are other paradigms that can't be parallelized in the same way and are very sensitive to latency.

d_silin•1h ago
It is a 88-core ARM v9 chip, for somewhat more detailed spec.
mixmastamyk•1h ago
Hmm, the 128-core Ampere Altra CPU is already available, and in a case from System76. I wonder what else differentiates it.

If they're going to build CPUs I wish they had used Risc-V instead. They are using it somewhat already.

PeterCorless•14m ago
Vera does what NVIDIA calls Spatial Multithreading, "physically partitioning each core’s resources rather than time slicing them, allowing the system to optimize for performance or density at runtime." A kind of static hyperthreading; you get two threads per core.

It's somewhat different from how x86 chips do simultaneous multithreading (SMT),

dmitrygr•1h ago
> Purpose-Built for Agentic AI

From the "fridge purpose-built for storing only yellow tomatoes" and "car only built for people whose last name contains the letter W" series.

When can this insanity end? It is a completely normal garden-variety ARM SoC, it'll run Linux, same as every other ARM SoC does. It is as related to "Agentic $whatever" as your toaster is related to it

dpe82•1h ago
The power and importance of marketing is deeply underappreciated by us technical types.
LogicFailsMe•51m ago
And yet more than a little Gavin Belson "Box III" vibes here. Fortunately, no signature edition.
dwb•32m ago
I don’t underappreciate it, but I do despise it.
pdpi•53m ago
> It is as related to "Agentic $whatever" as your toaster is related to it

These things have hardware FP8 support, and a 1.8TB/s full mesh interconnect between CPUs and GPUs. We can argue about the "agentic" bit, but those are features that don't really matter for any workload other than AI.

kibibu•48m ago
Would cloud gaming platforms benefit from the interconnect?
pdpi•44m ago
Don't think they would. Games aren't nearly as hungry for memory bandwidth as LLMs are. Also, I expect that the VRAM/GPU/CPU balance would be completely out of whack. Something would be twiddling its thumbs waiting for the rest of the hardware.
dmitrygr•45m ago
mem bw between cores matters for .... literally all workloads that are not single-core (read: all). And FP8 matters not at all cause inference on cpu is too slow to be of any use whatsoever in the days of proper accelerators
pwg•17m ago
> It is a completely normal garden-variety ARM SoC

To mis-quote the politician quip:

How can you tell a marketer is lying?

Answer: His/her mouth is moving.

tencentshill•1h ago
So does this cut out Intel/x86 from all the massive new datacenter buildouts entirely? They've already lost Apple as a customer and are not competitive in the consumer space. I don't see how they can realistically grow at all with x86.
alecco•1h ago
Even Apple hardware looks inexpensive compared to Nvidia's huge premium. And never mind the order backlog.

x86 and Apple already sell CPUs with integrated memory and high bandwidth interconnects. And I bet eventually Intel's beancounter board will wake up and allow engineering to make one, too.

But competition is good for the market.

storus•53m ago
Apple went from a high-end PC to a low-end AI provider due to blocking Nvidia on their platform.
mikrl•48m ago
>are not competitive in the consumer space

AFAIK they still dominate on clock rate, which I was surprised to see when doing some back of the envelope calculations regarding core counts.

I felt my 8 core i9 9900K was inadequate, so shopped around for something AMD, and IIRC the core multiplier of the chip I found was dominated by the clock rate multiplier so it’s possible that at full utilization my i9 is still towards the best I can get at the price.

Not sure if I’m the typical consumer in this case however.

kllrnohj•36m ago
Your 9900k at 5ghz does work slower than a Ryzen 9800X3D at 5ghz. A lot slower (1700 single core geekbench vs 3300, and just about any benchmark will tell the same story). Clock speed alone doesn't mean anything.
wmf•36m ago
A 9700X is twice the performance of a 9900K and M5 Max is almost 3X the performance. The megahertz myth is a myth.
rishabhaiover•1h ago
I'm assuming this is for tool call and orchestration. I didn't know we needed higher exploitable parallelism from the hardware, we had software bottlenecks (you're not running 10,000 agents concurrently or downstream tool calls)

Can someone explain what is Vera CPU doing that a traditional CPU doesn't?

urig•1h ago
Lots and lots of CPUs pooled. Faster more efficient power RAM accessible to both GPU and CPU. IIUC.
rishabhaiover•1h ago
But at what stage are we asking for that RAM? if it's the inference stage then doesn't that belong to the GPU<>Memory which has nothing to do with the CPU?

I did see they have the unified CPU/GPU memory which may reduce the cost of host/kernel transactions especially now that we're probably lifting more and more memory with longer context tasks.

kibibu•49m ago
> you're not running 10,000 agents concurrently or downstream tool calls

Cursor seem to be doing exactly that though

urig•1h ago
What the heck is agentic inference and how is it supposed to be different from LLM inference? That's a rhetorical question. Screw marketing and screw hype.
BoredPositron•1h ago
Who wants general computing anyways?
KnuthIsGod•1h ago
China will beat this....

Seems like a triumph of hype over reality.

China can do breathless hype just as well as Nvidia.

gcanyon•1h ago
Anyone know how this compares to Apple’s M5 chips? Or is that comparison <takes off sunglasses> apples to oranges.
d_silin•55m ago
M5 are 9-18 cores and optimized for power-efficiency, those are more like Xeons, with 200-300W TDP, I'd bet.
kllrnohj•45m ago
If M5 has 9-18 cores and takes ~20w, then that's ~1-2w per CPU core. If these are 200-300W, and have ~100-200 CPU cores, then guess what? That's also ~1-2w per CPU core.

Xeons, Epycs, whatever this is - they are all also typically optimized for power efficiency. That's how they can fit so many CPU cores in 200-300W.

storus•54m ago
Grace GB10, Vera's predecessor, had a single core performance comparable to M3 so I guess we can expect at least M4 level performance now.
porphyra•38m ago
Isn't the GB10 a Mediatek chip and not directly related to the Grace datacenter CPU?
wtallis•7m ago
More fair to say it's completely unrelated to the Grace data center CPU.
pdpi•50m ago
Features like hardware FP8 support definitely make it apples-to-oranges.
FridgeSeal•57m ago
Are we rapidly careening towards a world where _only_ AI “computing” is possible?

Wanted to do general purpose stuff? Too bad, we watched the price of everything up, and then started producing only chips designed to run “ai” workloads.

Oh you wanted a local machine? Too bad, we priced you out, but you can rent time with an ai!

Feels like another ratchet on the “war on general purpose computing” but from a rather different direction.

baal80spam•50m ago
Say what you want about NVIDIA (to me they are just doing what every company would do in their place), but they create engineering marvels.
kibibu•47m ago
Am I crazy, or is Jensen's statement a copy-paste from ChatGPT?

(Could be both)

wmf•44m ago
If AI is so great why should he not use it?
recvonline•34m ago
Does this mean their gaming GPUs are becoming less in demand, and therefore cheaper/more available again?
TheRoque•33m ago
It means it will be profitable to mine crypto again
wmf•23m ago
No.
anesxvito•27m ago
The philosophy of knowing exactly what's on your system translates directly to how you think about software you build. Local-first, no telemetry, minimal dependencies. FreeBSD instilled that mindset in a generation of developers that now pushes back hard against cloud-everything SaaS. Tauri over Electron is the same argument applied to desktop apps.
brazukadev•24m ago
> Tauri over Electron is the same argument applied to desktop apps.

you lost me here but still got my upvote. Tauri and Electron are pretty much the same, compared to local-first vs cloud SaaS.

rka128•27m ago
"democratize access to AI and accelerating innovation."

So they make inference cheaper and the models get even worse. Or Jensen Huang has AI psychosis. Or both.

Here is a new business idea for Nvidia: Give me $3000 in a circular deal which I will then spend on a graphics card.

PeterCorless•20m ago
This is the related benchmark blog from Redpanda [disclosure: I work for Redpanda and I helped write this. Credit to Travis Downs & others at Redpanda for the heavy lifting on the testing and analysis.]

https://www.redpanda.com/blog/nvidia-vera-cpu-performance-be...

akomtu•14m ago
They should've called it Vega: https://doom.fandom.com/wiki/VEGA
yalogin•10m ago
This is yet not the grok acquisition, so there is another update coming with that claiming more improvements?
nilstycho•6m ago
https://developer.nvidia.com/blog/inside-nvidia-groq-3-lpx-t...

The bottleneck now is reviewing code, not writing it

https://mystudentfailedtheirmid.substack.com/p/the-bottleneck-now-is-reviewing-code
1•darkhorse13•4m ago•0 comments

Rigor in Analysis: From Newton to Cauchy (2005) [pdf]

https://homsigmaa.net/wp-content/uploads/2025/03/2005-Collingwood-Rigor.pdf
1•3willows•4m ago•1 comments

The flip side of AI ingenuity (DeepMind, 2020)

https://deepmind.google/blog/specification-gaming-the-flip-side-of-ai-ingenuity/
1•ramoz•5m ago•1 comments

Has anyone critically examined Michael Levin's sorting algorithm claims?

1•vladiim•5m ago•0 comments

Ask HN: How to improve my OSINT middle east monitor

1•zarathustra333•7m ago•1 comments

How Iranians are evading internet blocks to contact family abroad

https://www.bbc.com/news/articles/ckgl58y5943o
1•derbOac•7m ago•0 comments

Google kept featuring this Chrome extension for months after it turned malicious

https://www.xda-developers.com/google-featuring-chrome-extension-months-malicious/
1•Doublentender•8m ago•0 comments

I built an AI tool that analyzes contracts and flags legal risks

https://contractshieldai.com
1•Mihir_97•8m ago•0 comments

Show HN: AllMy Ledger – Desktop accounting software, one-time purchase, no cloud

https://allmy.software/ledger/
2•cdmackie•9m ago•1 comments

Orb.Farm

https://orb.farm/
1•onestay42•12m ago•0 comments

My custom agent used 87% fewer tokens when I gave it Skills for its MCP tools

https://seroter.com/2026/03/16/my-custom-agent-used-87-fewer-tokens-when-i-gave-it-skills-for-its...
1•richards•14m ago•0 comments

Ending the Sugar Rush

https://civic.io/2026/03/16/ending-the-sugar-rush/
1•cdrnsf•15m ago•0 comments

Show HN: ThresholdIQ – Browser-based anomaly detection Engine

https://thresholdiq.app
1•vigneshj•15m ago•1 comments

The Freedom Stack

https://www.ianbetteridge.com/the-freedom-stack/
1•cdrnsf•17m ago•0 comments

Hydropower Line from Quebec to Queens Could Power a Million NYC Homes

https://www.nytimes.com/2026/03/16/nyregion/hydro-power-nyc.html
2•JumpCrisscross•21m ago•0 comments

Redpanda pushes the envelope on Nvidia Vera

https://www.redpanda.com/blog/nvidia-vera-cpu-performance-benchmark
1•PeterCorless•21m ago•0 comments

Solving Problems by Writing Out Questions and Answers

https://nguyenhuythanh.com/posts/problem-solving-qnas/
1•thanhnguyen2187•24m ago•0 comments

The day Point Loma launched a ship made of concrete

https://timesofsandiego.com/military/2026/03/14/the-day-point-loma-launched-a-ship-made-of-concrete/
1•gscott•25m ago•0 comments

Appt Helper – Skip the Global Entry Interview Backlog

https://appthelper.com/en
1•Roberto_guido•25m ago•0 comments

Iranians Use an App to Map Military Bases and Missile Sites – and So Does Israel

https://www.thefp.com/p/iranians-use-an-app-to-map-military
2•mhb•29m ago•0 comments

Ford Now Sells a Supercharger Kit to Make the F-150 Lobo a Real Street Truck

https://www.thedrive.com/news/ford-now-sells-a-supercharger-kit-to-make-the-f-150-lobo-a-real-str...
1•PaulHoule•30m ago•0 comments

AI agents framework for TypeScript and Deno

https://github.com/a7ul/vibes
1•atulanand94•31m ago•0 comments

Humanities in the Machine

https://blainsmith.com/essays/humanities-in-the-machine/
1•birdculture•31m ago•0 comments

Benjamin Netanyahu is struggling to prove he's not an AI clone

https://www.theverge.com/tech/895453/ai-deepfake-netanyahu-claims-conspiracy
5•amrrs•32m ago•0 comments

Cognitive Security

https://ghuntley.com/cogsec/
2•ghuntley•33m ago•0 comments

AI as Economic Warfare

https://ghuntley.com/warfare/
1•ghuntley•33m ago•0 comments

Theorem_ledger.md

https://github.com/affectively-ai/aeon/blob/main/docs/ebooks/145-log-rolling-pipelined-prefill/co...
1•taylorbuley•35m ago•0 comments

Show HN: LynString – Translate Android Strings.xml with AI

https://www.lynstring.dev/
1•jharteg•36m ago•1 comments

AI is helping choose targets in Iran war – now it's a target too

https://www.abc.net.au/news/2026-03-15/iran-war-ai-technology-data-centres/106443004
3•breve•37m ago•0 comments

Show HN: Live-Editable Svelte Pages

https://svedit.dev
3•_mql•39m ago•0 comments