frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

OpenAI and Broadcom unveil LLM-optimized inference chip

https://openai.com/index/openai-broadcom-jalapeno-inference-chip/
128•meetpateltech•4h ago

Comments

kilroy123•3h ago
I hope to see something like this, but in a small form factor like the NVIDIA spark.

I want a super fast LLM that is Opus 4.6+, like, in ability.

wmf•1h ago
Memory bandwidth is the bottleneck in the Spark. If you replace the SoC with an optimized ASIC but keep the same 256-bit LPDDR5 the performance will be the same. You can increase performance by using wider memory but that's also more expensive.
smith7018•45m ago
Unfortunately Sam Altman won't be the one to deliver us at-home hardware that can run Opus-level models
fibonacci112358•3h ago
So this is where all the memory they bought is going to.
babelfish•1h ago
that's not really how it works
shellcromancer•3h ago
Probably obvious but still omitted in the OpenAI post: chips are being made by TSMC [1]. Wasn't sure if Intel got it.

1. https://www.investing.com/news/stock-market-news/openai-unve...

a_conservative•1h ago
I recently put 2+2 together.

Broadcom has become wealthy by being Google's TPU hardware partner, including sharing their TSMC capacity with Google, and evidently now they are doing the same thing with OpenAI. What a brilliant way to take advantage of the AI gold rush!

I wish they weren't using their piles of money to extort money out of the software industry like they are with VMWare and Bitnami.

alephnerd•1h ago
> Broadcom has become wealthy by being Google's TPU hardware partner...

Kinda, but not exactly.

Broadcom cornered the enterprise infra and security market in the late 2010s after acquiring CA Technologies, BMC, Symantec, and VMWare and were able to make a strong cybersecurity story during the late 2010s cybersecurity and SaaS boom.

That gave them plenty of cashflow that helped subsidize their hardware business when hardware was not viewed as hot as it is today.

Additionally, Broadcom is GCP's marquee customer and has been for a little under a decade so they were able to make a sweetheart deal where all that software businesses at Broadcom would be exclusively using GCP and in return GCP would working with Broadcom to design it's silicon and source infra needed for their DC buildouts.

Ironically, the DoJ blocking Broadcom's acquisition of Qualcomm was the best thing it ever could have done for Broadcom, because it gave Broadcom the dry powder to dominate the Enterprise SaaS and build a strong niche in the cybersecurity space.

> piles of money to extort money out of the software industry

From personal experience, executives and leadership who started off in the electronics and hardware industry are much more vicious and cutthroat than their peers who started in software.

Working in an industry that historically had to deal with high commodification, low margins, and long tail sales leads to leadership that can execute. Additionally, no one climbs the leadership ladder without having spent years as a line-level engineer.

maz1b•2h ago
Pretty huge move. Google and their TPUs are looking infinitely more prescient as I think they are on their 7th generation, along with the offshoots it inspired like the LPU and even others, perhaps like Cerebras and their Wafer Scale Engine.

However, based off first impressions, it seems like this is meant for inference side, and not training, which is also an interesting choice.

forrestthewoods•1h ago
Inference costs are higher than training now. I think.

Nvidia is king of general purpose training chips. But inferences can be specialized.

skeledrew•1h ago
Training is pretty much a 1x cost, and efficiency there is already on the way down with architectural improvements. Inference though is an ongoing cost which over time takes orders of magnitude more resources, so focusing on making that far more efficient means way greater gains over time.
zer00eyz•1h ago
> early testing shows that Jalapeño will deliver performance per watt substantially better than current state-of-the-art

We're starting to see what really matters here, and though this is hand wavy the TPU makes similar claims.

I think googles memo about having no moat still stands (see: https://newsletter.semianalysis.com/p/google-we-have-no-moat... if you are unaware). It kind of makes sense that all of this is looking more like 60's to 90's IBM, DEC, Cray, Sun and the hardware race that happened then. History doesn't repeat but it often rhymes and I suspect that these efforts will follow the same trajectory.

jerojero•2h ago
One thing I don't like about California based companies is how cringe the names always are.

"Jalapeño" is such a bad name, having an "ñ" already makes it difficult and annoying to deal with in so many little ways. Good luck with that.

But also, theres the sort of "yes lets use Mexican related things because we're California" thought that I just really hate. I don't know, its like corporate Memphis to me. You see a product like this, you know it's an uppity califonia based firm that came up with it.

qsxfthnkp2322•2h ago
Jalapeño

Jalapeño

Jalapeño

Really has a… ring to it

thewebguyd•1h ago
No worse, I suppose, than, the obsession with Lord of the Rings that the authoritarian surveillance companies have. Palantir, Anduril. Then we have the not defense/surveillance ones: Mithril, Valar, Narya, Erebor
skeledrew•1h ago
What kinds of names would you suggest?
thewebguyd•1h ago
None, probably. Just saying Jalapeño is no worse than any other non-descriptive company name. Although at least Palantir and Anduril are aptly named for what they do. The VC firms less so.
utopiah•1h ago
dadoum•2h ago
> May we scale smoothly, exponentially and uneventfully through A[SI]

That sentence sounds weird to me. I can't really put my finger on why, maybe the combination of adverbs, or just the fact of writing the desire of scaling as a company so directly. It feels (to me) like openly claiming their selfish goals. Or maybe I am just misinterpreting and they are referring to the whole humanity as "We" (but knowing Broadcom and in a lesser extent OpenAI doings, I am not convinced).

qsxfthnkp2322•2h ago
aw shucks nvda has some spicy competition

Make sure you all use that fancy ñ

boarush•1h ago
They don't have true competition, what they lose out on is market share with hyperscalers, since OpenAI would have no plans to share inference hardware with any other company right now. Plus, I don't know how does NVIDIA's investment equation pans out long terms given OpenAI will be investing in more purpose built inference stack for the future.
Legend2440•1h ago
The only surprising thing about this is that they didn't do it three years ago.
satvikpendem•1h ago
I'm assuming they used LLMs to (help humans) do custom circuit design. Even pre LLM there were various computer optimizations that didn't require humans like genetic algorithms. It'd be cool to see a paper on how they did it.
jabedude•1h ago
how much does this chip help with inference speed?
wmf•1h ago
It's probably the same speed but cheaper.
gravypod•1h ago
I wonder how close OpenAI is getting to using the memory they purchased. Are they planning to stack a huge amount of HBM2 into these chips?
wmf•1h ago
I assume OpenAI has been buying memory and "giving" it to Nvidia in exchange for a discount.
v5v3•1h ago
>designed for initial deployment by the end of 2026 and expanding in the years ahead,

So after the IPO and will be featured heavily in the IPO sales brochure as a future promise?

I'm sceptical over any pre-IPO announcements.

frandroid•30m ago
Who's IPO? Broadcom and Google are already listed, obviously.
fennecbutt•1h ago
I mean I'd love to be able to buy something like the 17k tps taalas chip as a pcie or m.2.

Imagine when we can roar along at that speed, low power. Can just have the model reason for a while about anything and everything. It reminds me of the "race to idle" for mcus etc.

MichaelNolan•1h ago
The current taalas chip is for a 3.1B param model. I’m hope so much that they can get that up to the 30B range. Just imagine Gemma 4 or Qwen 3.6 at 17k tps.
ipdashc•1h ago
> 17k tps taalas chip

It's odd to me that I haven't heard anything about this approach (baking LLMs/weights into silicon directly) since. It seems almost common-sense that we're going to end up there eventually. And it feels like that point is drawing ever closer now that model capabilities, if not quite plateauing out, are at least getting to a "good enough" point for a LOT of use cases.

I wonder if it's being worked on in secret, if there's something about it that makes it infeasible, or if companies are really too nervous to lock in one model like that because the next one down the line could be a huge improvement. Re. infeasability, I have heard that the Taalas demonstration chip ran Llama 3.1 8B (a pretty horrible model) and that even that took a massive amount of transistors / die area. So it might just be the case that the good models are too big to fit on silicon?

wmf•1h ago
Good models will require multiple Taalas chips but Groq and Cerebras also require a lot of chips and that hasn't stopped them.
theowaway213456•1h ago
This seems like more competition for Cerebras? Am I understanding correctly?
HarHarVeryFunny•49m ago
This is just an uncut wafer - I don't think it's intended to be wafer-scale chip.

Cerebras etch memory onto the wafer alongside the processing elements, but AFAIK OpenAI are going to be using HBM memory and a conventional chiplet design.

digitaltrees•33m ago
We’ve entered the “if you care about software, build hardware” phase of AI
zwarag•25m ago
What are the other phases. Or what are you referring to in general?
some-guy•21m ago
I have been eyeing what Taalas is doing [1] by making pure hardware models. The speed is absurd.

[1] https://taalas.com/products/

wmf•15m ago
“People who are really serious about software should make their own hardware.” ― Alan Kay
Mistletoe•26m ago
The similarities between the AI world and the crypto world are so much closer than any AI fanboy would ever admit.
flyinglizard•4m ago
I call BS. It’s probably a white label around existing Broadcom IP, impossible to go from zero to this kind of chip in nine months. I doubt OpenAI had any significant contribution.
vb-8448•13m ago
Did they acquire also BMC?
a_conservative•7m ago
Good information, Broadcom is a playa, lots and lots of acquisitions! (a quick google search turns up a very eventful history for Broadcom)

> From personal experience, executives and leadership who started off in the electronics and hardware industry are much more vicious and cutthroat than their peers who started in software.

Only The Paranoid Survive is quite a name for a management book. It implies surviving in the world you are speaking about.

[0] https://www.goodreads.com/book/show/66863.Only_the_Paranoid_...

HarHarVeryFunny•1h ago
I just read a claim on Twitter that the reason these companies (Google and Amazon as well as OpenAI) are using Broadcom isn't just for design expertise, but because Broadcom have allocation agreements in place with TSMC and the memory manufacturers.
alephnerd•1h ago
Most design partners have allocation agreements. The thing is Broadcom is an absolute GIANT in the ASIC design space, and it's closest competitor Marvell is a fraction of it's size.

There are a lot of large tech companies that most of HN has never heard about that completely dominate entire segments.

Strawberry was too complicated as a codename.

John Carmack on the mistakes around Quake that ruined id software

https://twitter.com/ID_AA_Carmack/status/2069799283369345247
277•shadowtree•1h ago•120 comments

RubyLLM: A Ruby framework for all major AI providers

https://rubyllm.com/
199•doener•3h ago•24 comments

We’re making Bunny DNS free

https://bunny.net/blog/were-making-bunny-dns-free/
668•dabinat•8h ago•213 comments

CAPTCHAs have failed for 20 years

https://www.browserbase.com/blog/why-captchas-are-getting-harder
45•harsehaj•1h ago•32 comments

For Most of the World, Open-Source AI Is the Only Way Forward

https://techstrong.ai/articles/for-most-of-the-world-open-source-ai-is-the-only-way-forward/
74•CrankyBear•2h ago•42 comments

PR spam today looks like email spam in the early 2000s

https://www.greptile.com/blog/prs-on-openclaw
42•dakshgupta•3h ago•36 comments

Show HN: Nub – A Bun-like all-in-one toolkit for Node.js

https://github.com/nubjs/nub
119•colinmcd•3h ago•26 comments

The Xteink X4 E-Ink Reader

https://blog.omgmog.net/post/xteink-x4-e-ink-reader/
22•felixdoerp•1h ago•5 comments

I taught a bucket to speak Git

https://www.tigrisdata.com/blog/objgit/
29•xena•1h ago•3 comments

Computer use in Gemini 3.5 Flash

https://blog.google/innovation-and-ai/models-and-research/gemini-models/introducing-computer-use-...
12•swolpers•21m ago•1 comments

Krea 2: SOTA open-weights 12B image model

https://www.krea.ai/blog/krea-2-technical-report
192•mattnewton•1d ago•23 comments

Running Windows Games on a Hobby OS with Wine

https://astral-os.org/posts/2026/04/03/wine-on-astral.html
49•avaliosdev•3h ago•15 comments

Genuinely, my all-time favourite image: Mamenchisaurus hochuanensis

https://svpow.com/2026/06/04/genuinely-my-all-time-favourite-image-mamenchisaurus-hochuanensis/
51•surprisetalk•2d ago•16 comments

A Practical Guide to SSH Tunnels: Local and Remote Port Forwarding

https://labs.iximiuz.com/tutorials/ssh-tunnels
162•signa11•4d ago•30 comments

Show HN: Monolisa v3 – a typeface for developers and creatives

https://www.monolisa.dev/
97•bebraw•2d ago•20 comments

Boffin claims Microsoft's "quantum leap" is invalid due to "basic Python errors"

https://www.theregister.com/research/2026/06/24/boffin-claims-microsofts-supposed-quantum-leap-do...
82•connorboyle•2h ago•35 comments

Show HN: Pure Effect – Reproduce production bugs on your laptop without a DB

https://pure-effect.org
38•tie-in•3d ago•7 comments

Haystack: Open-Source AI Framework for Production Ready Agents, RAG

https://haystack.deepset.ai/
66•doener•6h ago•19 comments

Edsger Dijkstra's Library (Housed and Archived in Leuven, Belgium)

https://www.dijkstrascry.com/inventory
21•rramadass•2h ago•4 comments

Founding a company in Germany: €9600, 152 days and I still can't send an invoice

https://paolino.me/founding-a-company-in-germany/
456•earcar•5h ago•530 comments

Show HN: peerd – AI agent harness that runs entirely in your browser

https://github.com/NotASithLord/peerd
18•NotASithLord•1d ago•10 comments

Ubisoft co-founder Claude Guillemot dies in plane crash

https://techcrunch.com/2026/06/21/ubisoft-co-founder-claude-guillemot-dies-in-plane-crash/
15•randycupertino•36m ago•2 comments

Raspberry Pi Pico W as USB Wi-Fi Adapter

https://gitlab.com/baiyibai/pico-usb-wifi
231•byb•14h ago•110 comments

Systems optimization should be part of CI/CD

https://ucbskyadrs.github.io/blog/levi/
21•ttanv•4h ago•2 comments

Pull request limits are cutting down the noise

https://github.blog/open-source/maintainers/how-pull-request-limits-are-cutting-down-the-noise/
6•ingve•5d ago•1 comments

Statistics that live in your SQL

https://kolistat.com/blog/the-stats-duck-v0-6-0/
112•caerbannogwhite•2d ago•15 comments

Ashby (YC W19) Is Hiring EMEA Engineers Who Can Design

https://www.ashbyhq.com/careers?ashby_jid=87b96eef-edc1-4de4-adb6-d460126d02f8&utm_source=hn
1•abhikp•10h ago

OpenAI and Broadcom unveil LLM-optimized inference chip

https://openai.com/index/openai-broadcom-jalapeno-inference-chip/
130•meetpateltech•4h ago•47 comments

Minimus container images are now free

https://images.minimus.io/
106•dimastopel•5h ago•60 comments

Stealing Is a Skill

https://ben-mini.com/2026/stealing-is-a-skill
115•bewal416•4h ago•81 comments