frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
521•klaussilveira•9h ago•146 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
855•xnx•14h ago•515 comments

How we made geo joins 400× faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
68•matheusalmeida•1d ago•13 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
176•isitcontent•9h ago•21 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
177•dmpetrov•9h ago•78 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
288•vecti•11h ago•130 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
67•quibono•4d ago•11 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
342•aktau•15h ago•167 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
336•ostacke•15h ago•90 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
236•eljojo•12h ago•143 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
431•todsacerdoti•17h ago•224 comments

Unseen Footage of Atari Battlezone Arcade Cabinet Production

https://arcadeblogger.com/2026/02/02/unseen-footage-of-atari-battlezone-cabinet-production/
6•videotopia•3d ago•0 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
40•kmm•4d ago•3 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
369•lstoll•15h ago•252 comments

Delimited Continuations vs. Lwt for Threads

https://mirageos.org/blog/delimcc-vs-lwt
12•romes•4d ago•1 comments

Show HN: ARM64 Android Dev Kit

https://github.com/denuoweb/ARM64-ADK
14•denuoweb•1d ago•2 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
218•i5heu•12h ago•162 comments

Why I Joined OpenAI

https://www.brendangregg.com/blog/2026-02-07/why-i-joined-openai.html
87•SerCe•5h ago•74 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
17•gmays•4h ago•2 comments

Introducing the Developer Knowledge API and MCP Server

https://developers.googleblog.com/introducing-the-developer-knowledge-api-and-mcp-server/
38•gfortaine•7h ago•10 comments

Learning from context is harder than we thought

https://hy.tencent.com/research/100025?langVersion=en
162•limoce•3d ago•81 comments

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

https://github.com/phreda4/r3
60•phreda4•8h ago•11 comments

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

https://infisical.com/blog/devops-to-solutions-engineering
126•vmatsiiako•14h ago•51 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
261•surprisetalk•3d ago•35 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
1027•cdrnsf•18h ago•428 comments

FORTH? Really!?

https://rescrv.net/w/2026/02/06/associative
54•rescrv•17h ago•18 comments

WebView performance significantly slower than PWA

https://issues.chromium.org/issues/40817676
16•denysonique•5h ago•2 comments

I'm going to cure my girlfriend's brain tumor

https://andrewjrod.substack.com/p/im-going-to-cure-my-girlfriends-brain
106•ray__•6h ago•51 comments

Evaluating and mitigating the growing risk of LLM-discovered 0-days

https://red.anthropic.com/2026/zero-days/
44•lebovic•1d ago•14 comments

Show HN: Smooth CLI – Token-efficient browser for AI agents

https://docs.smooth.sh/cli/overview
83•antves•1d ago•60 comments
Open in hackernews

AMD's AI Future Is Rack Scale 'Helios'

https://morethanmoore.substack.com/p/amds-ai-future-is-rack-scale-helios
133•rbanffy•7mo ago

Comments

halJordan•7mo ago
Honestly that was a hard read. I hope that guy gets an mi355 just for writing this.

AMD deserves exactly zero of the credulity this writer heaps onto them. They just spent four months not supporting their rdna4 lineup in rocm after launch. AMD is functionally capable of day120 support. None of the benchmarks disambiguated where the performance is coming from. 100% they are lying on some level, representing their fp4 performance against fp 8/16.

pclmulqdq•7mo ago
AMD doesn't care about you being able to do computing on their consumer GPUs. The datacenter GPUs have a pretty good software stack and great support.
caycep•7mo ago
this is ROCm?
fooblaster•7mo ago
Yes, the mi300x/mi250 are best supported as they directly compete with data center gpus from Nvidia which actually make money. Desktop is a rounding error by comparison.
fc417fc802•7mo ago
I'm inclined to believe it but that difference is exactly how nvidia got so far ahead of them in this space. They've consistently gone out of their way to put their GPGPU hardware and software in the hands of the average student and professional and the results speak for themselves.
zombiwoof•7mo ago
Just look at the disaster of rocm or you need to spend 300k on software engineers to get anything so work
tormeh•7mo ago
I wouldn't say so. Nvidia bet on machine learning a decade or so before AMD got the memo. That was a good bet on Nvidia's part. In 2015 you just had to have an Nvidia card if you wanted to do ML research. Sure, Nvidia did hand them out in some cases, but even if you bought an AMD card it just wouldn't work. It was Nvidia or go home. Even if AMD now did everything right (and they don't), there's a decade+ of momentum in Nvidia's favor.
viewtransform•7mo ago
AMD is offering AMD Developer Cloud (https://www.amd.com/en/blogs/2025/introducing-the-amd-develo...)

"25 complimentary GPU hours (approximately $50 US of credit for a single MI300X GPU instance), available for 10 days. If you need additional hours, we've made it easy to request additional credits."

archerx•7mo ago
If they care about their future they should. I am a die hard AMD supporter and even I am getting over their mediocrity and what seems to be constant self sabotage in the GPU department.
zombiwoof•7mo ago
It’s the AMD management . They just are recycling 20 year VP lifers at AMD to take over key projects
archerx•7mo ago
They could have slapped 48gb of vram on their new Radeon cards and they would have instantly sold out but that would cut into cousins profit margin at nvidia so that’s obviously a no go.
booder1•7mo ago
I have had trained on both large AMD and Nvidia clusters and your right AMD support is good. I never had to talk to Nvidia support. That was better.

They should care about the availability of their hardware so large customers don't have to find and fix their bugs. Let consumers do that...

echelon•7mo ago
> AMD doesn't care about you being able to do computing on their consumer GPUs

Makes it a little hard to develop for without consumer GPU support...

stingraycharles•7mo ago
Yes but then they fail to understand a lot of “long tail” home projects, opensource stuff etc is done on consumer GPUs at home, which is tremendously important for ecosystem support.
cma•7mo ago
Nvidia started removing nvlink with the 4000 series, they aren't heavily focused on it either anymore and want to sell the workstation cards for uses like training models at home.
7speter•7mo ago
As a beginner, you can get a lot done on a 4060ti/5060ti 16gb, much less a 4090/5090.

Heck I’ve been able to work through the early chapters of the FastAI book using a lowly Quadro p1000

wmf•7mo ago
What if they understand that and they don't care? Getting one hyperscaler as a customer is worth more than the entire long tail.
selectodude•7mo ago
Then they’re fools. Every AI maestro knows CUDA because they learned it at home.
jiggawatts•7mo ago
It’s the same reason there’s orders of magnitude more code written for Linux than for mainframes.
stingraycharles•7mo ago
The problem is that this is short-term thinking. You need students and professionals playing around with your tools at home and/or on their work computers to drive hyperscale demand in the long term.

This is why it’s so important AMD gets their act together quickly, as the benefits of these kind of things are measured in years, not months.

danielheath•7mo ago
Why would a hyperscaler pick the technology that’s harder to hire for (because there’s no hobbyist-to-expert pipeline)?
moffkalast•7mo ago
Then they will stay irrelevant in the GPU space like they have been so far.
lhl•7mo ago
On the corp side you have FB w/ PyTorch, xformers (still pretty iffy on AMD support tbt) and MS w/ DeepSpeed. But let's see about some others:

Flash Attention: academia, 2y behind for AMD support

bitsandbytes: academia, 2y behind for AMD support

Marlin: academia, no AMD support

FlashInfer: acadedmia/startup, no AMD

ThunderKittens: academia, no AMD support

DeepGEMM, DeepEP, FlashMLA: ofc, nothing from China supports AMD

Without the long tail AMD will continue to always be in a position where they have to scramble to try to add second tier support years later themselves, while Nvidia continues to get all the latest and greatest for free.

This is just off the top of my head on the LLM side where I'm focused on, btw. Whenever I look at image/video it's even more grim.

jimmySixDOF•7mo ago
Modular says Max/Mojo will change this and make refactoring between different vendors (and different lines of the same vendor) less of a showstopper but tbd for now
pjmlp•7mo ago
The judge is still out there regarding if Max/Mojo is going to be something that the large majority cares about.
littlestymaar•7mo ago
Why should we care about them if they don't care?

I mean of they want to stay at a fraction of the market value and profit of their direct competitor, good for them.

dummydummy1234•7mo ago
I want a competitive market so I can have cheaper gpus.

It's Nvidia, AMD, and maybe Intel.

shmerl•7mo ago
Aren't they addressing it with the unified UDNA architecture? That's going to be a thing in the future GPUs, making consumer and datacenter ones share the same arch.

Different architectures was probably a big reason for the above issue.

fooker•7mo ago
It’s the same software stack.
pjmlp•7mo ago
Except they forget people get to adopt technologies by learning them on their consumer hardware.
jchw•7mo ago
I still find their delay with properly investing in ROCm on client to be rather shocking, but in fairness they did finally announce that they would be supporting client cards on day 1[1]. Of course, AMD has to keep the promise for it to matter, but they really do seem to, for whatever reason, finally realized just how important it is that ROCm is well-supported across their entire stack (among many other investments they've announced recently.)

It's baffling that AMD is the same company that makes both Ryzen and Radeon, but the year-to-date for Radeon has been very good, aside from the official ROCm support for RDNA4 taking far too long. I wouldn't get overly optimistic; even if AMD finally committed hard to ROCm and Radeon it doesn't mean they'll be able to compete effectively against NVIDIA, but the consumer showing wasn't so bad so far with the 9070 XT and FSR4, so I'm cautiously optimistic they've decided to try to miss some opportunities to miss opportunities. Let's see how long these promises last... Maybe longer than a Threadripper socket, if we're lucky :)

[1]: https://www.phoronix.com/news/AMD-ROCm-H2-2025

roenxi•7mo ago
Is this day 1 support a claim about the future or something they've demonstrated? Because if it involves the future it is safer to just assume AMD will muck it up somehow when it comes to their AI chips. It isn't like their failure in the space is a weird one-off - it has been confusingly systemic for years. It'd be nice if they pull it off, but it could easily be day 1 support for a chip that turns out to crash the computer.

I dunno; I suppose they can execute on server parts. But regardless, a good plan here is to let someone else go first and report back.

jchw•7mo ago
They've been able to execute well for Ryzen, EPYC, and Radeon in the data center. I don't really think there's any reason to believe they can't or even wouldn't be able to do ROCm on client cards, but up until recently they wouldn't commit.
zombiwoof•7mo ago
Exactly.

AMD is a marketing company now

ethbr1•7mo ago
> I hope that guy gets an mi355 just for writing this. AMD deserves exactly zero of the credulity this writer heaps onto them.

You mean Ryan Smith of late AnandTech fame?

https://www.anandtech.com/author/85/

kombine•7mo ago
If hope AMD can produce a chip that matches H100 in training workloads.
moralestapia•7mo ago
You mean a slower chip?

Their MI300s already beat them, 400s coming soon.

Vvector•7mo ago
Chip speed isn't as important as good software
moralestapia•7mo ago
The software is the same, AMD is not doing its own LLMs.
jjice•7mo ago
I think the software they were referring to is CUDA and the developer experience around the nvidia stack.
moralestapia•7mo ago
???

Know any LLMs that are implemented in CUDA?

wmf•7mo ago
Ultimately all of them except Gemini.
moralestapia•7mo ago
Wrong.

Show me one single CUDA kernel on Llama's source code.

(and that's a really easy one, if one knows a bit about it)

rnrn•7mo ago
removing comment since I regret attempting to engage in this thread
moralestapia•7mo ago
Wrong.

It is the same PyTorch whether it runs on an AMD or an NVIDIA GPU.

The exact same PyTorch, actually.

Are you're trying to suggest that the machine code that runs on the GPU is the one that is different?

If you knew a bit more, you would know that this is the case even between different generations of GPUs of the same vendor; making that argument completely absurd.

rnrn•7mo ago
removing comment since I regret attempting to engage in this thread
imtringued•7mo ago
The average consumer uses llama.cpp. So here is your list of kernels: https://github.com/ggml-org/llama.cpp/tree/master/ggml/src/g...

And here is pretty damning evidence that you're full of shit: https://github.com/ggml-org/llama.cpp/blob/master/ggml/src/g...

The ggml-hip backend references the ggml-cuda kernels. The "software is the same" (as in, it is CUDA) and yet AMD is still behind.

moralestapia•7mo ago
@tomhow personal attack
lhl•7mo ago
Last year I had issues using MI300X for training, and when it did work, was about 20-30% slower than H100, but I'm doing some OpenRLHF (transformers/DeepSpeed-based) DPO training atm w/ latest ROCm and PyTorch and it seems to be doing OK, roughly matching GPU-hour perf w/ an H200 for small ~12h runs.

Note: previous testing I did was on a single (8x) MI300X node, currently I'm doing testing on just a single MI300X GPU, so not quite apples-to-apples, multi-GPU/multi-node training is still a question mark, just a single data point.

fooker•7mo ago
It gets even more jarring that H100 is about three years old now.
aetherspawn•7mo ago
I hear [“Atropos log, abandoning Helios”](https://returnal.fandom.com/wiki/Helios) and have an emotional reaction every time this comes up in the news.
zombiwoof•7mo ago
AMD future should be figuring out how to reproduce the performance numbers they “claim” they are getting
user____name•7mo ago
Is Bob Page leading the effort?
alecco•7mo ago
Jensen knows what he is doing with the CUDA stack and workstations. AMD needs to beat that more than thinking about bigger hardware. Most people are not going to risk years learning an arcane stack for an architecture that is used by less than 10% of the GPGPU market.
rbanffy•7mo ago
Indeed. The stories I hear about software support for their entry-level hardware aren't great. Having a good on-ramp is essential.

OTOH, by emphasizing datacenter hardware, they can cover a relatively small portfolio and maximize access to it via cloud providers.

As much as I'd love to see an entry-level MI350-A workstation, that's not something that will likely happen.

pjmlp•7mo ago
Additionally when people discuss CUDA they always think about C, ignoring that has been a C++ first since CUDA 3.0, also has Fortran surpport, and NVidia always embraced having multiple languages being able to play on PTX land as well.

And as of 2025, there is a Python CUDA JIT DSL as well.

Also, even if not the very latest version, the fact that CUDA SDK works on any consumer laptop with NVidia hardware, anyone can slowly get into CUDA, even if their hardware isn't that great.

hyperbovine•7mo ago
I'm willing to bet almost nobody you know calls the CUDA API directly. What AMD needs to focus on is getting the ROCm backend going for XLA and PyTorch. That would unlock a big slice of the market right there.

They should also be dropping free AMD GPUs off helicopters, as Nvidia did a decade or so ago, in order to build up an academic userbase. Academia is getting totally squeezed by industry when it comes to AI compute. We're mostly running on hardware that's 2 or 3 generations out of date. If AMD came with a well supported GPU that cost half what an A100 sells for, voila you'd have cohort after cohort of grad students training models on AMD and then taking that know-how into industry.

bwfan123•7mo ago
Indeed. the user-facing software stack componentry - pytorch and jax/xla - are owned by meta, and google and open sourced. Further, the open-source models (llama/deepseek) are largely hw agnostic. There is really no user or eco-system lock-in. Also, clouds are highly incentivized to have multiple hardware alternatives.
pjmlp•7mo ago
HN keeps forgetting game development and VFX exists.
hyperbovine•7mo ago
What fraction of Nvidia revenue comes from those applications?
pjmlp•7mo ago
Lets put it this way, they need graphics cards, and CUDA is now relatively common.

For example OTOY OctaneRender, one of the key renders in Hollywood.

akshayt•7mo ago
About 0.1% from professional visualization in Q1 this year
aseipp•7mo ago
There already is ROCm support for PyTorch. Then there's stuff like this: https://semianalysis.com/2024/12/22/mi300x-vs-h100-vs-h200-b...

They have improved since that article, by a decent amount from my understanding. But by now, it isn't enough to have "a backend". The historical efforts have spoiled that narrative so badly that it won't be enough to just have a pytorch-rocm pypi package; some of that flak is unfair though not completely unsubstantiated. But frankly they need to deliver better software, across all their offerings, for multiple successive generations before the bad optics around their software stack will start fading. Their competitors are already on their next gen architecture since that article was written.

You are correct that people don't really invoke CUDA APIs much, but that's partially because those APIs actually work and deliver good performance, so things can actually be built on top of them.

7speter•7mo ago
AMD isn’t doing what you’re proposing, but it seems intel is a few months out from this.
cedws•7mo ago
At this point it looks to me like something is seriously broken internally at AMD resulting in their software stack being lacklustre. They’ve had a lot of time to talk to customers about their problems and spin up new teams, but as far as I’ve heard there’s been very little progress, despite the enormous incentives. I think Lisa Su is a great CEO but perhaps not shaking things up enough in the software department. She is from a hardware background after all.
bwfan123•7mo ago
There used to be a time when hw vendors begudgingly put out sample driver code which contained 1 file with 5000 lines of C code - which just about barely worked. The quality of software was not really a priority, as most of the revenue was from hw sales. That reflected in the quality of hires and incentive structures.
AlexanderDhoore•7mo ago
Can someone with more knowledge give me a software overview of what AMD is offering?

Which SDKs do they offer that can do neural network inference and/or training? I'm just asking because I looked into this a while ago and felt a bit overwhelmed by the number of options. It feels like AMD is trying many things at the same time, and I’m not sure where they’re going with all of it.

Minks•7mo ago
ROCm really is hit or miss depending on the use case.

Plus their consumer card support is questionable to say the least. I really wish it was a viable alternative, but swapping to CUDA really saved me some headaches and a ton or time.

Having to run MiOpen benchmarks for HIP can take forever.

m_mueller•7mo ago
Exactly the same has been said over and over again, ever since CUDA took off for scientific computing around 2010. I don’t really understand why 15 years later AMD still hasn’t been able to copy the recipy, and frankly it may be too late now with all that mindshare in NVIDIA’s software stack.
bigyabai•7mo ago
It's just not easy. Even if AMD was willing to invest in the required software, they would need a competitive GPU architecture to make the most of it. It's a lot easier to split 'cheap raster' and 'cheap inference' into two products, despite Nvidia's success.
7speter•7mo ago
Well, AMD is supposed to be releasing UDNA next year, which will presumably ‘unite’ capabilities like raster and inference within one architecture.
bayindirh•7mo ago
Just remember that 4 of the top 10 Top500 systems run on AMD Instinct cards, based on the latest June 2025 list announced at ISC Hamburg.

NVIDIA has a moat for smaller systems, but that is not true for clusters.

As long as you have a team to work with the hardware you have, performance beats mindshare.

wmf•7mo ago
HPC has probably been holding AMD back from the much larger AI market.
pjmlp•7mo ago
Custom builds with top paid employees to make the customer happy.
bayindirh•7mo ago
What do you mean?
convolvatron•7mo ago
presumably that in HPC you can dump enough money into individual users to make the platform useful in a way that is impossible in a more horizontal market. in HPC it used to be fairly common to get one of only 5 machines with processor architecture that had never existed before, dump a bunch of energy into making it work for you, and then throw it all out in 6 years.
pjmlp•7mo ago
Besides sibling comment, HPC labs are the kind of customers that get hardware companies to fly in engineers when there is a problem bringing down the compute cluster.
aseipp•7mo ago
The Top500 is an irrelevant comparison; of course AMD is going to give direct support to single institutions that give them hundreds of millions of dollars and help make their products work acceptably. They would be dead if they didn't. Nvidia also does the same thing to their major clients, and yet they still make their products actually work day 1 on consumer products, too.

Nvidia of course has a shitload more money, and they've been doing this for longer, but that's just life.

> smaller systems

El Capitan is estimated to cost around $700 million or something with like 50k deployed MI300 GPUs. xAI's Colossus cluster alone is estimated to be north of $2 billion with over 100k GPUs, and that's one of ~dozens of deployed clusters Nvidia has developed in the past 5 years. AI is a vastly bigger market in every dimension, from profits to deployments.

pjmlp•7mo ago
What really matters is how much of "Software++: ROCm 7 Released" can I use on a regular consumer laptop, like I can with CUDA.
numpad0•7mo ago
fyi: ROCm support status currently isn't crucial for casual AI users - standard proprietary AMD drivers include Vulkan API support going back ~10 years. It's slower, but llama.cpp supports it, and so do many oneclick automagic LLM apps like LM Studio.
Paradigma11•7mo ago
Don't call us, we will call you when that future is the present.