frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Nvidia DGX Spark: When Benchmark Numbers Meet Production Reality

https://publish.obsidian.md/aixplore/Practical+Applications/dgx-lab-benchmarks-vs-reality-day-4
57•RyeCatcher•2h ago

Comments

RyeCatcher•1h ago
Would love to hear from others using the spark for model training and development.
stuckinhell•59m ago
I'm utterly shocked at the article saying GPU inference (PyTorch/Transformers)isn't working. Numerical instability produces bad outputs, Not viable for real-time serving, Wait for driver/CUDA updates!

My job just got me and our entire team a DGX spark. I'm impressed at the ease of use for ollama models I couldn't run on my laptop. gpt-oss:120b is shockingly better than what I thought it would be from running the 20b model on my laptop.

The DGX has changed my mind about the future being small specialized models.

jasonjmcghee•55m ago
You're shocked because that isn't your experience? From the article it sounds like ollama runs cpu inference not GPU inference. Is that the case for you?
RyeCatcher•43m ago
Totally agree. I’ve been training nanochat models all morning. Hit some speed bumps. I’ll share more later in another article. Buts it’s absolutely amazing. I fine tuned a Gemma3 model in a day yesterday.
jsheard•50m ago
No mention of the monstrous 200GbE NIC, seems like a waste if people aren't finding a use for it.
RyeCatcher•42m ago
Need to buy 2 and connect em. :-)
RyeCatcher•49m ago
I absolutely love it. I’ve been up for days playing with it. But there are some bleeding edge issues. I tried to write a balanced article. I would highly recommend for people that love to get their hands dirty. Blows away any consumer GPU.
furyofantares•44m ago
Since the text is obviously LLM output, how much prompting and editing went into this post? Did you have to correct anything that you put into it that it then got wrong or added incorrect output to?
enum•17m ago
+1

I have H100s to myself, and access to more GPUs than I know what to do with in national clusters.

The Spark is much more fun. And I’m more productive. With two of them, you can debug shallow NCCL/MPI problems before hitting a real cluster. I sincerely love Slurm, but nothing like a personal computer.

veber-alex•39m ago
The llama.cpp issues are strange.

There are official benchmarks of the Spark running multiple models just fine on llama.cpp

https://github.com/ggml-org/llama.cpp/discussions/16578

RyeCatcher•37m ago
Cool I’ll have a look. All reflections I made were first pass stuff.
CaptainOfCoit•24m ago
There wasn't any instructions how the author got ollama/llama.cpp, could possibly be something nvidia shipped with the DGX Spark and is an old version?
eadwu•31m ago
There are bleeding edge issues, everyone dials into transformers so that's generally pain proof.

I haven't exactly bisected the issue but I'm pretty sure convolutions are broken on sm_121 after a certain size, getting 20x memory blowup from a convolution from a 2x batch size increase _only_ on the DGX Spark.

I haven't had any problems with inference, but I also don't use the transformers library that much.

llama.cpp was working for openai-oss last time I checked and on release, not sure if something broke along the way.

I don't exactly know if memory fragmentation is something fixable on the driver side - this might just be the problem with kernel's policy and GPL, it prevents them from automatically interfering with the memory subsystem to the granularity they'd like - see zfs and their page table antics - or so my thoughts on it is.

If you've done stuff on WSL, you have similar issues and you can fix it by running a service that normally compacts and clean memory, I have it run every hour. Note that this does impact at the very least CPU performance and memory allocation speeds, but I have not have any issue with long training runs with it (24hr+, assuming that is the issue, I have never tried without it and put that service in place since getting it due to my experience on WSL).

Using Atomic State to Improve React Performance in Deeply Nested Component Trees

https://runharbor.com/blog/2025-10-26-improving-deeply-nested-react-render-performance-with-jotai...
1•18nleung•2m ago•0 comments

Seeking Work: Greater Bay Area Remote: Yes (US-Based)

1•kimhoffman•3m ago•0 comments

Alpha launch – .well-known/avatar – feedback wanted

https://shkspr.mobi/blog/2025/10/alpha-launch-well-known-avatar-feedback-wanted/
2•birdculture•4m ago•0 comments

Show HN: A Minimal Playwright Skill for Claude Code

https://github.com/noiv/skill-playwright-minimal
1•noiv•6m ago•0 comments

Looking for a Winter/Spring 2026 SWE Internships

https://drive.google.com/file/d/1EmNsncIcqVbfh4c41SIVQPU3A1mFcj8E/view?usp=sharing
1•ambha21•8m ago•1 comments

Collective Communication for 100k+ GPUs

https://arxiv.org/abs/2510.20171
1•ingve•12m ago•0 comments

Show HN: AI Models Group Chat

https://chat.llmgateway.io/group
1•smakosh•14m ago•0 comments

Chinese and U.S. Officials Reach Framework of a Trade Deal

https://www.nytimes.com/2025/10/26/business/china-us-trade.html
2•aspenmayer•14m ago•1 comments

The MP3.com Rescue Barge Barge

https://blog.somnolescent.net/2025/09/mp3-com-rescue-barge-barge/
1•CharlesW•18m ago•0 comments

Trump and Xi will 'consummate' TikTok deal on Thursday, treasury secretary says

https://techcrunch.com/2025/10/26/trump-and-xi-will-consummate-tiktok-deal-on-thursday-treasury-s...
2•aspenmayer•19m ago•2 comments

Mechanize AI: Life after work

https://www.mechanize.work/blog/life-after-work/
1•colesantiago•19m ago•0 comments

The mysterious figure accused of masterminding a $14B crypto scam

https://www.bbc.com/news/articles/c70jz8e00g1o
2•paulpauper•19m ago•0 comments

Climbing Gyms Took over the World

https://thehustle.co/originals/how-climbing-gyms-took-over-the-world
1•paulpauper•21m ago•0 comments

Kebabs Are Consequential

https://www.lrb.co.uk/the-paper/v47/n19/adam-mars-jones/kebabs-are-consequential
1•paulpauper•21m ago•0 comments

H.P. Lovecraft: The King of Weird (1996)

https://www.nybooks.com/articles/1996/10/31/the-king-of-weird/
3•mitchbob•25m ago•1 comments

Relational Charades: Turning Movies into Tables

https://duckdb.org/2025/10/27/movies-in-databases
1•chmaynard•27m ago•0 comments

Practical Defenses Against Technofascism

https://micahflee.com/practical-defenses-against-technofascism/
2•HotGarbage•28m ago•0 comments

Valkey 9.0 Released with Ability to Achieve One Billion Requests / Second

https://www.phoronix.com/news/Valkey-9.0-Released
2•ksec•30m ago•0 comments

The Ethics in Our Algorithms: When Code Contradicts Conduct

https://blog.thecodejedi.online/2025/10/code-of-conduct-hidden-moral-frameworks.html
1•eddealmeida•30m ago•1 comments

Ask HN: Amazon kindle can't update daylight saving time

1•zeristor•32m ago•1 comments

Can a new blood test detect ME/CFS? An expert unpacks new research

https://theconversation.com/can-a-new-blood-test-really-detect-me-cfs-an-expert-unpacks-new-resea...
2•PaulHoule•39m ago•0 comments

EPYC Turin vs. Xeon 6 Granite Rapids vs. Graviton4 AWS M8 Instance Benchmarks

https://www.phoronix.com/review/aws-m8a-m8g-m8i-benchmarks
2•ksec•45m ago•0 comments

Solarized – A Break Down

https://ethanschoonover.com/solarized/
1•dduplex•47m ago•0 comments

The 1920s Immigration Mistake America May Repeat

https://www.bloomberg.com/opinion/articles/2025-10-25/the-1920-s-immigration-mistake-america-may-...
3•wslh•48m ago•3 comments

NeuroMark – Yet another bookmark organizer for Firefox

https://addons.mozilla.org/en-GB/firefox/addon/neuromark/
1•dwamei•51m ago•0 comments

How to Use Zorn's Lemma

https://gowers.wordpress.com/2008/08/12/how-to-use-zorns-lemma/
1•perihelions•52m ago•0 comments

How indexes make your queries fast

https://wizardzines.com/comics/indexes/
1•chmaynard•52m ago•0 comments

Show HN: Typegraph – type-level graphs of Rust types

https://github.com/nicksenger/typegraph
1•bietroi•56m ago•0 comments

Nanoimprint Lithography: Stop Saying It Will Replace EUV

https://newsletter.semianalysis.com/p/nanoimprint-lithography-stop-saying
2•cpard•1h ago•0 comments

Fintech will hire you if you're a bad writer

https://eleanorwarnock.substack.com/p/why-this-fintech-fires-bad-writers
2•itoshinoeri•1h ago•0 comments