frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Don’t Look Up: Sensitive internal links in the clear on GEO satellites [pdf]

https://satcom.sysnet.ucsd.edu/docs/dontlookup_ccs25_fullpaper.pdf
222•dweekly•5h ago•54 comments

NanoChat – The best ChatGPT that $100 can buy

https://github.com/karpathy/nanochat
1100•huseyinkeles•16h ago•212 comments

Copy-and-Patch: A Copy-and-Patch Tutorial

https://transactional.blog/copy-and-patch/tutorial
29•todsacerdoti•2h ago•3 comments

Why Study Programming Languages

https://people.csail.mit.edu/rachit/post/why-study-programming-languages/
20•bhasi•1h ago•7 comments

Palisades Fire suspect's ChatGPT history to be used as evidence

https://www.rollingstone.com/culture/culture-news/chatgpt-palisades-fire-suspect-1235443216/
92•quuxplusone•5d ago•66 comments

Dutch government takes control of Chinese-owned chipmaker Nexperia

https://www.cnbc.com/2025/10/13/dutch-government-takes-control-of-chinese-owned-chipmaker-nexperi...
424•piskov•21h ago•343 comments

No science, no startups: The innovation engine we're switching off

https://steveblank.com/2025/10/13/no-science-no-startups-the-unseen-engine-were-switching-off/
443•chmaynard•18h ago•323 comments

Sony PlayStation 2 fixing frenzy

https://retrohax.net/sony-playstation-2-fixing-frenzy/
110•ibobev•8h ago•39 comments

First device based on 'optical thermodynamics' can route light without switches

https://phys.org/news/2025-10-device-based-optical-thermodynamics-route.html
136•rbanffy•5d ago•16 comments

vali, a C library for Varlink

https://emersion.fr/blog/2025/announcing-vali/
26•GalaxySnail•3d ago•6 comments

New York Times, AP, Newsmax and others say they won't sign new Pentagon rules

https://apnews.com/article/pentagon-press-access-defense-department-rules-95878bce05096912887701e...
144•baobun•4h ago•39 comments

Show HN: SQLite Online – 11 years of solo development, 11K daily users

https://sqliteonline.com/
376•sqliteonline•18h ago•125 comments

Modern iOS Security Features – A Deep Dive into SPTM, TXM, and Exclaves

https://arxiv.org/abs/2510.09272
157•todsacerdoti•13h ago•3 comments

America is getting an AI gold rush instead of a factory boom

https://www.washingtonpost.com/business/2025/10/13/manufacturing-artificial-intelligence/
188•voxleone•16h ago•191 comments

DDoS Botnet Aisuru Blankets US ISPs in Record DDoS

https://krebsonsecurity.com/2025/10/ddos-botnet-aisuru-blankets-us-isps-in-record-ddos/
118•JumpCrisscross•8h ago•93 comments

LLMs are getting better at character-level text manipulation

https://blog.burkert.me/posts/llm_evolution_character_manipulation/
83•curioussquirrel•11h ago•53 comments

JIT: So you want to be faster than an interpreter on modern CPUs

https://www.pinaraf.info/2025/10/jit-so-you-want-to-be-faster-than-an-interpreter-on-modern-cpus/
117•pinaraf•1d ago•23 comments

Smartphones and being present

https://herman.bearblog.dev/being-present/
250•articsputnik•17h ago•162 comments

All in on MatMul? Don’t Put All Your Tensors in One Basket!

https://www.sigarch.org/dont-put-all-your-tensors-in-one-basket-hardware-lottery/
3•matt_d•5d ago•0 comments

Strudel REPL – a music live coding environment living in the browser

https://strudel.cc
148•birdculture•12h ago•26 comments

Why did containers happen?

https://buttondown.com/justincormack/archive/ignore-previous-directions-8-devopsdays/
108•todsacerdoti•19h ago•120 comments

A series of debugging sessions for Strimzi

https://github.com/fvaleri/strimzi-debugging
4•fvaleri•5d ago•0 comments

NVIDIA DGX Spark In-Depth Review: A New Standard for Local AI Inference

https://lmsys.org/blog/2025-10-13-nvidia-dgx-spark/
24•yvbbrjdr•6h ago•17 comments

Passt – Plug a Simple Socket Transport

https://passt.top/passt/about/
13•zdw•1w ago•1 comments

America's future could hinge on whether AI slightly disappoints

https://www.noahpinion.blog/p/americas-future-could-hinge-on-whether
120•jxmorris12•14h ago•106 comments

JSON River – Parse JSON incrementally as it streams in

https://github.com/rictic/jsonriver
179•rickcarlino•5d ago•80 comments

Abstraction, not syntax

https://ruudvanasseldonk.com/2025/abstraction-not-syntax
88•unripe_syntax•22h ago•45 comments

StreamingVLM: Real-Time Understanding for Infinite Video Streams

https://arxiv.org/abs/2510.09608
22•badmonster•7h ago•0 comments

Optery (YC W22) – Hiring Tech Lead with Node.js Experience (U.S. & Latin America)

https://www.optery.com/careers/
1•beyondd•14h ago

Scaling request logging with ClickHouse, Kafka, and Vector

https://www.geocod.io/code-and-coordinates/2025-10-02-from-millions-to-billions/
125•mjwhansen•5d ago•18 comments
Open in hackernews

NVIDIA DGX Spark In-Depth Review: A New Standard for Local AI Inference

https://lmsys.org/blog/2025-10-13-nvidia-dgx-spark/
24•yvbbrjdr•6h ago

Comments

SethTro•3h ago
Article doesn't seem to mention price which is $4,000 which makes it comparable to a 5090 but with 128GB of unified LPDDR5x vs the 5090's 32GB DDR7.
CamperBob2•3h ago
And about 1/4 the memory bandwidth, which is what matters for inference.
nialse•2h ago
Well, that’s disappointing since the Mac Studio 128GB is $3,499. If Apple happens to launch a Mac Mini with 128GB RAM it would eat Nvidia Sparks’ lunch every day.
newman314•1h ago
Agreed. I also wonder why they chose to test against a Mac Studio with only 64GB instead of 128GB.
yvbbrjdr•1h ago
Hi, author here. I crowd-sourced the devices for benchmarking from my friends. It just happened that one of my friend has this device.
ggerganov•1h ago
FYI you should have used llama.cpp to do the benchmarks. It performs almost 20x faster than ollama for the gpt-oss-120b model. Here are some samples results on my spark:

  ggml_cuda_init: found 1 CUDA devices:
    Device 0: NVIDIA GB10, compute capability 12.1, VMM: yes
  | model                          |       size |     params | backend    | ngl | n_ubatch | fa |            test |                  t/s |
  | ------------------------------ | ---------: | ---------: | ---------- | --: | -------: | -: | --------------: | -------------------: |
  | gpt-oss 20B MXFP4 MoE          |  11.27 GiB |    20.91 B | CUDA       |  99 |     2048 |  1 |          pp4096 |       3564.31 ± 9.91 |
  | gpt-oss 20B MXFP4 MoE          |  11.27 GiB |    20.91 B | CUDA       |  99 |     2048 |  1 |            tg32 |         53.93 ± 1.71 |
  | gpt-oss 120B MXFP4 MoE         |  59.02 GiB |   116.83 B | CUDA       |  99 |     2048 |  1 |          pp4096 |      1792.32 ± 34.74 |
  | gpt-oss 120B MXFP4 MoE         |  59.02 GiB |   116.83 B | CUDA       |  99 |     2048 |  1 |            tg32 |         38.54 ± 3.10 |
yvbbrjdr•1h ago
I see! Do you know what's causing the slowdown for ollama? They should be using the same backend..
__mharrison__•1h ago
Curious to how this compares to running on a Mac.
rajatgupta314•53m ago
Is this the full weight model or quantized version? The GGUFs distributed on Hugging Face labeled as MXFP4 quantization have layers that are quantized to int8 (q8_0) instead of bf16 as suggested by OpenAI.

Example looking at blk.0.attn_k.weight, it's q8_0 amongst other layers:

https://huggingface.co/ggml-org/gpt-oss-20b-GGUF/tree/main?s...

Example looking at the same weight on Ollama is BF16:

https://ollama.com/library/gpt-oss:20b/blobs/e7b273f96360

moondev•1h ago
Just don't try to run a NCCL
EnPissant•1h ago
A 5090 is $2000.
pixelpoet•2h ago
I wonder why they didn't test against the broadly available Strix Halo with 128GB of 256 GB/s memory bandwidth, 16 core full-fat Zen5 with AVX512 at $2k... it is a mystery...
yvbbrjdr•1h ago
Hi, author here. I crowd-sourced the devices for benchmarking from my friends. It just happened that none of my friend has this device.
EnPissant•49m ago
Something is wrong with your numbers: gpt-oss-20b and gpt-oss-120b should be much much faster than what you are seeing. I would suggest you familiarize yourself with llama-bench instead of ollama.

Running gpt-oss-120b with a rtx 5090 and 2/3 of the experts offloaded to system RAM (less than half of the memory bandwidth of this thing), my machine gets ~4100tps prefill and ~40tps decode.

Your spreadsheet shows the spark getting ~94tps prefill and ~11tps decode.

Now, it's expected that my machine should slaughter this thing in prefill, but decode should be very similar or the spark a touch faster.

yvbbrjdr•28m ago
We actually profiled one of the models, and saw that the last GeMM, which is completely memory bound, is taking a lot of time, which reduces the token speed by a lot.
EnPissant•1h ago
Strix Halo has the problem that prefill is incredibly slow if your context is not very small.

The only thing that might be interesting about this DGX Spark is it's prefill manages to be faster due to better compute. I haven't compared the numbers yet, but they are included in the article.

hank808•43m ago
You guys that continue to compare DGX Spark to the Mac Studios, please remember two things:

1. Virtually every model that you'd run was developed on Nvidia gear and will run on Spark. 2. Spark has fast-as-hell interconnects. The sort of interconnects that one would want to use in an actual AI DC, so you can use more than one Spark at the same time, and RDMA, and actually start to figure out how things work the way they do and why. You can do a lot with 200 Gb of interconnect.