https://www.scan.co.uk/products/asus-ascent-gx10-desktop-ai-...
Asus make some really useful things, but the v1 Tinker Board was really a bit problem-ridden, for example. This is similarly way out on the edge of their expertise; I'm not sure I'd buy an out-there Asus v1 product this expensive.
(Edit: GB of course, not MB, thanks buildbot)
I've finetuned diffusion models streaming from an SSD without noticeable speed penalty at high enough batchsize.
Either build a single socket system and give it some DDR5 to work alongside, or go dual socket and a bit less DDR5 memory.
So it is severely underbaked but the base gameplay is there. Roughly what you would expect out of a LLM given only the high level objective. I would expect an hour or so of vibe coding would probably result in something reasonably complete before you started bumping up into the context window. I'm honestly kind of impressed that it worked at all given the minuscule amount of human input that went into that prompt.
I'm not quite sure how impressed to be by the LLM's output here. Surely there are quite a few simple Space Invaders implementations that made it into the training corpus. So the amount of work the LLM did here may have been relatively small; more of a simple regurgitation?
What do you think?
That is how Space Invaders originally worked, used strips of colored cellophane to give the B&W graphics color and the aliens moved behind a different colored strip on each level down. So, maybe not an whoops?
Edit: After some reading, I guess it was the second release of Space Invaders which had the aliens change color as they dropped, first version only used the cellophane for a couple parts of the screen.
Should you get one? #
It’s a bit too early for me to provide a confident recommendation concerning this machine. As indicated above, I’ve had a tough time figuring out how best to put it to use, largely through my own inexperience with CUDA, ARM64 and Ubuntu GPU machines in general.
The ecosystem improvements in just the past 24 hours have been very reassuring though. I expect it will be clear within a few weeks how well supported this machine is going to be.From rough memory, something along the lines of "it's an RTX, not RTX Pro class of GPU" so the core layout is different from what he was basing his initial expectations upon.
The Spark's GPU gets ~4x the FP16 compute performance of an M3 Ultra GPU on less than half the Mac Studio's total TDP.
Can you be a bit more specific what technology you're actually referring to? "Unified memory" is just a marketing term, you could mean unified address space, dual-use memory controllers, SOC integration or Northbridge coprocessors. All are technologies that Nvidia has shipped in consumer products at one point or another, though (Nintendo Switch, Tegra Infotainment, 200X MacBook to name a few).
And neither were really consumer offerings.
You do get about twice as much memory bandwidth out of the Mac though.
Nvidia cooperates with Khronos, implements open-source and proprietary APIs simultaneously, documents their GPU hardware, and directly supports community reverse-engineering projects like nouveau and NOVA with their salaried engineers.
Pretty much the only proprietary part is CUDA, and Nvidia emphatically supports the CUDA alternatives. Apple doesn't even let you run them.
1- I vote with my wallet, do I want to pay a company to be my digital overlord, doing everything they can to keep me inside their ecosystem? I put too much effort to earn my freedom to give it up that easily.
2- Software: Almost certainly, I would want to run linux on this. Do I want to have something that has or eventually will have great mainstream linux support, or something with closed specs that people in Asahi try to support with incredible skills and effort? I prefer the system with openly available specs.
I've extensively used mac, iphone, ipad over time. The only apple device I ever bought was an ipad, and I would never buy it, if I knew they deliberately disable multitasking on it.
https://github.com/apple/container
> container is a tool that you can use to create and run Linux containers as lightweight virtual machines on your Mac. It's written in Swift, and optimized for Apple silicon.
Container does nothing to progress the state of supporting Linux on Apple Silicon. It does not replace macOS, iBoot or the other proprietary, undocumented or opaque software blobs on the system. All it does is keep people using macOS and purchasing Apple products and viewing Apple advertisements.
How can a serious company not notice these glaring issues in their websites?
Its not that they dont notice.
They dont care.
But I'm not surprised, this is ASUS. As a company, they don't really seem to care about software quality.
I learned the hard way that ASUS translates do "don't buy ever again".
>> What is the memory bandwidth supported by Ascent GX10?
> AI applications often require a bigger memory. With the NVIDIA Blackwell GPU that supports 128GB of unified memory, ASUS Ascent GX10 is an AI supercomputer that enables faster training, better real-time inference, and support larger models like LLMs.
You don't have to wonder: I bet they're using generative AI to speed up delivery velocity.
If they wanted to do that, they should have just omitted the question from their FAQ. An evasive answer in a FAQ is a giant footgun, because it just calls attention to the evasion.
> What is the memory bandwidth supported by Ascent GX10? AI applications often require a bigger memory. With the NVIDIA Blackwell GPU that supports 128GB of unified memory, ASUS Ascent GX10 is an AI supercomputer that enables faster training, better real-time inference, and support larger models like LLMs.
Which is appropriate, given the applications!
I see that they mention it uses LPDDR5x, so bandwidth will not be nearly as fast as something using HBM or GDDR7, even if bus width is large.
Edit: I found elsewhere that the GB10 has a 256bit L5X-9400 memory interface, allowing for ~300GB/sec of memory bandwidth.
File this one in the blue folder like the DGX
I think it's the reverse, the use case for these boxes are basically training and fine-tuning, not inference.
There's newer models called "Mixture of Experts" that are, say, 120b parameters, but only use 5b parameters per token (the specific parameters are chosen via a much smaller routing model). That is the kind of model that excels on this machine. Unfortunately again, those models work really well when doing hybrid inference, because the GPU can handle the small-but-computationally-complex fully connected layers while the CPU can handle the large-but-computationally-easy expert layers.
This product doesn't really have a niche for inference. For training and prototyping is another story, but I'm a noob on those topics.
> What is the memory bandwidth supported by Ascent GX10?
> AI applications often require a bigger memory. With the NVIDIA Blackwell GPU that supports 128GB of unified memory, ASUS Ascent GX10 is an AI supercomputer that enables faster training, better real-time inference, and support larger models like LLMs.
Never seen anything like that before. I wonder if this product page is actually done and was ready to be public?
Fortunately, their products are also easy to crack open and probe.
It's a feature that requires two different _companies_ to collaborate to build. Mayhem.
You're free to lift the kernel and any drivers/libraries and run them on your distribution of choice, it'll just be hacky.
The kernel is patched (and maintained by Canonical, not Nvidia) but the patches hanging off their 6.17-next branch didn't look outrageous to me. The main hitch right now is that upstream doesn't have a Realtek r8127 driver for the ethernet controller. There were also some mediatek-related patches that were probably relevant as they designed the CPU die.
Overall it feels close to full upstream support (to be clear: you CAN boot this system with a fully upstream kernel, today). And booting with UEFI means you can just use the nvidia patches on $YOUR_FAVORITE_DISTRO and reboot, no need to fiddle with or inject the proper device trees or whatever.
Based on my experience it feels quite different and much closer to a normal x86 machine, probably intentional. Maybe it helped that Nvidia did not design the full CPU complex, Mediatek did that.
[1] They even claim that Thor is now fully SBSA compliant (Xavier had UEFI, Orin had better UEFI, and now this) -- which would imply it has full UEFI + ACPI like the Spark. But when I looked at the kernel in their Thor L4T release, it looked like it was still loaded with Jetson-specific SOC drivers on top of a heavy fork of the PREEMPT_RT patch series for Linux 6.8; I did not look too hard, but it still didn't seem ideal. Maybe you can probably boot a "normal" OS missing most of the actual Jetson-specific peripherals, I guess.
/etc/os-release:
PRETTY_NAME="Ubuntu 24.04.3 LTS"
NAME="Ubuntu"
VERSION_ID="24.04"
VERSION="24.04.3 LTS (Noble Numbat)"
VERSION_CODENAME=noble
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=noble
LOGO=ubuntu-logo
and /etc/dgx-release: DGX_NAME="DGX Spark"
DGX_PRETTY_NAME="NVIDIA DGX Spark"
DGX_SWBUILD_DATE="2025-09-10-13-50-03"
DGX_SWBUILD_VERSION="7.2.3"
DGX_COMMIT_ID="833b4a7"
DGX_PLATFORM="DGX Server for KVM"
DGX_SERIAL_NUMBER="Not Specified"
While other Linux distros were already reported to work, some tools provide by Nvidia won't work with Fedora or NixOS. Not yet!I couldn't get Nvidia AI Workbench to start on Neon KDE after changing to DISTRIB_ID=Ubuntu in /etc/lsb-release. Neon is based on Ubuntu Noble too.
Plus ofc software stack for gaming on this isn’t available
1) This still has raster hardware, even ray tracing cores. It's not technically an "AI focused card" like the AMD Instinct hardware or Nvidia's P40-style cards.
2) It kinda does have a stack. ARM is the hardest part to work around, but Box86 will get the older DirectX titles working. The GPU is Vulkan compliant too, so it should be able to leverage Proton/DXVK to accommodate the modern titles that don't break on ARM.
The tough part is the price. I don't think ARM gaming boxes will draw many people in with worse performance at a higher price.
But if gaming is what you're actually interested in, then it's a pretty terrible buy. You can get a much cheaper x86-based system with a discrete GPU that runs circles around this.
I am still trying to think a use case that a Ryzen AI Max/MacBook or a plain gaming gpu cannot cover.
Last few jobs I've had were for binaries compiled to target ARM AArch64 SBC devices, and cross compiling was sometimes annoying, and you couldn't truly eat your own dogfood on workstations as there's subtle things around atomics and memory consistency guarantees that differ between ISAs.
Mac M series machines are an option except that then you're not running Linux, except in VM, and then that's awkward too. Or Asahi which comes with its own constraints.
Having a beefy ARM machine at my desk natively running Linux would have pleased me greatly. Especially if my employer was paying for it.
At least with this, you get to pay both the Nvidia and the Asus tax!
- Price: $3k / $5k
- Memory: same (128GB)
- Memory bandwidth: ~273GB/s / 546GB/sec
- SSD: same (1 TB)
- GPU advantage: ~5x-10x depending on memory bottleneck
- Network: same 10Gbe (via TB)
- Direct cluster: 200Gb / 80Gb
- Portable: No / Yes
- Free Mac included: No / Yes
- Free monitor: No / Yes
- Linux out of the box: Yes / No
- CUDA Dev environment: Yes : No
W.r.t ip, the fastest I’m aware of is 25Gb/s via TB5 adapters like from Sonnet.
The Asus clustering speed is not limited to p2p.
How is the monitor "free" if the Mac costs more?
For homelab use, this is the only thing that matters to me.
GMKtec EVO-X2 vs GX10 vs MacBook Pro M4 Max
Price: $2,199.99 / $3,000 / $5,000
CPU: Ryzen AI Max 395+ (Strix Halo, 16C/32T) / NVIDIA Grace Blackwell GB200 Superchip (20-core ARM v9.2) / Apple M4 Max (12C)
GPU: Radeon 890M (RDNA3 iGPU) / Integrated Blackwell GPU (up to 1 PFLOP FP4) / 40-core integrated GPU
Memory: 128GB LPDDR5X / 128GB LPDDR5X unified / 128GB unified
Memory bandwidth: ???GB/s / ~500GB/s / ~546GB/s
SSD: 1TB PCIe 4.0 / 4TB PCIe 5.0 / 1TB NVMe
GPU advantage: Similar (EVO-X2 trades blows with GB10 depending on model and framework)
Network: 2.5GbE / 10GbE / 10GbE (via TB)
Direct cluster: 40Gb (USB4/TB4) / 200Gb / 80Gb
Portable: Semi (compact desktop) / No / Yes
Free Mac included: No / No / Yes
Free monitor: No / No / Yes
Linux out of the box: Yes / Yes / No
CUDA dev environment: No (ROCm) / Yes / NoFrom NVIDIA's perspective, they need an answer to the growing segment of SoCs with decent sized GPUs and unified memory; their existing solutions at the far end of a PCIe link with a small pool of their own memory just can't work for some important use cases, and providing GPU chiplets to be integrated into other SoCs is how they avoid losing ground in these markets without the expense of building their own full consumer hardware platform and going to war with all of Apple, Intel, AMD, Qualcomm.
They're not the best choice for anyone who wants to run LLMs as fast and cheap as possible at home. Think of it like a developer tool.
These boxes are confusing the internet because they've let the marketing teams run wild (or at least the marketing LLMs run wild) trying to make them out to be something everyone should want.
How great would it be if instead of shoving these bots to help decipher the marketing speak they just had the specs right up front?
No, I don't want to use your assistant and your are forcing me to pointlessly click on the close button. Some times they event hide viable information during their popup.
They seem to be the reincarnation of 2000s popups; there to satisfy a business manager versus actually being a useful tool.
What is the purpose of this thing?
Please give me a good old html table with specs will ya?
The Asus Ascent GX10 a Nvidia GB10 Mini PC with 128GB of Memory and 200GbE - https://news.ycombinator.com/item?id=43425935 - March 2025 (50 comments)
Edit: added via wmf's comment below:
"DGX Spark has only half the advertised performance" - https://news.ycombinator.com/item?id=45739844 - Oct 2025 (24 comments)
Nvidia DGX Spark: When benchmark numbers meet production reality - https://news.ycombinator.com/item?id=45713835 - Oct 2025 (117 comments)
Nvidia DGX Spark and Apple Mac Studio = 4x Faster LLM Inference with EXO 1.0 - https://news.ycombinator.com/item?id=45611912 - Oct 2025 (20 comments)
Nvidia DGX Spark: great hardware, early days for the ecosystem - https://news.ycombinator.com/item?id=45586776 - Oct 2025 (111 comments)
NVIDIA DGX Spark In-Depth Review: A New Standard for Local AI Inference - https://news.ycombinator.com/item?id=45575127 - Oct 2025 (93 comments)
Nvidia DGX Spark - https://news.ycombinator.com/item?id=45008434 - Aug 2025 (207 comments)
Nvidia DGX Spark - https://news.ycombinator.com/item?id=43409281 - March 2025 (10 comments)
https://news.ycombinator.com/item?id=45586776
https://news.ycombinator.com/item?id=45008434
https://news.ycombinator.com/item?id=45713835
https://news.ycombinator.com/item?id=45575127
https://news.ycombinator.com/item?id=45611912
To get a sense for use cases, see the playbooks on this website: https://build.nvidia.com/spark.
Regarding limited memory bandwidth: my impression is that this is part of the onramp for the DGX Cloud. Heavy lifting/production workloads will still need to be run in the cloud.
Fill up the memory with a large model, and most of your memory bandwidth will be waiting on compute shaders. Seems like a waste of $5,000 but you do you.
Also, NVIDIA's software they have you install on another machine to use it is garbage. They tried to make it sort of appliance-y but most people would rather just have SSH work out of the box and can go from there. IMO just totally unnecessary. The software aspect was what put me over the edge.
Maybe the gen 2 will be better, but unless you have a really specific use case that this solves well, buy credits or something somewhere else.
That's all I want. It does not have to be fast, but it must be capable of doing all of that.
Oh, and it should be energy efficient. Very important for a 24/7 machine.
You'll need a model able to work with tools, like llama 3.2 (https://huggingface.co/meta-llama), serve it, hook up MCPs, include a STT interface, and you're cooking.
Plus, you need to keep the card at "ready" state, you can't idle/standby it completely.
You may have better results with semi-templated responses though.
https://www.servethehome.com/nvidia-dgx-spark-review-the-gb1...
If (and in case of Nvidia that's a big if at the moment) they get their software straight on Linux for once this piece of hardware seems to be something to keep an eye on.
from https://www.gmktec.com/blog/evo-x2-vs-nvidia-dgx-spark-redef... (text taken from https://wccftech.com/forget-nvidia-dgx-spark-amd-strix-halo-... since the GMKtec table was an image, but wccftech converted to an HTML table - EDIT-reformatted to make table look nicer in monospace font w/o tabs)
Test Model Metric EVO – X2 NVIDIA GB10 Winner
Llama 3.3 70B Generation Speed (tok/sec) 4.90 4.67 AMD
First Token Response Time (s) 0.86 0.53 NVIDIA
Qwen3 Coder Generation Speed (tok/sec) 35.13 38.03 NVIDIA
First Token Response Time (s) 0.13 0.42 AMD
GPT-OSS 20B Generation Speed (tok/sec) 64.69 60.33 AMD
First Token Response Time (s) 0.19 0.44 AMD
Qwen3 0.6B Model Generation Speed (tok/sec) 163.78 174.29 NVIDIA
First Token Response Time (s) 0.02 0.03 AMDhttps://frame.work/nl/en/desktop?tab=machine-learning
So to me the only thing which seems to be interesting about the Spark atm is the ability to daisy link several units together so you can create a InfiniBand-ish network at InfiniBand speeds of Sparks.
But overall for just plain development and experimentation, and since I don't work at Big AI, I'm pretty sure I would not purchase Nvidia at the moment.
For comparison, as of right now, I can run GPT-OSS 120b @ 59 tok/sec, using llama.cpp (revision 395e286bc) and Unsloth dynamic 4-bit quantized models.[1] GPT-OSS 20b @ 88 tok/sec [2]. The MXFP4 variant comes in the same, at ~89 tok/sec[3]. It's probably faster on other frameworks, llama.cpp is known to not be the fastest. I don't know what LM Studio backend they used. All of these numbers put the GB10 well ahead of Strix Halo, if only going by the numbers we see here.
If the AMD software wasn't also comparatively optimized by the same amount in the same timeframe, then the GB10 would be faster, now. Maybe it was optimized just as much; I don't have a Strix Halo part to compare. But my point is, don't just compare numbers from two various points in time, it's going to be very misleading.
[1]: https://huggingface.co/unsloth/gpt-oss-120b-GGUF/tree/main/U... [2]: https://huggingface.co/unsloth/gpt-oss-20b-GGUF/resolve/main... [3]: https://huggingface.co/unsloth/gpt-oss-20b-GGUF/resolve/main...
Given Strix Halo is so much cheaper I'd expect more people to work on improving it, but the NVIDIA tools are better so unclear which has more headroom.
The pricing is definitely by far the worst part of all of this. I suspect the GB10 still has more perf left on the table, Blackwell has been a rough launch. But I'm not sure it's $2000 better if you're just looking to get a fun little AI machine to do embeddings/vision/LLMs on?
What exactly isn't working for you? The last two/three months I've been almost exclusively doing ML work (+CUDA) with a NVIDIA card on Linux, and everything seems to work out of the box, including debugging/introspection tools and everything else I've tried. As an extra plus, everything runs much faster on Linux than the very same hardware and software does on Windows.
1 petaFLOP using FP4, that's 4 petaFLOPS using FP1 and infinite petaFLOPS using FP0.
To turn your petaFLOP into petaSLOP.
Dell:
https://www.dell.com/en-us/shop/desktop-computers/dell-pro-m...
- $3,998.99 4TB SSD
- $3,699.00 2TB SSD
Lenovo:
https://www.lenovo.com/us/en/p/workstations/thinkstation-p-s...
- $3,999.00 4TB SSD
https://www.lenovo.com/us/en/p/workstations/thinkstation-p-s...
- $3,539.00 1TB SSD
At least the m5 ultra should finally balance things given the significant improvements to prompt processing in the m5 from what we've seen. Apple has had significantly higher memory bandwidth since the m1 series approaching 5 years old now. Surely an nvidia machine like this could have at bare minimum 500Gb+ if they cared in the slightest about competition.
simlevesque•2mo ago
hamdingers•2mo ago
uyzstvqs•2mo ago