Arm's Cortex A725 Ft. Dell's Pro Max with GB10

https://chipsandcheese.com/p/arms-cortex-a725-ft-dells-pro-max

61•pixelpoet•1w ago

Comments

crest•1w ago

I would love to see a comparison between the A725 and X925 cores.

geerlingguy•1w ago

Not quite in the same depth, but there are some more general benchmarks across all cores and latencies here: https://github.com/geerlingguy/sbc-reviews/issues/92

arjie•1w ago

Wow, this repo and the ai-benchmarks repo are the ones I wanted https://github.com/geerlingguy/ai-benchmarks/issues/34

Thank you for doing these. Earned a star and a watch from me on both! Minor sponsor donation as gratitude.

Would be sick to have an RSS feed for your data releases.

geerlingguy•1w ago

Will consider that at some point; a lot of the time is just spent getting the data, heh.

ksec•1w ago

Note to myself: Cortex X925 was originally called X5. The Current Generation X930 is now called C1-Ultra used in Mediatek 9500.

pinnochio•1w ago

Apologies for the tangent, but isn't this like saying "sliced tomato featuring BLT sandwich"?

trynumber9•1w ago

No. It's trying to analyze the CPU core but clarifies the device under test as that may have performance implications. There is cooling and possibly manufactured configured power limits.

pinnochio•1w ago

I get what they're doing. I've never seen that phrasing before.

cmrdporcupine•1w ago

This is awesome. I'm going to have to spend some time digging over this.

I got one of these GB10s, but the ASUS variety. So far fairly happy with it. Most days I don't remember I'm on ARM.

It's pretty performant, snappy, about the same speed as my other mini PC, a Ryzen 9 7940HS Minisforum UM 790 Pro, but with double the amount of cores and many times the amount of RAM.

storystarling•1w ago

Have you tried running any local LLMs via llama.cpp? I am curious if that high RAM is effectively usable as unified memory for larger models. I wonder if the memory bandwidth is sufficient to get decent performance on something like a 70b model or if it bottlenecks.

justaboutanyone•1w ago

You can run large-ish MoE model at good speeds, like gpt-oss-120b, it's snappy enough even with big context.

But large and dense at the same time is a bit slow.

Running a local LLM will be a load of money for something much slower than the api providers though.

storystarling•1w ago

Makes sense regarding the MoE performance. I am not sure the cost argument holds up for high volume workloads though. If you are running batch jobs 24/7 the hardware pays for itself in a few months compared to API opex. It really just comes down to utilization.

storystarling•1w ago

Do you have specific t/s numbers for those dense models? I'm curious just how severe the memory bandwidth bottleneck gets in practice.

I'm not sure I agree on the cost aspect though. For high-volume production workloads the API bills scale linearly and can get painful fast. If you can amortize the hardware over a year and keep the data local for privacy, the math often works out in favor of self-hosting.

justaboutanyone•1w ago

For Qwen2.5-72B-Instruct-Q5_K_M at 32k context, I fed it a 26k token file (truncated fiction novel) asking it to summarize, and it input processed at 224 tok/s and output generated at 3 tok/s. Not really good enough for interactive use without frustration. Not just from watching it reply, but also the long wait for it to actually read the book.

On the same hardware gpt-oss-120b at 128k context, I fed it a longer version of the input (a whole novel, 97k tok), and it input processed at 1650 tok/s and output generated at 27 tok/s. Just fast enough IMO

cmrdporcupine•1w ago

I bought it primarily so I could learn some of the toolchain for fine-tuning / training stuff, not so much for running inference, which its only "ok" at.

If I was primarily interested in that, I would have probably bought one of the cheaper Strix Halo machines.

It's also just a decent non-Mac ARM64 workstation, with large quantities of RAM. Which in 2026 is a bit of unicorn.

Cppsp v1.4.5–custom pattern-driven, nested, namespace-scoped templates

The next frontier in weight-loss drugs: one-time gene therapy

At Age 25, Wikipedia Refuses to Evolve

Show HN: ReviewReact – AI review responses inside Google Maps ($19/mo)

Why AlphaTensor Failed at 3x3 Matrix Multiplication: The Anchor Barrier

Ask HN: How much of your token use is fixing the bugs Claude Code causes?

Show HN: Agents – Sync MCP Configs Across Claude, Cursor, Codex Automatically

Hello

FSD helped save my father's life during a heart attack

Show HN: Writtte – Draft and publish articles without reformatting, anywhere

Portuguese icon (FROM A CAN) makes a simple meal (Canned Fish Files) [video]

Brookhaven Lab's RHIC Concludes 25-Year Run with Final Collisions

Transcribe your aunts post cards with Gemini 3 Pro

.72% Variance Lance

ReKindle – web-based operating system designed specifically for E-ink devices

Encrypt It

NextMatch – 5-minute video speed dating to reduce ghosting

Personalizing esketamine treatment in TRD and TRBD

SpaceKit.xyz – a browser‑native VM for decentralized compute

NotebookLM: The AI that only learns from you

Show HN: An open-source starter kit for developing with Postgres and ClickHouse

Game Boy Advance d-pad capacitor measurements

South Korean crypto firm accidentally sends $44B in bitcoins to users

Apache Poison Fountain

Web.whatsapp.com appears to be having issues syncing and sending messages

Google in Your Terminal

Shannon: Claude Code for Pen Testing: #1 on Github today

Anthropic: Latest Claude model finds more than 500 vulnerabilities

Brooklyn cemetery plans human composting option, stirring interest and debate

Why the 'Strivers' Are Right

Cppsp v1.4.5–custom pattern-driven, nested, namespace-scoped templates

The next frontier in weight-loss drugs: one-time gene therapy

At Age 25, Wikipedia Refuses to Evolve

Show HN: ReviewReact – AI review responses inside Google Maps ($19/mo)

Why AlphaTensor Failed at 3x3 Matrix Multiplication: The Anchor Barrier

Ask HN: How much of your token use is fixing the bugs Claude Code causes?

Show HN: Agents – Sync MCP Configs Across Claude, Cursor, Codex Automatically

Hello

FSD helped save my father's life during a heart attack

Show HN: Writtte – Draft and publish articles without reformatting, anywhere

Portuguese icon (FROM A CAN) makes a simple meal (Canned Fish Files) [video]

Brookhaven Lab's RHIC Concludes 25-Year Run with Final Collisions

Transcribe your aunts post cards with Gemini 3 Pro

.72% Variance Lance

ReKindle – web-based operating system designed specifically for E-ink devices

Encrypt It

NextMatch – 5-minute video speed dating to reduce ghosting

Personalizing esketamine treatment in TRD and TRBD

SpaceKit.xyz – a browser‑native VM for decentralized compute

NotebookLM: The AI that only learns from you

Show HN: An open-source starter kit for developing with Postgres and ClickHouse

Game Boy Advance d-pad capacitor measurements

South Korean crypto firm accidentally sends $44B in bitcoins to users

Apache Poison Fountain

Web.whatsapp.com appears to be having issues syncing and sending messages

Google in Your Terminal

Shannon: Claude Code for Pen Testing: #1 on Github today

Anthropic: Latest Claude model finds more than 500 vulnerabilities

Brooklyn cemetery plans human composting option, stirring interest and debate

Why the 'Strivers' Are Right

Arm's Cortex A725 Ft. Dell's Pro Max with GB10

Comments