frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Benchmarking rolvsparse on DeepSeek-R1 and Llama 4 – up to 82x vs. cuBLAS

https://rolv.ai/
1•heggenhougen•1h ago

Comments

heggenhougen•1h ago
We have been running a sparse matrix library called rolvsparse on real model weights downloaded directly from HuggingFace and measuring throughput and energy against cuBLAS on an NVIDIA B200. Here are the results across five models so far.

DeepSeek-R1: all 256 MoE experts stacked into a 524,288 x 7,168 matrix. 78.9x throughput vs cuBLAS, 98.7% energy reduction, 5,294 effective TFLOPS. Operator build time 0.11 seconds.

Llama 4 Scout: MoE FFN weights, 81.7x throughput, 98.8% energy reduction.

Mixtral 8x22B: 55.1x throughput across all 56 MoE layers, 98.2% energy reduction.

Qwen3-235B-A22B: 22.4x throughput, 95.5% energy reduction.

Llama 4 Maverick: 20.7x throughput, 81.5% energy reduction.

Each result is SHA-256 verified against a normalized output hash. The same hash has been reproduced independently by the University of Miami across NVIDIA B200, AMD MI300X, Intel Xeon, and Apple M4 Pro hardware, published on Zenodo in December 2025.

The library works without model retraining, quantization, or hardware changes. It operates on the weight matrices directly.

We are happy to answer questions about methodology, the hardware counters, or anything else.

rolv.ai

Nvidia's Groq Plot Thickens – The Chip Letter

https://thechipletter.substack.com/p/nvidias-groq-plot-thickens
1•rbanffy•2m ago•0 comments

The Latest Republican Efforts to Make It Harder to Vote in the Midterms

https://www.newyorker.com/news/the-lede/the-latest-republican-efforts-to-make-it-harder-to-vote-i...
1•mitchbob•3m ago•1 comments

The Dark Factory Is a .dot file

https://2389.ai/posts/the-dark-factory-is-a-dot-file/
1•paulsmith•3m ago•0 comments

Uber uses AI for development: inside look

https://newsletter.pragmaticengineer.com/p/how-uber-uses-ai-for-development
1•tmsh•3m ago•0 comments

Iowa Payphone Defends Itself (Associated Press, 1984)

https://www.payphone-project.com/iowa-payphone-defends-itself-ap-story-from-october-1984.html
1•TigerUniversity•4m ago•0 comments

Show HN: Quick Look Source Code in Finder on macOS

https://anybox.ltd/source-code-preview
1•francisfeng•6m ago•0 comments

Against Vibes: When Is a Generative Model Useful

https://www.williamjbowman.com/blog/2026/03/05/against-vibes-when-is-a-generative-model-useful/
1•takira•7m ago•0 comments

Show HN: KaraMagic – automatic karaoke video maker

https://karamagic.com/
1•godot•8m ago•0 comments

What comes after agents? AI employees

https://www.ycombinator.com/launches/Pf7-beyond-agents-the-era-of-ai-employees
1•karissaho•8m ago•0 comments

Photocopier No More: The Reckoning with AI Creativity Has Arrived

https://reviews.ofb.biz/safari/article/1401.html
1•trbutler•9m ago•0 comments

Inverse Occam's Razor

https://arxiv.org/abs/2204.08284
1•jerlendds•10m ago•0 comments

Tell HN: Apple development certificate server seems down?

4•strongpigeon•10m ago•1 comments

Mother of All Grease Fires

https://milk.com/wall-o-shame/bucket.html
2•xk3•10m ago•0 comments

6-Axis Milling for Enhancing Quality of Fused Granular Fabrication Parts

https://www.mdpi.com/2073-4360/18/5/608
1•PaulHoule•11m ago•0 comments

Working to Decentralize FedCM

https://atproto.com/blog/working-to-decentralize-fedcm
1•sgoto•11m ago•0 comments

Agent-sync – sync between Claude Code and Codex configs

https://github.com/matanabudy/agent-sync
1•matanabudy•12m ago•0 comments

Helix 02 living room tidy

https://www.youtube.com/watch?v=CAdTjePDBfc
1•hheikinh•13m ago•0 comments

Don't let LLMs write for you

https://justismills.substack.com/p/dont-let-llms-write-for-you
1•c-oreills•14m ago•0 comments

Deep Learning: Our Year 1990-1991

https://people.idsia.ch/~juergen/deep-learning-miraculous-year-1990-1991.html
1•untilted•16m ago•0 comments

Ask HN: I built an AI-native codebase framework–could you evaluate it?

1•xodn348•20m ago•1 comments

The Slowest Viral Thing

https://pilgrima.ge/p/the-slowest-viral-thing
1•momentmaker•21m ago•0 comments

SoftBank eyes up to $40B loan to fund OpenAI investment

https://www.reuters.com/business/media-telecom/softbank-seeks-up-40-billion-loan-finance-openai-i...
4•devonnull•21m ago•0 comments

SEIA Solar Market Insight Report 2025 Year in Review

https://seia.org/research-resources/us-solar-market-insight/
1•toomuchtodo•22m ago•0 comments

A vertical tab companion app for aerospace window manager

https://github.com/raghavendra-talur/aeromux
1•rtalur•23m ago•1 comments

Uber rolls out women-only option in the US

https://www.bbc.com/news/articles/cx2gvrzwdr7o
2•alephnerd•23m ago•0 comments

Meta Is Buying Moltbook

https://lifehacker.com/tech/meta-is-buying-moltbook
1•umangsehgal93•23m ago•1 comments

GoT Timeline – a daily timeline game to test your Game of Thrones skills

https://www.got-timeline.com
1•onion92•23m ago•0 comments

Claude Code makes local LLMs 90% slower

https://unsloth.ai/docs/basics/claude-code
4•telotortium•27m ago•1 comments

Eventbrite Enters into Definitive Agreement to Be Acquired by Bending Spoons

https://www.businesswire.com/news/home/20251202408560/en/Eventbrite-Enters-into-Definitive-Agreem...
5•DocFeind•28m ago•1 comments

Why doesn't V8 fit on my microcontroller? (2021)

https://medium.com/the-toit-take/why-doesnt-v8-fit-on-my-microcontroller-71dc6e2d8f5c
1•tosh•29m ago•0 comments