frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: A Zero-Copy 1.58-bit LLM Engine hitting 117 Tokens/s on single CPU core

https://github.com/r3-engine/r3-engine
2•dhilipsiva•1h ago
The Project: I am building R3-Engine, a from-scratch, local AI inference engine for Microsoft's bitnet-b1.58-2B-4T. It is written in 100% Safe Rust, natively cross-compiles to Wasm SIMD128, and uses Zero heap allocations in the execution loop.

The Physics: By mapping a 64-byte aligned .r3 file directly from NVMe to CPU L3 Cache (Zero-Copy) and using AVX-512 VPOPCNTDQ for branchless math, the Ryzen 9950X3D achieves 117 Tokens/Second latency.

The Problem: The AI is mute (Outputting <unk>*)* The matrix multiplication pipeline is mathematically complete, but the output is stuck at Token ID 0 (<unk>). The issue lies in the transition between the quantized weights and the float-based non-linear activations.

Where I need expert input:

    Weight Tying in BitNet: Microsoft's 2B model ties Embeddings with the LM Head. I am cloning the embedding matrix for the output projection, but I suspect a scaling factor is missing.

    RMSNorm & SiLU in 1.58-bit: How should the raw integer accumulators (from the VPOPCNTDQ loop) be scaled before entering the SiLU activation and the subsequent layer?
GitHub Repo: https://github.com/r3-engine/r3-engine

If you know the physics of LLM Logit Sampling or ternary activation math, I would love your eyes on the codebase.

Show HN: Promo and offer code sharing and discovery for apps

https://proffer.codes/
1•indest•41s ago•0 comments

We've given up on keeping our initial arch docs/spec up to date..Should I worry?

1•songzitheowang•54s ago•0 comments

Show HN: Sentinel – Zero-trust governance for AI Agents

https://github.com/azdhril/Sentinel
1•azdhril•3m ago•0 comments

Autonomous language-image generation loops converge to generic visual motifs

https://www.cell.com/patterns/fulltext/S2666-3899(25)00299-5?_returnURL=https%3A%2F%2Flinkinghub....
1•Thorentis•3m ago•0 comments

Be Skeptical of Solving AI Alignment with Vibes

https://flowerpetals.substack.com/p/the-unproven-art-of-summoning-an
1•nonveumann•5m ago•0 comments

Show HN: Lochle, a Wordle clone for Scottish Lochs

https://lochle.xyz/
1•rexfuzzle•10m ago•0 comments

Show HN: 500-cycle runtime test for long-horizon LLM coherence

https://zenodo.org/records/18369990
1•teugent•12m ago•0 comments

The Possessed Machines: Dostoevsky's Demons and the Coming AGI Catastrophe

https://possessedmachines.com/
2•shishy•14m ago•0 comments

Notes for January 19-25 (My Coding Agent Sandboxing Setup)

https://taoofmac.com/space/notes/2026/01/25/2030
1•rcarmo•14m ago•0 comments

Canada

https://www.jenn.site/on-canada/
1•nsm•15m ago•0 comments

Show HN: Make custom ASCII art t-shirts from your terminal

https://www.asciitee.com/
1•kaniksu•15m ago•0 comments

Stop Saying Boredom Is Good for Kids

https://www.fast.ai/posts/2025-12-03-boredom/
2•Ariarule•16m ago•0 comments

Tim Cook taps John Ternus to oversee Apple's design teams

https://9to5mac.com/2026/01/22/tim-cook-quietly-taps-john-ternus-to-oversee-apples-design-teams-r...
2•brandonb•19m ago•0 comments

Robert Moreno and the use of ChatGPT that defined his time at Sochi

https://www.beinsports.com/en-us/soccer/articles/robert-moreno-and-the-use-of-chatgpt-that-define...
1•frereubu•19m ago•0 comments

AI Tribalism

https://nolanlawson.com/2026/01/24/ai-tribalism/
19•zurvanist•21m ago•2 comments

Another rabbit hole: Paperless-ngx

https://blog.notmyhostna.me/posts/another-rabbit-hole-paperless-ngx
1•dewey•24m ago•0 comments

Ramp vs. Brex: How the underdog won

https://www.productmarketfit.tech/p/ramp-vs-brex-how-the-underdog-won
1•brandonb•26m ago•0 comments

Show HN: Helping my band rehearse remotely without installing a DAW

https://www.singtogether.app
1•vaneyckseme•27m ago•0 comments

Visualize public transit usage in Seattle

https://opentransitsoftwarefoundation.org/2026/01/visualize-transit-usage-across-puget-sound/
1•aaronbrethorst•27m ago•0 comments

Cholesterol levels cut in half with one-time gene editing drug in trial

https://www.nbcnews.com/health/health-news/harmful-cholesterol-levels-cut-half-one-time-gene-edit...
4•brandonb•28m ago•1 comments

Build Your Personal AI Assistant with Claude Code

https://www.ronforbes.com/blog/build-your-personal-ai-assistant-with-claude-code
1•ronforbes•29m ago•0 comments

Nexphone-A phone that runs Android, Linux, and Windows?

https://nexphone.com/blog/the-tale-of-nexphone-one-phone-every-computer
2•andrewjneumann•31m ago•1 comments

Out to Get You (2017)

https://thezvi.wordpress.com/2017/09/23/out-to-get-you/
1•Ariarule•34m ago•0 comments

Printable Grid Paper

https://grid-paper.daverupert.com/
3•samsolomon•34m ago•2 comments

Don't Write Evals for Fast-Moving Systems

https://simon.podhajsky.net/blog/2026/evaluating-clobsidian/
1•sim_pod•35m ago•0 comments

The Great Ministry of Defence-to-Palantir Pipeline (UK)

https://www.opendemocracy.net/en/palantir-ministry-defence-hire-four-officials-2025-record-defenc...
3•mraniki•37m ago•1 comments

Hive – Plugin-based dev orchestrator with zero-downtime deploys

1•mgorunuch•37m ago•0 comments

Show HN: GPU rental marketplace – RTX 4090s $15/HR (BTC payments)

https://cool-paws-lead.loca.lt
1•paco-gpu-empire•38m ago•1 comments

Nearly half of Detroit seniors spend 30% or more of income on housing costs

https://theconversation.com/nearly-half-of-detroit-seniors-spend-at-least-30-of-their-income-on-h...
2•PaulHoule•39m ago•0 comments

Real-Time Electricity Prices and Grid Operations Across the US

https://www.gridstatus.io/live
4•kmax12•40m ago•0 comments