frontpage.

The Project: I am building R3-Engine, a from-scratch, local AI inference engine for Microsoft's bitnet-b1.58-2B-4T. It is written in 100% Safe Rust, natively cross-compiles to Wasm SIMD128, and uses Zero heap allocations in the execution loop.

The Physics: By mapping a 64-byte aligned .r3 file directly from NVMe to CPU L3 Cache (Zero-Copy) and using AVX-512 VPOPCNTDQ for branchless math, the Ryzen 9950X3D achieves 117 Tokens/Second latency.

The Problem: The AI is mute (Outputting <unk>*)* The matrix multiplication pipeline is mathematically complete, but the output is stuck at Token ID 0 (<unk>). The issue lies in the transition between the quantized weights and the float-based non-linear activations.

Where I need expert input:

    Weight Tying in BitNet: Microsoft's 2B model ties Embeddings with the LM Head. I am cloning the embedding matrix for the output projection, but I suspect a scaling factor is missing.

    RMSNorm & SiLU in 1.58-bit: How should the raw integer accumulators (from the VPOPCNTDQ loop) be scaled before entering the SiLU activation and the subsequent layer?

GitHub Repo: https://github.com/r3-engine/r3-engine

If you know the physics of LLM Logit Sampling or ternary activation math, I would love your eyes on the codebase.

Was going to share my work

Pitchfork: A devilishly good process manager for developers

You Are Here

Why social apps need to become proactive, not reactive

How patient are AI scrapers, anyway? – Random Thoughts

Vouch: A contributor trust management system

I built a terminal monitoring app and custom firmware for a clock with Claude

Tiny C Compiler

Y Combinator Founder Organizes 'March for Billionaires'

Ask HN: Need feedback on the idea I'm working on

OpenClaw Addresses Security Risks

Apple finalizes Gemini / Siri deal

Italy Railways Sabotaged

Emacs-tramp-RPC: high-performance TRAMP back end using MsgPack-RPC

Nintendo Wii Themed Portfolio

"There must be something like the opposite of suicide "

Ask HN: Why doesn't Netflix add a “Theater Mode” that recreates the worst parts?

Show HN: Engineering Perception with Combinatorial Memetics

Show HN: Steam Daily – A Wordle-like daily puzzle game for Steam fans

The Anthropic Hive Mind

Just Started Using AmpCode

LLM as an Engineer vs. a Founder?

Crosstalk inside cells helps pathogens evade drugs, study finds

Show HN: Design system generator (mood to CSS in <1 second)

Show HN: 26/02/26 – 5 songs in a day

Toroidal Logit Bias – Reduce LLM hallucinations 40% with no fine-tuning

Top AI models fail at >96% of tasks

The Science of the Perfect Second (2023)

Bob Beck (OpenBSD) on why vi should stay vi (2006)

Show HN: a glimpse into the future of eye tracking for multi-agent use