Pure Rust ternary inference engine based on BitNet b1.58-2B-4T. No Python, no CUDA, no external ML frameworks. Single executable + model weights = portable AI that runs on any machine.
Zero-multiplication inference — ternary weights {-1, 0, +1} mean the inner GEMV loop uses only addition and subtraction, no floating-point multiply. Smart system awareness — detects RAM and CPU at startup and adjusts generation limits automatically.
blockmandev•1h ago
Zero-multiplication inference — ternary weights {-1, 0, +1} mean the inner GEMV loop uses only addition and subtraction, no floating-point multiply. Smart system awareness — detects RAM and CPU at startup and adjusts generation limits automatically.