frontpage.

Hi HN, I built OpenGraviton, an open-source AI inference engine that pushes the limits of running extremely large LLMs on consumer hardware. By combining 1.58-bit ternary quantization, dynamic sparsity with Top-K pruning and MoE routing, and mmap-based layer streaming, OpenGraviton can run models far larger than your system RAM—even on a Mac Mini. Early benchmarks: TinyLlama-1.1B drops from ~2GB (FP16) to ~0.24GB with ternary quantization. At 140B scale, models that normally require ~280GB fit within ~35GB packed. Optimized for Apple Silicon with Metal + C++ tensor unpacking, plus speculative decoding for faster generation. Check benchmarks, architecture, and details here: https://opengraviton.github.io GitHub: https://github.com/opengraviton This project isn’t just about squeezing massive models onto tiny hardware—it’s about democratizing access to giant LLMs without cloud costs. Feedback, forks, and ideas are very welcome!

Show HN: VS Code Agent Kanban: Task Management for the AI-Assisted Developer

Show HN: Run autoresearch on a gaming PC (Windows and RTX GPUs fork)

Show HN: Skir – like Protocol Buffer but better

Show HN: I built a real-time OSINT dashboard pulling 15 live global feeds

Show HN: Mcp2cli – One CLI for every API, 96-99% fewer tokens than native MCP

Show HN: Engram – open-source persistent memory for AI agents (Bun and SQLite)

Show HN: I built a site where strangers leave kind voice notes for each other

Show HN: Husky hook that blocks Git push until you do your pushups

Show HN: Reviving a 20-year-old puzzle game Chromatron with Ghidra and AI

Show HN: Eyot, A programming language where the GPU is just another thread

Show HN: cursor-tg – Run Cursor Cloud Agents from Telegram

Show HN: WolfStack – Proxmox-like server management in a single Rust binary

Show HN: Botais (Battle of the AI's) – Competitive Snake Game for LLMs

Show HN: Curiosity – DIY 6" Newtonian Reflector Telescope

Show HN: OpenMeters – A fast and free audio metering/visualization suite

Show HN: Finsight – A Privacy First, AI Credit Card and Bank Statement Analyzer

Show HN: OxiMedia – Pure Rust Reconstruction of FFmpeg and OpenCV

Show HN: Bunway – Express-compatible web framework for Bun

Show HN: Run 500B+ Parameter LLMs Locally on a Mac Mini

Show HN: ANSI-Saver – A macOS Screensaver

Show HN: µJS, a 5KB alternative to Htmx and Turbo with zero dependencies

Show HN: U-Claw – An Offline Installer USB for OpenClaw in China

Show HN: Environment Variable Checker

Show HN: AlphaPerch – Track product execution for companies you follow using AI

Show HN: Compose Launcher – A macOS app to run multiple Docker Compose files

Show HN: Moongate – Ultima Online server emulator in .NET 10 with Lua scripting

Show HN: Kula – Lightweight, self-contained Linux server monitoring tool

Show HN: Claude-replay – A video-like player for Claude Code sessions

Show HN: I open-sourced my Steam game, 100% written in Lua, engine is also open

Show HN: ChatML - Run Claude Code Parallel Sessions in a Desktop app

Show HN: Run 500B+ Parameter LLMs Locally on a Mac Mini

Comments

Show HN: VS Code Agent Kanban: Task Management for the AI-Assisted Developer

Show HN: Run autoresearch on a gaming PC (Windows and RTX GPUs fork)

Show HN: Skir – like Protocol Buffer but better

Show HN: I built a real-time OSINT dashboard pulling 15 live global feeds

Show HN: Mcp2cli – One CLI for every API, 96-99% fewer tokens than native MCP

Show HN: Engram – open-source persistent memory for AI agents (Bun and SQLite)

Show HN: I built a site where strangers leave kind voice notes for each other

Show HN: Husky hook that blocks Git push until you do your pushups

Show HN: Reviving a 20-year-old puzzle game Chromatron with Ghidra and AI

Show HN: Eyot, A programming language where the GPU is just another thread

Show HN: cursor-tg – Run Cursor Cloud Agents from Telegram

Show HN: WolfStack – Proxmox-like server management in a single Rust binary

Show HN: Botais (Battle of the AI's) – Competitive Snake Game for LLMs

Show HN: Curiosity – DIY 6" Newtonian Reflector Telescope

Show HN: OpenMeters – A fast and free audio metering/visualization suite

Show HN: Finsight – A Privacy First, AI Credit Card and Bank Statement Analyzer

Show HN: OxiMedia – Pure Rust Reconstruction of FFmpeg and OpenCV

Show HN: Bunway – Express-compatible web framework for Bun

Show HN: Run 500B+ Parameter LLMs Locally on a Mac Mini

Show HN: ANSI-Saver – A macOS Screensaver

Show HN: µJS, a 5KB alternative to Htmx and Turbo with zero dependencies

Show HN: U-Claw – An Offline Installer USB for OpenClaw in China

Show HN: Environment Variable Checker

Show HN: AlphaPerch – Track product execution for companies you follow using AI

Show HN: Compose Launcher – A macOS app to run multiple Docker Compose files

Show HN: Moongate – Ultima Online server emulator in .NET 10 with Lua scripting

Show HN: Kula – Lightweight, self-contained Linux server monitoring tool

Show HN: Claude-replay – A video-like player for Claude Code sessions

Show HN: I open-sourced my Steam game, 100% written in Lua, engine is also open

Show HN: ChatML - Run Claude Code Parallel Sessions in a Desktop app