frontpage.

Hi HN,

I built OpenGraviton, an open-source AI inference engine designed to push the limits of running extremely large models on consumer hardware.

The system combines several techniques to drastically reduce memory and compute requirements:

• 1.58-bit ternary quantization ({-1, 0, +1}) for ~10x compression • dynamic sparsity with Top-K pruning and MoE routing • mmap-based layer streaming to load weights directly from NVMe SSDs • speculative decoding to improve generation throughput

These allow models far larger than system RAM to run locally.

In early benchmarks, OpenGraviton reduced TinyLlama-1.1B from ~2.05GB (FP16) to ~0.24GB using ternary quantization. Synthetic stress tests at the 140B scale show that models which would normally require ~280GB FP16 can fit within ~35GB when packed with the ternary format.

The project is optimized for Apple Silicon and currently uses custom Metal + C++ tensor unpacking.

Benchmarks, architecture, and details: https://opengraviton.github.io

GitHub: https://github.com/opengraviton

Young billionaires are behind the prediction market boom. They hate each other

Life Happens at 1x Speed

The Full Rewrite: AI Edition

Why Do Ivy League Colleges Reject Some Students with Perfect Scores

The Origin Story of gRPC

Students Are Finding New Ways to Cheat on the SAT

I Asked 6 AIs to Nuke My Computer [video]

Why Gen Z Is Unprepared for the Workplace

From Studio to Street: The Story of DAT (1990)

The Apollo Guidance Computer Talk (2017) [video]

Show HN: SRA – A new architectural pattern for modern product engineering

The Dangerous Illusion of AI Coding? – Jeremy Howard [video]

Information Topology as a Behavioral Parameter in Multi-Agent Systems

Armed robots take to the battlefield in Ukraine war

Product Review: The K Desktop Environment, Version 1.0 (1999)

Ask HN: Can we talk about AI Astroturfing?

OpenAI robotics leader resigns over concerns on surveillance and auto-weapons

19 States approved permanent daylight saving time

Show HN: AI video generator for small businesses without video production budget

Moral Hazard

Learning Rust with Too Many Linked Lists

Why Current AI Systems are not good to work with

Trump gets data center companies to pledge to pay for power generation

SimEarth: Realtime

Lawmakers Want DoD Investigated for Biblical 'Armageddon' Claims

Show HN: Personal Standup

January 6 commemorative plaque appears in Capitol after years of delay

The Power Brokers Behind the $250B Influencer Economy

The Antifragile Organization: Designing Systems That Evolve Through Chaos

Footage shows US citizen shot dead by ICE agent in Texas traffic stop

Show HN: OpenGraviton – Run 500B+ parameter models on a consumer Mac Mini

Comments