frontpage.

"GGUF quantization" is the most popular tech stack for quantizing Llama-like models for CPU. But the documentation is very sparse, and the maintainers made it clear that writing a paper is not their priority. So I spent like a week reading through the code and understanding the various concepts (K-quants, I-quants, importance matrix, etc) and put together this (unofficial) repo with explainers.

It was mostly written by hand, without standard AI slop. I used AI mostly just to interrogate Claude Code on the llama.cpp codebase to help me understand it.

It's possible that I made mistakes or missed things here and there. If you have in-depth knowledge, I'd love your contributions!

Videos from the Amazon Reveal an Unexpected Animal Friendship

Show HN: Tech Debt Game – Launch a programming language before the deadline

I improved funny-bunnies.fleo.at and it is my birthday

Tsunami warning issued in Southern Alaska after 7.3 magnitude earthquake

Babies made using three people's DNA are born free of mitochondrial disease

Marin – open lab for building foundation models

Claude Is Back on Windsurf

Delivering the Missing Building Blocks for Nvidia CUDA Kernel Fusion in Python

Project Servfail: One Year In

Lessons from an Olive Oil Sommelier

Mark Cuban: Why AI Will Create More Jobs, Not Fewer

Anthropic hired back two of its employees – 2 weeks after leaving for Cursor

A third parent's DNA can prevent an inherited disease

Tin Can – The landline, reinvented for kids

Hügelkultur – Horticultural Technique

The performance of a heat sink for satellite avionics thermal management

Mapped: The Richest Person in Every U.S. State

Some gut microbes can absorb and help expel 'forever chemicals' from the body

An Etymological Knockout

Stop Pretending You're the Last Developer

Context in LLMs and the Blockchain

Cantor Fitzgerald close to $4B SPAC deal with Bitcoin pioneer(Adam Back)

VMware Workstation 17.6.4 Pro

The Pragmatic Engineer 2025 Survey: What's in your tech stack?

Vulnerability in End-of-Train and Head-of-Train Remote Linking Protocol

Show HN: Draft XCP protocol for cross-agent comms (Maida.AI)

Ask HN: Merge the branch into main before build/test in CI

An intuition for distributed consensus in OLTP systems

Dark Ride to the Source

Gajim 2.3.3 has been released – GTK XMPP/Jabber Chat Client – Communication

Show HN: Explainer/docs for GGUF quantization (unofficial)