frontpage.

Running a GGUF model locally usually means writing custom inference code or wrestling with llama.cpp's CLI flags every time you want to test something.

Existing OpenAI-compatible servers often require Docker, complex configuration files, or GPU support.

The gap between "I have a .gguf file" and "I have a working API endpoint" is wider than it should be.

A simple CLI tool to serve GGUF models as an endpoint: gguf-serve

To cut this short, we asked Neo to build gguf-serve.

Point it at any .gguf file, run the server, and immediately get OpenAI-compatible endpoints that work with any client library or tool that speaks the OpenAI API format.

UK Police will no longer waste time investigating legal social media posts

What Memoir Scandals Tell Us about Two LLM Writing Scandals

Liquid Glass Updates in 26.4

Show HN: Claude Code's –/.claude/rules/ loads globally – I built around it

New fibre optic record allows 50M movies to be streamed at once

Show HN: We turned accounting into a CLI command

On the trail of ancient art, deep in the Sahara

Show HN: UAV modeling and yaw maneuvers gone wrong

Claude Code's Real Secret Sauce Isn't the Model

If you're a U.S. Google user, you can now change your account username

I Just Want to Own My Audiobooks

Every Cure – Save lives by repurposing drugs

Clawd-code – A Python slop fork of Claude Code

Synthetic Responses: The Big Lie of AI

The Minimum Developer Must Know About Unicode and Character Sets (2003)

Goldman CIO Marco Argenti on the Warp-Speed Improvements in AI

The world’s largest humanoid robot maker is going public

Five Open Source AI Agentic Models for Autonomous Work

OpenAI Closing Its One-Stop AI Slop Shop Sora Is a Cautionary Tale

Whoop, a wearable device maker, raises $575M

Quantum Threat, Today

Oracle slashes 30k jobs with a cold 6 a.m. email

Human in the Loop

First Experimental Demonstration of Bell Correlations in the Motion of Atoms

Oracle laying off around 30k jobs across many departments

Totem – proxy that detects if your LLM has been tampered with

Against the Concept of Telescopic Altruism

SharpEdge: Live API for finding mispriced sports odds across sportsbooks

Google C++ Style Guide

Microsoft: Copilot is for entertainment purposes only

Show HN: Host any GGUF model in one command