frontpage.

After playing around with local AI setups for a while, I kept getting annoyed at having to juggle different llama.cpp servers for each model. Switching between them was such a pain and I always had to restart things just to load up a new model. So I ended up building something to fix that. It's called FlexLLama - https://github.com/yazon/flexllama Basically, it's a tool that lets you run multiple llama.cpp instances easily, spread across CPU and GPUs if you got'em. Everything sits behind a single OpenAI-compatible API. You can run chat models, embeddings, rerankers - all at once. The models assigned to the runners are reloaded on the fly. There's a little web dashboard to monitor and manage runners. It's super easy to get started: just pip install from the repo, or grab the Docker image for a speedy setup.

I've been using it myself with things like OpenWebUI and some VS Code extensions (Roo Code, Cline, Continue.dev), and it works flawlessly.

Ruby Bib – Academic Writing on Ruby

MapYourGrid

Show HN: I've been building a configuration tool for Claude Code

Show HN: I built a Squaredle solver that visualizes paths

Engineer restores pay phones for free public use

Trust in AI coding tools is plummeting

Ask HN: Feedback on my privacy-first resume builder (no login, no tracking)

Replyallpocalypse breaks the NHS's email system (2016)

An All-Ork Git Commit Message Convention (All-Caps, Zero 'Humie' Words)

Dinosaw Machines

Happy Birthday 6502

A Hiker Was Missing for Nearly a Year. Then an AI System Spotted His Helmet

Gut-Feelings vs. Metrics

A top designer was banned from Dribbble. Now he's building his own competitor

The Curse of the A-Word

Nerve Calm Canada

Show HN: Embeddable -build interactive experiences you can drop into any website

Horizon Beta (ChatGPT 5?)

Numai is an open source JavaScript spreadsheet powered by AI

When to Hire a Computer Performance Engineering Team

Bread, Circuses and Education

Show HN: I built a platform to showcase digital projects and connect creators

Chinese blessing scam more prominent in Australia as operators re-emerge

You Have Too Many Metrics

Adopting Claude Code: Riding the Software Economics Singularity

Show HN: We built this to save creators time, looking for early users (free)

Show HN: Dataset Explorer – Free tool to search any public datasets

Type Safety Back and Forth

Inside North Korea's effort to infiltrate U.S. companies

P-fast trie: lexically ordered hash map

Show HN: FlexLLama – Run multiple local LLMs at once with a simple dashboard