Show HN: docker/model-runner – an open-source tool for local LLMs

18•ericcurtin•3mo ago

Hey Hacker News,

We're the maintainers of docker/model-runner and wanted to share some major updates we're excited about.

Link: https://github.com/docker/model-runner

We are rebooting the community:

https://www.docker.com/blog/rebooting-model-runner-community...

At its core, model-runner is a simple, backend-agnostic tool for downloading and running local large language models. Think of it as a consistent interface to interact with different model backends. One of our main backends is llama.cpp, and we make it a point to contribute any improvements we make back upstream to their project. It also allows people to transport models via OCI registries like Docker Hub. Docker Hub hosts our curated local AI model collection, packaged as OCI Artifacts and ready to run. You can easily download, share, and upload models on Docker Hub, making it a central hub for both containerized applications and the next wave of generative AI.

We've been working hard on a few things recently:

- Vulkan and AMD Support: We've just merged support for Vulkan, which opens up local inference to a much wider range of GPUs, especially from AMD.

- Contributor Experience: We refactored the project into a monorepo. The main goal was to make the architecture clearer and dramatically lower the barrier for new contributors to get involved and understand the codebase.

- It's Fully Open Source: We know that a project from Docker might raise questions about its openness. To be clear, this is a 100% open-source, Apache 2.0 licensed project. We want to build a community around it and welcome all contributions, from documentation fixes to new model backends.

- DGX Spark day-0 support, we've got it!

Our goal is to grow the community. We'll be here all day to answer any questions you have. We'd love for you to check it out, give us a star if you like it, and let us know what you think.

Thanks!

Comments

ericcurtin•3mo ago

Hi everyone, we're the maintainers.

We're rebooting the model-runner community and wanted to share what we've been up to and where we're headed.

When we first built this, the idea was simple: make running local models as easy as running containers. You get a consistent interface to download and run models from different backends (llama.cpp being a key one) and can even transport them using familiar OCI registries like Docker Hub.

Recently, we've invested a lot of effort into making it a true community project. A few highlights:

- The project is now a monorepo, making it much easier for new contributors to find their way around.

- We've added Vulkan support to open things up for AMD and other non-NVIDIA GPUs.

- We made sure we have day-0 support for the latest NVIDIA DGX hardware.

shelajev•3mo ago

Nice, I really like the recent Vulkan support.

ericcurtin•3mo ago

Thanks very much. It worked well for you? Which hardware? :) Any other feedback, keep it coming!

jkoenig134•3mo ago

Awesome!

ericcurtin•3mo ago

What did you like? Anything stand out?

davidnet•3mo ago

Docker model run is now part of my demos when deploying ml stack stuff, pretty sure that this is removing the entrypoint of using multiple tools to just do inference, this is great!

ericcurtin•3mo ago

Any new features you think we should add to further enhance your usage? Glad you find it useful

juangcarmona•3mo ago

Really glad to see DMR getting "new life"... I’ve been experimenting with it for local agentic workloads (MAF, Google's ADK, cagent, Docker MCP, etc...) and it’s such a clean foundation...

A few things that could make it even more powerful (maybe some are out of your scope):

- Persistent model settings (context size, temperature, etc.) across restarts — right now it always resets to 4k, which breaks multi-turn agents. - HTTP/gRPC interface to let tools and frameworks talk to DMR directly, not only through the CLI. (Here the issue is on Docker MCP side, right?) - Simple config management (`docker model set` or `docker model config`) so we can tweak GPU, threads, precision, etc. predictably. (there are at least a couple of issues on this topic already...)

TBH, I love how fast the discussion evolved today.

Congrats and good luck with this. I'll try to help, promised!

ericcurtin•3mo ago

Keep opening pull requests and issues, we need these things, you are right!

nigelpoulton•3mo ago

Love that it's open source and the addition of Vulkan support.

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

Show HN: I spent 4 years building a UI design tool with only the features I use

Show HN: If you lose your memory, how to regain access to your computer?

Show HN: MCP App to play backgammon with your LLM

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

Show HN: Smooth CLI – Token-efficient browser for AI agents

Show HN: I'm 75, building an OSS Virtual Protest Protocol for digital activism

Show HN: I built Divvy to split restaurant bills from a photo

Show HN: ARM64 Android Dev Kit

Show HN: BioTradingArena – Benchmark for LLMs to predict biotech stock movements

Show HN: Slack CLI for Agents

Show HN: Artifact Keeper – Open-Source Artifactory/Nexus Alternative in Rust

Show HN: I Hacked My Family's Meal Planning with an App

Show HN: I built a free UCP checker – see if AI agents can find your store

Show HN: Gigacode – Use OpenCode's UI with Claude Code/Codex/Amp

Show HN: Compile-Time Vibe Coding

Show HN: Slop News – HN front page now, but it's all slop

Show HN: Daily-updated database of malicious browser extensions

Show HN: Micropolis/SimCity Clone in Emacs Lisp

Show HN: Horizons – OSS agent execution engine

Show HN: Falcon's Eye (isometric NetHack) running in the browser via WebAssembly

Show HN: Fitspire – a simple 5-minute workout app for busy people (iOS)

Show HN: Local task classifier and dispatcher on RTX 3080

Show HN: I built a RAG engine to search Singaporean laws

Show HN: Sem – Semantic diffs and patches for Git

Show HN: A password system with no database, no sync, and nothing to breach

Show HN: GitClaw – An AI assistant that runs in GitHub Actions

Show HN: FastLog: 1.4 GB/s text file analyzer with AVX2 SIMD

Show HN: Craftplan – I built my wife a production management tool for her bakery

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

Show HN: I spent 4 years building a UI design tool with only the features I use

Show HN: If you lose your memory, how to regain access to your computer?

Show HN: MCP App to play backgammon with your LLM

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

Show HN: Smooth CLI – Token-efficient browser for AI agents

Show HN: I'm 75, building an OSS Virtual Protest Protocol for digital activism

Show HN: I built Divvy to split restaurant bills from a photo

Show HN: ARM64 Android Dev Kit

Show HN: BioTradingArena – Benchmark for LLMs to predict biotech stock movements

Show HN: Slack CLI for Agents

Show HN: Artifact Keeper – Open-Source Artifactory/Nexus Alternative in Rust

Show HN: I Hacked My Family's Meal Planning with an App

Show HN: I built a free UCP checker – see if AI agents can find your store

Show HN: Gigacode – Use OpenCode's UI with Claude Code/Codex/Amp

Show HN: Compile-Time Vibe Coding

Show HN: Slop News – HN front page now, but it's all slop

Show HN: Daily-updated database of malicious browser extensions

Show HN: Micropolis/SimCity Clone in Emacs Lisp

Show HN: Horizons – OSS agent execution engine

Show HN: Falcon's Eye (isometric NetHack) running in the browser via WebAssembly

Show HN: Fitspire – a simple 5-minute workout app for busy people (iOS)

Show HN: Local task classifier and dispatcher on RTX 3080

Show HN: I built a RAG engine to search Singaporean laws

Show HN: Sem – Semantic diffs and patches for Git

Show HN: A password system with no database, no sync, and nothing to breach

Show HN: GitClaw – An AI assistant that runs in GitHub Actions

Show HN: FastLog: 1.4 GB/s text file analyzer with AVX2 SIMD

Show HN: Craftplan – I built my wife a production management tool for her bakery

Show HN: docker/model-runner – an open-source tool for local LLMs

Comments