Show HN: docker/model-runner – an open-source tool for local LLMs

18•ericcurtin•3mo ago

Hey Hacker News,

We're the maintainers of docker/model-runner and wanted to share some major updates we're excited about.

Link: https://github.com/docker/model-runner

We are rebooting the community:

https://www.docker.com/blog/rebooting-model-runner-community...

At its core, model-runner is a simple, backend-agnostic tool for downloading and running local large language models. Think of it as a consistent interface to interact with different model backends. One of our main backends is llama.cpp, and we make it a point to contribute any improvements we make back upstream to their project. It also allows people to transport models via OCI registries like Docker Hub. Docker Hub hosts our curated local AI model collection, packaged as OCI Artifacts and ready to run. You can easily download, share, and upload models on Docker Hub, making it a central hub for both containerized applications and the next wave of generative AI.

We've been working hard on a few things recently:

- Vulkan and AMD Support: We've just merged support for Vulkan, which opens up local inference to a much wider range of GPUs, especially from AMD.

- Contributor Experience: We refactored the project into a monorepo. The main goal was to make the architecture clearer and dramatically lower the barrier for new contributors to get involved and understand the codebase.

- It's Fully Open Source: We know that a project from Docker might raise questions about its openness. To be clear, this is a 100% open-source, Apache 2.0 licensed project. We want to build a community around it and welcome all contributions, from documentation fixes to new model backends.

- DGX Spark day-0 support, we've got it!

Our goal is to grow the community. We'll be here all day to answer any questions you have. We'd love for you to check it out, give us a star if you like it, and let us know what you think.

Thanks!

Comments

ericcurtin•3mo ago

Hi everyone, we're the maintainers.

We're rebooting the model-runner community and wanted to share what we've been up to and where we're headed.

When we first built this, the idea was simple: make running local models as easy as running containers. You get a consistent interface to download and run models from different backends (llama.cpp being a key one) and can even transport them using familiar OCI registries like Docker Hub.

Recently, we've invested a lot of effort into making it a true community project. A few highlights:

- The project is now a monorepo, making it much easier for new contributors to find their way around.

- We've added Vulkan support to open things up for AMD and other non-NVIDIA GPUs.

- We made sure we have day-0 support for the latest NVIDIA DGX hardware.

shelajev•3mo ago

Nice, I really like the recent Vulkan support.

ericcurtin•3mo ago

Thanks very much. It worked well for you? Which hardware? :) Any other feedback, keep it coming!

jkoenig134•3mo ago

Awesome!

ericcurtin•3mo ago

What did you like? Anything stand out?

davidnet•3mo ago

Docker model run is now part of my demos when deploying ml stack stuff, pretty sure that this is removing the entrypoint of using multiple tools to just do inference, this is great!

ericcurtin•3mo ago

Any new features you think we should add to further enhance your usage? Glad you find it useful

juangcarmona•3mo ago

Really glad to see DMR getting "new life"... I’ve been experimenting with it for local agentic workloads (MAF, Google's ADK, cagent, Docker MCP, etc...) and it’s such a clean foundation...

A few things that could make it even more powerful (maybe some are out of your scope):

- Persistent model settings (context size, temperature, etc.) across restarts — right now it always resets to 4k, which breaks multi-turn agents. - HTTP/gRPC interface to let tools and frameworks talk to DMR directly, not only through the CLI. (Here the issue is on Docker MCP side, right?) - Simple config management (`docker model set` or `docker model config`) so we can tweak GPU, threads, precision, etc. predictably. (there are at least a couple of issues on this topic already...)

TBH, I love how fast the discussion evolved today.

Congrats and good luck with this. I'll try to help, promised!

ericcurtin•3mo ago

Keep opening pull requests and issues, we need these things, you are right!

nigelpoulton•3mo ago

Love that it's open source and the addition of Vulkan support.

US moves to deport 5-year-old detained in Minnesota

If you lose your passport in Austria, head for McDonald's Golden Arches

Show HN: Mermaid Formatter – CLI and library to auto-format Mermaid diagrams

RFCs vs. READMEs: The Evolution of Protocols

Kanchipuram Saris and Thinking Machines

Chinese chemical supplier causes global baby formula recall

I've used AI to write 100% of my code for a year as an engineer

Looking for 4 Autistic Co-Founders for AI Startup (Equity-Based)

AI-native capabilities, a new API Catalog, and updated plans and pricing

What changed in tech from 2010 to 2020?

From Human Ergonomics to Agent Ergonomics

Advanced Inertial Reference Sphere

Toyota Developing a Console-Grade, Open-Source Game Engine with Flutter and Dart

Typing for Love or Money: The Hidden Labor Behind Modern Literary Masterpieces

Show HN: A longitudinal health record built from fragmented medical data

CoreWeave's $30B Bet on GPU Market Infrastructure

Creating and Hosting a Static Website on Cloudflare for Free

"The Stanford scam proves America is becoming a nation of grifters"

Elon Musk on Space GPUs, AI, Optimus, and His Manufacturing Method

X (Twitter) is back with a new X API Pay-Per-Use model

Zlob.h 100% POSIX and glibc compatible globbing lib that is faste and better

Show HN: Deterministic signal triangulation using a fixed .72% variance constant

Scientists Discover Levitating Time Crystals You Can Hold, Defy Newton’s 3rd Law

When Michelangelo Met Titian

Solving NYT Pips with DLX

Baldur's Gate to be turned into TV series – without the game's developers

Interview with 'Just use a VPS' bro (OpenClaw version) [video]

EchoJEPA: Latent Predictive Foundation Model for Echocardiography

Disablling Go Telemetry

Effective Nihilism

US moves to deport 5-year-old detained in Minnesota

If you lose your passport in Austria, head for McDonald's Golden Arches

Show HN: Mermaid Formatter – CLI and library to auto-format Mermaid diagrams

RFCs vs. READMEs: The Evolution of Protocols

Kanchipuram Saris and Thinking Machines

Chinese chemical supplier causes global baby formula recall

I've used AI to write 100% of my code for a year as an engineer

Looking for 4 Autistic Co-Founders for AI Startup (Equity-Based)

AI-native capabilities, a new API Catalog, and updated plans and pricing

What changed in tech from 2010 to 2020?

From Human Ergonomics to Agent Ergonomics

Advanced Inertial Reference Sphere

Toyota Developing a Console-Grade, Open-Source Game Engine with Flutter and Dart

Typing for Love or Money: The Hidden Labor Behind Modern Literary Masterpieces

Show HN: A longitudinal health record built from fragmented medical data

CoreWeave's $30B Bet on GPU Market Infrastructure

Creating and Hosting a Static Website on Cloudflare for Free

"The Stanford scam proves America is becoming a nation of grifters"

Elon Musk on Space GPUs, AI, Optimus, and His Manufacturing Method

X (Twitter) is back with a new X API Pay-Per-Use model

Zlob.h 100% POSIX and glibc compatible globbing lib that is faste and better

Show HN: Deterministic signal triangulation using a fixed .72% variance constant

Scientists Discover Levitating Time Crystals You Can Hold, Defy Newton’s 3rd Law

When Michelangelo Met Titian

Solving NYT Pips with DLX

Baldur's Gate to be turned into TV series – without the game's developers

Interview with 'Just use a VPS' bro (OpenClaw version) [video]

EchoJEPA: Latent Predictive Foundation Model for Echocardiography

Disablling Go Telemetry

Effective Nihilism

Show HN: docker/model-runner – an open-source tool for local LLMs

Comments