Show HN: Docker pulls more than it needs to - and how we can fix it

3•a_t48•1h ago

Hi all!

I've built a small tool to visualize how inefficient `docker pull` is, in preparation for standing up a new Docker registry + transport. It's bugged me for a while that updating one dependency with Docker drags along many other changes. It's a huge problem with Docker+robotics. With dozens or hundreds of dependencies, there's no "right" way to organize the layers that doesn't end up invalidating a bunch of layers on a single dependency update - and this is ignoring things like compiled code, embedded ML weights, etc. Even worse, many robotics deployments are on terrible internet, either due to being out in the boonies or due to customer shenanagins. I've been up at 4AM before supporting a field tech who needs to pull 100MB of mostly unchanged Docker layers to 8 robots on a 1Mbps connnection. (and I don't think that robotics is the only industry that runs into this, either - see the ollama example, that's a painful pull)

What if Docker were smarter and knew about the files were already on disk? How many copies of `python3.10` do I have floating around `/var/lib/docker`. For that matter, how many copies of it does DockerHub have? A registry that could address and deduplicate at the file level rather than just the layer level is surely cheaper to run.

This tool:

    - Given two docker images, one you have and one you are pulling, finds how much data docker pull would use, as well as how much data is _actually_ required to pull

    - Shows an estiimate for how much time you will save on various levels of cruddy internet

    - There's a bunch of examples given of situations where more intelligent pulls would help, but the two image names are free text, feel free to write your own values there and try it out (one at a time though, there's a work queue to analyze new image pairs)

The one thing I wish it had but haven't gotten around to fitting in the UI somehow is a visualization of the files that _didn't_ change but are getting pulled anyhow.

It was written entirely in Claude Code, which is a new experience for me. I don't know nextjs at all, I don't generally write frontends. I could have written the backend maybe a little slower than Claude, but the frontend would have taken me 4x as long and wouldn't have been as pretty. It helped that I knew what I wanted on the backend, I think.

The registry/transport/snapshotter(?) I'm building will allow both sharing files across docker layers on your local machine well as in the registry. There's a bit of prior art with this, but only on the client side. The eStargz format allows splitting apart the metadata for a filesystem and the contents, while still remaining OCI compliant - but it does lazy pulls of the contents, and has no deduplication. I think it could easily compete with other image providers both on cost (due to using less storage and bandwidth...everywhere) as well as speed.

If you'd be interested, please reach out.

Comments

PaulHoule•1h ago

Back in the early 2010s I couldn't bring up Docker images at all on my 2mbps DSL because any attempt to download images would time out.

theamk•1h ago

Reminds me of OSTree and casync.

danudey•1h ago

If you're interested in implementing this directly into your dockerfiles with some minimal changes, Docker already supports this to a degree:

https://docs.docker.com/reference/dockerfile/#copy---link

The TL;DR:

If you change your dockerfile to use `COPY --link <foo> <bar>`, then docker will create a layer containing only the files that would be copied, and that layer is treated as independent of layers coming before it. The only caveat is that you need to have a build cache with previous builds and use --cache-from to specify it, which means saving build state.

That said, there are a lot of benefits you can get very quickly if you can implement it. For example, if you have a dockerfile which creates a container, builds your golang application in it, and then copies the result into a fresh alpine:3.23.3 image, and you use a local cache for that build, then when you update to alpine 3.23.4 it will see that the build layers have not changed, therefore the `COPY --link` layer has not changed. Thus, it can just directly apply that on top of the new alpine image without doing any extra work.

Apparently it can even be smart enough to realize that it doesn't need to pull down the new alpine:3.23.4 image; it can just create a manifest that references its layers and upload the manifest; the new alpine image layers are there, the original 'my application' layers are already there, so it just creates a new manifest and publishes it. No bandwidth used at all!

> How many copies of `python3.10` do I have floating around `/var/lib/docker`.

Well, if you use 'FROM python:3.10' for your images then only one.

If you're careful, you can sort of pull together contents of multiple images by using `COPY --link`, and then even if you have 10 layers then changing from python:3.10 to python:3.14 only changes one of them.

Again, this does require that you maintain a cache, but that cache can live in a lot of places that doesn't have to be the local filesystem: https://docs.docker.com/reference/cli/docker/buildx/build/#c...

a_t48•59m ago

I'm well aware of `COPY --link`, it doesn't solve the problem. I'm a heavy heavy user of it, combined with throwaway build stages. `COPY --link` won't help my `apt install` commands.

The use case here isn't `FROM python:3.10`, it's `FROM ubuntu; RUN apt install -y vim wget curl software-properties-common python3.10`/`RUN rosdep install`/`RUN --mount=type=cache,target=/root/.cache/uv --mount=type=bind,source=uv.lock,target=uv.lock --mount=type=bind,source=pyproject.toml,target=pyproject.toml uv sync --locked --no-install-project`. All of those dependencies get merged onto a single layer that isn't shared with anything else. You'd better hope something like tensorflow isn't one of those dependencies.

Musk tells jury 'people read too much' into his posts

Web-Haptics: Haptic Feedback Comes to iOS Safari

$800 Monthly Car Payments Are Hurting Car Sales

Ladybird browser update (February 2026) [video]

AI Is Not Going to Kill Software Engineering

Shattered Galaxy – a persistent browser MMO RTS

Show HN: Msgspec-config, yet another config library for msgspec

Microsoft Is Stress-Testing the Agentic AI Bubble in Its Own Gaming Division

Android released a new official LLM code-generation benchmark: Android Bench

Trump fires Kristi Noem as DHS secretary

When Fonts Fall

Code Bonito – Design prompts for vibecoding tools

Async Programming Is Just Inject Time

Doppelgänger / RRN Disinformation Infrastructure Ecosystem 2026

Show HN: A Claude Code skill that renders decisions as interactive HTML pages

A Modular Robot Dashboard

Foreman: A secure self-hosted agent orchestrator

Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory

Generative Linguistics, LLMs, and the Social Nature of Scientific Success

Economy of the Mughal Empire

A standard protocol to handle and discard low-effort, AI-Generated pull requests

The Guy Who Played Barney the Dinosaur Now Runs a Tantric Sex Business

Show HN: Check out my new project – SitDeck

Story of a Failed Pentest (2018)

Roblox launches real-time AI chat rephrasing to filter out banned language

Ask HN: Did you change the site on mobile?

The five AI value models driving business reinvention

SaaSpocalypse: Enterprises are suddenly worried about the future of SaaS

FastClaw: Strong and powerfull AI easy to use for new users or pro users

Show HN: Tarmac – Know what Claude Code will cost before you run it