Show HN: Htvend, a tool to capture internet dependencies

3•aeijdenberg•6mo ago

htvend is a tool designed to make it easy (or at least possible), to build OCI (ie Docker) images, in a more trustworthy and reliable way.

That is, in a way that tightly controls which assets that they pull in so that things can be easily rebuilt without needing internet access, for situations including air-gapped networks, or simply a desire to not inadvertently bring in upstream changes while trying to make a small tweak to a private script.

It works by starting a local HTTP/HTTPS proxy server, then starting a subprocess with appropriate environment variables and certificate files set. It has special support for passing these into the RUN context for building of images, so that existing Dockerfiles can be used without modification.

Let me know what you think.

Comments

compressedgas•6mo ago

The example asset.json shows that as of continusec-htvend-43bb129 it only caches one response per URL and does not retain the request headers or response headers.

Why does it not retain the request headers and response headers and support more than one response per URL as an proper archiving proxy would?

aeijdenberg•6mo ago

Thanks for taking a look.

The intent was to support basic build systems accessing package eco-systems that tend to always serve the same response for the same URL.

Docker registries do this reasonably well, as do Maven repos.

It wasn't intended to be a full on proper archiving proxy (and I'll admit I hadn't heard that term - I'll look into it and see what else exists in that space).

The main use-case I had in mind for this is private projects, that are developed on workstations which have internet access, but are deployed to other environments using CI/CD systems with less network access. If both systems have access to a common blob store, then that can be populated with htvend build on a workstation and replayed at build time with htvend offline.

For that, there's no need to capture additional request information, as the focus wasn't to support getting a manifest file and being able to reliably re-download all the blobs from internet (and often those responses may have changed in the interim). And for the same reason, would expect to only need to one response per URL, per assets.json file.

Does that make sense?

Part 1 the Persistent Vault Issue: Your Encryption Strategy Has a Shelf Life

Teleop_xr – Modular WebXR solution for bimanual robot teleoperation

The Highest Exam: How the Gaokao Shapes China

Open-source framework for tracking prediction accuracy

India's Sarvan AI LLM launches Indic-language focused models

Show HN: CryptoClaw – open-source AI agent with built-in wallet and DeFi skills

ShowHN: Make OpenClaw respond in Scarlett Johansson’s AI Voice from the Film Her

CReact Version 0.3.0 Released

Show HN: CReact – AI Powered AWS Website Generator

The rocky 1960s origins of online dating (2025)

Show HN: Agent-fetch – Sandboxed HTTP client with SSRF protection for AI agents

Why there is no official statement from Substack about the data leak

Effects of Zepbound on Stool Quality

Show HN: Seedance 2.0 – The Most Powerful AI Video Generator

Ask HN: Do we need "metadata in source code" syntax that LLMs will never delete?

Pentagon cutting ties w/ "woke" Harvard, ending military training & fellowships

Can Quantum-Mechanical Description of Physical Reality Be Considered Complete? [pdf]

Kessler Syndrome Has Started [video]

Complex Heterodynes Explained

MemAlign: Building Better LLM Judges from Human Feedback with Scalable Memory

CCC (Claude's C Compiler) on Compiler Explorer

Homeland Security Spying on Reddit Users

Actors with Tokio (2021)

Can graph neural networks for biology realistically run on edge devices?

Deeper into the shareing of one air conditioner for 2 rooms

Weatherman introduces fruit-based authentication system to combat deep fakes

Why Embedded Models Must Hallucinate: A Boundary Theory (RCC)

A Curated List of ML System Design Case Studies

Pony Alpha: New free 200K context model for coding, reasoning and roleplay

Show HN: Tunbot – Discord bot for temporary Cloudflare tunnels behind CGNAT