- Set up a vector DB - Write chunking logic - Run an indexer - Keep it in sync with git on any change
For just answer this question about this repo, it felt a bit too much. So I built a small API instead: you send a repo + question, it sends back the files an LLM actually needs (not sure how novel this is at all?).
What it does:
You call an HTTP endpoint with a GitHub repo URL + natural language question (better if specific, but this will also work: How does auth work? What validates webhook signatures? etc).
The API returns JSON with 1–10 ranked files: - `path`, `language`, `size`, full `content` - plus a small `stats` object (token estimate, rough cost savings)
You plug those files into your own LLM / agent / tool. There's no embeddings, no vector DB, no background indexing job. It works on the very first request.
Why I built this?
I just wanted to ask this repo a question without:
- Standing up Pinecone/Weaviate/Chroma - Picking chunk sizes and overlap - Running an indexer for every repo - Dealing with sync jobs when code changes
This API skips all of that. It's meant for:
- One-off questions on random repos - Agents / tools that hop across many repos - Internal tools where you don't want more infra
Does it work at all?
On a small internal eval (177 questions across 14 repos, mix of Python, TS, monorepos + private ones):
- A cross-model LLM judge rated answers roughly on par with a standard embeddings + vector DB setup - Latency is about 2–4 seconds on the first request per repo (shallow cloning + scanning), then faster from cache - No indexing step: new repos work immediately
Numbers are from our own eval, so treat them as directional, not a paper. Happy to share the setup if anyone wants to dig in.
How it works:
1. On first request, it shallow clones the repo and builds a lightweight index: file paths, sizes, languages, and top-level symbols where possible.
2. It gives an LLM the file tree + question and asks it to pick the most relevant files.
3. It ranks, dedupes, and returns a pack of files that fits in a reasonable context window.
Basically: let an LLM read the file tree and pick files, instead of cosine-searching over chunks.
*imitations:
- Eval is relatively small (177 questions / 14 repos), all hand-written – directional, not research-grade - Works best on repos with sane structure and filenames - First request per repo pays the clone cost (cached after)
Try it:
- Live demo: https://contextpacker.com - DM me for an API key – keeping it free while I validate the idea.
If you're building code agents, "explain this repo" tools, or internal AI helpers over your company's repos – I'd love to hear how you'd want to integrate something like this (or where you think it will fall over). Very open to feedback and harsh benchmarks.