The problem
When sharing repository context with LLMs, we often give them everything: full files, implementation details, comments, etc. But for many tasks (like understanding architecture or navigating a repo), the model doesn't actually need most of that.
This creates two problems.
* unnecessary token usage * noisy context that can obscure the structure of the codebase
The idea
Instead of sharing the full implementation, what if we only shared the interface surface of the code?
Function signatures, types, imports, and documentation. Basically the parts that describe how the system is structured rather than how every function is implemented.
The experiment
I built a small CLI tool called Brf.it to test this idea. It uses Tree-sitter to parse source code and extract structural information, producing a compact representation of the repository.
Example output:
<file path="src/api.ts">
<function>fetchUser(id: string): Promise<User></function>
<doc>Fetches user from API, throws on 404</doc>
</file>
In one simple comparison from a repo:* original function: ~50 tokens * extracted interface: ~8 tokens
The goal isn't to replace sharing full code, but to provide a lightweight context layer that can help with things like:
* architecture understanding * repo navigation * initial prompt context for AI agents
The idea was partly inspired by tools like repomix, but Brf.it takes a slightly different approach. Instead of compressing the full repository, it extracts only the API-level structure.
Language support so far:
Go, TypeScript, JavaScript, Python, Rust, C, C++, Java, Swift, Kotlin, C#, Lua
Project:
https://github.com/indigo-net/Brf.it
Docs:
https://indigo-net.github.io/Brf.it/
Curious if others have experimented with similar ideas.
What information do you think is actually essential for LLM code understanding?
Are function signatures and docs enough for architecture reasoning?
Are there formats that work better for LLM consumption than XML or Markdown?
guerython•1h ago