NERDs (Networked Entity Representation Documents) are Wikipedia-style entity pages that LLM agents build for themselves by reading a large corpus chunk-by-chunk. Instead of reprocessing the full text at query time, a downstream agent searches and reasons over these entity documents.
The idea comes from a pattern that keeps showing up: brains, human cognition, knowledge bases, and transformer internals all organize complex information around entities and their relationships. NERDs apply that principle as a preprocessing step for long-context understanding.
We tested on NovelQA (86 novels, avg 200K+ tokens). On entity-tracking questions (characters, relationships, plot, settings) NERDs match full-context performance while using ~90% fewer tokens per question, and token usage stays flat regardless of document length. To highlight the methods limitation, we also tested it on counting tasks and locating specific passages (which aren't entity-centered) where it did not preform as well.
nerdviewer.com lets you browse all the entity docs we generated across the 86 novels. Click through them like a fan-wiki. It's a good way to build intuition for what the agent produces.
Paper: https://www.techrxiv.org/users/1021468/articles/1381483-thin...
elevaes•6h ago
tdaltonc•6h ago
Only later did we adapted to the technique to work to long books. The existing long book benchmarks seemed like the most appropriate way to show the core idea to a wider audience.
So ya, I'm confident that this central idea can be applied in many different domains.