Show HN: NERDs – Entity-centered long-term memory for LLM agents

12•tdaltonc•6h ago

Long-running agents struggle to attend to relevant information as context grows, and eventually hit the wall when the context window fills up.

NERDs (Networked Entity Representation Documents) are Wikipedia-style entity pages that LLM agents build for themselves by reading a large corpus chunk-by-chunk. Instead of reprocessing the full text at query time, a downstream agent searches and reasons over these entity documents.

The idea comes from a pattern that keeps showing up: brains, human cognition, knowledge bases, and transformer internals all organize complex information around entities and their relationships. NERDs apply that principle as a preprocessing step for long-context understanding.

We tested on NovelQA (86 novels, avg 200K+ tokens). On entity-tracking questions (characters, relationships, plot, settings) NERDs match full-context performance while using ~90% fewer tokens per question, and token usage stays flat regardless of document length. To highlight the methods limitation, we also tested it on counting tasks and locating specific passages (which aren't entity-centered) where it did not preform as well.

nerdviewer.com lets you browse all the entity docs we generated across the 86 novels. Click through them like a fan-wiki. It's a good way to build intuition for what the agent produces.

Paper: https://www.techrxiv.org/users/1021468/articles/1381483-thin...

Comments

elevaes•6h ago

This is fascinating, I'm wondering if it works as well with other use cases like papers, conversations, or any other human written text.

tdaltonc•6h ago

We originally developed NERDs inside of my last startup for monitoring the progress of solar developments. There are many different multi-modal event feeds that you need to monitor for a wholistic view of the project. NERDs helped glue together the event around entities.

Only later did we adapted to the technique to work to long books. The existing long book benchmarks seemed like the most appropriate way to show the core idea to a wider audience.

So ya, I'm confident that this central idea can be applied in many different domains.

mmayberry•5h ago

If the agent builds entity pages incrementally while reading, how do you prevent early incorrect assumptions about relationships or attributes from propagating through the entity graph? Is there support for belief revision?

tdaltonc•4h ago

Yes this sort of auto-regressive error propagation is a real concern for the same reason it's a real concern with LLMs in general.

If you force the output of an LLM to begin with an error, the LLM tends to continue down that erroneous path.

In practice, we didn't see much of this kind of EP. A solution to this would be to give some agent the task of occasionally reviewing the NERDs for contradictions as well as the ability to search through the source material as needed. That of course creates the possibility of catastrophic forgetting, where the agent rewrites a NERD in an effort to remove a contraction and end's up deleting something important.

We didn't see a lot of error propagation, but one example where we did: in Harry Potter, Prof Dumbledore is introduced as a mysterious hooded character. So the NERD-writer would create a NERD for "mysterious hooded man." There's no tool for the agent to change the title of a NERD, so the system is stuck with that title now. Sometimes the system would build the entire Dumbledore entry under "mysterious hooded man"; sometimes it would make a new Dumbledore entity and like a reference back to the "mysterious hooded man" entity, and sometimes it wouldn't link them. None of those outcomes are great.

rnunery13•3h ago

I agree with Elevaes, this was absolutely fascinating and I love the use of books to help understand the concepts. I could relate right away. The token usage reduction potential is massive especially when it comes to enterprise usage and costs - many companies are experiencing sticker shock because they weren't prepared / didn't anticipate the usage. The potential for better costing and estimation with the process could have widespread impacts to financials (in a good way) and allow for more accurate pricing estimates and models.

Show HN: Sheila, an AI agent that replaced our accounting flow

Qualcomm CEO: 'Resistance Is Futile' as 6G Mobile Revolution Approaches

Show HN: NeoNetrek – modernizing the internet's first team game (1988)

Show HN: Natural language queries for Prometheus Kafka metrics (StreamLens)

Satellite firm pauses imagery after revealing Iran's attacks on US bases

China Suspected in Breach of FBI Surveillance Network

Show HN: I created list of directories (1000) to create free backlinks

Fishing crews in the Atlantic keep accidentally dredging up chemical weapons

The National Videogame Museum Has Acquired the Mythical Nintendo PlayStation

C# Strings Silently Kill Your SQL Server Indexes in Dapper

Show HN: I open-sourced my Steam game, 100% written in Lua, engine is also open

The White House: Touchdown

Capability-Tiered AI Governance Architecture (CEGP)

A new chapter for the Nix language, courtesy of WebAssembly

Shipping a Button in 2026 [video]

Show HN: Stream-native AI that never sleeps, an alternative to OpenClaw

Show HN: Flompt – Visual prompt builder that decomposes prompts into blocks

FBI investigating 'suspicious' cyber activity on system holding wiretaps

Show HN: key-carousel - Key rotation for LLM agents

Device that can extract 1k liters of clean water a day from desert air

Show HN: Sqry – semantic code search using AST and call graphs

The Window Chrome of Our Discontent

When Batteries Heat Up, This Membrane "Sweats" It Out

Show HN: Stratum - a pure JVM columnar SQL engine using the Java Vector API

Wild crows in Sweden help clean up cigarette butts

Show HN: BLOBs in MariaDB's Memory Engine – No More Disk Spills for Temp Tables

Tip me, my life depends on it (2021)

Show HN: OculOS – Give AI agents control of your desktop via MCP

New Strides Made on Deceptively Simple 'Lonely Runner' Problem

Ask HN: Why is Pi so good (and some observations)