Polygraph: A Meta-Harness for Maximum Agent Autonomy

https://nx.dev/blog/announcing-polygraph

38•cheald•1h ago

Comments

kstenerud•47m ago

> Space. An agent is stuck in one repo. It can't see how a change fits the wider system, and it can only write to one repo at a time.

Huh? How can it not see multiple repos? They're just directories.

> Time. An agent has no episodic memory. Every session starts blank, so a human carries the memory context.

The memory comes from the research, design, specification, and planning documents.

> We no longer think about where the work happens or what repos are involved. We describe the work in a prompt and let Polygraph figure out what's relevant.

Err... that doesn't sound safe.

> Every decision is on record. So even though our team is distributed, I can ask my agent why a coworker chose one approach over another.

AFTER the fact...

victorsavkin•30m ago

Thank you for your comment.

> Huh? How can it not see multiple repos? They're just directories.

Relevant repos need to be discovered. They have to be set up correctly (some worktrees, most clones), dependencies installed, and the relationships between them made clear, etc.. In a sense, once you've done all that, they do become directories. Turning them into directories, and doing it ergonomically, is the tricky part.

Consider scale: Take the repos you own plus the OSS repos they depend on. It's many thousands. A real team has more. That's a lot to deal with.

> The memory comes from the research, design, specification, and planning documents.

This isn't episodic memory. You'll have high-level documents you can reference, and they're useful for overviews. But only a tiny fraction of decisions ever make it into them. Most decisions get made in the act of implementing something. And the "docs rot, code doesn't" rule applies here too.

> Err... that doesn't sound safe. It just picks the repos (you have access to) and helps you plan the work. Has no efect on safety.

> AFTER the fact...

Yes :) But say I'm reviewing their PR. I can ask my agent why the PR ended up the way it did, and every decision they made along the way is in the session. It's "after the fact", but useful. It doesn't mean every conversation with a human being can be replaced by this :) but a lot of conversations can be.

kstenerud•13m ago

> Relevant repos need to be discovered. They have to be set up correctly (some worktrees, most clones), dependencies installed, and the relationships between them made clear, etc..

This is what Sourcegraph and Github Code Search and Zoekt do, isn't it?

> You'll have high-level documents you can reference, and they're useful for overviews. But only a tiny fraction of decisions ever make it into them. Most decisions get made in the act of implementing something.

Er... In the age of AI the decisions need to be made (and documented) extensively before it starts writing any code. Otherwise you get slop.

> But say I'm reviewing their PR. I can ask my agent why the PR ended up the way it did, and every decision they made along the way is in the session.

That doesn't make the decision set good. And if the only documentation produced came from the implementation phase, then it's going to be self-defending regardless of how good the design actually is (and your review agent, lacking the context, won't know the difference). Multiply that with the many parallel PRs in parallel repos you get with some features, and that's just asking for trouble.

jenniferli23•47m ago

How are you thinking about permissions/revocation if Polygraph’s “memory” becomes a shared layer across repos?

victorsavkin•21m ago

Great question.

Polygraph knows what repos every dev (and therefore their agents) has access to. If a session touches repos you don't have access to, you'll only see the parts you're allowed to: PRs to a repo you can see, for instance. You won't see the logs or high-level descriptions, which can contain info you shouldn't see.

If a dev loses access to a repo, they also lose access to the sessions associated with it.

In other words, although Polygraph has one repo graph and one session graph under the hood, every dev has access to only a subset of each.

jeffbcross•24m ago

lukekarrys, how long would it take you to build this?

Show HN: ToolPalace – 25 free browser tools that work offline, no sign-up

Longer daylight linked to 4.4 minutes less sleep per extra hour of light

Show HN: Bsize yet Another Byte Size Crate

Show HN: Open protocol for agents to book vacation rentals direct from the host

Made a Rust DB run spatial queries on gaming GPU RT cores, beating an H100

Show HN: Closing the public-key authenticity gap in our E2EE social network

U.S. government will decide who gets to use latest upgrade to ChatGPT

Murmur: Shared communication bus for your coding agents

Poll: What's your primary AI coding agent/orchestrator Claude/Codex/Cursor, etc.?

Malware Insights: macOS Phexia Campaign

Show HN: AgentBrush – Your coding agent's missing tool: image generation

Ventora Expands Its AI Business Builder to Help Solo Founders

Wildfires Are Getting Worse. Patrick Moore Says Otherwise

Neural Image Compression with Gemini 3

Show HN: I built a hardware quantum RNG and wired it into a Magic 8-Ball

How to Corrupt an SQLite Database File

Chinese LineShine Supercomputer Debuts at No. 1 in TOP500

Pre-Modern Armies for Worldbuilders, Part III: Paying for It

How to Tell We–and AI–Are Choosing the Good

Commander's Intent Statement

Show HN: JSON Viewer Extension for Chrome

I'm building a Space Cadet Pinball Machine! [video]

Bigger context windows are the wrong abstraction for coding agents

Show HN: No one will beat my hiscore

Webradio server – broadcasts audio source to clients

Hasp – Local Secret Broker

Scaling Laws, Carefully

Anthropic has hired an economist with interesting views on human survival

Open Air Chicago

Human-bench: an eval for "human shaped" agents