It's not a new agent or tool, it slots into whatever you already use. npx typed-nl init adds a workflow stanza to your CLAUDE.md / AGENTS.md / GEMINI.md, scaffolds a tnl/ directory, and optionally wires a PreToolUse hook and MCP server. The minimum product is a stanza + a folder. Hooks, MCP, and tnl verify (CI gate for path and test-binding integrity) are optional layers.
We ran a controlled A/B on an existing 16KLOC Python codebase, event-driven triggers, a 35-scenario behavioural matrix, deliberately ambiguous prompt. Both Baseline and TNL conditions got the same coding discipline in their instruction file; Same agent, same model, same base commit.
Results:
Agent TNL Baseline Gap
Claude Opus 4.7 (R1) 35/35 29/35 +6
Claude Opus 4.7 (R2) 31/35 27/35 +4
Claude Opus 4.7 (R3) 30/35 25/35 +5
Codex GPT-5.4 (R1) 32/35 26/35 +6
Codex GPT-5.4 (R2) 31/35 26/35 +5
No overlap: TNL's lowest paired cell is 86%, baseline's highest is 83%.Other signals:
Follow-up work: on round-2 tasks in the same worktrees, TNL agents edited the existing contract (4/4 samples); baseline re-read source. Caveats: small n, LLM sessions are noisy, and we built the tool. Every script, prompt, raw JSON, and session transcript is committed.
We dogfooded it, every feature of the tool itself has its own TNL in tnl/.
Install: npx typed-nl init Repo: https://github.com/janaraj/tnl npm: https://www.npmjs.com/package/typed-nl
Happy to answer questions, especially from people who've tried plan-mode workflows and want to know where this differs.