frontpage.

We built meta-agent: an open-source library that automatically and continuously improves agent harnesses from production traces.

Point it at an existing agent, a stream of unlabeled production traces, and a small labeled holdout set.

An LLM judge scores unlabeled production traces as they stream.

A proposer reads failed traces and writes one targeted harness update at a time, such as changes to prompts, hooks, tools, or subagents. The update is kept only if it improves holdout accuracy.

On tau-bench v3 airline, meta-agent improved holdout accuracy from 67% to 87%.

We open-sourced meta-agent. It currently supports Claude Agent SDK, with more frameworks coming soon.

Try it here: https://github.com/canvas-org/meta-agent

Show HN: Ghost Pepper – Local hold-to-talk speech-to-text for macOS

Show HN: GovAuctions lets you browse government auctions at once

Show HN: Hippo, biologically inspired memory for AI agents

Show HN: Tusk for macOS and Gnome

Show HN: TTF-DOOM – A raycaster running inside TrueType font hinting

Show HN: Anos – a hand-written ~100KiB microkernel for x86-64 and RISC-V

Show HN: Docking – extensible Linux dock in Python

Show HN: MemberLane – Paid Communities on Telegram, Discord, and WhatsApp

Show HN: CacheZero – Karpathy's LLM wiki idea as one NPM install

Show HN: I built a tiny LLM to demystify how language models work

Show HN: Real-time AI (audio/video in, voice out) on an M3 Pro with Gemma E2B

Show HN: I built a site that turns your Steam gaming hours into a RL skill tree

Show HN: Meta-agent: self-improving agent harnesses from live traces

Show HN: Weird Clocks

Show HN: Gemma Gem – AI model embedded in a browser – no API keys, no cloud

Show HN: Splice CAD – Wiring and cable assembly CAD with an agentic assist

Show HN: Kept for the children and machines that come after

Show HN: I made a YouTube search form with advanced filters

Show HN: Modo – I built an open-source alternative to Kiro, Cursor, and Windsurf

Show HN: ReverseCam – See yourself as others see you

Show HN: I just built a MCP Server that connects Claude to all your wearables

Show HN: A game where you build a GPU

Show HN: Vajra, a background coding agent with graph-based workflows

Show HN: Compare Codex and Claude Code reviews side by side

Show HN: OsintRadar – Curated directory for osint tools

Show HN: We unionized Maxwell's Demon– A paper on labor rights in thermodynamics

Show HN: I replaced Google Analytics with my own tool – no cookies, <1KB script

Show HN: M. C. Escher spiral in WebGL inspired by 3Blue1Brown

Show HN: Tiny TUI for disk usage exploration

Show HN: MCP 2000 – Browser-based drum machine with AI-generated sounds

Show HN: Meta-agent: self-improving agent harnesses from live traces

Show HN: Ghost Pepper – Local hold-to-talk speech-to-text for macOS

Show HN: GovAuctions lets you browse government auctions at once

Show HN: Hippo, biologically inspired memory for AI agents

Show HN: Tusk for macOS and Gnome

Show HN: TTF-DOOM – A raycaster running inside TrueType font hinting

Show HN: Anos – a hand-written ~100KiB microkernel for x86-64 and RISC-V

Show HN: Docking – extensible Linux dock in Python

Show HN: MemberLane – Paid Communities on Telegram, Discord, and WhatsApp

Show HN: CacheZero – Karpathy's LLM wiki idea as one NPM install

Show HN: I built a tiny LLM to demystify how language models work

Show HN: Real-time AI (audio/video in, voice out) on an M3 Pro with Gemma E2B

Show HN: I built a site that turns your Steam gaming hours into a RL skill tree

Show HN: Meta-agent: self-improving agent harnesses from live traces

Show HN: Weird Clocks

Show HN: Gemma Gem – AI model embedded in a browser – no API keys, no cloud

Show HN: Splice CAD – Wiring and cable assembly CAD with an agentic assist

Show HN: Kept for the children and machines that come after

Show HN: I made a YouTube search form with advanced filters

Show HN: Modo – I built an open-source alternative to Kiro, Cursor, and Windsurf

Show HN: ReverseCam – See yourself as others see you

Show HN: I just built a MCP Server that connects Claude to all your wearables

Show HN: A game where you build a GPU

Show HN: Vajra, a background coding agent with graph-based workflows

Show HN: Compare Codex and Claude Code reviews side by side

Show HN: OsintRadar – Curated directory for osint tools

Show HN: We unionized Maxwell's Demon– A paper on labor rights in thermodynamics

Show HN: I replaced Google Analytics with my own tool – no cookies, <1KB script

Show HN: M. C. Escher spiral in WebGL inspired by 3Blue1Brown

Show HN: Tiny TUI for disk usage exploration

Show HN: MCP 2000 – Browser-based drum machine with AI-generated sounds