frontpage.

Hey HN,

I built AgentCheck, an open-source testing tool for LLM agents. It lets you:

Snapshot full agent runs (prompt, LLM calls, tool outputs, final answer)

Replay the trace locally — no API calls, no token costs

Diff agent behavior over time

Assert outputs to catch regressions

Why? Because today, most AI agents are tested by spot-checking outputs or rerunning flaky evals — which breaks CI, costs money, and misses edge cases. AgentCheck works more like Jest or VCR.py, but for LLM workflows. It records and replays traces so you can test agents like real software.

It’s CLI-first, dev-friendly, and designed to plug into LangChain/OpenAI workflows.

Still early I’d love feedback, contributors, and use cases from folks building agentic systems. The code’s here: https://github.com/hvardhan878/agentcheck

Thanks!

Nothing's Untestable

Show HN: I made a social media platform

How to write Rust in the kernel part 1

CEOs Start Saying the Quiet Part Out Loud: AI Will Wipe Out Jobs

Debian on Apple M1/M2: status and call for testers

GitHub Copilot coding agent now has a Playwright web browser

Show HN: Piskvor Prime: a five-in-a-row iOS game with a reactive AI opponent

Show HN: Wyntk.ai – anti horseless carriage email

Give Footnotes a Spec

Braess Paradox [video]

TPC-DS Benchmark: Trino 476, Spark 4.0.0, and Hive 4 on MR3 2.1

Show HN: GenZ AI – Your Voice, but Fluent in Gen Z

Ask HN: Building for Joy vs. Building for Scale

OpenAI to Sponsor Driver Alex Palou at Mid-Ohio IndyCar Race

Learning F# with Falco: Response Localization

Why the superyachts are getting bigger and bigger

Aphrodisiac

Natasha Lyonne reveals David Lynch was a supporter of AI

Accelerate Legacy Application Modernization 4 times faster

Third Interstellar Object Discovered

David Romero's Digital Models of Frank Lloyd Wright's Unrealized Buildings

You People Keep Contradicting Yourselves

Windows 11 Start menu uses a 15 MB JSON for categories

2025 AsiaLLVM Developers' Meeting Talks

Open Co-Scientist Agents: Recreating Google's AI Co-Scientist in LangGraph

The Mechanic Johnny Cash and Elvis Would've Wanted (Toolbox Tour) [video]

What happens to your brain when you watch videos online at faster speeds

Is that a Lululemon Scuba hoodie or Costco dupe? No one has to know

Has Xbox Considered Laying One Person Off Instead of Thousands

Mr. Abrego's Account of Torture at CECOT in El Salvador

Show HN: AgentCheck – Snapshot and Replay AI Agents Like Real Software