Show HN: Dochia – automated API testing for agentic build-test-fix loops

https://dochia.dev/

1•ludovicianul•1h ago

I've enhanced Dochia with the ability to generate agent skills. Being a CLI already, it makes it really simple to plug it into agentic workflows.

Run `dochia init-skills` and the coding agent(s) can trigger tests as it builds:

  1. Agent writes endpoint and the OpenAPI spec (or that get's generated from code)
  2. Agent runs: dochia test -c api.yml -s localhost:3000
  3. Dochia produces dochia-summary-report.json + per-endpoint test files
  4. Agent reads errors, fixes code, re-runs
  5. Loop

The JSON output is structured specifically so agents can read and act on it directly, not just humans parsing logs.

It's a native binary (GraalVM), so it's fast on all platforms.

Would love feedback on: is it something you will integrate into your flow, which test playbooks are missing, whether the report format is actually useful in agentic loops, any edge cases you'd expect a tool like this to catch?

GitHub: https://github.com/dochia-dev/dochia-cli Docs: https://docs.dochia.dev

For background, Dochia takes your OpenAPI spec and runs 120+ test playbooks: deterministic negative and boundary scenarios plus chaos testing. No test cases to write, no configuration beyond pointing it at your OpenAPI spec and a running server.

Comments

yohann_senthex•1h ago

Solid approach on the agent readable JSON. The real value here isn't just "catch bugs faster" it's that agents can now run comprehensive negative testing as part of the build loop, which is increasingly critical with agentic workflows.

One angle worth exploring: injection attacks. When Dochia fuzzes your endpoints, does it specifically test for prompt injection vectors if the API backend is an LLM? Most security tools miss that because they're generic but if an agent is writing an API that wraps Claude or GPT, you'd want fuzzing that includes things like "what if the user input gets templated into a system prompt?"

Same with token limits if the fuzzer generates massive payloads, does it distinguish between "API can't handle large inputs" vs "you'll bankrupt the user with token costs if this endpoint is exposed"?

Might be niche, but seems like low hanging fruit in the agentic testing space.

Missile defenses aren't failing–they're running out

I Talked to 500GB of Retail Data with Zero Domain Knowledge

Samsung brings Perplexity-powered AI and agentic capabilities to browser

Show HN: Open-source distributed quantum compute network

Change your Google account username in a few simple steps

Index providers should not bend the rules for Elon Musk

Design systems are platform problems, not feature problems

100M years ago an 'evolutionary fuse' sparked squid diversification

Pgmicro: In-process reimplementation of Postgres on SQLite compatible engine

AI Models Lie to Protect Each Other from Deletion, UC Berkeley Finds

Typed uncertainty instead of confidence scores

SugarHigh: Super Lightweight Syntax Highlighter

CDC Pauses Testing for Rabies and Pox Viruses

Nono.sh: Kernel-enforced runtime safety for AI agents

Australia's "Red Centre" Turns Green

How to Do an Agent Experience Audit

Show HN: Reinhardt – Django/DRF-inspired full-stack web framework for Rust

"I'm trying to report an exposed cloud bucket to a GenAI system"

Show HN: An MCP server for Devops automation

Show HN: Projects Calendar for GitHub – A stateless iCal feed for your projects

Watching a human life rise toward the Moon

BurnerNet v1.0.0: A Zero-Trust C++20 HTTP Client Engine

Nvidia rolls out its fix for PC gaming's "compiling shaders" wait times

Show HN: ContentForge – Content scoring API (42 endpoints, 12 platforms)

I think local-hosted LLMs are better than the cloud-hosted ones

LogCrush: Lossless log compression beating ZSTD-19 by 60%+

Artemis II Flight Map – printable pdf

China surfaces details of spacecraft to land humans on the Moon by 2030

Ruckus: Racket for iOS

Favorite Thing: Zeiss Mobile Screen Wipes