frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Dari-docs – Optimize your docs using parallel coding agents

https://github.com/mupt-ai/dari-docs
14•byhong03•6h ago
It’s well known at this point that documentation needs to be optimized for AI agents - we’re all pointing our Claude Code / Codex / Pi agents at documentation, and expecting the models to figure out how to implement a product.

This, however, changes the entire optimization problem when writing documentation. Good documentation now becomes more objective - you are solving the very concrete problem: can a dumb harness running the dumbest model implement this reliably?

Humans can typically compensate for inconsistent terminology or scattered context across pages, but for agents, this often will waste time (or even just completely confuse the agent).

We’ve been building a small project around this called dari-docs: users can upload their documentation via website or CLI and run agents across different providers to see where they falter. You can upload your documentation, feed a list of tasks, and ask agents with varying intelligence / cost levels to complete those tasks in parallel. When a run is complete, you get back a list feedback markdown files from each agent run and can apply changes based on agent feedback.

Managed service: https://optimize.dari.dev/, repo link: https://github.com/mupt-ai/dari-docs

The agents actually try to use the product end-to-end. They search through the docs, follow instructions, run commands, try examples, and attempt to debug failures. Importantly, this is not a static LLM review of the documentation. The agents are actually attempting the integration.

You can also enable live verification with test credentials so the agents can actually verify workflows against real APIs:

  dari-docs check . --live-verify --secret-env DARI_TEST_API_KEY --task "Create a checkout session"
If you’re building a CLI, API, MCP server, or SDK and actively maintaining docs for humans or agents, we’d love to work with you and test this on real workflows!

Comments

Aleesha_hacker•2h ago
Cool approach actually letting agents test the docs makes debugging way more practical than just reading them
slipheen•1h ago
I read the GitHub repo, but still don't quite understand-

What exactly is the advantage of doing this vs just running a prompt in my existing coding agent?

I don't understand why this is a harness/project vs just for example, a skill?

I'm confident there's a good reason, I just don't understand.

avyvar•1h ago
Totally fair question. If you only want one agent to sanity-check one doc change, a skill/prompt is probably enough.

We actually aren’t rebuilding a harness here, it’s Pi with several LLM options to select from. The reason this is a project is that the useful workflow is more like a docs test suite: run realistic user tasks across multiple models, isolate each run in a greenfield sandbox, keep the transcripts/results, and make failures reproducible in CI.

You could ask an existing coding agent to spawn subagents for every task/model pair, but once that matrix grows, running hundreds of subagents on your computer gets messy. It’s also the wrong isolation boundary: for docs testing, you usually want the agent to start from a clean environment with access only to the docs/product surface you’re testing, not your whole working tree or local setup.

anish_m•1h ago
Nice! I want to use this for my product at ngram.com. Btw, I also created a sample teaser video: https://www.ngram.com/watch/dari-explainer-video-brief-d7991.... Feel free to use it on your social media

Show HN: CPU-only transcription for YouTube, TikTok, X, Instagram videos

https://github.com/kouhxp/yapsnap
14•mrkn1•2h ago•1 comments

Show HN: Dari-docs – Optimize your docs using parallel coding agents

https://github.com/mupt-ai/dari-docs
14•byhong03•6h ago•4 comments

Show HN: Lance – image/video generation and understanding in one model

https://github.com/bytedance/Lance
47•cleardusk•8h ago•14 comments

Show HN: I built Istanbul live transit map

https://tarif.ist/
6•berkaycubuk•4h ago•0 comments

Show HN: I made a tool for learning scales, chords, and how to combine them

https://projects.alesh.com/intervalkit/
11•aleshh•6h ago•10 comments

Show HN: Forge – Guardrails take an 8B model from 53% to 99% on agentic tasks

https://github.com/antoinezambelli/forge
644•zambelli•1d ago•234 comments

Show HN: Superlog (YC P26) – Observability that installs itself and fixes bugs

https://superlog.sh/
70•Magnanten•1d ago•45 comments

Show HN: Hocuspocus 4 – self-hosted Yjs collaboration backend

https://github.com/ueberdosis/hocuspocus
31•philipisik•8h ago•3 comments

Show HN: expo-callkit-telecom – easily integrate CallKit/Core-Telecom

https://github.com/mfairley/expo-callkit-telecom
2•mfairley•4h ago•0 comments

Show HN: Gaussian Splat of a Strawberry

https://superspl.at/scene/84df8849
516•danybittel•1d ago•195 comments

Show HN: Open-Source Agentic QA Harness with Memory

https://vostride.com/agent-qa
16•pranshuchittora•12h ago•2 comments

Show HN: Number Gacha, a gacha game distilled to its essence

https://isabisabel.com/gacha/
253•babel16•1w ago•140 comments

Show HN: I made a 3D pose maker for artists

https://setpose.com/
85•augustvdv•1d ago•32 comments

Show HN: IgniteMS – batch text embeddings at 253K msg/s on 8x A100

https://github.com/Artain-AI/ignite-ms
2•ddayanov•6h ago•0 comments

Show HN: Files.md – Open-source alternative to Obsidian

https://github.com/zakirullin/files.md
709•zakirullin•2d ago•346 comments

Show HN: Haystack – Review the PRs that need human attention

https://haystackeditor.com/
43•akshaysg•2d ago•16 comments

Show HN: Pg_deltax, Apache-licensed alternative to TimescaleDB

https://github.com/xataio/deltax
37•tee-es-gee•1d ago•1 comments

Show HN: Yt-x v0.8.0 – Browse, play, and download YouTube from the terminal

https://github.com/Benexl/yt-x
27•Benex254•1d ago•4 comments

Show HN: Id-agent – Token efficient UUID alternative for AI agents

https://github.com/vostride/id-agent
40•pranshuchittora•1d ago•54 comments

Show HN: Hsrs – Type-Safe Haskell Bindings Generator for Rust

https://github.com/harmont-dev/hsrs
53•suis_siva•1d ago•7 comments

Show HN: InsForge – Open-source Heroku for coding agents

https://github.com/InsForge/InsForge
59•mrcoldbrew•2d ago•7 comments

Show HN: Every Lego minifigure ranked, from over 1.3M user votes

https://brickelo.com
3•gpattle•10h ago•1 comments

Show HN: Rust Database from Scratch

https://github.com/ayoubnabil/aiondb
7•ayoubnabil•12h ago•4 comments

Show HN: IResearch – C++ search that beat Lucene and Tantivy on their benchmark

https://github.com/serenedb/serenedb/tree/main/libs/iresearch
11•gnusi•13h ago•3 comments

Show HN: Semble – Code search for agents that uses 98% fewer tokens than grep

https://github.com/MinishLab/semble
441•Bibabomas•3d ago•150 comments

Show HN: Rocksky – Music scrobbling and discovery on the AT Protocol

https://tangled.org/rocksky.app/rocksky
117•tsiry•4d ago•44 comments

Show HN: Mezz, a curl-able WiFi sandbox for IoT pentesting

https://github.com/ABGEO/mezz
40•ABGEO•5d ago•10 comments

Show HN: Watch a neural net learn to play Snake

https://ppo.gradexp.xyz/
203•c1b•6d ago•47 comments

Show HN: Javalamp – A glowing terminal screensaver that keeps your Mac awake

https://github.com/breschio/javalamp
3•tbreschi•20h ago•4 comments

Show HN: Auto-identity-remove – Automated data broker opt-out runner for macOS

https://github.com/stephenlthorn/auto-identity-remove
324•stephenlthorn•2d ago•135 comments