frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Ask HN: What does your agentic software dark factory look like?

3•ElFitz•1h ago
In some of the comment threads around here a few of you shared interesting ideas and patterns, enough that I believe everyone interesting in harness engineering is working on some sort of software dark factory or another.

We have OpenAI’s Symphony[1], StrongDM’s Factory[2], Yegge’s GasTown[3], and probably a few others I’ve missed.

So I’m curious. What have you been working on? What have learned? What has worked and what has failed? And what do you think comes after?

I’ll go first. The first thing I tried that yielded interesting results was, when possible, providing a ground truth or reference for the model to iterate against: screenshots or mockups for UI work, API contracts and unit / integration tests for logic. That’s the Ralph Loop we all know and love. A feedback loop.

The second (obvious, I know) was splitting planning and implementation.

Reviews by other models and iterative loops came next, with appreciable results. However the implementing agent would often wiggle out by deferring things into oblivion or saying things that were actually important feedback were out of scope. Another feedback loop. I’ve found turning those reviews into "hard gates" has its own set of issue, as reviewing agents will always find something to nitpick about, turning this iterative implementation approaches into near infinite loops.

Combining these reviews and committing plans alongside the code led to an interesting accident: reviewing agents spontaneously and unexpectedly picked up on those and drastically improved their feedbacks by comparing plan and implementation (should have been obvious, and you’ll imagine my surprise the first time GitHub Copilot actually provided useful feedbacks instead of the usual typo nitpicks).

Then a comment here led me to an adversarial green team / red team process.

A first agent creates a spec (based on StrongDM’s NLSpec) from my initial plan and gets it reviewed, including a detailed API.

A red team agent writes unit and integration test based on these specs, and gets them reviewed.

Then a green team agent is given those same specs and API, and implements the actual feature or fix, and iterates against the tests, without any access to the tests themselves, only which tests failed and what they were testing. This prevents it from gaming the tests.

Finally, once tests pass, a reviewing agent reviews the implementation against the specs.

This was nice. And it allows mixing and matching models, thinking levels, and providers. But both green and red team would sometimes diverge from the initial specs and API, sometimes with good reasons.

So another agent was brought in to evaluate those divergences when they occur and, if they are valid improvements, restart the process from the spec generation point, with the new insights. Yet another feedback loop.

And finally, integrating logs, OTel traces, and stack traces into the process. These agents seem remarkably capable at sifting through these, and end-to-end observability drastically improved results. Again, a feedback loop.

That’s all for me so far. Curious to see what other insights, findings, lessons or learnings everyone else has to share on this!

It’s a fun ride.

Output Isn't Design

https://linear.app/now/output-isn-t-design
1•julian_digital•53s ago•0 comments

We Are Not Language Machines

https://www.shishyko.com/essays/we-are-not-language-machines.html
1•shishy•1m ago•0 comments

Show HN: Groundtruth – Stop hook that blocks Claude Code from saying done

https://github.com/vnmoorthy/groundtruth
1•vnmoorthy•1m ago•0 comments

FTWA – Free the Web Apps

https://ftwa.mathix.dev/why/
1•mathix•2m ago•0 comments

Creation of Canada's first sovereign wealth fund

https://www.cbc.ca/lite/story/9.7178238
1•colinprince•2m ago•0 comments

AI gives back more equality than it takes away

2•Bashkiroff•6m ago•0 comments

The 2 Hour [Marathon] barrier has been smashed

https://stevemagness.substack.com/p/the-2-hour-barrier-has-been-smashed
2•lordleft•7m ago•0 comments

A Tiny Polish Startup Became the Multi-Billion-Dollar Voice of AI

https://www.forbes.com/sites/iainmartin/2025/12/01/how-a-tiny-polish-startup-became-the-multi-bil...
3•Anon84•8m ago•0 comments

Running DOS and Unix on an 8-bit Commodore (2024) [video]

https://archive.fosdem.org/2024/schedule/event/fosdem-2024-2334-running-dos-unix-on-an-8-bit-comm...
4•michalpleban•10m ago•0 comments

17th-century astrolabe heads to London sale with £2.5M estimate

https://www.turkiyetoday.com/culture/17th-century-jaipur-astrolabe-heads-to-london-sale-with-25m-...
3•geox•10m ago•0 comments

AI apathy is a tragedy, falling behind is a crime

https://loworbitsecurity.com/radar/radar-18/
2•7777777phil•11m ago•0 comments

Maybe you should learn something

https://www.marginalia.nu/log/a_135_learn/
2•ingve•14m ago•0 comments

Supabase Feature Preview: RLS Tester

https://github.com/orgs/supabase/discussions/45233
2•siegers•16m ago•0 comments

On the LOC Controversy

https://github.com/garrytan/gstack/blob/main/docs/ON_THE_LOC_CONTROVERSY.md
2•tie-in•17m ago•0 comments

China orders Meta to unwind $2B buy of AI startup Manus

https://www.reuters.com/world/asia-pacific/china-blocks-foreign-acquisition-ai-startup-manus-2026...
2•kilroy123•18m ago•1 comments

Make Europe the Electro Union

https://www.norrsken.org/goodnews/make-europe-the-electro-union
2•imartin2k•18m ago•0 comments

Why QVC's Operational Excellence Became Its Undoing

https://gadallon.substack.com/p/the-inflexible-stack-why-qvcs-operational
3•JumpCrisscross•21m ago•0 comments

The Iran war is hitting the AI supply chain where it hurts

https://thenextweb.com/news/iran-war-pcb-supply-chain-sabic
3•skeledrew•21m ago•0 comments

Securing AI Agents and MCP at the network layer with Tailscale and Highflame

https://www.businesswire.com/news/home/20260403439638/en/Highflame-and-Tailscale-Partner-to-Secur...
3•grumblemumble•24m ago•1 comments

LogMeIn – Instant Login Links, No Password Sharing

https://logmein.today
2•imkost•26m ago•1 comments

Corporate America Is Minting Money–and Not Just in Tech and Finance

https://www.wsj.com/business/earnings/corporate-america-is-minting-moneyand-not-just-in-tech-and-...
2•JumpCrisscross•29m ago•0 comments

Ketamine, psychedelics, GHB: is the US falling out of love with cocaine?

https://www.theguardian.com/us-news/2026/apr/21/cocaine-ketamine-psychedelics-ghb
2•JumpCrisscross•29m ago•0 comments

Chatting with myself about AI

https://2earth.github.io/website/20260426.html
2•2earth•31m ago•1 comments

Ask HN: Is anyone building/selling VT220-type terminals?

3•vrypan•31m ago•4 comments

Taiwan's stock market surpasses the UK's, thanks to AI

https://www.tomshardware.com/tech-industry/taiwan-stock-market-overtakes-the-uk
2•giuliomagnifico•31m ago•0 comments

TopCarPlay Wireless CarPlay

https://www.facebook.com/TopCarPlayWirelessCarPlay.Get
2•jerrycazer•32m ago•0 comments

China blocks Meta's acquisition of AI startup Manus

https://www.cnbc.com/2026/04/27/meta-manus-china-blocks-acquisition-ai-startup.html
2•yakkomajuri•32m ago•0 comments

LFM2.5-VL-450M: Structured Visual Intelligence, Edge to Cloud

https://www.liquid.ai/blog/lfm2-5-vl-450m
2•exploraz•33m ago•0 comments

Choo Choo Words: Spell words to make train tracks, stop the train from crashing

https://choochoowords.chyuang.com/
3•yongyongyong•34m ago•1 comments

Show HN: Webhook API – inbound email –> webhook

https://www.echovalue.dev/webhook-api/
3•emiliano•36m ago•0 comments