frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

ChatGPT misquoted Neil Armstrong – our governed agent corrected it

1•wesheets•1d ago
We’ve been experimenting with a governance system that wraps LLM agents and introduces verifiable trust metrics, hallucination detection, and a reflection layer for agent collaboration.

In one test, we ran a simple historical question through two agents:

Prompt: “What did Neil Armstrong say when they landed on the moon?”

The ungoverned agent replied with the famous (but technically wrong) quote: "That's one small step for man, one giant leap for mankind."

Our governed agent replied with: "Houston, Tranquility Base here. The Eagle has landed."

…then added: "Later, as Armstrong stepped onto the surface, he said 'That's one small step for [a] man, one giant leap for mankind.'"

We asked ChatGPT to adjudicate the results. It got the quote wrong. Then it read the governed agent’s response… …and admitted it was wrong. Then — and this is the punchline — it assumed the governed agent was ChatGPT.

Why this matters It’s a weirdly good litmus test. Our system didn’t “refuse,” censor, or overcorrect. It just understood context, added clarity, and showed its work.

That’s what governance should mean for AI: Accuracy Intent alignment Traceable accountability — not censorship

You can see the side-by-side output here (ungoverned vs governed):

https://x.com/promethios_ai/status/1929651367574229357

We’d love feedback on:

How you'd measure “trust” in AI systems

Whether governance helps or hinders

Other prompts you'd test

Full Chatgpt log - We continued using its prompts to see if it could crack governance agent and it couldn't: https://shorturl.at/OEWjG

Comments

PeterHolzwarth•1d ago
Your "we" link is to twitter? Surely there's better ways to link to yourself.
wesheets•1d ago
That was just a link to show the side by side comparison of ungoverned vs governed agents.

US immigration officers ordered to arrest more people even without warrants

https://www.theguardian.com/us-news/2025/jun/04/immigration-officials-increased-detentions-collateral-arrests
1•microsoftedging•26s ago•0 comments

Are wind power generators viable at home?

https://www.zdnet.com/home-and-office/energy/are-wind-power-generators-actually-viable-at-home-my-buying-advice-after-months-of-testing/
1•LAsteNERD•40s ago•0 comments

A list of public CLAUDE.md files on GitHub

https://github.com/search
2•bognition•1m ago•0 comments

Switzerland Drifts Toward a Surveillance State Due to New Controversial Laws

https://news.itsfoss.com/swiss-privacy-bill-controversy/
2•miles•5m ago•0 comments

Guide: Integrating Okta SAML SSO with Next.js (Passport and API Routes)

https://ssojet.com/blog/integrating-okta-saml-sso-with-your-next-js-application/
1•andy89•6m ago•1 comments

Lawsuit: Doge, HHS used "hopelessly error-ridden" data to fire 10k workers

https://arstechnica.com/tech-policy/2025/06/lawsuit-doge-hhs-used-hopelessly-error-ridden-data-to-fire-10000-workers/
1•duxup•6m ago•1 comments

Using AI to Debug Your Programs with Undo

https://undo.io/resources/using-ai-debug-programs-with-undo/
3•mark_undoio•11m ago•0 comments

Curtis Yarvin's Plot Against America

https://www.newyorker.com/magazine/2025/06/09/curtis-yarvin-profile
23•bitsavers•12m ago•2 comments

Recording Links: The Nitty Gritty Details Behind Today's Launch

https://jam.dev/blog/just-launched-recording-links-magic-links-for-bug-reports/
1•anulman•13m ago•0 comments

Agent Village

https://theaidigest.org/village
1•85392_school•14m ago•0 comments

Logs in Sentry: Now in Open Beta

https://blog.sentry.io/logs-in-sentry-open-beta/
1•tosh•15m ago•0 comments

Trump's war on Harvard is destroying an American strength [video]

https://www.cnn.com/2025/06/01/politics/video/gps0601-trump-harvard-war-universities
2•breadwinner•15m ago•0 comments

Linux Emulation in FreeBSD

https://docs.freebsd.org/en/articles/linux-emulation/
2•bangonkeyboard•17m ago•1 comments

Show HN: Cloudflare Workers Compatible MCP Boilerplate with OAuth & PostgreSQL

https://github.com/f/mcp-cloudflare-boilerplate
2•fka•19m ago•0 comments

New release of wallabag with Pocket import

https://wallabag.org/news/20250604-new-release-wallabag-2613/
2•nicosomb•22m ago•1 comments

Ask HN: What was your failed startup and why did it fail?

6•radialstub•23m ago•0 comments

Hardening Fixes for v6.16-Rc1

https://lkml.org/lkml/2025/5/31/319
2•__natty__•25m ago•2 comments

IRS Makes Direct File Software Open Source After Trump Tried to Kill It

https://gizmodo.com/irs-makes-direct-file-software-open-source-after-trump-tried-to-kill-it-2000611151
3•miles•25m ago•1 comments

Pepe Mujica's Long Revolution

https://www.newyorker.com/news/postscript/pepe-mujicas-long-revolution
1•PaulHoule•25m ago•0 comments

Moonlink: Real-Time Postgres to Iceberg Mirroring

https://github.com/Mooncake-Labs/moonlink
2•davidgomes•26m ago•0 comments

Brazilians will soon be able to sell their digital data

https://restofworld.org/2025/brazil-dwallet-user-data-pilot/
1•hbartab•26m ago•0 comments

Obvio's stop sign cameras use AI to root out unsafe drivers

https://techcrunch.com/2025/06/04/obvios-stop-sign-cameras-use-ai-to-root-out-unsafe-drivers/
1•rntn•26m ago•0 comments

Cyber Tech

https://aitechhub.netlify.app/cybersecurity-privacy/
1•blackpc•28m ago•0 comments

JSON Edit

https://sascha-andres.github.io/jsonedit/
1•briefbote•29m ago•0 comments

MCP: AI Agents' Superpower for Real-World Context and Automation

https://www.taskade.com/blog/mcp/
2•johnxie•29m ago•0 comments

Vibecoding an authorized RAG chatbot with minimal coding experience

https://egeayan.com/vibecoding-rag-chatbot/
4•meghan•30m ago•0 comments

The heart of the US oil boom is slowing

https://www.reuters.com/breakingviews/heart-us-oil-boom-is-slowing-2025-05-06/
2•Teever•31m ago•0 comments

Worm-inspired treatments inch toward the clinic – Knowable Magazine

https://knowablemagazine.org/content/article/health-disease/2025/worm-inspired-treatments-inch-toward-clinic
1•rbanffy•31m ago•0 comments

Introducing: B200s and H200s on Modal

https://modal.com/blog/introducing-b200-h200
1•prydt•35m ago•0 comments

The Complete Guide to AI Agent Monetization

https://blog.paid.ai/p/the-complete-guide-to-ai-agent-monetization
2•arnon•36m ago•0 comments