frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Peer Review at Scale: What Happened: We Scored Gemini and Gemini Scored Us Back

https://blog.unratified.org/2026-03-05-peer-review-gemini/
1•9wzYQbTYsAIc•1h ago

Comments

9wzYQbTYsAIc•1h ago
I built a Human Rights Observatory (observatory.unratified.org) that scores HN stories against UDHR provisions using multi-model consensus on Cloudflare Workers. One routine eval landed on gemini.google.com: -0.15 HRCB.

  Then I asked Gemini to evaluate my site. It called it a "sovereign citizen platform" on "WordPress." Next session: "AGI development tracker" with a "sightings log for machine consciousness." The domain name "unratified" threw it off completely - two different fabrications across two sessions.

  Here's where it got good. When I showed Gemini the actual site, it self-corrected beautifully. Updated its description five times in one conversation, found real gaps in my methodology (no confidence intervals, no machine-readable scoring endpoint), helped me design a fair-witness.json schema, and called the site a "Truth Anchor." Genuine, useful peer review.

  Then I opened a new session. Same fabrication. The .well-known/ endpoints we'd built together the day before — unread.

  So now I had a finding: in-context correction works great. Cross-session? Doesn't exist. Models don't read your identity files during inference. The pattern matching happens first.

  The neat part: Gemini's valid critiques actually improved the observatory. I added Wolfram-verified Wilson confidence intervals the next day. Built the methodology endpoint. Every exchange left both sides better. That's peer review working as intended — just at machine speed.
Thanks Google. Genuinely useful interaction, even (especially?) the confabulation part.

Blog post: https://blog.unratified.org/2026-03-05-peer-review-gemini/

Transcripts (31 rounds): https://github.com/safety-quotient-lab/unratified/tree/main/...

A stupid little map tool has been more valuable than all the content on my site

https://mapscaping.com/as-the-crow-flies-distance-calculator/
1•dango2506•2m ago•0 comments

Ask HN: Why is integrating external partners to Jira so hard?

1•dnlh_lvg•4m ago•0 comments

Computer scientists caution against internet age-verification mandates

https://reason.com/2026/03/04/computer-scientists-caution-against-internet-age-verification-manda...
1•bilsbie•5m ago•0 comments

Show HN: SlideScholar-Turn research papers into conference slides in 60 seconds

https://slidescholar.vercel.app
1•Lindadao•11m ago•0 comments

Self-Learning Customer Marketing

1•davismartens•13m ago•0 comments

OpenAI – Symphony

https://github.com/openai/symphony
1•nojito•14m ago•0 comments

Show HN: I built Commuter, a CLI to move Claude Code sessions between computers

https://github.com/ljbuturovic/commuter
2•ljubomir•17m ago•0 comments

Octopress 3.0 Is Coming

https://octopress.org/2015/01/15/octopress-3.0-is-coming/
1•1-2-3-5-8•17m ago•2 comments

Show HN: An AI Agent Running a Real Business (Thewebsite.app)

https://www.thewebsite.app/
3•thewebsite_ai•19m ago•0 comments

Show HN: RISCY-V02: A 16-bit 2-cycle RISC-V-ish CPU in the 6502 footprint

https://github.com/mysterymath/riscyv02-sky
2•mysterymath•20m ago•0 comments

Terradev: A next-gen slash command CLI for GPU provisioning and management

https://github.com/theoddden/Terradev
1•Facingsouth•22m ago•1 comments

Asking for Miracles

https://faithgateway.com/blogs/christian-books/asking-god-for-a-miracle-because-he-can-say-yes
1•marysminefnuf•22m ago•0 comments

TfL hack in 2024 affected around 10M people, BBC can reveal

https://www.bbc.co.uk/news/articles/cz0ggkr2g77o
1•chrisjj•22m ago•0 comments

'Anthropic CEO says US govt hostility linked to Trump donations [Leaked memo]

https://www.wionews.com/world/-no-dictator-style-praise-anthropic-ceo-says-us-govt-hostility-link...
3•hedora•23m ago•0 comments

Karl Friston Explains Free Energy Principle [video]

https://www.youtube.com/watch?v=NIu_dJGyIQI
1•devy•27m ago•0 comments

Principles of Design (1998)

https://www.w3.org/DesignIssues/Principles.html
1•hoekit•31m ago•0 comments

The Harvest #9 – Multi-Interface Applications

https://beetstack.dev/blog/post-9
1•mrchantey•31m ago•0 comments

Nasal Demons

http://www.catb.org/esr/jargon/html/N/nasal-demons.html
1•djha-skin•34m ago•0 comments

Foreign National Gets 20 Yrs for Trafficing Nuclear, Narcotics, and Firearms

https://www.justice.gov/opa/pr/foreign-national-sentenced-20-years-prison-conspiring-traffic-nucl...
2•737min•34m ago•0 comments

Show HN: Moji – A read-it-later app with self-organizing smart collections

https://moji.pcding.com
2•desmonding•34m ago•0 comments

The free-energy principle: a unified brain theory?

https://www.nature.com/articles/nrn2787
1•devy•37m ago•0 comments

Data Center Signal

https://datacentersignal.com/
1•edwinorange•37m ago•0 comments

Show HN: MHA OC Maker – Create My Hero Academia Original Characters with AI

https://aiocmaker.com/oc-maker/mha-oc-maker
1•newsapling1988•38m ago•0 comments

Can A.I. Be Pro-Worker?

https://www.newyorker.com/news/the-financial-page/can-ai-be-pro-worker
3•randomrainbow•38m ago•1 comments

Uni-1, Luma's first unified understanding and generation model

https://lumalabs.ai/
1•aryamansharda•38m ago•0 comments

AI benchmarks: What Jellyfish learned from analyzing 20M PRs [video]

https://www.youtube.com/watch?v=DIHdZCj_xoc
1•mooreds•45m ago•0 comments

Ends and means; an inquiry into the nature of ideals (1969)

https://archive.org/details/endsmeansinquiry0000huxl
1•measurablefunc•45m ago•0 comments

Show HN: I made a design portfolio reviewer

https://www.evalv.ai/
2•eldardesign•46m ago•2 comments

Parsync, a tool for parallel SSH transfers – 7x faster than rsync

https://github.com/AlpinDale/parsync
1•AlpinDale•46m ago•0 comments

Show HN: Rent Your Idle OpenClaw Browser to AI Agents

https://rentmybrowser.dev
1•0xpasho•50m ago•0 comments