frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Codex's precision and attention to detail is *crazy* when set up correctly

3•ditchfieldcaleb•1h ago
Lately I've been working on a Tower Defense game with Codex, in part to learn how game development works and in part to see how far I can get using just Codex, no manual coding at all. I've got my AGENTS md & my CODESTYLE md & six other ALLCAPS md files etc, and am working on some refactoring to keep the codebase clean & file sizes low, etc.

And then I see this in the ExecPlan for my latest refactor:

---

# Observations

- Observation: The refactor made the screenshots pixel-identical after the baseline was recaptured correctly.

Evidence: sha256sum screenshots/before-implementation-x.png screenshots/after-implementation-x.png reported matching hashes for before/after pairs 1, 2, and 3.

---

Which is crazy! I've never told Codex to do an sha compare on before/after screenshots of the app, but I do have instructions in my PLANS.md to take before & after screenshots of the webapp for the game to make sure we avoid frontend regressions (it uses GPT-Image-2 for analysis). So for non-frontend impacting changes, of course nothing should be different between screenshots taken at identical timestamps into the game start.

But doing an explicit SHA compare - that's just...not something I would've ever thought of. Wild.

Comments

TacticalCoder•1h ago
> But doing an explicit SHA compare - that's just...not something I would've ever thought of. Wild.

If I'm not mistaken SGI (Silicon Graphics, Inc.) was already doing that to prevent regression 40 years ago: maybe not SHA but they were taking "screenshots" of the entire screen at a time t and some kind of checksum to then verify (without having to compare every single pixel in the happy case) that enhancement/optimization to their rendering pipeline not supposed to change the output indeed did indeed generate the exact same image as before.

It's basically a 40 years old technique: not too sure what's that wild about it.

ditchfieldcaleb•1h ago
Sure, it's been done before, and I'm sure not just limited to SGI, but no one does this for regular apps these days - never heard of it before. I just find it neat that Codex came up with this - not something I ever would have.
kay_o•54m ago
> but no one does this for regular apps these days - never heard of it before

Everyone does this to match files as identical, be it sha, md5, or something else. I cannot imagine any other method such that it would first come to mind easily you would be doing to check if two files are the same.

I don't mean to offend but I quite literally mean everyone does this. Every software updater, game patcher, checking if two binary files are identical (pixel perfect/lossless in this case: BMP, PNG created by same encoder off same inputs would qualify, JPG would likely not), all of them do exactly this.

GPT-Analysis or a similarity and image chunk hashing would not be the first thing you turn to if what you wanted was exact identical pixel perfect. I am curious what your background is if this is the case.

tomjakubowski•40m ago
https://en.wikipedia.org/wiki/Checksum

RAG retrieves the refutation and still gets it wrong

https://reyes.id.au/posts/anchor-catching-the-failure-mode-where-rag-retrieves-the-refutation-and...
1•aeyer•1m ago•0 comments

Sendapi.co – One API for WhatsApp, SMS, and Email

https://sendapi.co/
1•nimana•2m ago•0 comments

Why some mathematicians think we should abandon pi

https://www.scientificamerican.com/article/why-some-mathematicians-think-we-should-abandon-pi/
1•raihankr•2m ago•0 comments

LaDiR: Latent Diffusion Enhances LLMs for Text Reasoning

https://machinelearning.apple.com/research/ladir
2•gmays•6m ago•0 comments

YouTube, your RSS feeds are broken

https://openrss.org/blog/youtube-your-feeds-are-broken
2•veeti•7m ago•0 comments

AI and That Guy at the Bar

https://dotart.blog/cobbles/ai-and-that-guy-at-the-bar
1•speckx•9m ago•0 comments

Copy.fail: a small Linux kernel bug with an unusually big blast radius

https://jorijn.com/en/blog/copy-fail-cve-2026-31431-linux-kernel-bug-explained/
2•tjek•10m ago•1 comments

Peter Thiel backs $1B ocean data centre startup powered by waves

https://www.ft.com/content/711ce313-16fb-4a12-b6be-fbed547c8a39
2•tjek•13m ago•0 comments

Startup Ignites First Fusion Rocket

https://gizmodo.com/startup-successfully-ignites-worlds-first-fusion-rocket-2000738506
2•airstrike•19m ago•0 comments

Folie à Deux: The most dangerous hallucination is one you're inclined to believe

https://thebookofluke.com/p/folie-a-deux
2•doginasuit•21m ago•0 comments

An AI use policy generator that outputs a deployable managed-settings.json

https://repello.ai/tools/ai-acceptable-use-policy-generator
2•aryamanTitan•23m ago•0 comments

What AstralCodex Gets Wrong about Argument Maps(In the Voice of Scott Alexander)

https://justjamiejoyce.substack.com/p/your-attempt-to-refute-argument-maps
2•JamieTheJoyce•29m ago•1 comments

UpScout – Fast, multi-region uptime monitoring built in Rust

https://upscout.io
3•kipsnai•31m ago•0 comments

Puter 26.05

https://github.com/HeyPuter/puter/releases/tag/26.05
2•ent101•32m ago•0 comments

Show HN: I vibe coded a free site blocker

https://chromewebstore.google.com/detail/sanctuary/caglhejjfpldaooehhlakcdniokjgflh
2•chungusman•33m ago•0 comments

What Happened to Notre Dame's 180k Bees? (2019)

https://dailyobjectivist.com/what-happened-to-notre-dames-180000-bees/
2•thunderbong•36m ago•0 comments

Don't Become an Agent Wrapper

https://www.anantjain.xyz/posts/dont-become-a-wrapper
2•anant90•38m ago•0 comments

Show HN: QA-recorder – One-click QA reports for web apps

https://www.npmjs.com/package/qa-recorder
2•yung3152•39m ago•0 comments

Ask HN: The death of software development as a job?

4•piratesAndSons•40m ago•6 comments

ZooL4nD3r: Translate a passage across 961 learned discourse communities

https://huggingface.co/spaces/RiverRider/zooL4nD3r-demo
2•spacebacon•40m ago•0 comments

Ask HN: What Are You Building?

2•lagniappe•43m ago•1 comments

Now Available: Monthly Subscriptions with a 12-Month Commitment

https://developer.apple.com/news/?id=agq42lxe
4•Austin_Conlon•44m ago•0 comments

Trader.ai – a leaderboard of AI trading bots you can learn from

https://trader.ai
3•TTB_Bulletin•50m ago•0 comments

Microsoft is hiding Windows 11's 'eyes'

https://www.tomsguide.com/ai/microsoft-is-hiding-windows-11s-eyes-heres-how-to-find-copilot-visio...
2•rolph•53m ago•0 comments

Sick of Copilot? You Can Uninstall Microsoft's AI, but It's Tricky

https://www.pcmag.com/news/sick-of-copilot-you-can-finally-uninstall-microsofts-ai-but-its-tricky
2•rolph•57m ago•0 comments

Upscaling classic Sierra adventure games

https://www.youtube.com/watch?v=z9Yp9S23ICo
5•bane•58m ago•2 comments

82nd Airborne Division Using AI to Support 'Project Freedom'

https://www.wral.com/news/local/82nd-airborne-division-ready-support-project-freedom-iran-war-may...
2•Noaidi•1h ago•0 comments

Cryptographic hashing as a transformer attention head

https://github.com/ffr1/unbounded-context-attention
2•SkorpSeven•1h ago•0 comments

Show HN: Design Taste for AI Agents

https://aidesigntaste.com/
5•novateg•1h ago•0 comments

Incus 7.0 LTS has been released

https://discuss.linuxcontainers.org/t/incus-7-0-lts-has-been-released/26641
3•ropyeett•1h ago•1 comments