frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Optinum – finds the blind spots AI coding agents systematically miss in PR tests

https://github.com/anhnguyensynctree/optinum
2•nta25297•1h ago

Comments

nta25297•1h ago
We ran Optinum against 16 real production bugs from SWE-bench Verified, a dataset of real OSS issues with human-verified patches. In 62.5% of cases, the AI-written tests that accompanied each fix completely missed the exact failure class the bug belonged to. Not occasional random misses — the same categories across Django, sympy, scikit-learn, requests, Sphinx, and LangChain. We mapped all 500 SWE-bench Verified instances to 22 patterns across 6 change types (cascade-blindness, contract-change, schema-migration, etc.), zero false positives. We also took one sympy instance, synthesized a test, and verified it end-to-end in a Docker sandbox: the test fails on the bug commit and passes on the fix commit. The problem isn't quality — it's structure. When an AI modifies a function, it writes tests covering exactly what it changed. What it has no structural reason to check is whether other callers, dependents, or sibling functions have also been affected by the change. The blast radius is invisible to it. A human reviewer would grep for all callers; the AI tests what it authored and nothing else. You can try Optinum today at https://github.com/anhnguyensynctree/optinum or install it via npm install -g github:anhnguyensynctree/optinum and run optinum test --diff demo/cascade-blindness.diff against the bundled example to see what patterns it surfaces.

The Cost of Customer Development Isn't What You Think

https://medium.com/@juuso.vermasheina/the-real-cost-of-customer-development-isnt-what-you-think-6...
1•jvermasheina•1m ago•0 comments

Improving Unity's Mono codegen, part 1

https://blog.s-schoener.com/2026-04-07-mono-codegen-1/
1•mariuz•1m ago•0 comments

GLM 5.1: Pelican Test

https://simonwillison.net/2026/Apr/7/glm-51/
1•sorenbs•1m ago•0 comments

Darwin: Diagnostic-Guided Evolutionary Model Merging

https://huggingface.co/blog/FINAL-Bench/darwin-v6
1•seawolf2357•3m ago•0 comments

Ask HN: When do integrations become painful?

1•OdinSpecc•3m ago•0 comments

Your parallel Agent limit

https://addyosmani.com/blog/cognitive-parallel-agents/
2•saikatsg•9m ago•0 comments

Javadocs.dev

https://github.com/jamesward/javadoccentral
1•saikatsg•10m ago•0 comments

Row over 'virtual gated community' AI surveillance plan in Toronto neighbourhood

https://www.theguardian.com/technology/2026/apr/07/toronto-rosedale-row-virtual-gated-community-a...
2•beardyw•12m ago•0 comments

Who Stole Your Eyes?

https://www.parascene.com/blog/who-stole-your-eyes
1•heddycrow•12m ago•1 comments

Africa's Human Capital – Bill Gates Letter to The Economist

https://economist.com/letters/2026/04/01/letters-to-the-editor
2•andsoitis•14m ago•0 comments

Veracrypt Project Update

https://sourceforge.net/p/veracrypt/discussion/general/thread/9620d7a4b3/
1•super256•16m ago•0 comments

Beyond the Chatbot: How Claude Code Turns Security into a One-Command Workflow

https://hackarandas.com/blog/2026/04/07/beyond-the-chatbot-how-claude-code-is-turning-security-au...
1•ch0ks•18m ago•0 comments

Collabora Office Technical Committee – first meeting

https://forum.collaboraonline.com/t/collabora-office-technical-committee-1st-meeting/4568
2•maxloh•18m ago•1 comments

SOTA Normalization Performance with Torch.compile

https://pytorch.org/blog/sota-normalization-performance-with-torch-compile/
1•salkahfi•18m ago•0 comments

Chrome: Vertical tabs and immersive reading mode

https://blog.google/products-and-platforms/products/chrome/new-chrome-productivity-features/
2•tosh•20m ago•0 comments

Ask HN: Why people still use GCP and AWS?

1•wasimsk•26m ago•3 comments

Async logging doesn't remove cost – What limits?

https://medium.com/@emishkurov/async-logging-is-not-a-silver-bullet-what-actually-limits-performa...
1•efmsoft•26m ago•1 comments

Over 10B Views What Are 'Fuse Beads'

https://beadpattern.net/what-are-fuse-beads
1•wangneo276•27m ago•0 comments

C++: Freestanding Standard Library

https://www.sandordargo.com/blog/2026/04/08/cpp-freestanding
1•ingve•30m ago•0 comments

Visualizing Graph Structures Using Go and Graphviz

https://dominik.info/blog/visualizing-graphs
1•EspressoGPT•30m ago•0 comments

Mengenlehreuhr

https://en.wikipedia.org/wiki/Mengenlehreuhr
1•mxfh•33m ago•0 comments

Graphics Programming Weekly 435

https://www.jendrikillner.com/post/graphics-programming-weekly-issue-435/
1•mariuz•34m ago•0 comments

EU Migration to and from the UK (Since Brexit)

https://migrationobservatory.ox.ac.uk/resources/briefings/eu-migration-to-and-from-the-uk/
6•senorqa•35m ago•0 comments

Show HN: Sudoku Solver

https://sudokusolverx.com/
1•artiomyak•35m ago•0 comments

Enabling agent-first process redesign

https://www.technologyreview.com/2026/04/07/1134966/enabling-agent-first-process-redesign/
2•joozio•36m ago•0 comments

Nuclear brinkmanship usually works. It's also dangerous

https://www.natesilver.net/p/nuclear-brinkmanship-usually-works
5•rbanffy•38m ago•0 comments

Anthropic Mythos model can find and exploit 0-days

https://www.theregister.com/2026/04/07/anthropic_all_your_zerodays_are_belong_to_us/
2•beardyw•42m ago•0 comments

Mu – A Second Shot at the Same Problem

https://mu.xyz/blog/post?id=1775631008638052804
1•asim•48m ago•0 comments

Show HN: BriskTool 220+ free browser tools where files never leave your device

https://brisktool.com
1•briankaplan•49m ago•0 comments

AirReply

https://airreply.app/
1•jakubino•49m ago•0 comments