news newest ask show jobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Train-Before-Test: One Simple Fix That Makes LLM Benchmark Rankings Agree

https://ghzhang233.github.io/blog/2026/03/05/train-before-test/

1•taegee•1h ago

Comments

taegee•1h ago

"Model A wins on MMLU. Model B wins on ARC-Challenge. Model C wins on HellaSwag.

At some point you stop trusting any of them—not because benchmarks are meaningless, but because no two of them seem to tell the same story about which model is actually better.

[…]

We found a fix. It’s called Train-before-Test."

Trust Me, I'm a Shortcut

https://www.wietzebeukema.nl/blog/trust-me-im-a-shortcut

1•wietze•2m ago•0 comments

Bitwuzla: Satisfiability Modulo Theories (SMT) Solver

https://github.com/bitwuzla/bitwuzla

1•tosh•4m ago•0 comments

"Bot or Human?" Is the Wrong Question for the Modern Web

https://blog.cloudflare.com/past-bots-and-humans/

1•emot•5m ago•0 comments

Image Generators Are Generalist Vision Learners

https://arxiv.org/abs/2604.20329

1•mohsen1•7m ago•0 comments

What you can do in a decade

https://twitter.com/swyx/status/2047217611880984935

1•tosh•8m ago•0 comments

AI and Teaching

https://eiexchange.com/content/ai-and-teaching-the-brave-new-world

1•walterbell•10m ago•0 comments

Show HN: We built an OCR server that can process 270 dense images/s on a 5090

https://github.com/aiptimizer/TurboOCR

1•pfdomizer•10m ago•0 comments

Writing a C Compiler, in Zig

https://ar-ms.me/thoughts/c-compiler-1-zig/

1•tosh•10m ago•0 comments

Subscription bombing attacks: patterns, dark web services, and mitigations

https://cacm.acm.org/practice/subscription-bombing-email-under-attack/

1•gannimo•11m ago•0 comments

Show HN: AI Applyd – score, rewrite, auto-apply via cloud browser

https://aiapplyd.com/

1•sneefle•11m ago•0 comments

A new logical model for artificial gravity cores: from pest control to railguns

https://gist.github.com/ryouta19931007

1•hamutarou•15m ago•0 comments

Programming as Theory Building – Peter Naur

https://gist.github.com/onlurking/fc5c81d18cfce9ff81bc968a7f342fb1

1•jonnonz•15m ago•0 comments

FIU Student Arrested After Joking About Netanyahu on WhatsApp

https://www.youtube.com/watch?v=o1Zsb1IijYY

5•enaaem•19m ago•0 comments

Meta layoff wave impacting 8000 jobs

https://www.usatoday.com/videos/news/2026/04/20/meta-layoffs-impacting-8000-employees/89697461007/

2•tcp_handshaker•22m ago•0 comments

Is Starlink a Secret Radar Constellation? [video]

https://www.youtube.com/watch?v=jbp3kdJZ1_A

2•msuniverse2026•28m ago•0 comments

Show HN: Nova by civai, a platform for managed AI agents

https://nova.civai.co/

1•usecodenaija•30m ago•0 comments

RFK Jr. Defends Trump's Mathematically Impossible Drug Discount Claims

https://www.nytimes.com/2026/04/22/us/politics/rfk-jr-trump-impossible-drug-discounts.html

3•tcp_handshaker•31m ago•1 comments

Vision Banana: Image Generators Are Generalist Vision Learners

https://vision-banana.github.io

2•M4v3R•31m ago•1 comments

Show HN: We built a way for Claude Code to join meetings like a real teammate

7•pattern-ai•31m ago•2 comments

Debugging WASM in Chrome DevTools

https://eli.thegreenplace.net/2026/debugging-wasm-in-chrome-devtools/

2•mfrw•35m ago•0 comments

Hackers breach Anthropic's 'too dangerous to release' Mythos AI model

https://www.euronews.com/next/2026/04/22/hackers-breach-anthropics-too-dangerous-to-release-mytho...

2•latexr•36m ago•0 comments

Show HN: Razorpay-universal – A framework-agnostic Razorpay SDK

https://www.npmjs.com/package/razorpay-universal

1•rupamshil111•37m ago•0 comments

SpaceX and Cursor have explored a team-up with Mistral to take on AI rivals

https://www.businessinsider.com/elon-musk-xai-explored-collaborating-with-mistral-cursor-2026-4

2•consumer451•39m ago•1 comments

Former Israeli intelligence agents from Unit 8200 hired by Apple

https://vuseum.wordpress.com/2025/07/22/ex-spie-israeliane-dellunita-8200-assunte-da-apple/

4•kome•46m ago•1 comments

Google announced that Chrome is becoming an agentic workplace platform

https://thenextweb.com/news/google-chrome-enterprise-ai-coworker-agentic-browser

2•onchainintel•49m ago•1 comments

The new hosted agents in Foundry Agent Service

https://devblogs.microsoft.com/foundry/introducing-the-new-hosted-agents-in-foundry-agent-service...

1•nonfamous•52m ago•0 comments

Show HN: Autonomous coin-flipping machine with on-device CV

https://www.terencegrover.com/section/physicalart/4

2•tgrover•52m ago•0 comments

Supplies Probably Won't Be Stolen in a Disaster

https://www.jefftk.com/p/your-supplies-probably-wont-be-stolen-in-a-disaster

1•luu•55m ago•0 comments

Google Search Is Broken

https://www.vincentschmalbach.com/google-search-is-broken/

1•vincent_s•55m ago•0 comments

Agents-CLI CLI and skills for building agents on Google Cloud

https://google.github.io/agents-cli/

1•piqufoh•55m ago•0 comments