frontpage.

Today, we're launching the Open Benchmarks Grants: a $3M commitment to fund open-source and academic teams building benchmarks for AI agents. In partnership with HuggingFace, PrimeIntellect, FactoryHQ, Together, Harbor, and PyTorch, the grants provide funding, data development support, and research collaboration.

Our ability to measure AI has been outpaced by our ability to develop it, and we believe this evaluation gap is one of the most important problems in AI. Open benchmarks are one of the most important levers for advancing AI safely and responsibly—but the academic and open-source teams driving them often hit resource constraints, especially in the face of the exponentially expanding complexity of what tomorrow’s benchmarks need to cover.

We think the next wave of benchmarks needs to push on three axes: - Environment complexity - How realistic is the operating environment? - Autonomy horizon - How far can an agent operate independently? We need to measure - Output complexity - How sophisticated is the work product?

Happy to answer questions about the grants, the framework, and would love to hear more about what you’re building!

Show HN: Matching people based on their saved places, not their profiles

Reviving a CIDCO MailStation – the last Z80 computer

Bringing a Warhammer to a Knife Fight

Show HN: Brood,image-first AI visual canvas for devs

Exploring Chess Positions and Counts

Synchronicity

Clairvoyance

Show HN: Local-Sanitizer – Mask PII in 10GB+ Logs Locally Using Rust and WASM

Show HN: Open-source React Native templates (trading, messaging, AI chat)

OpenVPN 2.7.0 – An open source VPN daemon

NHS staff told to stop discouraging first cousin marriages

Show HN: Interview Simulator – AI voice agent for practicing job interviews

Y Combinator CEO Garry Tan launches dark-money group to influence CA politics

Outcome Engineering

TLX: Triton-Like Simplicity, a Clear Path to Peak Performance [video]

Making an AI First Endpoint

Zerobrew is a Rust-based, 5-20x faster drop-in Homebrew alternative

Video Game Preservation – An archive of commercial video game source code

Why your 40s can be the most exhausting decade of your life

China showcases new Moon ship and reusable rocket in one extraordinary test

US decides SpaceX is like an airline, exempting it from Labor Relations Act

"Windows 11 26H1" is a special version of Windows exclusively for new Arm PCs

Automating Inference Optimizations with NVIDIA TensorRT LLM AutoDeploy

Malcolm Gladwell Announces Book Exploring the Nation's Gun Violence Epidemic

Deepwiki.com (Devin) documentation of Sutskever-30-implementations

Tékumel

Reports of Telnet's Death Have Been Greatly Exaggerated

Agentic Engineering

WebMCP started as a solution to auth for agents at Amazon

Ford Falls Behind China's BYD in Global Sales for the First Time

Show HN: Open Benchmarks Grants– a $3M commitment to close the AI eval gap