frontpage.

We've been running coding agents against standardized repos with natural-language prompts — no tool names, no hints — and measuring what they actually choose.

Early finding: Claude Code picks Custom/DIY in 12 of 20 categories. Not because it can't use the tools (BFCL scores suggest it can) but because it doesn't reach for them. That's a different failure mode than capability benchmarks measure.

We score each tool on: agent visibility, pick rate vs Custom/DIY, cross-context breadth, expert human ratings, and implementation success rate. Tools above survival=1 persist. Below it, agents synthesize around them.

Methodology is at survivalindex.org/methodology. Very curious what people think of the measurement approach, especially the human coefficient variable.

AI Error May Have Contributed to Girl's School Bombing in Iran

How many options fit into a boolean?

SK lays off nearly 1k workers at Georgia plant amid cooling automaker EV plan

Ships in Gulf declare themselves Chinese to dodge attack

Doomscroll 14,333 cat pictures

Unemployment Reasons, by Age and Education

Using Rust and Postgres for everything: patterns learned over the years

Show HN: Quantum-PULSE – compress-then-encrypt vault for LLM training data

You can get better code by exploiting model weights

Show HN: BurnRate – Track what you spend on AI coding tools

Worming out molecular secrets behind collective behaviour

Show HN: Resend-CLI, unofficial Resend CLI built for AI agents and humans

Show HN: Rai – Add AI steps in your shell, scripts or CI/CD pipelines

Full-Text RSS site config files

Astronomers Spot a Cosmic Laser Halfway Across the Universe

Trump has privately shown serious interest in U.S. ground troops in Iran

Semi-formal reasoning helps agents reason about code without executing the code

Show HN: EdgeDox – Offline document AI on Android using Qwen3.5-0.8B

EA Javelin Anticheat job listing mentions future support for Linux and Proton

Should AI web agents skip sponsored/ad results by default?

TCXO Failure Analysis

Google Workspace CLI Removes MCP Support

Armed robots take to the battlefield in Ukraine war

Show HN: CC Pocket – Control Claude Code/Codex from Your Phone

Readeck 0.22 Released

Evolving Languages Faster with Type Tailoring (2024)

Grammarly is using our identities without permission

I compiled a list of why you should be excited about std:SIMD and C++26

Mars MIPS Simulator in the Browser

Ask HN: Is SWE mostly just calling APIs?

Show HN: SurvivalIndex – which developer tools do AI agents choose?

Comments