frontpage.

LOAB, an open-source benchmark for evaluating whether AI agents can follow regulated lending processes — not just produce the right final answer. The motivation is simple: in mortgage lending, regulators don't care if you got the right answer. They care whether you followed the right process. Skip a KYC check, pull a credit bureau report before getting privacy consent, or approve a loan without the required policy lookup — that's a compliance failure even if the outcome was correct. Current AI benchmarks don't measure this. They evaluate what the agent decided, not how it got there. LOAB simulates a fictional Australian lender with mock regulatory APIs, multi-agent roles mirroring real bank operations, and a five-dimension scoring rubric derived from actual lending law. A run only passes if the outcome is correct AND the process was correct. The main finding: frontier models achieve 67-75% outcome accuracy but only 25-42% when you also require process compliance. It's surprisingly hard to get AI to follow a prescribed sequence of steps even when it clearly "knows" the right answer.

No, it doesn't cost Anthropic $5k per Claude Code user

Love in the Time of A.I. Companions

Helios: Real Real-Time Long Video Generation Model

PRX Part 3 – Training a Text-to-Image Model in 24h

Open-source software could be excluded from Colorado age verification bill

Show HN: Hacker News Focus Comments Reader

The emerging role of SRAM-centric chips in AI inference

Simradar21

Amid wave of kids' online safety laws, age-checking tech comes of age

M5 Max: Chiplets, Thermals, and Performance per Watt

Agentis – An AI-native programming language where the LLM is the stdlib

iOS 26.4's new setting lets you disable another Liquid Glass effect

Show HN: Free AI resume tailor I built after a recent layoff (300+ users so far)

Closing the verification loop, Part 2: autonomous optimization

From Tool to Employee: What Claude Code's /Loop Means

Reversing Russian spyware I installed on my iPhone [video]

Agentic development environment extension taxonomy

Worldwide Sidewalk Joy: Adding whimsy to neighborhoods

10K Curl Downloads per Year

Superpowers 5

Show HN: Git Trophy – 3D print your GitHub contribution graph

Trump is heading for a hard reckoning over Iran

Reinforcement fine-tuning use cases

Bromure: An ephemeral browser that runs in a disposable virtual machine on macOS

QuickTERMINAL – A 10k-line single-file terminal emulator for macOS

Sir Tony Hoare has died

JavaScript with a native Rust host game engine. Built for vibe coding

Why right-wing media can't stop Candace Owens

How long do electric vehicle batteries last?

A Modular Computer That's Bringing Back Analog