frontpage.

Just finished reading "Parametric Hubris" by Martin Gehrken. It’s a fascinating deep dive into why frontier models hallucinate even when they have tools.

The core thesis: Models suffer from "Parametric Hubris". They rely on their training data (lazy) instead of using search tools, even when browsing is enabled.

Data: GPT-5 only triggers search in ~31% of prompts.

The Fix: A pipeline called "Veritas" that forces 100% retrieval (no parametric memory allowed for answers).

Results: Achieves 89.1% F-Score on SimpleQA Verified (vs 51.6% for GPT-5 and 72.1% for Gemini 3 Pro).

Cost/Model: Built on Gemini 2.5 Flash Lite (cheapest model) for ~$0.002 per query.

Trade-off: It’s slow (~115s per query), but accurate.

The paper argues that hallucination isn't a capability problem, but an architectural discipline problem. Code and data are open source.

Paper/Repo: https://github.com/lamLumae/Project-Lutum-Veritas

Buccal Pumping

Every book recommended on the Odd Lots discord channel

Show HN: WhatsApp Chat Viewer – exported chats as HTML

Throne Wars: When Claude Opus 4.6 Clashes with GPT-5.3 Codex

400k Iranians abroad share Internet access with users at home

Setting Up an IRC Server

I hacked my own computer using OpenClaw and it was terrifyingly easy

PRD-driven, dependency-aware agent workflow for Claude Code and Vibe Kanban

Sandwich Bill of Materials

Pi Is All You Need

AI Makes the Easy Part Easier and the Hard Part Harder

Show HN: Emergent – Artificial life simulation in a single HTML file

Show HN: ParaGopher v1.3.0 – A retro Paratrooper (1982) clone written in Go

What Will Happen to Code?

Show HN: NoFaceClips automatic Reddit to TikTok faceless video generator

What does 'remastering' an album mean?

Quantum Twins simulator unveils 15,000 controllable quantum dots

Show HN: Multi-tenant OpenClaw with isolated containers and encrypted vault

Show HN: Claude Dashboard – k9s-style TUI for managing Claude sessions via tmux

Hitting 1k tokens per second on a single RTX 5090

The tech titans who show up in the Epstein files

Ask HN: Vibe Studying?

Whatoblock – a search engine and live map for internet scanners and botnets

France murder trial complicated by twin brothers with same DNA

My advice to 6-18-year-olds who want to learn to program

Show HN: My store was attacked, and Stripe finished the job

My Second Brain Never Worked. Then I Gave It a Gardener

The Cult of Charvet

Japan's Controversial Shift Back to Nuclear Energy

JSON-driven E2E test runner with built-in MCP server for Claude Code

Show HN: Parametric Hubris – Beating GPT-5 on SimpleQA with forced retrieval