frontpage.

I built FretBench after noticing Gemini was confidently wrong about basic guitar tab questions. Tab is arguably the simplest notation in music: six lines, numbers for frets, read left to right. So I made a benchmark out of it.

182 test cases, 4 tunings, 14 models via OpenRouter. Two open-weight Qwen models from Alibaba crushed everything else (83.5%), while most "flagship" models scored below 50%. MiniMax M2.5 scored worse than random guessing.

Everything is open source: https://github.com/jmcapra/FretBench

I'm curious whether the performance gap is related to tokenisation of ASCII art — if anyone has insights on how different tokenisers handle grid-structured text, I'd love to hear it.

InfluxDB

Human brain cells on a chip learned to play Doom in a week

Show HN: Wiggly border generator (somewhat responsive)

TCS, Google Cloud Launch Gemini Experience Centre for Manufacturing AI

America and Public Disorder

OpenBrushograph – A Brush Painting Robot

LA Cotidianidad

SlowQL – stop bad SQL before it reaches production

Cursor-goes-to-war-for-AI-coding-dominance

Ask HN: Seeking a Lobste.rs Invitation

Using AI Agents in Software Development 2026 [audio]

The Eye of the Mathematician

Making art with CSS gradients and corner-shape and skew

Author of the Cicada 3301 Mystery

Frequently-used FFmpeg recipes (2025)

Show HN: Real Browser MCP – your AI agent can see your real browser

Actuarial Warfare: How Seven Insurance Letters Closed the Strait of Hormuz

Blue Origin Starts 800,000sqft 'Project Horizon' Expansion Process

My distance from web development prepared me for the age of AI agents

Agentcontainer: A standard way to declare agent containers for your projects

Show HN: How many working days in 2026? And your income in pizzas

Show HN: Free market intelligence tool, analyze HN, find users pain points

How I Use Claude Code as a Designer at Shopify [video]

Show HN: Engram – open-source persistent memory for AI agents (Bun and SQLite)

The Complete Guide to Building Skills for Claude [pdf]

Agentic coding doesn't = technical debt

A 10% traffic spike took down a stable system in 3 minutes and 47 seconds

Show HN: This is what I Want from the Internet

Nvidia backs AI data center startup Nscale as it hits $14.6B valuation

Building a system to track market narratives and behavioral signals

Show HN: FretBench – I tested 14 LLMs on reading guitar tabs. Most failed