frontpage.

Hi HN! I built this tool because I wanted to answer a simple question for myself: which books come up most often across the podcasts I listen to?

The pipeline is straightforward at a high level: transcribe episodes with faster_whisper running locally on an RTX 3060, run the text through GPT-5-mini to pull out structured book mentions, and store everything in Azure SQL and Blob storage. The frontend is a small Flask app using HTMX, Tailwind, and some D3 for the visualizations.

The part that turned out to be much more time consuming than expected was deduping. Everything else scaled nicely, but normalizing book titles is still the one piece I can’t fully automate without quality drifting. Fuzzy matching gets you most of the way, but the long tail of book names is huge. I ended up building a tiny internal Flask UI just to confirm or split fuzzy matches by hand, it also lets me review the context for the book mention to ensure accuracy. It's the only place in the system where a human is still in the loop.

A few other unexpected issues came up: some podcast RSS feeds randomly duplicate or link to broken episodes, CUDA can crash if I’m not careful with garbage collection between Whisper runs, and LLM extraction occasionally fails if the model doesn’t return exactly the JSON shape I expect.

One surprising pattern emerged: the long tail is enormous. A handful of books are mentioned constantly, but thousands more appear exactly once.

If you want to see the current state of it, the reports and visualizations are here: https://www.mavensignal.com

Happy to answer anything about the pipeline, LLM prompting, dedupe logic, or the stack in general.

Tmux to Zellij (and Back)

Ask HN: How are you using specialized agents to accelerate your work?

Passing user_id through 6 services? OTel Baggage fixes this

DavMail Pop/IMAP/SMTP/Caldav/Carddav/LDAP Exchange Gateway

Visual data modelling in the browser (open source)

Show HN: Tharos – CLI to find and autofix security bugs using local LLMs

Oddly Simple GUI Programs

The New Playbook for Leaders [pdf]

Interactive Unboxing of J Dilla's Donuts

OneCourt helps blind and low-vision fans to track Super Bowl live

Rudolf Vrba

Autism Incidence in Girls and Boys May Be Nearly Equal, Study Suggests

Wellness Hotels Discovery Application

NASA delays moon rocket launch by a month after fuel leaks during test

Sebastian Galiani on the Marginal Revolution

Ask HN: Are we at the point where software can improve itself?

Binance Gives Trump Family's Crypto Firm a Leg Up

Reverse engineering Chinese 'shit-program' for absolute glory: R/ClaudeCode

Indian Culture

Show HN: Maravel-Framework 10.61 prevents circular dependency

The age of a treacherous, falling dollar

Ask HN: AI Generated Diagrams

Microsoft Account bugs locked me out of Notepad – are Thin Clients ruining PCs?

Show HN: A delightful Mac app to vibe code beautiful iOS apps

Show HN: Gemini Station – A local Chrome extension to organize AI chats

Welfare states build financial markets through social policy design

Market orientation and national homicide rates

California urges people avoid wild mushrooms after 4 deaths, 3 liver transplants

Matthew Shulman, co-creator of Intellisense, died 2019 March 22

Show HN: SuperLocalMemory – AI memory that stays on your machine, forever free