frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Removing 95% of podcast ads with transcript segmentation and LLMs

https://benbowler.com/how-i-built-an-llm%e2%80%91assisted-pipeline-that-removes-95-of-podcast-ads-podcastadblock-app/
2•benbowler•1h ago

Comments

benbowler•1h ago
I’ve been listening to podcasts for 15+ years. Ads used to be short and host-read. Now, some shows I follow have 15+ minutes of loud, compressed ads per hour.

I built a system to strip them out automatically. It takes a podcast feed, processes each episode, and outputs an ad-free feed compatible with any player.

What didn’t work:

Full-transcript one-shot prompting: LLMs would return a few timestamps, then stop—context was too broad.

Keyword-based detection: High false positives/negatives, especially with “house ads” and blended sponsor mentions.

What worked:

Segmentation + local scoring: Split transcripts into overlapping windows. Ask the LLM for “ad likelihood” per window—short prompts keep context tight.

Multi-head prompting: Separate prompts for (a) brand ads (URLs, promo codes, sponsor language) and (b) cross-promos. The cross-promo path compares segments to the show’s own notes/description to spot “subscribe to X podcast” segments.

Feedback loop: Users can flag missed ads; reported brand/podcast names bias future runs.

Post-processing: Merge adjacent detections, ignore <10s blips, smooth cut boundaries.

Speaker diarization (WhisperX): Detects voice/tone shifts to distinguish “host in-topic” from “host reading copy.”

Across interviews, daily news, and narrative shows, this consistently removes ~95% of ads. The remaining 5% are sponsor mentions woven directly into content—hard by design.

Infra: hosted on DigitalOcean; inference runs on Modal.com.

Full write-up (with prompts, heuristics, and some failure cases): https://PodcastAdBlock.app/blog/building-podcast-adblock

Curious if others have tackled similar problems—especially around hard-to-detect “native” ads or more efficient diarization approaches.

Spotify's 'The Drop Weekly'

https://newsroom.spotify.com/2025-09-12/the-drop-weekly-editors-new-releases/
1•soheilpro•2m ago•0 comments

Good Old IBM Is Leading the Way in the Race for 'Quantum Advantage'

https://www.wsj.com/tech/ibm-quantum-computer-b443bf5c
1•doener•2m ago•0 comments

Spyware installed on Kenyan filmmakers' phones in police custody

https://cpj.org/2025/09/spyware-installed-on-kenyan-filmmakers-phones-in-police-custody/
1•gnabgib•4m ago•0 comments

Running Lean at Scale

https://harmonic.fun/news
1•eab-•6m ago•0 comments

UK launches Project Octopus, thousands of interceptor drones to Ukraine

https://www.shephardmedia.com/news/air-warfare/dsei-2025-uk-launches-project-octopus-to-deliver-t...
2•tim333•6m ago•0 comments

Test

https://1571297788-atari-embeds.googleusercontent.com/embeds/16cb204cf3a9d4d223a0a3fd8b0eec5d/inn...
1•nnkkkn•9m ago•0 comments

Image-GS: Content-Adaptive Image Representation via 2D Gaussians

https://arxiv.org/abs/2407.01866
1•s20n•10m ago•0 comments

Dario Amodei hacked on X

https://twitter.com/DarioAmodei/status/1966486921334472856
2•volky•10m ago•1 comments

At Least One Underlying Condition

https://siderea.dreamwidth.org/1882720.html
2•cwillu•15m ago•0 comments

Toxic "forever chemicals" found in 95% of beers tested in the U.S.

https://www.sciencedaily.com/releases/2025/09/250911073204.htm
5•OutOfHere•16m ago•4 comments

Show HN: Consentless – A minimalist, privacy-preserving traffic counter

https://consentless.joeldare.com
1•codazoda•18m ago•0 comments

Larry Wall – Present Continuous, Future Perfect (2006)

https://perl.org.il/presentations/larry-wall-present-continuous-future-perfect/transcript.html
1•adityaathalye•19m ago•1 comments

HairMama – AI-powered hair analysis and personalized care recommendations

1•Jenni_emeka•20m ago•1 comments

Reliable Cloud Operations Using Transformers

https://ieeexplore.ieee.org/document/11023560?source=tocalert&dld=Z21haWwuY29t
1•rbanffy•21m ago•0 comments

Why Most LLM Chatbots Never Make It to Production

https://humansignal.com/blog/why-most-llm-chatbots-never-make-it-to-production/
1•ReDeiPirati•21m ago•0 comments

Debunking the Claims of K2-Think

https://www.sri.inf.ethz.ch/blog/k2think
1•nielstron•22m ago•0 comments

Global EV market surges with 1.7M sales in August, up 25% YTD

https://electrek.co/2025/09/11/global-ev-market-surges-with-1-7m-sales-in-august-up-25-ytd/
2•breve•26m ago•0 comments

The Perverse Consequences of the Easy A

https://www.theatlantic.com/ideas/archive/2025/08/harvard-college-grade-inflation/684021/
2•obscurette•26m ago•0 comments

Chat Control repelled 4th time in the EU

https://twitter.com/TutaPrivacy/status/1966384776883142661
7•miohtama•28m ago•1 comments

Researchers turn mouse scalp transparent to image brain development

https://news.stanford.edu/stories/2025/08/mouse-scalp-transparent-image-brain-development-research
2•PaulHoule•28m ago•0 comments

Care-Driven Development: The Art of Giving a Shit

https://brodzinski.com/2025/09/care-driven-development.html
3•flail•28m ago•0 comments

Learn x86-64 assembly by writing a GUI from scratch

https://gaultier.github.io/blog/x11_x64.html
2•ibobev•29m ago•0 comments

New iOS app helps you stop re-checking stoves, doors, and switches

https://apps.apple.com/tr/app/yepp-your-ocd-companion/id6744017205?l=tr
2•ardakaan•30m ago•2 comments

OpenBSD – Full BSDCan 2025 video playlist(s) available

https://www.undeadly.org/cgi?action=article;sid=20250912124932
1•peter_hansteen•30m ago•0 comments

Area of unit disk under a univalent function

https://www.johndcook.com/blog/2025/09/12/conformal-image-area/
1•ibobev•33m ago•0 comments

Thumby Modding [video]

https://www.youtube.com/watch?v=yXtb-MzoS_s
1•doruk101•33m ago•0 comments

Why I Hope the Search for Extraterrestrial Life Finds Nothing

https://nickbostrom.com/papers/where-are-they/
1•voxleone•37m ago•0 comments

Go Mobile Now- for SMEs, Lead the Appplaude

https://vite-react-one-sable-18.vercel.app/
1•mugambindeke•38m ago•0 comments

Multigres: Horizontally scalable Postgres with multi-tenant, HA capabilities

https://multigres.com/
2•merqurio•38m ago•0 comments

I Made the World's Smallest Minecraft Server

https://www.youtube.com/watch?v=p-k5MPhBSjk
1•gavide•45m ago•0 comments