frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Octopoddy – iOS Podcast App Using Transcripts and LLMs to Skip Ads

https://apps.apple.com/us/app/octopoddy-alpha/id6753860890
1•spellbind-dare•1h ago
TL;DR I'm a fan of podcasts and I despise ads. I built an iOS app to detect and skip in audio ad content.

Motivation: I love podcasts, especially multi hour ones that go into detail on niche topics. One thing that puts me off some podcasts is having the flow become interrupted, especially mid sentence by dynamically inserted ads. Last year this led me down a rabbit hole of experimenting with removing ads from the podcasts I listen to.

Experimentation: At first I tried using Whisper for ad detection to generate a transcript of an episode, then feed this to ChatGPT and ask it to find the ad timestamps. This worked surprisingly well. I had a proof of concept working but I wanted something I could actually use on my phone.

Productionization: Next I wondered if I were to productionize my prototype how would I do that? From experimenting the main issue would be volume of audio transcription required to satisfy a moderate-heavy podcast listener. I estimated ~100 hours of audio per month would be required per user. In testing I used OpenAI hosted Whisper, charged at $0.006/min. That sounds quite cheap. What would that be for 100 hours? $0.006/min -> $0.36/hour -> $36.00/100 hours. S%#t. $36/user/month is way too expensive. If you were to turn this into a business at that rate you'd probably need to be charging at least $50/month. No one's going to pay that.

What if we did everything on device? In iOS 26 there are APIs for on device speech LLM and speech to text. I got the podcast audio -> transcript -> LLM detected ad segment pipeline working. Excellent! The next problem is that an iPhone is not a data center grade GPU. The pipeline was significantly slower than my first attempt. Before it would take <= ~4 mins while the iPhone pipeline could take up to 10 minutes for multi hour podcasts. The on device approach would be too slow of a good UX. Not to mention each time I ran a test run of the iPhone based pipeline my phone would get really hot and be a huge drain on the battery.

Back to square one. The only other approach (at least that I could think of) would be to manage the transcription infra myself. Given this is just a side project I wanted simple infra. Ideally I would be able to use something like AWS Lambda with GPUs (does not exist, I checked). My research showed GCP has serverless Cloud Run with a GPU option. Now we were starting to cook. I built a spike with GCP and had the ad detection working. As I was starting to get excited, I ran a load test on Cloud Run revealing a new problem.

GPUs are in hot demand. Who knew? GCP is (or at least was) limiting the number of GPUs per customer. I was only allotted ~3 GPUs to my account (I tried raising a support ticket for a higher limit but no luck). This was a huge bottleneck as transcribing an episode would saturate the resources of one GPU so 3 GPUs is only a pitiful 3 concurrent episodes being transcribed at once :(

Further into the rabbit hole, research led me to find Runpod that has a serverless GPUs. The low end GPUs go for ~$0.50/hour (that's hour of GPU time not audio transcribed) depending on the GPU used. Now with more reliable access to enough GPUs I could run a load test again. It worked out to be ~$0.02/hour or ~$2.00/100 hours of audio transcribed. At $2 per user per month this is looking a lot more reasonable. $2 is a 94% decrease compared to using the OpenAI API at $36. To be fair to OpenAI the transcripts I would get from their API would be more accurate. When tuning the Runpod implementation I was optimizing for speed and low cost. For the ad detection I found if the transcript was a bit less accurate this did not matter too much when getting the LLM to pick out the ad segments, so trading accuracy for speed + cost made sense here.

Anyway that is my story of building the Octopoddy ad detection pipeline. Please try it out, I'd love to hear what you think. I'd be happy to provide more details on any of this in the comments if you'd like :)

CSS subgrid is super good

https://dbushell.com/2026/04/02/css-subgrid-is-super-good/
1•speckx•44s ago•0 comments

Vtables Aren't Slow (Usually)

https://louis.co.nz/2026/01/24/vtable-overhead.html
1•hmpc•1m ago•0 comments

Gloamy: An open source Claude Cowork alternative

https://github.com/iBz-04/gloamy
1•Ibz04•3m ago•1 comments

Aragorn's Tax Policy and Other Weird Shibboleths

https://reactormag.com/aragorns-tax-policy-and-other-weird-shibboleths/
1•baud147258•3m ago•0 comments

Apple at 50: My journey to the Mac

https://anderegg.ca/2026/04/01/apple-at-50-my-journey-to-the-mac
1•Brajeshwar•3m ago•0 comments

Oil prices soar and shares drop after Trump threatens more Iran strikes

https://www.bbc.com/news/articles/ce8lzd4v7zdo
1•tartoran•3m ago•0 comments

Show HN: Topical.so - structural SEO audits for AI-generated blogs

1•adriaanb•4m ago•0 comments

A life insurance fraud ring built on fake restaurants

https://connordempsey.substack.com/p/how-to-commit-insurance-fraud
1•cdempsey44•4m ago•0 comments

The Self-Cancelling Subscription

https://predr.ag/blog/the-self-cancelling-subscription/
1•birdculture•7m ago•0 comments

Peaky Peek – Local-first debugger for AI agents

https://github.com/acailic/agent_debugger
1•ilkehimself•8m ago•0 comments

Andon (Manufacturing)

https://en.wikipedia.org/wiki/Andon_(manufacturing)
1•debo_•10m ago•0 comments

You can use AI every day and still not get better

https://www.kevinlondon.com/2026/03/12/ai-every-day-and-not-get-better/
2•Kaedon•10m ago•0 comments

Congressional scrutiny of Kalshi, Polymarket explodes

https://www.politico.com/news/2026/04/01/congress-kalshi-polymarket-regulation-00852370
1•1vuio0pswjnm7•11m ago•0 comments

Artemis computer running two instances of MS outlook; they can't figure out why

https://bsky.app/profile/nikigrayson.com/post/3miik2wzosk25
5•mooreds•11m ago•2 comments

As arms agreements fray, China expands its nuclear weapons infra

https://www.cnn.com/2026/04/01/china/investigates-china-secretly-expanding-nuclear-weapons-infras...
1•cwwc•12m ago•0 comments

WebKit Features for Safari 26.4

https://webkit.org/blog/17862/webkit-features-for-safari-26-4/
2•ksec•13m ago•1 comments

Artemis II will use laser beams to live-stream 4K moon footage at 260 Mbps

https://www.tomshardware.com/networking/artemis-ii-will-use-laser-beams-to-live-stream-4k-moon-fo...
4•speckx•15m ago•0 comments

In a thunderous launch, Artemis II astronauts leave Earth. Here's what's next

https://text.npr.org/nx-s1-5770599
1•mooreds•16m ago•0 comments

Delve allegedly forked an open-source tool and sold it as its own

https://techcrunch.com/2026/04/01/the-reputation-of-troubled-yc-startup-delve-has-gotten-even-worse/
5•nickvec•16m ago•0 comments

There Is No Standard EM Role

https://leadership.garden/there-is-no-standard-em-role/
2•speckx•18m ago•0 comments

Best Enterprise Claude Code Gateway

https://www.npmjs.com/package/@maximhq/bifrost
1•aanthonymax•21m ago•0 comments

Node.js can host a new language. Interpreter is the easiest thing

https://github.com/dominexmacedon-dev/starlight-cli-script
1•dominexmacedon•21m ago•0 comments

Startup funding shatters all records in Q1

https://techcrunch.com/2026/04/01/startup-funding-shatters-all-records-in-q1/
1•Brajeshwar•22m ago•1 comments

Japanese X is now America's favorite corner of the internet

https://www.japantimes.co.jp/commentary/2026/04/01/japan/japanese-x-now-americas-favorite/
2•mikhael•23m ago•0 comments

Rare Apple Prototypes for iPod, iPhone, Watch [video]

https://www.youtube.com/watch?v=74qPQt_5DdM
1•dzonga•23m ago•0 comments

The Beep at Meta

https://k2xl.substack.com/p/the-beep-at-meta
3•k2xl•24m ago•0 comments

Stand-Alone Complex or Vibercrime? Exploring GenAI in Cybercrime Ecosystems

https://arxiv.org/abs/2603.29545
2•susan_segfault•25m ago•0 comments

Goodbye, Apple Photos

https://sethw.xyz/blog/2024/03/29/goodbye-apple-photos/
1•speckx•27m ago•0 comments

Ask HN: What percentage of HN is simply promotional content?

1•general_reveal•27m ago•2 comments

BIGA-Bank-of-Infinity-Generating-Automata

https://github.com/Ashioya-ui/BIGA-Bank-of-Infinity-Generating-Automata
1•pb_lightmind•28m ago•0 comments