frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Python Audio Transcription: Convert Speech to Text Locally

https://www.pavlinbg.com/posts/python-speech-to-text-guide
9•Pavlinbg•1h ago

Comments

drewbuschhorn•43m ago
You should throw in some diarization, there's some pretty effective libraries that don't need pertraining on the voice separation in python.
Pavlinbg•38m ago
Nice suggestion, I'll look them up.
oidar•43m ago
What's the best solution right now for TTS that supports speaker diarisation?
makaimc•35m ago
AssemblyAI (YC S17) is currently the one that stands out in the WER and accuracy benchmarks (https://www.assemblyai.com/benchmarks). Though its models are accessed through a web API rather than locally hosted, and speaker diarization is enabled through a parameter in the API call (https://www.assemblyai.com/docs/speech-to-text/pre-recorded-...).
xnx•11m ago
I like this version of Whisper which has diarization built in: https://github.com/Purfview/whisper-standalone-win
999900000999•43m ago
Fantastic project.

I have an old project that relies on AWS transcription and I'd love to migrate it to something local.

vunderba•35m ago
Nice job. I made a similar python script available as a Github gist [1] a while back that given an audio file does the following:

- Converts to 16kHz WAV

- Transcribes using native ggerganov whisper

- Calls out to a local LLM to clean the text

- Prints out the final cleaned up transcription

I found that accuracy/success increased significantly when I added the LLM post-processor even with modestly sized 12-14b models.

I've been using it with great success to convert very old dictated memos from over a decade ago despite a lot of background noise (wind, traffic, etc).

[1] https://gist.github.com/scpedicini/455409fe7656d3cca8959c123...

xnx•9m ago
This tool requires ffmpeg, but don't forget that the latest version of ffmpeg has speech-to-text built in!

I'm sure there are use cases where using Whisper directly is better, but it's a great addition to an already versatile tool.

Categorical Foundations for CuTe Layouts

https://research.colfax-intl.com/categorical-foundations-for-cute-layouts/
1•matt_d•31s ago•0 comments

Python Whatt?

https://medium.com/codeelevation/python-is-dying-and-nobody-wants-to-admit-it-4260f774117a
1•devrimozcay•1m ago•0 comments

APTs Global Review 2022–2025

https://bisi.org.uk/reports/apts-global-review-2022-2025-trends-regions-forecast
1•BigVan•2m ago•0 comments

Kimmel's Late-Night Show Will Return to Air Tuesday

https://www.wsj.com/business/media/kimmels-late-night-show-will-return-to-air-tuesday-7450ea40
2•Anon84•2m ago•0 comments

Are you high-agency or an NPC?

https://jasmi.news/p/dictionary
1•herbertl•2m ago•0 comments

Trusting the Machine We Built

https://gvrkiran.substack.com/p/trusting-the-machine-we-built
1•daureg•3m ago•0 comments

Everything Is Connected to the Heart

https://www.raptitude.com/2025/09/everything-is-connected-to-the-heart/
1•herbertl•4m ago•0 comments

Gone in 2.5 pitches: The fleeting life of a baseball in modern MLB

https://www.nytimes.com/athletic/6637577/2025/09/18/mlb-baseball-lifespan-pitches-phillies-yankees/
1•herbertl•5m ago•1 comments

Impact of Zelda and Ghibli on Young People's Exploration and Happiness

https://pmc.ncbi.nlm.nih.gov/articles/PMC12357126/
2•zufallsheld•9m ago•0 comments

Someone gave their consciousness to Gemini

https://open.substack.com/pub/mackenziesharp/p/i-gave-5-years-of-my-journals-to
3•gpucpufarmer•9m ago•3 comments

CATL: The Missed Empire and the Playbook for the Next Industrial VC

https://maggiexiao.com/catl/
1•walterbell•15m ago•0 comments

Privacy for Subdomains: The Problem

https://blog.frankel.ch/privacy-subdomains/1/
2•nfrankel•15m ago•0 comments

H-1B visas will cost $100K for new petitions; but could lead to more offshoring

https://www.theregister.com/2025/09/22/h1b_visa_changes/
2•rntn•16m ago•0 comments

Convert Google Maps Saved Places to Apple Maps

https://www.gotoapplemaps.com
3•ruslandautov•16m ago•2 comments

Nvidia and United Kingdom Build Nation's AI Infrastructure

https://nvidianews.nvidia.com/news/nvidia-and-united-kingdom-build-nations-ai-infrastructure-and-...
2•andrewstetsenko•18m ago•0 comments

Tell HN: You gave us pricing feedback, we're testing it

1•pedalpete•19m ago•0 comments

Three crashes in the first day:Tesla in Austin

https://arstechnica.com/cars/2025/09/teslas-robotaxi-test-three-crashes-in-only-7000-miles/
2•worik•20m ago•0 comments

Is Life a Form of Computation?

https://thereader.mitpress.mit.edu/is-life-a-form-of-computation/
2•anarbadalov•20m ago•4 comments

Libghostty Is Coming

https://mitchellh.com/writing/libghostty-is-coming
1•pbardea•20m ago•0 comments

Ask HN: Is anyone building mental health support for vibe coders?

1•mbm•20m ago•0 comments

Confessions of a 'Professional' Narcissist Influencer

https://nymag.com/intelligencer/article/diagnosed-narcissists-npd-disorder-coaching-hustle-influe...
2•rendx•26m ago•0 comments

LinkedIn will soon use your data to train AI. Here's what you can do to opt out

https://proton.me/blog/linkedin-ai-training
2•LopRabbit•26m ago•3 comments

We vs It: How AI is shifting power from humans to models

https://bisi.org.uk/reports/artificial-intelligence-power-dynamics-who-controls-ai
1•BigVan•27m ago•0 comments

Oracle's Ellison joins Musk and Zuckerberg in controlling platforms billions see

https://boingboing.net/2025/09/22/oracles-ellison-joins-musk-and-zuckerberg-in-controlling-platfo...
3•KittenInABox•28m ago•0 comments

Unsupervised Instance Segmentation with Superpixels

https://arxiv.org/abs/2509.05352
1•PaulHoule•29m ago•0 comments

Aposd-vs-clean-code: A discussion between John Ousterhout and Robert Martin

https://github.com/johnousterhout/aposd-vs-clean-code
1•Bogdanp•29m ago•0 comments

The Cost of Progressive Rollout

https://surfingcomplexity.blog/2025/09/13/the-hidden-trade-offs-of-fine-grained-progressive-rollo...
1•ijidak•30m ago•0 comments

Amiberry-Lite is an optimized Amiga emulator for ARM and RISC-V platforms

https://github.com/BlitterStudio/amiberry-lite
1•doener•32m ago•0 comments

Show HN: Spatialbound – Turn any location into an interactive 3D playgrounds

https://www.spatialbound.com
3•mibrahimSB•32m ago•2 comments

Bring your Launchpad back in MacOS26+

https://github.com/RoversX/LaunchNext
1•amazonhut•35m ago•0 comments