frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Hoot: Scheme on WebAssembly

https://www.spritely.institute/hoot/
1•AlexeyBrin•42s ago•0 comments

What the longevity experts don't tell you

https://machielreyneke.com/blog/longevity-lessons/
1•machielrey•1m ago•0 comments

Monzo wrongly denied refunds to fraud and scam victims

https://www.theguardian.com/money/2026/feb/07/monzo-natwest-hsbc-refunds-fraud-scam-fos-ombudsman
2•tablets•6m ago•0 comments

They were drawn to Korea with dreams of K-pop stardom – but then let down

https://www.bbc.com/news/articles/cvgnq9rwyqno
2•breve•8m ago•0 comments

Show HN: AI-Powered Merchant Intelligence

https://nodee.co
1•jjkirsch•11m ago•0 comments

Bash parallel tasks and error handling

https://github.com/themattrix/bash-concurrent
2•pastage•11m ago•0 comments

Let's compile Quake like it's 1997

https://fabiensanglard.net/compile_like_1997/index.html
1•billiob•12m ago•0 comments

Reverse Engineering Medium.com's Editor: How Copy, Paste, and Images Work

https://app.writtte.com/read/gP0H6W5
2•birdculture•17m ago•0 comments

Go 1.22, SQLite, and Next.js: The "Boring" Back End

https://mohammedeabdelaziz.github.io/articles/go-next-pt-2
1•mohammede•23m ago•0 comments

Laibach the Whistleblowers [video]

https://www.youtube.com/watch?v=c6Mx2mxpaCY
1•KnuthIsGod•24m ago•1 comments

Slop News - HN front page right now hallucinated as 100% AI SLOP

https://slop-news.pages.dev/slop-news
1•keepamovin•29m ago•1 comments

Economists vs. Technologists on AI

https://ideasindevelopment.substack.com/p/economists-vs-technologists-on-ai
1•econlmics•31m ago•0 comments

Life at the Edge

https://asadk.com/p/edge
3•tosh•37m ago•0 comments

RISC-V Vector Primer

https://github.com/simplex-micro/riscv-vector-primer/blob/main/index.md
4•oxxoxoxooo•40m ago•1 comments

Show HN: Invoxo – Invoicing with automatic EU VAT for cross-border services

2•InvoxoEU•41m ago•0 comments

A Tale of Two Standards, POSIX and Win32 (2005)

https://www.samba.org/samba/news/articles/low_point/tale_two_stds_os2.html
2•goranmoomin•45m ago•0 comments

Ask HN: Is the Downfall of SaaS Started?

3•throwaw12•46m ago•0 comments

Flirt: The Native Backend

https://blog.buenzli.dev/flirt-native-backend/
2•senekor•47m ago•0 comments

OpenAI's Latest Platform Targets Enterprise Customers

https://aibusiness.com/agentic-ai/openai-s-latest-platform-targets-enterprise-customers
1•myk-e•50m ago•0 comments

Goldman Sachs taps Anthropic's Claude to automate accounting, compliance roles

https://www.cnbc.com/2026/02/06/anthropic-goldman-sachs-ai-model-accounting.html
3•myk-e•52m ago•5 comments

Ai.com bought by Crypto.com founder for $70M in biggest-ever website name deal

https://www.ft.com/content/83488628-8dfd-4060-a7b0-71b1bb012785
1•1vuio0pswjnm7•53m ago•1 comments

Big Tech's AI Push Is Costing More Than the Moon Landing

https://www.wsj.com/tech/ai/ai-spending-tech-companies-compared-02b90046
4•1vuio0pswjnm7•55m ago•0 comments

The AI boom is causing shortages everywhere else

https://www.washingtonpost.com/technology/2026/02/07/ai-spending-economy-shortages/
2•1vuio0pswjnm7•57m ago•0 comments

Suno, AI Music, and the Bad Future [video]

https://www.youtube.com/watch?v=U8dcFhF0Dlk
1•askl•59m ago•2 comments

Ask HN: How are researchers using AlphaFold in 2026?

1•jocho12•1h ago•0 comments

Running the "Reflections on Trusting Trust" Compiler

https://spawn-queue.acm.org/doi/10.1145/3786614
1•devooops•1h ago•0 comments

Watermark API – $0.01/image, 10x cheaper than Cloudinary

https://api-production-caa8.up.railway.app/docs
1•lembergs•1h ago•1 comments

Now send your marketing campaigns directly from ChatGPT

https://www.mail-o-mail.com/
1•avallark•1h ago•1 comments

Queueing Theory v2: DORA metrics, queue-of-queues, chi-alpha-beta-sigma notation

https://github.com/joelparkerhenderson/queueing-theory
1•jph•1h ago•0 comments

Show HN: Hibana – choreography-first protocol safety for Rust

https://hibanaworks.dev/
5•o8vm•1h ago•1 comments
Open in hackernews

Ask HN: What Speaker Diarization tools should I look into?

11•justforfunhere•6mo ago
Hi,

I am making a tool that needs to analyze a conversation (non-English) between two people. The conversation is provided to me in audio format. I am currently using OpenAI Whisper to transcribe and feed the transcription to ChatGPT-4o model through the API for analysis.

So far, it's doing a fair job. Sometimes, though, reading the transcription, I find it hard to figure out which speaker is speaking what. I have to listen to the audio to figure it out. I am wondering if ChatGPT-4o would also sometimes find it hard to follow the conversation from the transcription. I think that adding a speaker diarization step might make the transcription easier to understand and analyze.

I am looking for Speaker Diarization tools that I can use. I have tried using pyannote speaker-diarization-3.1, but I find it does not work very well. What are some other options that I can look at?

Comments

nemima•6mo ago
Hi, I'm an engineer at Speechmatics. Our speech-to-text software handles speaker diarization very reliably, and we're a go-to choice for non-English languages. https://www.speechmatics.com/

How long is the audio file? If it's under 2 hours, you can upload the file and transcribe it with diarization for free using our web portal: https://portal.speechmatics.com/jobs/create/batch

Hope it helps for your use case! If it does, and you encounter any issues, drop us an email at devrel@speechmatics.com :)

EDIT: typo

justforfunhere•6mo ago
Hi, yes, it is well under two hours. The longest audio that I have had to handle as of now is around 10 minutes.

I will give your portal a try soon. Thanks

hildekominskia•6mo ago
Skip pyannote 3.1; two battle-tested upgrades:

1. NVIDIA NeMo’s `diar_msdd_telephonic` (8 kHz) or `diar_msdd_mic` (16 kHz) — one-line Python install, GPU optional, beats pyannote on cross-talk. 2. AssemblyAI’s async `/v2/transcript` endpoint — gives you `words[].speaker` + Whisper-level accuracy for 40+ languages. Free tier: 3 h / month.

Glue either to your existing Whisper pipeline and feed ChatGPT-4o with speaker-tagged text. The jump in clarity is night-and-day.

I use the same combo to auto-caption interviews, then drop the synced footage into Veo 3 (https://veo-3.app) for instant talking-head explainers—works even for non-English audio.

hbredin•6mo ago
Hey, I am the creator of pyannote open-source toolkit.

I just created a company around it that serves much better diarization models through an API.

You can test it by creating an account on https://dashboard.pyannote.ai. You'll get 150h of diarization for free.

There is also a playground where you can simply upload a file and visualize the diarization results.

satvikpendem•6mo ago
Seems like this only diarizes, is there a transcription interface as well? The prices are a bit high for only diarization as something like Soniox is also ~13 cents for real-time diarization with transcription included.
satvikpendem•6mo ago
Google Gemini and ElevenLabs are quite good at transcription with diarization if you already have the audiofile. For real-time, I like Soniox, you can use their comparison page that runs all the major transcription services at once [0]. Note that their Google model is not Gemini, it's their older Chirp model.

[0] https://soniox.com/compare/

vismit2000•6mo ago
Elevenlabs does speaker diarization really well in my experience: https://elevenlabs.io/ (First came to know about this from Lex-Modi podcast)
meerab•6mo ago
I am building VideoToBe.com - I have found that whisperX works the most reliable.

https://github.com/m-bain/whisperX

It is built on top of OpenAI Whisper, so speech recognition is good, the transcript gives speaker tags as 'SPEAKER_00' and 'SPEAKER_01' etc.

Here is how the transcript may look like

https://videotobe.com/play/media/1b02f75a-9503-43aa-8956-d18...