frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Flirt: The Native Backend

https://blog.buenzli.dev/flirt-native-backend/
1•senekor•50s ago•0 comments

OpenAI's Latest Platform Targets Enterprise Customers

https://aibusiness.com/agentic-ai/openai-s-latest-platform-targets-enterprise-customers
1•myk-e•3m ago•0 comments

Goldman Sachs taps Anthropic's Claude to automate accounting, compliance roles

https://www.cnbc.com/2026/02/06/anthropic-goldman-sachs-ai-model-accounting.html
2•myk-e•5m ago•2 comments

Ai.com bought by Crypto.com founder for $70M in biggest-ever website name deal

https://www.ft.com/content/83488628-8dfd-4060-a7b0-71b1bb012785
1•1vuio0pswjnm7•6m ago•1 comments

Big Tech's AI Push Is Costing More Than the Moon Landing

https://www.wsj.com/tech/ai/ai-spending-tech-companies-compared-02b90046
1•1vuio0pswjnm7•8m ago•0 comments

The AI boom is causing shortages everywhere else

https://www.washingtonpost.com/technology/2026/02/07/ai-spending-economy-shortages/
1•1vuio0pswjnm7•10m ago•0 comments

Suno, AI Music, and the Bad Future [video]

https://www.youtube.com/watch?v=U8dcFhF0Dlk
1•askl•12m ago•1 comments

Ask HN: How are researchers using AlphaFold in 2026?

1•jocho12•15m ago•0 comments

Running the "Reflections on Trusting Trust" Compiler

https://spawn-queue.acm.org/doi/10.1145/3786614
1•devooops•20m ago•0 comments

Watermark API – $0.01/image, 10x cheaper than Cloudinary

https://api-production-caa8.up.railway.app/docs
1•lembergs•21m ago•1 comments

Now send your marketing campaigns directly from ChatGPT

https://www.mail-o-mail.com/
1•avallark•25m ago•1 comments

Queueing Theory v2: DORA metrics, queue-of-queues, chi-alpha-beta-sigma notation

https://github.com/joelparkerhenderson/queueing-theory
1•jph•37m ago•0 comments

Show HN: Hibana – choreography-first protocol safety for Rust

https://hibanaworks.dev/
5•o8vm•39m ago•0 comments

Haniri: A live autonomous world where AI agents survive or collapse

https://www.haniri.com
1•donangrey•39m ago•1 comments

GPT-5.3-Codex System Card [pdf]

https://cdn.openai.com/pdf/23eca107-a9b1-4d2c-b156-7deb4fbc697c/GPT-5-3-Codex-System-Card-02.pdf
1•tosh•52m ago•0 comments

Atlas: Manage your database schema as code

https://github.com/ariga/atlas
1•quectophoton•55m ago•0 comments

Geist Pixel

https://vercel.com/blog/introducing-geist-pixel
2•helloplanets•58m ago•0 comments

Show HN: MCP to get latest dependency package and tool versions

https://github.com/MShekow/package-version-check-mcp
1•mshekow•1h ago•0 comments

The better you get at something, the harder it becomes to do

https://seekingtrust.substack.com/p/improving-at-writing-made-me-almost
2•FinnLobsien•1h ago•0 comments

Show HN: WP Float – Archive WordPress blogs to free static hosting

https://wpfloat.netlify.app/
1•zizoulegrande•1h ago•0 comments

Show HN: I Hacked My Family's Meal Planning with an App

https://mealjar.app
1•melvinzammit•1h ago•0 comments

Sony BMG copy protection rootkit scandal

https://en.wikipedia.org/wiki/Sony_BMG_copy_protection_rootkit_scandal
2•basilikum•1h ago•0 comments

The Future of Systems

https://novlabs.ai/mission/
2•tekbog•1h ago•1 comments

NASA now allowing astronauts to bring their smartphones on space missions

https://twitter.com/NASAAdmin/status/2019259382962307393
2•gbugniot•1h ago•0 comments

Claude Code Is the Inflection Point

https://newsletter.semianalysis.com/p/claude-code-is-the-inflection-point
4•throwaw12•1h ago•2 comments

Show HN: MicroClaw – Agentic AI Assistant for Telegram, Built in Rust

https://github.com/microclaw/microclaw
1•everettjf•1h ago•2 comments

Show HN: Omni-BLAS – 4x faster matrix multiplication via Monte Carlo sampling

https://github.com/AleatorAI/OMNI-BLAS
1•LowSpecEng•1h ago•1 comments

The AI-Ready Software Developer: Conclusion – Same Game, Different Dice

https://codemanship.wordpress.com/2026/01/05/the-ai-ready-software-developer-conclusion-same-game...
1•lifeisstillgood•1h ago•0 comments

AI Agent Automates Google Stock Analysis from Financial Reports

https://pardusai.org/view/54c6646b9e273bbe103b76256a91a7f30da624062a8a6eeb16febfe403efd078
1•JasonHEIN•1h ago•0 comments

Voxtral Realtime 4B Pure C Implementation

https://github.com/antirez/voxtral.c
2•andreabat•1h ago•1 comments
Open in hackernews

Ask HN: What Speaker Diarization tools should I look into?

11•justforfunhere•6mo ago
Hi,

I am making a tool that needs to analyze a conversation (non-English) between two people. The conversation is provided to me in audio format. I am currently using OpenAI Whisper to transcribe and feed the transcription to ChatGPT-4o model through the API for analysis.

So far, it's doing a fair job. Sometimes, though, reading the transcription, I find it hard to figure out which speaker is speaking what. I have to listen to the audio to figure it out. I am wondering if ChatGPT-4o would also sometimes find it hard to follow the conversation from the transcription. I think that adding a speaker diarization step might make the transcription easier to understand and analyze.

I am looking for Speaker Diarization tools that I can use. I have tried using pyannote speaker-diarization-3.1, but I find it does not work very well. What are some other options that I can look at?

Comments

nemima•6mo ago
Hi, I'm an engineer at Speechmatics. Our speech-to-text software handles speaker diarization very reliably, and we're a go-to choice for non-English languages. https://www.speechmatics.com/

How long is the audio file? If it's under 2 hours, you can upload the file and transcribe it with diarization for free using our web portal: https://portal.speechmatics.com/jobs/create/batch

Hope it helps for your use case! If it does, and you encounter any issues, drop us an email at devrel@speechmatics.com :)

EDIT: typo

justforfunhere•6mo ago
Hi, yes, it is well under two hours. The longest audio that I have had to handle as of now is around 10 minutes.

I will give your portal a try soon. Thanks

hildekominskia•6mo ago
Skip pyannote 3.1; two battle-tested upgrades:

1. NVIDIA NeMo’s `diar_msdd_telephonic` (8 kHz) or `diar_msdd_mic` (16 kHz) — one-line Python install, GPU optional, beats pyannote on cross-talk. 2. AssemblyAI’s async `/v2/transcript` endpoint — gives you `words[].speaker` + Whisper-level accuracy for 40+ languages. Free tier: 3 h / month.

Glue either to your existing Whisper pipeline and feed ChatGPT-4o with speaker-tagged text. The jump in clarity is night-and-day.

I use the same combo to auto-caption interviews, then drop the synced footage into Veo 3 (https://veo-3.app) for instant talking-head explainers—works even for non-English audio.

hbredin•6mo ago
Hey, I am the creator of pyannote open-source toolkit.

I just created a company around it that serves much better diarization models through an API.

You can test it by creating an account on https://dashboard.pyannote.ai. You'll get 150h of diarization for free.

There is also a playground where you can simply upload a file and visualize the diarization results.

satvikpendem•6mo ago
Seems like this only diarizes, is there a transcription interface as well? The prices are a bit high for only diarization as something like Soniox is also ~13 cents for real-time diarization with transcription included.
satvikpendem•6mo ago
Google Gemini and ElevenLabs are quite good at transcription with diarization if you already have the audiofile. For real-time, I like Soniox, you can use their comparison page that runs all the major transcription services at once [0]. Note that their Google model is not Gemini, it's their older Chirp model.

[0] https://soniox.com/compare/

vismit2000•6mo ago
Elevenlabs does speaker diarization really well in my experience: https://elevenlabs.io/ (First came to know about this from Lex-Modi podcast)
meerab•6mo ago
I am building VideoToBe.com - I have found that whisperX works the most reliable.

https://github.com/m-bain/whisperX

It is built on top of OpenAI Whisper, so speech recognition is good, the transcript gives speaker tags as 'SPEAKER_00' and 'SPEAKER_01' etc.

Here is how the transcript may look like

https://videotobe.com/play/media/1b02f75a-9503-43aa-8956-d18...