frontpage.

We needed a speaker diarization solution that could run every few seconds alongside transcription on iOS and macOS. But native Swift support was either limited or locked behind paid licenses. Since diarization is a common need in speech-to-text workflows, we decided to open source our work and give back to the community.

We initially tried sherpa-onnx, which works, but running both diarization and transcription models slowed down older devices. CPU-only inference just isn’t ideal for near real-time workloads, so we wanted the option to offload segmentation and speaker embedding to the GPU or ANE. Supporting M1 Macs in particular meant pushing more of the workload to the ANE.

Instead of shoehorning the ONNX model into CoreML with C++, we converted the original PyTorch models directly to CoreML. This approach required some monkey-patching in the PyTorch and pyannote code, but the initial benchmarks look promising.

We’d love feedback! We're currently working on adding VAD and integrating Parakeet for transcription, but still wrestling with CoreML model conversion.

Show HN: I made a JSFiddle-style playground to test and share prompts fast

Show HN: HNping 'remind me later' for HN via web push

Show HN: DesignArena – crowdsourced benchmark for AI-generated UI/UX

Show HN: Pyhoff – Connect Python ML Models to Beckhoff/WAGO IO Hardware

Show HN: Vibe Kanban – Kanban board to manage your AI coding agents

Show HN: I built a toy music controller for my 5yo with a coding agent

Show HN: 0xDEAD//Type – A Fast-Paced Typing Shooter with Retro Vibes

Show HN: FluidAudio – Swift Speaker Diarization on CoreML

Show HN: RULER – Easily apply RL to any agent

Show HN: Pangolin – Open source alternative to Cloudflare Tunnels

Show HN: BinaryRPC – Lightweight WebSocket-based RPC framework in modern C++

Show HN: I build an iOS App for parents to plan meal, create recipes, lunchboxes

Show HN: An educational Local Qwen3 LLM Inference project written in Rust

Show HN: OffChess – Offline chess puzzles app

Show HN: Microsoft official MCP for documentation and more

Show HN: I Built a Stick-On Wireless Lamp That Installs in 30 Seconds

Show HN: Build web forms in rich text

Show HN: Cactus – Ollama for Smartphones

Show HN: Interactive pinout for the Raspberry Pi Pico 2

Show HN: CXXStateTree – A modern C++ library for hierarchical state machines

Show HN: FlopperZiro – A DIY open-source Flipper Zero clone

Show HN: Open source alternative to Perplexity Comet

Show HN: Train Block Diffusion Models on Consumer Hardware (RTX 4090) in Hours

Show HN: MCP server for searching and downloading documents from Anna's Archive

Show HN: I automated code security to help vibe coders from getting busted

Show HN: I built a playground to showcase what Flux Kontext is good at

Show HN: Cogency – Cognitive Architecture for AI Agents

Show HN: Typeform was too expensive so I built my own forms

Show HN: asyncmcp – Run MCP over async transport via AWS SNS+SQS

Show HN: NYC Subway Simulator and Route Designer

Show HN: FluidAudio – Swift Speaker Diarization on CoreML

Show HN: I made a JSFiddle-style playground to test and share prompts fast

Show HN: HNping 'remind me later' for HN via web push

Show HN: DesignArena – crowdsourced benchmark for AI-generated UI/UX

Show HN: Pyhoff – Connect Python ML Models to Beckhoff/WAGO IO Hardware

Show HN: Vibe Kanban – Kanban board to manage your AI coding agents

Show HN: I built a toy music controller for my 5yo with a coding agent

Show HN: 0xDEAD//Type – A Fast-Paced Typing Shooter with Retro Vibes

Show HN: FluidAudio – Swift Speaker Diarization on CoreML

Show HN: RULER – Easily apply RL to any agent

Show HN: Pangolin – Open source alternative to Cloudflare Tunnels

Show HN: BinaryRPC – Lightweight WebSocket-based RPC framework in modern C++

Show HN: I build an iOS App for parents to plan meal, create recipes, lunchboxes

Show HN: An educational Local Qwen3 LLM Inference project written in Rust

Show HN: OffChess – Offline chess puzzles app

Show HN: Microsoft official MCP for documentation and more

Show HN: I Built a Stick-On Wireless Lamp That Installs in 30 Seconds

Show HN: Build web forms in rich text

Show HN: Cactus – Ollama for Smartphones

Show HN: Interactive pinout for the Raspberry Pi Pico 2

Show HN: CXXStateTree – A modern C++ library for hierarchical state machines

Show HN: FlopperZiro – A DIY open-source Flipper Zero clone

Show HN: Open source alternative to Perplexity Comet

Show HN: Train Block Diffusion Models on Consumer Hardware (RTX 4090) in Hours

Show HN: MCP server for searching and downloading documents from Anna's Archive

Show HN: I automated code security to help vibe coders from getting busted

Show HN: I built a playground to showcase what Flux Kontext is good at

Show HN: Cogency – Cognitive Architecture for AI Agents

Show HN: Typeform was too expensive so I built my own forms

Show HN: asyncmcp – Run MCP over async transport via AWS SNS+SQS

Show HN: NYC Subway Simulator and Route Designer