frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Sumi – Open-source voice-to-text with local AI polishing

2•alkd•2h ago
I'm based in Taiwan and run 3-4 Claude Code agents in parallel most of the day. Typing instructions to all of them was the actual bottleneck, so I built a voice-to-text tool that runs both STT and LLM polish locally.

Architecture: two-stage pipeline. Stage 1 is speech recognition via Whisper (whisper-rs, 7 model variants, DTW timestamps) or Qwen3-ASR. I quantized the Qwen3-ASR model myself and wrote the inference pipeline in pure Rust. It handles accented speech and dialects better than Whisper in my testing, likely because of broader training data. Silero VAD pre-filters audio before either engine runs.

Stage 2 is text polish via candle (HuggingFace's Rust ML framework). Available models: Phi 4 Mini (2.5 GB), Ministral 3B/14B, Qwen 3 4B/8B. All Q4_K_M GGUF. Metal on macOS, CUDA on Windows.

The polish step does context detection: reads the active app and URL (NSWorkspace + osascript on Mac, GetForegroundWindow on Windows) and selects a prompt accordingly. You can define custom rules keyed on app name, bundle ID, or URL regex.

Other things: - Meeting mode: background transcription to SQLite. Start before a call, stop when done. - Edit by Voice: select text, speak an instruction ("translate to English", "make this shorter"), LLM rewrites in place - Two local STT engines with 100+ languages, automatic code-switching - Optional BYOK cloud: STT via Groq/OpenAI/Deepgram/Azure, polish via OpenRouter/Groq/Gemini/SambaNova

I built this because the existing tools (Wispr Flow, SuperWhisper) are cloud-only for AI processing and subscription-based. I wanted local inference for both stages, custom prompt rules per app, and source code I could actually read.

Rust, GPLv3.

Website: https://sumivoice.com/en/?utm_source=hackernews&utm_medium=forum&utm_campaign=launch_2026q1&utm_content=show_hn

Source: https://github.com/alan890104/sumi

Linux kernel proposal to drop IPv6 as a module

https://lore.kernel.org/lkml/20260309022013.5199-1-fmancera@suse.de/
1•kyshovkk•55s ago•0 comments

InfluxDB

https://www.influxdata.com/lp/influxdb-database/
1•handfuloflight•2m ago•0 comments

Human brain cells on a chip learned to play Doom in a week

https://www.newscientist.com/article/2517389-human-brain-cells-on-a-chip-learned-to-play-doom-in-...
1•cobbzilla•3m ago•0 comments

Show HN: Wiggly border generator (somewhat responsive)

https://marcusmichaels.com/wiggly-border-generator/
1•marcusmichaels•4m ago•0 comments

TCS, Google Cloud Launch Gemini Experience Centre for Manufacturing AI

https://menafn.com/1110835624/TCS-Google-Cloud-Launch-Gemini-Experience-Centre-For-Manufacturing-AI
1•01-_-•4m ago•0 comments

America and Public Disorder

https://walkingtheworld.substack.com/p/america-and-public-disorder
1•jackyli02•6m ago•0 comments

OpenBrushograph – A Brush Painting Robot

https://github.com/openBrushograph/openBrushograph_hardware
1•Klaster_1•7m ago•0 comments

LA Cotidianidad

1•jgalera•10m ago•0 comments

SlowQL – stop bad SQL before it reaches production

https://github.com/makroumi/slowql
1•makroumi•14m ago•1 comments

Cursor-goes-to-war-for-AI-coding-dominance

https://www.forbes.com/sites/annatong/2026/03/05/cursor-goes-to-war-for-ai-coding-dominance/
1•royka118•14m ago•0 comments

Ask HN: Seeking a Lobste.rs Invitation

1•eouzoe•15m ago•0 comments

Using AI Agents in Software Development 2026 [audio]

https://overcommitted.dev/using-ai-agents-in-software-development-2026-current-uses-and-future-po...
1•mooreds•17m ago•0 comments

The Eye of the Mathematician

https://aeon.co/essays/how-should-we-define-mathematical-beauty-in-the-ai-age
2•rifish•17m ago•0 comments

Making art with CSS gradients and corner-shape and skew

https://cassidoo.co/post/css-wavy-art/
2•mooreds•17m ago•0 comments

Author of the Cicada 3301 Mystery

https://wondrousnet.blogspot.com/2022/11/cicada-3301-solution.html
1•morethenthis•19m ago•0 comments

Frequently-used FFmpeg recipes (2025)

https://henry.codes/writing/frequently-used-ffmpeg-recipes/
2•mooreds•20m ago•0 comments

Show HN: Real Browser MCP – your AI agent can see your real browser

1•ofershapira•21m ago•0 comments

Actuarial Warfare: How Seven Insurance Letters Closed the Strait of Hormuz

https://shanakaanslemperera.substack.com/p/actuarial-warfare-how-seven-insurance
2•throw0101c•22m ago•0 comments

Blue Origin Starts 800,000sqft 'Project Horizon' Expansion Process

https://talkoftitusville.com/2026/03/05/blue-origin-starts-800000sqft-project-horizon-expansion-p...
1•bookmtn•22m ago•0 comments

My distance from web development prepared me for the age of AI agents

https://write.as/xbetpmu6t4u4p
2•_spyro_•26m ago•2 comments

Agentcontainer: A standard way to declare agent containers for your projects

https://github.com/jpmelos/agentcontainer
1•jpmelos•27m ago•0 comments

Show HN: How many working days in 2026? And your income in pizzas

https://gettti.me/tools/working-days
1•v_b•33m ago•0 comments

Show HN: Free market intelligence tool, analyze HN, find users pain points

https://whatstechin.com/
1•losalah•33m ago•0 comments

How I Use Claude Code as a Designer at Shopify [video]

https://www.youtube.com/watch?v=aVDAhJ3PtLg
1•benr•35m ago•1 comments

Show HN: Engram – open-source persistent memory for AI agents (Bun and SQLite)

https://github.com/zanfiel/engram
1•zanfiel•36m ago•1 comments

The Complete Guide to Building Skills for Claude [pdf]

https://resources.anthropic.com/hubfs/The-Complete-Guide-to-Building-Skill-for-Claude.pdf
2•Terretta•37m ago•0 comments

Agentic coding doesn't = technical debt

https://inmydata.ai/blog/agentic-coding-discipline/
1•nfinch•38m ago•0 comments

A 10% traffic spike took down a stable system in 3 minutes and 47 seconds

https://www.orchenginex.com/publications/queue-collapse-traffic-spike
1•Mlondy•40m ago•0 comments

Show HN: This is what I Want from the Internet

https://jetzt.cx/about
1•krickelkrackel•44m ago•1 comments

Nvidia backs AI data center startup Nscale as it hits $14.6B valuation

https://www.cnbc.com/2026/03/09/nscale-ai-data-center-nvidia-raise.html
7•voxadam•46m ago•1 comments