frontpage.

Show HN: NanoWakeWord – Open-source wake word training for any device

https://github.com/arcosoph/nanowakeword

1•arcosoph_ai•2h ago

Hacker News,

Training custom wake words like "Hey Alexa" is often a resource-intensive task, demanding powerful hardware and complex manual tuning.

NanoWakeWord is an open-source framework designed to solve this. It features an intelligent engine that automates the ML pipeline, making it possible to build high-performance, production-ready wake word models with minimal effort.

What makes it different:

Train Anywhere, On Anything: The core architecture is built for extreme efficiency. You can train on massive, terabyte-scale datasets using a standard laptop or even a low-spec machine, all without needing a GPU. This is achieved through memory-mapped files that stream data directly from disk, eliminating RAM limitations.

Intelligent Automation: The framework analyzes your data to automatically configure an optimal model architecture, learning schedule, and training parameters. It removes the guesswork from building a robust model.

Total Flexibility and Control: While it automates everything, it also offers deep customization. You can choose from 11+ built-in architectures (from lightweight DNNs to SOTA Conformers) or easily extend the framework to add your own custom architecture. Every single parameter generated by the engine can be manually overridden for full control.

Smarter Data Processing: It moves beyond generic negatives. The system performs phonetic analysis on your wake word to synthesize acoustically confusing counter-examples, which drastically reduces the false positive rate in real-world use.

Ready for the Edge: Models are exported to the standard ONNX format. The framework also includes a lightweight, stateful streaming inference engine designed for low-latency performance on devices like the Raspberry Pi.

Try It in Your Browser (No Install Needed):

This single Google Colab notebook is a playground to train your first model. Inside, you can select and experiment with any of the available architectures with just a few clicks.

Launch the Training Notebook: https://colab.research.google.com/github/arcosoph/nanowakewo...

The goal is to produce models with an extremely low false positive rate (tests show less than one false activation every 16-28 hours on average).

The project is actively developed by Arcosoph, and all feedback or questions are highly welcome!

Key Links:

GitHub Repo: https://github.com/arcosoph/nanowakeword

PyPI Package: https://pypi.org/project/nanowakeword/

Pre-trained Models: https://huggingface.co/arcosoph/nanowakeword-models#pre-trai...

Discord Community: https://discord.gg/rYfShVvacB

Free website privacy and exposure scanner (no signup, instant results)

Show HN: MemPalace Agent that sits in front of any LLM endpoint and gives memory

Rationale for the design of the Ada programming language

NIST Weighs in on the Mystery of the Gravitational Constant

Learn How to Structure a Test

AI is an extinction-level event for your rules of thumb

More tools for testing SQL dialects

Syntonic Dentiforms Redux

Anthropic rolls out Claude Opus 4.7, an AI model that is less risky than Mythos

Show HN: Modulo – A daily spatial reasoning puzzle

Insurance carriers back away from covering AI outputs

Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7

Show HN: Chitragupta Kafka Identity and topic level cost attribution

India's TCS to probe sexual assault, religious conversion in office

Show HN: Stage – Putting humans back in control of code review

Ask HN: Is Opus 4.7 obsessed with malware for anybody else?

Meta Platforms, Broadcom Partners to Co-Develop Multi-Gen Silicon AI Chips

Bootstrapping AI Evals from Context (Why 'Just Asking Claude' Fails)

Shader Lab, like Photoshop but for shaders

Linux in Postgres in Linux in Postgres in Linux in Postgres

Tape 05 – Boards of Canada [video]

Why Gen Zers are trashing smartphones: 'People are just sick of it'

ISMS CORE – Self-hosted ISO 27001 GRC platform (Docker Compose, 23 frameworks)

Snap to Cut 16% of Workforce as It Seeks Profitability

Posting on the Internet Frightens Me

Dwarkesh Patel's Podcast with Nvidia CEO Jensen Huang

Show HN: Xata, open-source Postgres platform with copy-on-write branches

Gemini can now create personalized AI images by digging around in Google Photos

Claude Code injects hidden prompts into file reads to stop malware tweaks

Many-Step Sequences in Go