No echo cancellation, so you will need a headset.
"Not everything needs to be spoken" - this is the idea I wanted to capture with this project. It generates text for TTS (usually short) and displays relevant information on the screen (longer sections).
Few things I learned on the journey:
- collecting jargons from previous chats to the llm based asr system greatly improves its ability to handle jargons during transcription
- openrouter is awesome!
- end to end speech to speech systems aren't all that great once tool calling is involved. for any serious use case, tool calling will be involved. so it has to go through speech -> text, text processing, text -> speech anyhow.
- once you are serious about a project, Claude code will consume the weekly quota rather quickly. I neded up with opencode + kimi 2.5. 90% of the code is done by chatbots
Usable, tested vibe coded PRs are welcome!