(Yes, I appreciate that some people may be disabled in such a way that it makes sense to use voice assistants, eg motor problems)
If a light cannot be automatically on when I need it (like a motion sensor) or controlled with a dedicated button within arms reach (like a remote on my desk) then the third best option is one that lets me control it without interrupting what I'm doing, moving from where I am, using my hands, or possessing anything (a voice assistant).
I don't have just one light per room though, some spaces like my workshop or living room have a lot of lighting options, and flitting around the room flipping a bunch of switches is clumsy and unnecessary. The preference is always towards automation (e.g. when I play a movie in Jellyfin, the lights dim) but there are situations where I just need to ask for the workbench light.
This is not just flip a switch territory.
I mostly set timers because it’s one of the few things that always works.
It’s why I haven’t and won’t enable Gemini, and I’ll likely chuck my nest minis once I’m forced to have an LLM-based experience. Hopefully they’ll be able to at least function as dumb Bluetooth speakers still but I’m not holding out hope on that end
The thing that kills this for me (and they even mentioned it) is wake word detection. I have both the HA voice preview and FPH Satellite1 devices, plus have experimented with a few other options like a Raspberry Pi with a conference mic.
Somehow nothing is even 50% good as my Echo devices at picking up the wake word. The assistant itself is far better, but that doesn't matter if it takes 2-3 tries to get it to listen to you. If someone solves this problem with open hardware I'll be immediately buying several.
I'd prefer to physically press a button on an intercom box than having something churning away constantly processing sound.
I'm looking forward to whenever my Pebble ships so I can recreate that experience with this: https://github.com/skylord123/pebble-home-assistant-ws
Also I have all my voice assistant devices mounted to the ceiling
Could be pressed even if your hands were busy.
Hopefully the new boards will be here soon, but another issue is that I don't really have anything that can measure microamp consumption, so any testing takes days of waiting for the battery to run down :(
I do think these clones are the issue, though. They had a LED I couldn't turn off, so they'd literally shine forever. They don't seem engineered for low quiescent current, so fingers crossed with the new ones.
I haven't tried training my own wake word though, I'm tempted to see if it improves things.
Funky chicken for Gemini
Penguin dance for OpenAI
Claude?
The wake word detection isn't great, and the audio quality is abysmal (for voice responses, not music).
Amazon has ruined their Alexa and Echo devices with ads and annoying nag messages.
I'd really like an open alternative, but the basics are lacking right now.
Some of the devices contain browsers, and people have set up hacky ways to turn them into thin clients through that, but it’s not particularly reliable IME.
I heard some Chinese brands which made similar hardware for Chinese consumers don’t lock their devices down, letting you flash an open install of Android on them, but I haven’t seen anyone try that IRL.
[0] https://www.home-assistant.io/voice_control/worlds-most-priv...
the core issue is prosody: kokoro and piper are trained on read speech, but conversational responses have shorter breath groups and different stress patterns on function words. that's why numbers, addresses, and hedged phrases sound off even when everything else works.
the fix is training data composition. conversational and read speech have different prosody distributions and models don't generalize across them. for self-hosted, coqui xtts-v2 [1] is worth trying if you want more natural english output than kokoro.
btw i'm lily, cofounder of rime [2]. we're solving this for business voice agents at scale, not really the personal home assistant use case, but the underlying problem is the same.
dewey•5h ago
> Understands when it is in a particular area and does not ask “which light?” when there is only one light in the area, but does correctly ask when there are multiple of the device type in the given area.
alex_young•3h ago
I set 2 timers for the same thing somehow. I then tried to cancel one of them.
Eventually they both rang and she listened when I said stop.0_____0•3h ago
abroadwin•3h ago
Skidaddle•3h ago
sanswork•1h ago
jazzyjackson•1h ago
“There’s nothing to stop”
> me, suddenly aware of how the AI takeover will happen
xp84•8m ago
Me: "Text Jane Would you mind dropping down the robe and underpants"
Siri: Sends Jane "Would you mind dropping down"
Me: rolls eyes "Text Jane robe and underpants"
Siri: "I don't see a Jane Robe in your contacts."
Me: wishes I could drown Siri in the bathtub
It's wild to me that Apple got the ability to do the actual speech-to-text part pretty much 100% solved more than half a decade ago, yet struggles in 2026 to turn streams of very simple, correctly-transcribed text into intents in ways that even a local model can figure out. Siri is good STT, a bunch of serviceable APIs that can control lots of stuff, with the digital equivalent of a brain-damaged cat sitting at the center of it guaranteeing the worst possible experience.