Text inputs are too slow for complex prompting if you're vibe coding or generating media. I built a full-stack Voice Mode component (UI + logic + transcription) for React/Next.js. It handles the awkward browser audio stuff so you don't have to.
Also used Gemini 3 to generate that entire page in one prompt. :-)