I built a voice-to-text tool that runs entirely in your browser. No account required for the free tier, no data sent to my servers.
Try it: https://voicetotextonline.com
Why I built this:
- Existing tools require signups, have minute limits, or cost money - Google Docs voice typing requires a Google account - Dragon costs $150-500 - Otter.ai has free tier limits
(A) Free Features (no account required):
1/ Core Transcription:
- Real-time voice-to-text using Web Speech API - 55+ languages supported - Auto-punctuation & sentence case options - Works offline after first load (PWA)
2/ AI Enhance (added based on user survey – 80% voted yes):
- Auto-fix grammar, punctuation & formatting - One-click cleanup of transcripts
3/ My Projects (local storage):
- Save transcripts to browser localStorage - Organize with folders (Notes, Work, Personal, etc.) - Custom folders & tags - Search across all transcripts - Edit, copy, download as TXT - 100% private – never leaves your device
- Export:
- Copy to clipboard - Download as TXT or DOCX
(B) Pro Features ($10/month or $1/hour pay-per-use):
1/ File Upload & Transcription:
- Upload audio/video files (MP3, WAV, M4A, MP4, MOV, AVI, MKV) - Up to 500MB per file - Batch upload (10 files at once) - Powered by AssemblyAI (95%+ accuracy) - 150 hours/month transcription
2/ Advanced Features:
- Real-time progress with ETA - Speaker labels - In-browser audio recording (5 min with pause/resume) - Translation to 25+ languages (GPT-4o)
3/ Export Formats:
- TXT, SRT, VTT, JSON with timestamps - Segment-level timestamp precision
4/ Cloud Storage:
- Transcription history in the cloud - 10 GB storage, 1,000 files/month
(C) Data & Privacy:
Free tier:
- All transcripts stored in browser localStorage only - Never touches our servers - 100% private
Pro tier:
- Audio files stored in Supabase (encrypted) - Files retained for 30 days for re-download, then auto-deleted - Transcripts stored permanently in your account - You can delete any transcript or your entire account anytime - We don't use your data for training
Tech stack:
- Next.js 14 (App Router) - Web Speech API (free real-time transcription) - AssemblyAI (Pro file transcription, 95%+ accuracy) - OpenAI GPT-4o (AI Enhance & translation) - Supabase (auth & storage) - Stripe (payments) - Tailwind CSS - Hosted on Vercel
Limitations:
- Real-time transcription doesn't work in Firefox (Web Speech API not supported) - Free tier accuracy depends on Chrome's speech engine
Would love feedback on UX, pricing, or feature ideas. Considering open-sourcing the core transcription component.