frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

SpeakEasy – Voice-to-Text with File Context for AI Agents

https://speakeasydev.com
1•oFlyingPanda•2h ago

Comments

oFlyingPanda•2h ago
I've been using AI agents quite a bit over the last year or so and one thing I've noticed is that I'm spending 4-5x more time typing than I had before using AI agents (hand coding...gasp!)

Mind you, I type in colemak and still manage to reach 140WPM on my best days, so I'm no slow typist. My first thought was giving Windows built in dictation tool, but the formatting was awful and quickly found how much I missed having the ability to provide file context to ensure my agent had the best resources possible to fix my problem.

I figured, what's a better way to build my skills as an 'AI engineer' than undertaking a project in a tech stack that I know absolutely nothing about (electron). Well I picked up Windsurf, spent many many late nights over the course of a couple weeks coding & QA testing...and I came up with SpeakEasy.

SpeakEasy is a desktop application that integrates with OpenAI's whisper API to transform your dictated speech into wonderfully and correctly formatted text that agents are more likely to understand. I didn't stop there though, we were still missing the ability to add file context during the voice-to-text process. So I simply added it.

Another few late nights, edge case testing, and viola...a simple, easy to use voice-to-text tool that has the ability to provide file context to your AI Agent (Windsurf, Claude Code currently supported with '@' syntax to add file context).

I know the entire thing was built by AI, but I'm a senior software engineer with 10+ years of experience and like to think that even while I've lost some of the precise details of what exactly is going on under the hood, my foundational architecture that this application was stood up on will hold strong and my strong QA testing will help prevent any disastrous bugs!

So if you're like me and tired of typing all day, I'd love for you to consider trying my app. I honestly use it every day and sharing what I've built with other people who can appreciate it makes me incredibly happy. There's a pretty generous free tier available (100 transcriptions/month), no sign up needed.

The only caveat here is that before using my app, you'll need to make sure you've got an OpenAI API key (stored locally and only used for Whisper API calls), otherwise you won't be able to make any transcriptions.

Once you've installed and gone through the onboarding process, it's pretty simple to get started. Just use your toggle hotkey (Ctrl + Shift + Space) or push-to-talk (Ctrl + `) and start a recording. Once you've finished speaking, your speech will be transcribed and inserted at your current cursor position and automatically press enter (to allow you to fluidly interact with your AI agent).

The transcribed text will also be saved to your clipboard (configurable) in case you weren't quite ready to insert the text. To try out the file context, while you're speaking just clearly say the file name and the app will translate it from 'file.tsx' to '@file.tsx' + press tab. Translating the file name to have this '@' syntax along with pressing tab is the process which allows files to be added to context just using our voice!

Anyways, I hope someone at least finds what I've built cool and useful, I'd love to hear any and all feedback if you have it.

Thanks for reading, FlyingPanda

sigmaprimus•1h ago
Sounds like a pretty neat program, Have you played around with Voice Access The application included with Windows? I'm using it right now to type this message. Unfortunately I can't use programs that have a push to talk as I am paralyzed from the neck down... But believe me when I say It's nice to see someone working on text to speech And that any improvement or new application can only make mine and people like me lives better. I haven't really done much with the open AI offerings but I do use Gemini CLI and it is exhausting To the point where my throat gets dry and I have to drink some water if I want to Keep working** Edit** drinking water involves me calling a care aid to assist me!

Peekpoke: Tiny retro fantasy console with two commands peek and poke

https://github.com/abagames/peekpoke
1•woolion•16s ago•0 comments

Bubble and Build: The 2025 Mad (Machine Learning, AI and Data) Landscape

https://www.mattturck.com/mad2025
1•teleforce•49s ago•0 comments

Pleiades star cluster revealed as just one part of a stellar family

https://phys.org/news/2025-11-pleiades-star-cluster-revealed-vast.html
1•divbzero•1m ago•0 comments

Scheduling in LLM Inference

https://fergusfinn.com/blog/scheduling-in-inference-engines/
1•somnial•3m ago•0 comments

Photoroom T2i Open Model

https://huggingface.co/blog/Photoroom/prx-open-source-t2i-model
1•pilooch•7m ago•0 comments

How Markets could topple the global economy

https://www.economist.com/leaders/2025/11/13/how-markets-could-topple-the-global-economy
1•petethomas•10m ago•1 comments

Image Cash Letter – The Federal Reserve Banks image format for exchanging checks [pdf]

https://www.frbservices.org/binaries/content/assets/crsocms/financial-services/check/setup/frb-x9...
1•philippb•10m ago•0 comments

A/B Tests over Evals

https://www.raindrop.ai/blog/thoughts-on-evals/
1•Nischalj10•12m ago•0 comments

Cronmaster: Cron Management Made Easy

https://github.com/fccview/cronmaster
1•thunderbong•14m ago•0 comments

Show HN: Spatial CAPTCHA – 3D spatial reasoning test against AI bots

https://github.com/Shining04/Spatial-CAPTCHA
1•Shining_S•17m ago•1 comments

How markets could topple the global economy

http://economist.com/leaders/2025/11/13/how-markets-could-topple-the-global-economy
1•helsinkiandrew•17m ago•2 comments

Crowdsourced Prompt Engineering

1•notjunior•19m ago•0 comments

Support the call for Memory Safety incentives in EU cybersecurity policies

https://trifectatech.org/blog/support-memory-safety-incentives-in-eu-cybersecurity-policies/
2•weinzierl•23m ago•0 comments

Starting Debugging Session from CLI

1•regular8901•23m ago•0 comments

Open source minimalist Android launcher

https://github.com/GeorgeClensy/Escape-Launcher
2•GeorgeClensy•25m ago•1 comments

Show HN: Matcha or Swamp?

https://fragkakis.github.io/matchaorswamp/
1•fragkakis•25m ago•0 comments

Aaronson handles some of that quantum hype

https://scottaaronson.blog/?p=9325
1•gsf_emergency_4•27m ago•1 comments

RegreSQL: Regression Testing for PostgreSQL Queries

https://boringsql.com/posts/regresql-testing-queries/
3•radimm•33m ago•0 comments

Record Number of Young Women Want to Ditch the U.S. Under Trump

https://www.thedailybeast.com/record-number-of-young-women-want-to-ditch-the-us-under-trump/
1•ryan_j_naughton•33m ago•1 comments

We created a searchable database with all 20k files from Epstein's Estate

https://couriernewsroom.com/news/we-created-a-searchable-database-with-all-20000-files-from-epste...
2•nabla9•34m ago•0 comments

Mozilla Adding New 'AI Window' Feature to Its Firefox

https://connect.mozilla.org/t5/discussions/building-ai-the-firefox-way-shaping-what-s-next-togeth...
2•zekrioca•38m ago•1 comments

Multi-User Dungeon (MUD)

https://en.wikipedia.org/wiki/Multi-user_dungeon
4•reconnecting•46m ago•1 comments

Not So Wonderful Things

https://stage-write.ghost.io/not-so-wonderful-things/
1•ides_dev•48m ago•0 comments

India state offers menstrual leave to all working women

https://www.bbc.com/news/articles/c78zg4810jro
2•vinni2•49m ago•0 comments

I built a small Sora-style video generator as a side experiment

https://saro2.ai
2•kelly99•50m ago•1 comments

Concurrent Local Coding Agents – My opinionated, flexible version of Cursor 2.0

https://xxchan.me/blog/2025-11-14-concurrent-local-coding-agents/index_en/
1•xxchan22•57m ago•0 comments

Show HN: SES Template Manager – Local, Native AWS SES Template Editor for Mac

https://apps.apple.com/us/app/ses-template-manager/id6754749647?mt=12
2•gtlsgamr•58m ago•0 comments

Ryzen 7 Pro Gaming Mini PC Drops to Its Lowest Price yet on Amazon

https://kotaku.com/this-mini-gaming-pc-from-gmktec-dropped-by-over-100-2000637792
2•PaulHoule•1h ago•0 comments

WSJ Report on the iPhone Air Pegs It as a 'Flop'

https://daringfireball.net/linked/2025/11/13/wsj-iphone-air-flop
2•tosh•1h ago•0 comments

Lawmakers Want to Ban VPNs–and They Have No Idea What They're Doing

https://www.eff.org/deeplinks/2025/11/lawmakers-want-ban-vpns-and-they-have-no-idea-what-theyre-d...
11•gslin•1h ago•0 comments