frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: VoxConvo – "X but it's only voice messages"

https://voxconvo.com
9•siim•5h ago
Hi HN,

I saw this tweet: "Hear me out: X but it's only voice messages (with AI transcriptions)" - and couldn't stop thinking about it.

So I built VoxConvo.

Why this exists:

AI-generated content is drowning social media. ChatGPT replies, bot threads, AI slop everywhere.

When you hear someone's actual voice: their tone, hesitation, excitement - you know it's real. That authenticity is what we're losing.

So I built a simple platform where voice is the ONLY option.

The experience:

Every post is voice + transcript with word-level timestamps:

Read mode: Scan the transcript like normal text or listen mode: hit play and words highlight in real-time.

You get the emotion of voice with the scannability of text.

Key features:

- Voice shorts

- Real-time transcription

- Visual voice editing - click a word in transcript deletes that audio segment to remove filler words, mistakes, pauses

- Word-level timestamp sync

- No LLM content generation

Technical details:

Backend running on Mac Mini M1:

- TypeGraphQL + Apollo Server

- MongoDB + Atlas Search (community mongo + mongot)

- Redis pub/sub for GraphQL subscriptions

- Docker containerization for ready to scale

Transcription:

- VOSK real time gigaspeech model eats about 7GB RAM

- WebSocket streaming for real-time partial results

- Word-level timestamp extraction plus punctuation model

Storage:

- Audio files are stored to AWS S3

- Everything else is local

Why Mac Mini for MVP? Validation first, scaling later. Architecture is containerized and ready to migrate. But I'd rather prove demand on gigabit fiber than burn cloud budget.

Comments

cdrini•5h ago
Neat idea! Not sure if I'm willing to register just try it, though. Having the main feed public would be nice! Or even a sample feed.
1bpp•4h ago
How would this prevent someone from just plugging ElevenLabs into it? Or the inevitable more realistic voice models? Or just a prerecorded spam message? It's already nearly impossible to tell if some speech is human or not. I do like the idea of recovering the emotional information lost in speech -> text, but I don't think it'd help the LLM issue.
layman51•4h ago
Or also a genuine human voice reading a script that’s partially or almost entirely LLM written? I think there must be some video content creators who do that.
SrslyJosh•4h ago
Detecting "human speech" means shutting out people who cannot speak and rely on TTS for verbal communication.
estimator7292•1h ago
Also speech impediments, accents, physical disabilities, etc etc.

Tech culture just refuses to even be aware of people as physical beings. It's just spherical users in a vacuum and if you don't fit the mold, tough.

cjflog•4h ago
Did you ever use AirChat?
esafak•4h ago
So you're going to reject recordings detected as computer generated, or human recorded from a computer-generated script?

I feel like you are making your users jump through hoops to do bot and slop detection, when you ought to be investing in technology to do the same. Here is a focusing question: would you still demand audio recordings if you had that technology?

Maybe you will court an interesting set of users when you do this? I just know I will not be one of them; ain't got time for that. Good luck.

zahlman•4h ago
> I saw this tweet: "Hear me out: X but it's only voice messages (with AI transcriptions)" - and couldn't stop thinking about it.

> Why this exists: AI-generated content is drowning social media.

> Real-time transcription

... So you want to filter out AI content by requiring users to produce audio (not really any harder for AI than text), and you add AI content afterward (the transcriptions) anyway?

I really think you should think this through more.

The "authenticity" problem is fundamentally about how users discover each other. You get flooded with AI slop because the algorithm is pushing it in front of you. And that algorithm is easily gamed, and all the existing competitors are financially incentivized to implement such an algorithm and not care about the slop.

Also, I looked at the page source and it gives a strong impression that you are using AI to code the project and also that your client fundamentally works by querying an LLM on the server. It really doesn't convey the attitude supposedly motivating the project.

Nice tech demo though, I guess.

jagged-chisel•3h ago
“Sign in with Google”

:grimace:

Sorry, but I have to pass.

oulipo2•3h ago
Idea is cool, but the STT is bad (at least with an accent), and the fact that you need to edit each word is too cumbersome
teunlao•1h ago
Impressive tech execution, but the format has fundamental scaling issues.

Clubhouse lost 93% of users from peak. WhatsApp sends 7 billion voice messages daily - but those are DMs, not feeds.

The math doesn't work: reading is 50-80% faster than listening. You can skim 50 text posts in 100 seconds. 50 voice posts? 15 minutes.

Voice works async 1-to-1. You built Twitter where every tweet is a 30-second voicemail nobody has time to listen to.

The transcription proves it - users will read, not listen. Which makes this "text feed with worse UX"

Show HN: Find matching acrylic paints for any HEX color

https://acrylicmatch.com/
7•dotspencer•4d ago•3 comments

Show HN: VoxConvo – "X but it's only voice messages"

https://voxconvo.com
9•siim•5h ago•11 comments

Show HN: Three Emojis, a daily word puzzle for language learners

https://threeemojis.com/en-US/play/hex/en-US/2025-11-07
20•knuckleheads•8h ago•19 comments

Show HN: Command line YouTube downloader,a universal media solution for everyone

https://github.com/Saffron-sh/m2m
11•saffron-sh•7h ago•5 comments

Show HN: I scraped 3B Goodreads reviews to train a better recommendation model

https://book.sv
565•costco•2d ago•234 comments

Show HN: Rankly – The only AEO platform to track AI visibility and conversions

https://tryrankly.com
2•satj•5h ago•0 comments

Show HN: See chords as flags – Visual harmony of top composers on musescore

https://rawl.rocks/
122•vitaly-pavlenko•2d ago•28 comments

Show HN: qqqa – A fast, stateless LLM-powered assistant for your shell

https://github.com/matisojka/qqqa
153•iagooar•1d ago•84 comments

Show HN: Dynamic code and feedback walkthroughs with your coding Agent in VSCode

https://www.intraview.ai/hn-demo
41•cyrusradfar•1d ago•9 comments

Show HN: OSS implementation of Test Time Diffusion that runs on a 24gb GPU

https://github.com/eamag/MMU-RAG-competition
20•eamag•16h ago•0 comments

Show HN: Pingu Unchained an Unrestricted LLM for High-Risk AI Security Research

https://pingu.audn.ai
8•ozgurozkan•7h ago•6 comments

Show HN: TabPFN-2.5 – SOTA foundation model for tabular data

https://priorlabs.ai/technical-reports/tabpfn-2-5-model-report
71•onasta•1d ago•12 comments

Show HN: I built a Free "Masterclass" from YouTube clips

https://opencademy.com/
3•longerpath•8h ago•7 comments

Show HN: Ambient light sensor control of keyboard and screen brightness in Linux

https://github.com/donjajo/als-led-backlight
23•donjajo•5d ago•1 comments

Show HN: Extending LLM SVG generation beyond pelicans and bicycles

https://gally.net/temp/20251107pelican-alternatives/index.html
6•tkgally•15h ago•0 comments

Show HN: Flutter_compositions: Vue-inspired reactive building blocks for Flutter

https://github.com/yoyo930021/flutter_compositions
44•yoyo930021•1d ago•23 comments

Show HN: Linguistic RL – A 7B model discovers Occam's Razor through reflection

https://github.com/DRawson5570/linguistic-rl-scheduling
2•drawson5570•12h ago•0 comments

Show HN: Lanturn – A smart headlamp running voice+vision on ESP32

https://github.com/getchannel/lanturn
2•Aeroi•13h ago•1 comments

Show HN: XML-Lib – An over-engineered XML workflow with guardrails and proofs

https://github.com/farukalpay/xml-lib
3•HenryAI•13h ago•0 comments

Show HN: A Lightweight Kafka Alternative

5•kellyviro•14h ago•0 comments

Show HN: A CSS-Only Terrain Generator

https://terra.layoutit.com
363•rofko•3d ago•82 comments

Show HN: I made a better DOM morphing algorithm

https://joel.drapper.me/p/morphlex/
7•joeldrapper•16h ago•0 comments

Show HN: [npm] Recreation of YouTube's "ambient glow" effect

https://www.npmjs.com/package/video-ambient-glow
3•JSXJedi•16h ago•1 comments

Show HN: Chess960v2 – 100 Rounds Done, Some Openings Still Undefeated

https://chess960v2.com/en
3•lavren1974•21h ago•0 comments

Show HN: I built a search engine for all domains on the internet

https://domainexplorer.io
5•iryndin•21h ago•9 comments

Show HN: Switchport – A/B Test Your LLM Prompts in Production

https://switchport.ai/
2•rjfc•22h ago•0 comments

Show HN: FlashVSR – High-Speed 4K Video Super-Resolution

https://www.aiupscaler.net/flashvsr
2•lu794377•23h ago•0 comments

Show HN: Strange Attractors

https://blog.shashanktomar.com/posts/strange-attractors
798•shashanktomar•1w ago•78 comments

Show HN: Practice your captcha skills with Google's weirdest Street Views

https://street-captcha.netlify.app/
3•SantiDev•23h ago•1 comments

Show HN: What Is Hacker News Working On?

https://waywo.eamag.me/
12•eamag•1d ago•2 comments