frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: A luma dependent chroma compression algorithm (image compression)

https://www.bitsnbites.eu/a-spatial-domain-variable-block-size-luma-dependent-chroma-compression-...
20•mbitsnbites•3d ago•1 comments

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

https://github.com/Momciloo/fun-with-clip-path
37•momciloo•5h ago•5 comments

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev

https://github.com/sandys/kappal
40•sandGorgon•2d ago•17 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
293•isitcontent•1d ago•38 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
361•eljojo•1d ago•217 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
373•vecti•1d ago•170 comments

Show HN: PalettePoint – AI color palette generator from text or images

https://palettepoint.com
2•latentio•2h ago•0 comments

Show HN: Smooth CLI – Token-efficient browser for AI agents

https://docs.smooth.sh/cli/overview
97•antves•2d ago•70 comments

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

https://github.com/phreda4/r3
85•phreda4•1d ago•17 comments

Show HN: I built a <400ms latency voice agent that runs on a 4gb vram GTX 1650"

https://github.com/pheonix-delta/axiom-voice-agent
2•shubham-coder•4h ago•1 comments

Show HN: Stacky – certain block game clone

https://www.susmel.com/stacky/
3•Keyframe•5h ago•0 comments

Show HN: Artifact Keeper – Open-Source Artifactory/Nexus Alternative in Rust

https://github.com/artifact-keeper
155•bsgeraci•1d ago•64 comments

Show HN: BioTradingArena – Benchmark for LLMs to predict biotech stock movements

https://www.biotradingarena.com/hn
29•dchu17•1d ago•12 comments

Show HN: A toy compiler I built in high school (runs in browser)

https://vire-lang.web.app
3•xeouz•5h ago•1 comments

Show HN: Slack CLI for Agents

https://github.com/stablyai/agent-slack
55•nwparker•2d ago•12 comments

Show HN: ARM64 Android Dev Kit

https://github.com/denuoweb/ARM64-ADK
18•denuoweb•2d ago•2 comments

Show HN: Gigacode – Use OpenCode's UI with Claude Code/Codex/Amp

https://github.com/rivet-dev/sandbox-agent/tree/main/gigacode
23•NathanFlurry•1d ago•10 comments

Show HN: Nginx-defender – realtime abuse blocking for Nginx

https://github.com/Anipaleja/nginx-defender
3•anipaleja•7h ago•0 comments

Show HN: MCP App to play backgammon with your LLM

https://github.com/sam-mfb/backgammon-mcp
3•sam256•9h ago•1 comments

Show HN: Micropolis/SimCity Clone in Emacs Lisp

https://github.com/vkazanov/elcity
173•vkazanov•2d ago•49 comments

Show HN: I'm 75, building an OSS Virtual Protest Protocol for digital activism

https://github.com/voice-of-japan/Virtual-Protest-Protocol/blob/main/README.md
9•sakanakana00•10h ago•2 comments

Show HN: I built Divvy to split restaurant bills from a photo

https://divvyai.app/
3•pieterdy•10h ago•1 comments

Show HN: Horizons – OSS agent execution engine

https://github.com/synth-laboratories/Horizons
27•JoshPurtell•1d ago•5 comments

Show HN: Daily-updated database of malicious browser extensions

https://github.com/toborrm9/malicious_extension_sentry
14•toborrm9•1d ago•8 comments

Show HN: Falcon's Eye (isometric NetHack) running in the browser via WebAssembly

https://rahuljaguste.github.io/Nethack_Falcons_Eye/
7•rahuljaguste•1d ago•1 comments

Show HN: Slop News – HN front page now, but it's all slop

https://dosaygo-studio.github.io/hn-front-page-2035/slop-news
22•keepamovin•15h ago•6 comments

Show HN: I Hacked My Family's Meal Planning with an App

https://mealjar.app
2•melvinzammit•12h ago•0 comments

Show HN: I built a free UCP checker – see if AI agents can find your store

https://ucphub.ai/ucp-store-check/
2•vladeta•13h ago•2 comments

Show HN: Local task classifier and dispatcher on RTX 3080

https://github.com/resilientworkflowsentinel/resilient-workflow-sentinel
25•Shubham_Amb•1d ago•2 comments

Show HN: Which chef knife steels are good? Data from 540 Reddit tread

https://new.knife.day/blog/reddit-steel-sentiment-analysis
2•p-s-v•6h ago•0 comments
Open in hackernews

Show HN: VoxConvo – "X but it's only voice messages"

https://voxconvo.com
10•siim•3mo ago
Hi HN,

I saw this tweet: "Hear me out: X but it's only voice messages (with AI transcriptions)" - and couldn't stop thinking about it.

So I built VoxConvo.

Why this exists:

AI-generated content is drowning social media. ChatGPT replies, bot threads, AI slop everywhere.

When you hear someone's actual voice: their tone, hesitation, excitement - you know it's real. That authenticity is what we're losing.

So I built a simple platform where voice is the ONLY option.

The experience:

Every post is voice + transcript with word-level timestamps:

Read mode: Scan the transcript like normal text or listen mode: hit play and words highlight in real-time.

You get the emotion of voice with the scannability of text.

Key features:

- Voice shorts

- Real-time transcription

- Visual voice editing - click a word in transcript deletes that audio segment to remove filler words, mistakes, pauses

- Word-level timestamp sync

- No LLM content generation

Technical details:

Backend running on Mac Mini M1:

- TypeGraphQL + Apollo Server

- MongoDB + Atlas Search (community mongo + mongot)

- Redis pub/sub for GraphQL subscriptions

- Docker containerization for ready to scale

Transcription:

- VOSK real time gigaspeech model eats about 7GB RAM

- WebSocket streaming for real-time partial results

- Word-level timestamp extraction plus punctuation model

Storage:

- Audio files are stored to AWS S3

- Everything else is local

Why Mac Mini for MVP? Validation first, scaling later. Architecture is containerized and ready to migrate. But I'd rather prove demand on gigabit fiber than burn cloud budget.

Comments

cdrini•3mo ago
Neat idea! Not sure if I'm willing to register just try it, though. Having the main feed public would be nice! Or even a sample feed.
siim•3mo ago
That's a good call. While there's no general public feed, individual profiles are public. For example, here's mine: https://voxconvo.com/siim
1bpp•3mo ago
How would this prevent someone from just plugging ElevenLabs into it? Or the inevitable more realistic voice models? Or just a prerecorded spam message? It's already nearly impossible to tell if some speech is human or not. I do like the idea of recovering the emotional information lost in speech -> text, but I don't think it'd help the LLM issue.
layman51•3mo ago
Or also a genuine human voice reading a script that’s partially or almost entirely LLM written? I think there must be some video content creators who do that.
SrslyJosh•3mo ago
Detecting "human speech" means shutting out people who cannot speak and rely on TTS for verbal communication.
estimator7292•3mo ago
Also speech impediments, accents, physical disabilities, etc etc.

Tech culture just refuses to even be aware of people as physical beings. It's just spherical users in a vacuum and if you don't fit the mold, tough.

siim•3mo ago
True. However making voice input has higher friction than typing chatgpt write me a reply.
cjflog•3mo ago
Did you ever use AirChat?
esafak•3mo ago
So you're going to reject recordings detected as computer generated, or human recorded from a computer-generated script?

I feel like you are making your users jump through hoops to do bot and slop detection, when you ought to be investing in technology to do the same. Here is a focusing question: would you still demand audio recordings if you had that technology?

Maybe you will court an interesting set of users when you do this? I just know I will not be one of them; ain't got time for that. Good luck.

zahlman•3mo ago
> I saw this tweet: "Hear me out: X but it's only voice messages (with AI transcriptions)" - and couldn't stop thinking about it.

> Why this exists: AI-generated content is drowning social media.

> Real-time transcription

... So you want to filter out AI content by requiring users to produce audio (not really any harder for AI than text), and you add AI content afterward (the transcriptions) anyway?

I really think you should think this through more.

The "authenticity" problem is fundamentally about how users discover each other. You get flooded with AI slop because the algorithm is pushing it in front of you. And that algorithm is easily gamed, and all the existing competitors are financially incentivized to implement such an algorithm and not care about the slop.

Also, I looked at the page source and it gives a strong impression that you are using AI to code the project and also that your client fundamentally works by querying an LLM on the server. It really doesn't convey the attitude supposedly motivating the project.

Nice tech demo though, I guess.

siim•3mo ago
Curious what made you think the backend uses LLMs for content generation?

To clarify:

1. transcription is local VOSK speech-to-text via WebSocket

2. live transcript post-processing has optional Gemini Flash-lite turned on which tries to fix obvious transcription mistakes, nothing else. The real fix here is more accurate transcriber.

3. backend: TypeGraphQL + MongoDB + Redis

The anti-AI stance isn't "zero AI anywhere", it's about requiring human input.

AI-generated audio is either too bad or too perfect. Real recorded voice has human imperfections.

jagged-chisel•3mo ago
“Sign in with Google”

:grimace:

Sorry, but I have to pass.

oulipo2•3mo ago
Idea is cool, but the STT is bad (at least with an accent), and the fact that you need to edit each word is too cumbersome
teunlao•3mo ago
Impressive tech execution, but the format has fundamental scaling issues.

Clubhouse lost 93% of users from peak. WhatsApp sends 7 billion voice messages daily - but those are DMs, not feeds.

The math doesn't work: reading is 50-80% faster than listening. You can skim 50 text posts in 100 seconds. 50 voice posts? 15 minutes.

Voice works async 1-to-1. You built Twitter where every tweet is a 30-second voicemail nobody has time to listen to.

The transcription proves it - users will read, not listen. Which makes this "text feed with worse UX"

siim•3mo ago
Speaking > typing for creation.

Reading > listening for consumption.

Talk to create, read to consume.

monadoid•3mo ago
Cool idea! You should make it so that I can only play one audio message at once (currently if I click to start two, they both play simultaneously)