frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: MCP to get latest dependency package and tool versions

https://github.com/MShekow/package-version-check-mcp
1•mshekow•7m ago•0 comments

The better you get at something, the harder it becomes to do

https://seekingtrust.substack.com/p/improving-at-writing-made-me-almost
2•FinnLobsien•8m ago•0 comments

Show HN: WP Float – Archive WordPress blogs to free static hosting

https://wpfloat.netlify.app/
1•zizoulegrande•10m ago•0 comments

Show HN: I Hacked My Family's Meal Planning with an App

https://mealjar.app
1•melvinzammit•10m ago•0 comments

Sony BMG copy protection rootkit scandal

https://en.wikipedia.org/wiki/Sony_BMG_copy_protection_rootkit_scandal
1•basilikum•13m ago•0 comments

The Future of Systems

https://novlabs.ai/mission/
2•tekbog•13m ago•1 comments

NASA now allowing astronauts to bring their smartphones on space missions

https://twitter.com/NASAAdmin/status/2019259382962307393
2•gbugniot•18m ago•0 comments

Claude Code Is the Inflection Point

https://newsletter.semianalysis.com/p/claude-code-is-the-inflection-point
3•throwaw12•20m ago•1 comments

Show HN: MicroClaw – Agentic AI Assistant for Telegram, Built in Rust

https://github.com/microclaw/microclaw
1•everettjf•20m ago•2 comments

Show HN: Omni-BLAS – 4x faster matrix multiplication via Monte Carlo sampling

https://github.com/AleatorAI/OMNI-BLAS
1•LowSpecEng•20m ago•1 comments

The AI-Ready Software Developer: Conclusion – Same Game, Different Dice

https://codemanship.wordpress.com/2026/01/05/the-ai-ready-software-developer-conclusion-same-game...
1•lifeisstillgood•22m ago•0 comments

AI Agent Automates Google Stock Analysis from Financial Reports

https://pardusai.org/view/54c6646b9e273bbe103b76256a91a7f30da624062a8a6eeb16febfe403efd078
1•JasonHEIN•26m ago•0 comments

Voxtral Realtime 4B Pure C Implementation

https://github.com/antirez/voxtral.c
2•andreabat•28m ago•1 comments

I Was Trapped in Chinese Mafia Crypto Slavery [video]

https://www.youtube.com/watch?v=zOcNaWmmn0A
2•mgh2•34m ago•0 comments

U.S. CBP Reported Employee Arrests (FY2020 – FYTD)

https://www.cbp.gov/newsroom/stats/reported-employee-arrests
1•ludicrousdispla•36m ago•0 comments

Show HN: I built a free UCP checker – see if AI agents can find your store

https://ucphub.ai/ucp-store-check/
2•vladeta•41m ago•1 comments

Show HN: SVGV – A Real-Time Vector Video Format for Budget Hardware

https://github.com/thealidev/VectorVision-SVGV
1•thealidev•43m ago•0 comments

Study of 150 developers shows AI generated code no harder to maintain long term

https://www.youtube.com/watch?v=b9EbCb5A408
1•lifeisstillgood•43m ago•0 comments

Spotify now requires premium accounts for developer mode API access

https://www.neowin.net/news/spotify-now-requires-premium-accounts-for-developer-mode-api-access/
1•bundie•46m ago•0 comments

When Albert Einstein Moved to Princeton

https://twitter.com/Math_files/status/2020017485815456224
1•keepamovin•47m ago•0 comments

Agents.md as a Dark Signal

https://joshmock.com/post/2026-agents-md-as-a-dark-signal/
2•birdculture•49m ago•0 comments

System time, clocks, and their syncing in macOS

https://eclecticlight.co/2025/05/21/system-time-clocks-and-their-syncing-in-macos/
1•fanf2•51m ago•0 comments

McCLIM and 7GUIs – Part 1: The Counter

https://turtleware.eu/posts/McCLIM-and-7GUIs---Part-1-The-Counter.html
2•ramenbytes•53m ago•0 comments

So whats the next word, then? Almost-no-math intro to transformer models

https://matthias-kainer.de/blog/posts/so-whats-the-next-word-then-/
1•oesimania•54m ago•0 comments

Ed Zitron: The Hater's Guide to Microsoft

https://bsky.app/profile/edzitron.com/post/3me7ibeym2c2n
2•vintagedave•57m ago•1 comments

UK infants ill after drinking contaminated baby formula of Nestle and Danone

https://www.bbc.com/news/articles/c931rxnwn3lo
1•__natty__•58m ago•0 comments

Show HN: Android-based audio player for seniors – Homer Audio Player

https://homeraudioplayer.app
3•cinusek•58m ago•2 comments

Starter Template for Ory Kratos

https://github.com/Samuelk0nrad/docker-ory
1•samuel_0xK•1h ago•0 comments

LLMs are powerful, but enterprises are deterministic by nature

3•prateekdalal•1h ago•0 comments

Make your iPad 3 a touchscreen for your computer

https://github.com/lemonjesus/ipad-touch-screen
2•0y•1h ago•1 comments
Open in hackernews

Neutts-air – Open-source, on device TTS

https://github.com/neuphonic/neutts-air
105•nopelynopington•4mo ago

Comments

nopelynopington•4mo ago
If this lives up to the demo it's a huge development for anyone looking to do realistic tts without paying to use an API
kristopolous•4mo ago
there's quite a number of pretty low overhead models around that do that in realtime these days.
foofoo12•4mo ago
no
MarsIronPI•3mo ago
But how many of them support voice cloning?

(Genuine question; I haven't seen any other than this one.)

nickthegreek•3mo ago
microsoft’s vibe voice.
MarsIronPI•3mo ago
VibeVoice (according to the repo description) is currently unavailable due to "misuse". But my impression was that it required a significant (>8GB) amount of VRAM? Or that it wasn't suitable for on-device for devices with low specs.
nickthegreek•3mo ago
its unavailable from their repo, but was released with an open license and mirrors exist. I'm not sure what the VRAM req are.
MarsIronPI•3mo ago
According to this issue[0] the 1.5B model needs 6GB of VRAM. Meanwhile it looks like NeuTTS is designed to be able to run on CPU, which is nice for older/lower-spec hardware.

0: https://github.com/microsoft/VibeVoice/issues/26#issuecommen...

gardnr•4mo ago
The model weighs 1.5GB [1] (the q4 quant is ~500MB)

The demo is impressive. It uses reference audio at inference time, and it looks like the training code is mostly available [2][3] with a reference dataset [4] as well.

From the README:

> NeuTTS Air is built off Qwen 0.5B

1. https://huggingface.co/neuphonic/neutts-air/tree/main

2. https://github.com/neuphonic/neutts-air/issues/7

3. https://github.com/neuphonic/neutts-air/blob/feat/example-fi...

4. https://huggingface.co/datasets/neuphonic/emilia-yodas-engli...

curioussquirrel•4mo ago
Could we finally get a decent opensource TTS app for Android? This project is very cool.
hsjdbsjeveb•4mo ago
SherpaTTS?

On Fdroid

deknos•4mo ago
i though this uses coqui which is not really opensource?
noman-land•4mo ago
SherpaTTS is decent.
curioussquirrel•3mo ago
Will check it out, thx!
mewmix•3mo ago
https://github.com/mewmix/nabu

you could try out nabu and let me know, i am working on adding more tts models in the future. It features all the kokoro voices, style mixing to create your own blend from their voices, basic kitten tts support, audio book / screen reader, LLMs and more :)

ks2048•4mo ago
Every couple of weeks I see a new TTS model showcased here and it’s always difficult to see how they differ from one another. Why don’t they describe the architecture and details of the trailing data?

My cynical side thinks people just take the state-of-the-art open source model, use an LLM to alter the source, minimal fine tuning to change the weights and they are able to claim “we built our own state of the art tts”.

I know it’s open source, so I can dig into the details myself, but are they any good high-level overviews of modern TTS, comparing/contrasting the top models?

DecoPerson•4mo ago
Without the resources to do a study to see if the quality is actually better or worse than other options, these open-TTS models must be judged by what you think of their output. (That is, do your own study.)

I've found some of them to be surprisingly good. I keep a list of them, as I have future project ideas that might need a good one, and each has its own merits.

I'm yet to find one that does good spoken informal Chinese. I'd appreciate if anyone can suggest one!

popalchemist•4mo ago
The special sauce here is that it is built on a very small LLM (Qwen) which means this can run on CPU-only, or even on micro devices like Raspberry Pi or a mobile phone.

Architecturally it's similar to other LLM-based TTS models (like OuteTTS) but the underlying LLM makes them able to release it under an Apache 2 license.

joshstrange•4mo ago
This is really neat. I cloned my voice and can generate text, but I can't seem to generate longer clips. The README.md says:

> Context Window: 2048 tokens, enough for processing ~30 seconds of audio (including prompt duration)

But it's cutting off for me before even that point. I fed it a paragraph of text and it gets part of the way through it before skipping a few words ahead, saying a few words more, then cutting off at 17 seconds. Another test just cut off after 21 seconds (no skipping).

Lastly, I'm on a MBP M3 Max with 128GB running Sequoia. I'm following all the "Guidelines for minimizing Latency" but generating a 4.16 second clip takes 16.51s for me. Not sure what I'm doing wrong or how you would use this in practice since it's not realtime and the limit is so low (and unclear). Maybe you are supposed to cut your text into smaller chunks and run them in parallel/sequence to get around the limit?

oidar•4mo ago
I really wish these cloning tts would incorporate some sort of prosody control.
miki123211•4mo ago
> Install espeak (required dependency)

This means using this TTS in commercial project is very dicy due to GPL3.

mlla•4mo ago
If only English support is required eSpeak could be replaced with MisakiSwift, which is under Apache 2.0 https://github.com/mlalma/MisakiSwift
diggan•4mo ago
Unfortunately seems it's Mac/iPhone only. Any cross platform alternatives?
baby•4mo ago
BTW I was looking to train a TTS on my voice, whats the best way to do that today locally?
aitchnyu•4mo ago
Tangential, how easy is it to verify watermark with a smartphone and how easy is it to erase the watermark?
ottah•3mo ago
Removing the watermark looks trivial https://github.com/neuphonic/neutts-air/blob/d9761a3d938b06c...

Watermarking is usually very fragile and generally relies on an adversary not knowing about it. I honestly don't know why anyone bothers with it.

kanwisher•4mo ago
Need to hook this up to Home assistant
mrklol•4mo ago
Model says it’s only supporting English, seems like the demos on their page for other languages are using an older model as the quality is worse.

But the current one seems really good, tested it for quite a bit with multiple kind of inputs.