frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Launch HN: Uplift (YC S25) – Voice models for under-served languages

49•zaidqureshi•3h ago
Hi HN, we are Zaid, Muhammad and Hammad, the co-founders of Uplift AI (https://upliftai.org). We build models that speak underserved languages — today: Urdu, Sindhi, and Balochi.

A billion people worldwide can't read. In countries like Pakistan – the 5th most populous country – 42% of adults are illiterate. This holds back the entire economy: patients can't read medical reports, parents can't help with homework, banks can't go fully digital, farmers can't research best practices, and people memorize smartphone app button sequences. Voice AI interfaces can fix all of this, and we think this will perhaps be one of the great benefits of modern AI.

Right now, existing voice models barely work for these languages, and big tech is moving slowly.

Uplift AI was originally a side project to make datasets for translation and voice models. For us it was a "cool side-thing" to work on, not an "important full-time thing" to work on. With some initial data we hacked together a Urdu Voice Bot on Whatsapp and gave it to one domestic worker. In two days 800 people were using it. When we dived deeper into understanding the users, we learned that text interfaces don't work for sooo many. So we started Uplift AI to solve this problem fulltime.

The most challenging part is that all the building blocks needed for great voice models are broken for these languages. For example, if you are creating a speech synthesis model, you will scrape a lot of data from youtube and auto-label it using a transcription model… all very easy to do in English. But it doesn't work in under-served languages because the transcription modes are not accurate.

There are many other challenges. Like when you hire human transcribers to label the data, often they don't have any spell correctors for their languages, and this creates lots of noise in the data… making it hard to train models with low data. There are many more challenges in phonemes, silence detection, diacritization etc.

We solve these problems by making great internal tooling to help with data labeling. Also, we source our own data and don't buy it. This is counterintuitive, but a big advantage over companies buying data and then training. By sourcing our own data we create the right data distributions and get much better models with much less data. By doing the entire thing inhouse, (data, labeling, training, deploying) we are able to make a lot faster progress.

Today we publicly offer a text to speech APIs for Urdu, Sindhi, and Balochi. Here's a video which shows this: https://www.loom.com/share/dcd5020967444c228e9c127151e7a9f5.

Khan Academy is using our tech to dub videos to Urdu (https://ur.khanacademy.org).

Our models excel at informational use cases (like AI bots) but need more work in emotive use-cases like poetry.

We have been giving a lot of people private access in beta mode, and today are launching our models publicly. We believe this will be the fastest way for us to learn about areas that are not performing well so we can fix them quickly.

We'd love to hear from all of you, especially around your experiences with under-served languages (not just the Pakistani ones we're starting with) and your comments in general.

Comments

akshayp29•2h ago
Pretty cool! Do you think the model would be good at other under-served languages as well? Or is it hypertuned to just these?
zaidqureshi•2h ago
The model itself can work well for new languages, its just the process of data gathering and maintaining high quality of data is what we have to figure out as we scale across languages.

Currently the model is only given data for these languages so it doesn't know anything else.

akshayp29•2h ago
Cool - makes sense!
mandeepj•1h ago
> just the process of data gathering and maintaining high quality of data is what we have to figure out as we scale across languages.

À crawler and data ingestion pipeline will not help with that?

zaidqureshi•58m ago
Gathering audio data online is not that hard, but getting it accurately labelled is challenging, as the speech understanding systems for those languages aren't there either, so we can't automatically do that
pavlov•2h ago
Nice! Clearly a big and underserved market for voice AI solutions.

Would be nice to have some code examples for using your TTS API with Pipecat.

zaidqureshi•2h ago
I have to make that.. I did make one for LiveKit which utilizes our websocket API designed for real-time conversation API:

https://docs.upliftai.org/tutorials/livekit-voice-agent

zaidqureshi•1h ago
btw I did try to first make it with Pipecat and was having some annoying windows issues with getting libraries installed for daily etc. so I posted something that was easily reproducible for the tutorial...
sanman8119•2h ago
Would love to see Malayalam here one day!
zaidqureshi•2h ago
Yes! I will keep track of this comment for the day we do :P
yorwba•2h ago
Unless that happens within a week or so, this thread will be locked and you won't be able to reply anymore.

It would be good to have a company blog with an RSS feed that people can subscribe to for updates.

zaidqureshi•2h ago
ah, created a quick google form for language requests! https://forms.gle/XA6nZbmBNK5K7GJv5
moinism•1h ago
Congrats on the launch! Having support for regional voices is going to open up so many opportunities.
zaidqureshi•1h ago
Agreed!
nojs•1h ago
Nice, this is really needed. Would be cool to see some of the less common regional Chinese dialects, which are widely spoken and often the only language older people speak. And even just more accurate regional accents for Mandarin.
zaidqureshi•1h ago
wow did not know that! Do you feel there is gap in speech understanding here or personalization missing with current TTS?
_waqas_ali_•44m ago
As a Sindhi speaker myself, amazing stuff. The output is so good. This unlocks the vastness of the internet for millions of people. I am imaging something like NotebookLM but for under-served languages or a hotline where people can call and talk/learn about anything. Do you guys have plans to create b2c products yourself?
zaidqureshi•38m ago
At the moment we are focused on making the models available through API so developers can make some cool things. We are actively monitoring to see if there is an opportunity that we will be better positioned to solve.

We are planning on hosting an online hackathon soon, so will suggest these things as ideas!

_waqas_ali_•13m ago
Fair enough. I don’t have a use case for the API yet but I am looking forward to the products that come out of this
zaidqureshi•5m ago
Maybe will make another post in a month of all the cool products that have come out so far :)..
Bilal_io•30m ago
Congratulations on the launch! I really hope it doesn't get used to launch misinformation campaigns against the country.

Are you aware of any effort to educate and fight against misinformation in Pakistan?

zaidqureshi•8m ago
Hope so! It is great that it overall has a big impact on making knowledge more accessible (i.e Khan Academy using it to dub their content in minutes instead of weeks). But there are lots of other areas where it applies as well.
jnmandal•2m ago
Looks really cool, exciting to see. I have two questions around this:

1. Given that you are concerned with providing access a class of folks that are traditionally ignored by technologists, do you plan to make these models usable for offline purposes? For example an illiterate person I know from Uttarkhand: his home village is not connected to road. Interestingly he does speak Hindi, but his native language I believe is something more obscure. To get home, he walks five hours from the terminus of a road. Connectivity is obviously both limited and intermittent. A usable device might want the voice interface embedded on it. Any plans for this?

2. I have minimal understanding of this but as someone who has learned Hindi/Urdu as a foreign language but in the US, I am often in mixed conversation w/ both Indians and Pakistanis. There never seems to be any issues with communication. I have heard that certain terms (like for example "khub suraat", "shukria", "kitaab") are more Urdu than Hindi. I also studied Arabic, Farsi, and Swahili so I am familiar with these as loanwords Arabic and/or Persian, but in practice I hear Hindi speakers using these terms often. Is the primary value add here political? Is it an accent thing? Thanks in advance for any explanation. This is still very much a mystery to me.

Chrome intends to remove XSLT from the HTML spec

https://github.com/whatwg/html/pull/11563
91•troupo•1h ago•77 comments

Without the Futex, It's Futile

https://h4x0r.org/futex/
105•eatonphil•2h ago•26 comments

Custom telescope mount using harmonic drives and ESP32

https://www.svendewaerhert.com/blog/telescope-mount/
184•waerhert•6h ago•64 comments

Critical Cache Poisoning Vulnerability in Dnsmasq

https://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2025q3/018288.html
70•westurner•3h ago•24 comments

Launch HN: Parachute (YC S25) – Guardrails for Clinical AI

13•ariavikram•1h ago•6 comments

Launch HN: Uplift (YC S25) – Voice models for under-served languages

49•zaidqureshi•3h ago•23 comments

Lazy-brush – smooth drawing with mouse or finger

https://lazybrush.dulnan.net
433•tvdvd•3d ago•58 comments

Geotoy – Shadertoy for 3D Geometry

https://3d.ameo.design/geotoy
22•Ameo•1d ago•4 comments

Prime Number Grid

https://susam.net/primegrid.html
217•todsacerdoti•8h ago•82 comments

Candle Flame Oscillations as a Clock

https://cpldcpu.com/2025/08/13/candle-flame-oscillations-as-a-clock/
49•cpldcpu•3d ago•8 comments

PyPI Preventing Domain Resurrection Attacks

https://blog.pypi.org/posts/2025-08-18-preventing-domain-resurrections/
66•pabs3•5h ago•28 comments

OpenMower – An open source lawn mower

https://github.com/ClemensElflein/OpenMower
479•rickcarlino•15h ago•149 comments

Vim Macros for Beancount

https://tangled.sh/@adam.tngl.sh/vim-beancounting
24•xarcolade•3h ago•7 comments

How to Build a Medieval Castle

https://archaeology.org/issues/september-october-2025/features/how-to-build-a-medieval-castle/
165•benbreen•11h ago•35 comments

In 2006, Hitachi developed a 0.15mm-sized RFID chip

https://www.hitachi.com/New/cnews/060206.html
66•julkali•4d ago•36 comments

EloqKV, a distributed database with Redis compatible API (GPLv2 and AGPLv3)

https://github.com/eloqdata/eloqkv
29•cloudsql•1d ago•14 comments

Attention Is the New Big-O: A Systems Design Approach to Prompt Engineering

https://alexchesser.medium.com/attention-is-the-new-big-o-9c68e1ae9b27
11•alexc05•2h ago•4 comments

Guile bindings for Sway window manager

https://github.com/ebeem/guile-swayer
18•ducktective•3d ago•0 comments

Show HN: Whispering – Open-source, local-first dictation you can trust

https://github.com/epicenter-so/epicenter/tree/main/apps/whispering
507•braden-w•23h ago•132 comments

Ted Chiang: The Secret Third Thing

https://linch.substack.com/p/ted-chiang-review
217•pseudolus•15h ago•92 comments

Counter-Strike: A billion-dollar game built in a dorm room

https://www.nytimes.com/2025/08/18/arts/counter-strike-half-life-minh-le.html
447•asnyder•1d ago•386 comments

Netflix Revamps Tudum's CQRS Architecture with Raw Hollow In-Memory Object Store

https://www.infoq.com/news/2025/08/netflix-tudum-cqrs-raw-hollow/
62•NomDePlum•3d ago•63 comments

The Life and Death of London's Crystal Palace (2021)

https://heritagecalling.com/2021/11/29/picturing-the-crystal-palace/
37•zeristor•4d ago•12 comments

Tiny-tpu: A minimal tensor processing unit (TPU), inspired by Google's TPU

https://github.com/tiny-tpu-v2/tiny-tpu
242•admp•19h ago•13 comments

X-ray scans reveal Buddhist prayers inside tiny Tibetan scrolls

https://www.popsci.com/technology/tibetan-prayer-scroll-scans/
153•Hooke•2d ago•57 comments

Show HN: I built an app to block Shorts and Reels

https://scrollguard.app/
642•adrianhacar•3d ago•282 comments

Left to Right Programming

https://graic.net/p/left-to-right-programming
398•graic•22h ago•320 comments

Obsidian Bases

https://help.obsidian.md/bases
622•twapi•18h ago•203 comments

FFmpeg Assembly Language Lessons

https://github.com/FFmpeg/asm-lessons
397•flykespice•1d ago•133 comments

How to Use Snprintf

https://bernsteinbear.com/blog/snprintf/
41•surprisetalk•3d ago•17 comments