frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Surveillance data challenges what we thought we knew about location tracking

https://www.lighthousereports.com/investigation/surveillance-secrets/
235•_tk_•3h ago•41 comments

How bad can a $2.97 ADC be?

https://excamera.substack.com/p/how-bad-can-a-297-adc-be
170•jamesbowman•6h ago•92 comments

Why Is SQLite Coded in C and not Rust

https://www.sqlite.org/whyc.html
53•plainOldText•3h ago•30 comments

How AI hears accents: An audible visualization of accent clusters

https://accent-explorer.boldvoice.com/
135•ilyausorov•7h ago•50 comments

Hacking the Humane AI Pin

https://writings.agg.im/posts/hacking_ai_pin/
26•agg23•6d ago•4 comments

Unpacking Cloudflare Workers CPU Performance Benchmarks

https://blog.cloudflare.com/unpacking-cloudflare-workers-cpu-performance-benchmarks/
58•makepanic•3h ago•6 comments

GrapheneOS is finally ready to break free from Pixels and it may never look back

https://www.androidauthority.com/graphene-os-major-android-oem-partnership-3606853/
60•MaximilianEmel•1h ago•35 comments

What Americans die from vs. what the news reports on

https://ourworldindata.org/does-the-news-reflect-what-we-die-from
315•alphabetatango•5h ago•180 comments

SmolBSD – build your own minimal BSD system

https://smolbsd.org
85•birdculture•6h ago•3 comments

A 12,000-year-old obelisk with a human face was found in Karahan Tepe

https://www.trthaber.com/foto-galeri/karahantepede-12-bin-yil-oncesine-ait-insan-yuzlu-dikili-tas...
218•fatihpense•1w ago•91 comments

Ultrasound is ushering a new era of surgery-free cancer treatment

https://www.bbc.com/future/article/20251007-how-ultrasound-is-ushering-a-new-era-of-surgery-free-...
361•1659447091•6d ago•101 comments

Meta Erases Gaza Journalist's Instagram

https://twitter.com/DropSiteNews/status/1977795050206576763
27•cramsession•26m ago•1 comments

Astronomers 'image' a mysterious dark object in the distant Universe

https://www.mpg.de/25518363/1007-asph-astronomers-image-a-mysterious-dark-object-in-the-distant-u...
184•b2ccb2•9h ago•99 comments

AppLovin nonconsensual installs

https://www.benedelman.org/applovin-nonconsensual-installs/
94•jhap•3h ago•35 comments

Show HN: An open source access logs analytics script to block bot attacks

https://github.com/tempesta-tech/webshield
16•krizhanovsky•4h ago•1 comments

Why your boss isn't worried about AI – "can't you just turn it off?"

https://boydkane.com/essays/boss
151•beyarkay•5h ago•142 comments

ADS-B Exposed

https://adsb.exposed/
268•keepamovin•13h ago•68 comments

Beyond the SQLite Single-Writer Limitation with Concurrent Writes

https://turso.tech/blog/beyond-the-single-writer-limitation-with-tursos-concurrent-writes
47•syrusakbary•1w ago•23 comments

AI and Home-Cooked Software

https://mrkaran.dev/posts/ai-home-cooked-software/
22•todsacerdoti•1w ago•13 comments

Zoo of array languages

https://ktye.github.io/
140•mpweiher•12h ago•41 comments

Show HN: Wispbit - Linter for AI coding agents

https://wispbit.com
21•dearilos•4h ago•10 comments

CSS for Styling a Markdown Post

https://webdev.bryanhogan.com/miscellaneous/styling-markdown/
3•bryanhogan•1w ago•0 comments

Prefix sum: 20 GB/s (2.6x baseline)

https://github.com/ashtonsix/perf-portfolio/tree/main/delta
72•ashtonsix•7h ago•28 comments

Testing a compiler-driven full-stack web framework

https://wasp.sh/blog/2025/10/07/how-we-test-a-web-framework
41•franjo_mindek•6d ago•9 comments

U.S. Sanctions Cambodian Conglomerate, Citing Role in 'Pig-Butchering' Scams

https://www.wsj.com/business/u-s-sanctions-cambodian-conglomerate-citing-role-in-pig-butchering-s...
55•paulpauper•3h ago•12 comments

New lab-grown human embryo model produces blood cells

https://www.cam.ac.uk/research/news/new-lab-grown-human-embryo-model-produces-blood-cells
80•gmays•5h ago•20 comments

Automatic K8s pod placement to match external service zones

https://github.com/toredash/automatic-zone-placement
76•toredash•6d ago•31 comments

Kyber (YC W23) Is Hiring an Enterprise AE

https://www.ycombinator.com/companies/kyber/jobs/BQRRSrZ-enterprise-account-executive-ae
1•asontha•11h ago

Preparing for AI's economic impact: exploring policy responses

https://www.anthropic.com/research/economic-policy-responses
10•grantpitt•4h ago•7 comments

The day my smart vacuum turned against me

https://codetiger.github.io/blog/the-day-my-smart-vacuum-turned-against-me/
201•codetiger•1w ago•92 comments
Open in hackernews

How AI hears accents: An audible visualization of accent clusters

https://accent-explorer.boldvoice.com/
135•ilyausorov•7h ago

Comments

dereknelson•7h ago
really fun discovery clicking a dot and hearing the accent. neat visualization, lots to think about!
tmshapland•7h ago
Fascinating! How did you decouple the speaker-specific vocal characteristics (timbre, pitch range) from the accent-defining phonetic and prosodic features in the latent space?
oscarfree•7h ago
We didn't explicitly. Because we finetuned this model for accent classification, the later transformer layers appear to ignore non-accent vocal characteristics. I verified this for gender for example.
JakeLester•7h ago
Thank you for sharing! the 3d visual was an interesting application of the UMAP technique.

Is there a way to subscribe to these blog posts for auto-notification?

nosrepa•3h ago
Yeah, if only there was a protocol for that.
bheadmaster•3h ago
It would have taken you a second more to type out "RSS", and turn a sarcastic comment into an informative one.

Obligatory xkcd: https://xkcd.com/1053/

ahstilde•7h ago
why is spanish so distributed?
ilyausorov•6h ago
Good question! It's likely because there are lots of different accents of Spanish that are distinct from each other. Our labels only capture the native language of the speaker right now, so they're all grouped together but it's definitely on our to-do list to go deeper into the sub accents of each language family!
bikeshaving•1h ago
Spanish is one of those languages I would love to see as a breakdown by country. I’m sure Chilean Spanish looks very different from Catalonian Spanish.
rkomorn•1h ago
Did you mean Catalan (which is not Spanish) or Castilian Spanish?
bikeshaving•1h ago
Yes the Spanish spoken in Spain, especially the one that’s like /ˈɡɾaθjas/ and /baɾθeˈlona/.
djmips•58m ago
But Spanish sounds very different in Spain depending on what region of the country you are talking about.
oscarfree•6h ago
Not sure, could be the large number of Spanish dialects represented in the dataset, label noise, or something else. There may just be too much diversity in the class to fit neatly in a cluster.

Also, the training dataset is highly imbalanced and Spanish is the most common class, so the model predicts it as a sort of default when it isn't confident -- this could lead to artifacts in the reduced 3d space.

zaouiamine•3h ago
This is a fascinating look at how AI interprets accents! It reminds me of some recent advancements in speech recognition tech, like Google's Dialect Recognition feature, which also attempts to adapt to different accents. I wonder how these models could be improved further to not just recognize but also appreciate the nuances of regional
afiodorov•3h ago
Apparently Persian and Russian are close. Which is surprising to say the least. I know people keep getting confused about how Portuguese from Portugal and Russian sound close yet the Persian is new to me.
zehaeva•3h ago
When I went to Portugal I was struck by how much Portuguese there does sound like Spanish with a Russian accent!
oscarfree•2h ago
Part of this is the "dark L" sound
BalinKing•2h ago
I’d guess that the sibilants, consonant clusters, and/or vowel reduction would play a big role.
binary132•2h ago
I thought I was the only one who perceived an audible similarity between Portuguese and Russian.
mh-•1h ago
I speak neither, and both also sound similar to me depending on the accents of the speakers.
djmips•1h ago
I had that too but it was Brazillian Portuguese where I noticed it.
CGMthrowaway•2h ago
Idea: Farsi and Russian both have simple list of vowel sounds and no diphtongs. Making it hard/obvious when attempting to speak english, which is rife with them and many different vowel sounds
ilyausorov•2h ago
Yeh they seem to be in the same "major" cluster, although Serbian/Croatian, Romanian, Bulgarian, Turkish, Polish and Czech are all close.

Turkish and Persian seem to be the nearest neighbors.

zman0225•3h ago
Going mono-tonal to that of an expressive ebook increased my "American English" score from a 52% to 92%.

I'd suggest training a little less on audio books.

djmips•1h ago
What does it mean mono-tonal and what is an expressive ebook? I assume you are not American born? I had been of the understanding that rythm was more important than the exact sounds in comprehension.
bikeshaving•3h ago
The source code for this is unminified and very readable if you’re one of the rare few who has interesting latent spaces to visualize.

https://accent-explorer.boldvoice.com/script.js?v=5

ilyausorov•3h ago
Nothing too secret in there! We anonymized everything and anyway it's just a basic Plotly plot. Feel free to check it out.
3abiton•2h ago
Good catch. I really hate javascript so i never got into d3js, so plptly was such a life saver.
ilyausorov•2h ago
Plotly is great! Much love.
agrnet•36m ago
could you explain what it means for someone to “have interesting latent spaces”? curious how you’re using that metaphor here
dcreater•3h ago
whats the dimensionality of the latent space? How were the 3 dimensions visualized selected?
oscarfree•3h ago
12 layers of 768-dim each. The 3 dimensions visualized are chosen by UMAP.
lynchdt•2h ago
Irish accent appears to break it.
oscarfree•2h ago
We are working on this - we don't have quite enough Irish speech data.
diegolas•2h ago
it would've been nice to be able to visualize the differences between the different accents in the spanish language, really cool tho
ilyausorov•2h ago
Yeh, we would've loved to see that too. It's on our roadmap for sure. Same for some of the other languages with a large amount of unique accents like e.g. French, Chinese, Arabic, etc...
johnwatson11218•2h ago
I just got a project running whereby I used python + pdfplumber to read in 1100 pdf files, most of my humble bundle collection. I extracted the text and dumped it into a 'documents' table in postgresql. Then I used sentence transformers to reduce each 1K chunk to a single 384D vector which I wrote back to the db. Then I averaged these to produce a document level embedding as a single vector.

Then I was able to apply UMAP + HDBSCAN to this dataset and it produced a 2D plot of all my books. Later I put the discovered topic back in the db and used that to compute tf-idf for my clusters from which I could pick the top 5 terms to serve as a crude cluster label.

It took about 20 to 30 hours to finish all these steps and I was very impressed with the results. I could see my cookbooks clearly separated from my programming and math books. I could drill in and see subclusters for baking, bbq, salads etc.

Currently I'm putting it into a 2 container docker compose file, base postgresql + a python container I'm working on.

mertbozkir•1h ago
i love boldvoice
ilyausorov•1h ago
Thanks, we love you too
ccheever•1h ago
Very interesting
double_espresso•52m ago
this is super cool!
AprilArcus•46m ago
The Australian-Vietnamese continuum is well-explained by Australia being the geographically nearest region which can supply native English language teachers to English language learners in Vietnam, rather than by any intrinsic phonetic resemblance between Vietnamese and Australian English.
gmurphy•45m ago
Since our own accents generally sound neutral to ourselves, I would love someone to make an accent-doubler - take the differences between two accents and expand them, so an Australian can hear what they sound like to an American, or vice-versa
efskap•45m ago
BERT still making headlines in 2025, you love to see it.
pinkmuffinere•33m ago
Why do the voices all sound so similar? I'm not talking about accent, I'm talking about the pitch, timbre, and other qualities of the voice themselves. For instance, all the phrases I heard sounded like they were said by a medium-set 45 year old man. Nothing from kids, the elderly, or people with lower / higher-pitch voices. I assume this expected from the dataset for some reason, but am really curious about that reason. Did they just get many people with similar vocal qualities but wide ranges of accents?
dwohnitmok•30m ago
From the article:

> By clicking or tapping on a point, you will hear a standardized version of the corresponding recording. The reason for voice standardization is two-fold: first, it anonymizes the speaker in the original recordings in order to protect their privacy. Second, it allows us to hear each accent projected onto a neutral voice, making it easier to hear the accent differences and ignore extraneous differences like gender, recording quality, and background noise. However, there is no free lunch: it does not perfectly preserve the source accent and introduces some audible phonetic artifacts.

> This voice standardization model is an in-house accent-preserving voice conversion model.

crazygringo•26m ago
This is fascinating in theory, but I'm confused in practice.

When I play the different recordings, which I understand have the accent "re-applied" to a neutral voice, it's very difficult to hear any actual differences in vowels, let alone prosody. Like if I click on "French", there's something vaguely different, but it's quite... off. It certainly doesn't sound like any native French speaker I've ever heard. And after all, a huge part of accent is prosody. So I'm not sure what vocal features they're considering as "accent"?

I'm also curious what the three dimensions are supposed to represent? Obviously there's no objective answer, but if they've listened to all the samples, surely they could explain the main constrasting features each dimension seems to encode?

retrac•19m ago
I'm deaf. Something close to standard Canadian English is my native language. Most native English speakers claim my speech is unmarked but I think they're being polite; it's slightly marked as unusual and some with a good ear can easily tell it's because of hearing loss.

Using the accent guesser, I have a Swedish accent. Danish and Australian English follow as a close tie.

It's not just the AI. Non-native speakers of English often think I have a foreign accent, too. Often they guess at English or Australian. Like I must have been born there and moved here when I was younger, right? I've also been asked if I was Scandinavian.

Interestingly I've noticed that native speakers never make this mistake. They sometimes recognize that I have a speech impediment but there's something about how I talk that is recognized with confidence as a native accent. That leads me to the (probably obvious) inference that whatever it is that non-native speakers use to judge accent and competency, it is different from what native speakers use. I'm guessing in my case, phrase-length tone contour. (Which I can sort of hear, and presumably reproduce well, even if I have trouble with the consonants.)

AI also really has trouble with transcribing my speech. I noticed that as early as the '90s with early speech recognition software. It was completely unusable. Even now AI transcription has much more trouble with me than with most people. Yet aside from a habit of sometimes mumbling, I'm told I speak quite clearly, by humans.

Hearing different things, as it were.

dmevich1•15m ago
Fascinating work — especially how geography and history influence accent clustering more than language families. Brilliant visualization!