frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
624•klaussilveira•12h ago•182 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
927•xnx•18h ago•548 comments

What Is Ruliology?

https://writings.stephenwolfram.com/2026/01/what-is-ruliology/
32•helloplanets•4d ago•24 comments

How we made geo joins 400× faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
109•matheusalmeida•1d ago•27 comments

Jeffrey Snover: "Welcome to the Room"

https://www.jsnover.com/blog/2026/02/01/welcome-to-the-room/
9•kaonwarb•3d ago•7 comments

Unseen Footage of Atari Battlezone Arcade Cabinet Production

https://arcadeblogger.com/2026/02/02/unseen-footage-of-atari-battlezone-cabinet-production/
40•videotopia•4d ago•1 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
219•isitcontent•13h ago•25 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
210•dmpetrov•13h ago•103 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
322•vecti•15h ago•143 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
370•ostacke•18h ago•94 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
358•aktau•19h ago•181 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
477•todsacerdoti•20h ago•232 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
272•eljojo•15h ago•160 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
402•lstoll•19h ago•271 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
85•quibono•4d ago•20 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
14•jesperordrup•2h ago•7 comments

Delimited Continuations vs. Lwt for Threads

https://mirageos.org/blog/delimcc-vs-lwt
25•romes•4d ago•3 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
56•kmm•5d ago•3 comments

Start all of your commands with a comma

https://rhodesmill.org/brandon/2009/commands-with-comma/
3•theblazehen•2d ago•0 comments

Was Benoit Mandelbrot a hedgehog or a fox?

https://arxiv.org/abs/2602.01122
12•bikenaga•3d ago•2 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
244•i5heu•15h ago•189 comments

Introducing the Developer Knowledge API and MCP Server

https://developers.googleblog.com/introducing-the-developer-knowledge-api-and-mcp-server/
52•gfortaine•10h ago•21 comments

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

https://infisical.com/blog/devops-to-solutions-engineering
140•vmatsiiako•17h ago•63 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
280•surprisetalk•3d ago•37 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
1058•cdrnsf•22h ago•433 comments

Why I Joined OpenAI

https://www.brendangregg.com/blog/2026-02-07/why-i-joined-openai.html
132•SerCe•8h ago•117 comments

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

https://github.com/phreda4/r3
70•phreda4•12h ago•14 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
28•gmays•8h ago•11 comments

Learning from context is harder than we thought

https://hy.tencent.com/research/100025?langVersion=en
176•limoce•3d ago•96 comments

FORTH? Really!?

https://rescrv.net/w/2026/02/06/associative
63•rescrv•20h ago•22 comments
Open in hackernews

Our new SAM audio model transforms audio editing

https://about.fb.com/news/2025/12/our-new-sam-audio-model-transforms-audio-editing/
168•ushakov•1mo ago

Comments

ajcp•1mo ago
Given TikToks insane creator adoption rate is Meta developing these models to build out a content creation platform to compete?
mgraczyk•1mo ago
I doubt it, although it's possible these models will be used for creator tools, I believe the main idea is to use them for data labeling.

At the time the first SAM was created, Meta was already spending over 2B/year on human labelers. Surely that number is higher now and research like this can dramatically increase data labeling volume

embedding-shape•1mo ago
> I doubt it, although it's possible these models will be used for creator tools, I believe the main idea is to use them for data labeling.

How is creating 3D objects and characters (and something resembling bones/armature but isn't) supposed to help with data labeling? As synthetic data for training other models, maybe, but seems like this new release is aimed at improving their own tooling for content creators, hard to deny this considering their demos.

For the original SAM releases, I agree, that was probably the purpose. But these new ones that generate stuff and do effects and what not, clearly go beyond that initial scope.

yjftsjthsd-h•1mo ago
> Visual prompting: Click on the person or object in the video that’s making a sound to isolate their audio.

How does that work? Correlating sound with movement?

yodon•1mo ago
Think about it conceptually:

Could you watch a music video and say "that's the snare drum, that's the lead singer, keyboard, bass, that's the truck that's making the engine noise, that's the crowd that's cheering, oh and that's a jackhammer in the background"? So can AI.

Could you point out who is lead guitar and who is rhythm guitar? So can AI.

scarecrowbob•1mo ago
I mean, sometimes I -mixing- a show and I couldn't tell you where a specific sound is coming from....
yodon•1mo ago
> sometimes I -mixing- a show and I couldn't tell you where a specific sound is coming from

And in those situations it won't work. Is any of this really a surprise?

recursive•1mo ago
I thought about it. Still seems kind of pointless.

That doesn't seem any better than typing "rhythm guitar". In fact, it seems worse and with extra steps. Sometimes the thing making the sound is not pictured. This thing is going to make me scrub through the video until the bass player is in frame instead of just typing "bass guitar". Then it will burn some power inferring that the thing I clicked on was a bass.

yjftsjthsd-h•1mo ago
To be fair, it's one of 3 ways to prompt
janalsncm•1mo ago
If it’s anything like the original SAM, thousands of hours of annotator time.

If I had to do it synthetically, take single subjects with a single sound and combine them together. Then train a model to separate them again.

ac2u•1mo ago
I wonder if the segmentation would work with a video of a ventriloquist and a dummy?
m3kw9•1mo ago
Can I create a continuous “who farted” detector? Would be great at parties
IncreasePosts•1mo ago
Each person's unique fartprint is yet another way big tech will be tracking us
BoorishBears•1mo ago
They're already analyzing poop, what's a mic to go with your toilet camera?

https://www.kohlerhealth.com/dekoda/

samat•1mo ago
And ads based on a fart! I guess you could throw in some spectrography for content aware ads too!! ‘Hmm, I sense you like onions, you would love French soup in the restaurant downstairs today!’
rmnclmnt•1mo ago
Bighead is back! « Fart Alert »!
teeray•1mo ago
I wonder if this would be nice for hearing aid users for reducing the background restaurant babble that overwhelms the people you want to hear.
ks2048•1mo ago
I recently discovered Audacity includes plug-ins for audio separation that work great (e.g. split into vocals track and instruments track). The model it uses also originated at Facebook (demucs).
tantalor•1mo ago
Is "demucs" a pun on demux (demultiplexer)?
ipsum2•1mo ago
Yes.
TylerE•1mo ago
Audacity is very very very far from state of the art in that respect.
wellthisisgreat•1mo ago
What’s a good alternative ?
5-0•1mo ago
I suppose that depends on the use case.

For mash-ups specifically, using yt-dlp to download music and split into stems with Demucs, using the UVR frontend, before importing into a DAW is effortless. The catch is that you can't expect to get OK-ish separation on anything other than vocals and "other", which really isn't a problem for mash-ups.

https://github.com/Anjok07/ultimatevocalremovergui

nartho•1mo ago
IS there any DAW plugins that do that ?
5-0•1mo ago
There are several. I've only tried one of them (free, can't remember which) but went back to UVR5.

While it's convenient not having to split stems into separate files beforehand, by using a VST, you usually end up doing so anyway while editing and arranging.

embedding-shape•1mo ago
If you're already in the Ableton ecosystem, their newly released stem separation is actually very good, at least for the small amount of testing I've done so far. Much better than demucs, which shouldn't come as a surprise I suppose.
TylerE•1mo ago
I use RipX DAW personally. It very cleanly seperates vocals, guitar, bass, and drums.
vhcr•1mo ago
This new SAM model actually competes against SOTA models.

https://www.reddit.com/r/LocalLLaMA/comments/1pp9w31/ama_wit...

embedding-shape•1mo ago
Their answer:

> If you are interested in how well we do compared to demucs in particular, we can use the MUSDB18 dataset since that is the domain that demucs is trained to work well on. There our net win rate against demucs is ~17%, meaning we do perform better on the MUSDB18 test set. There are actually stronger competitors on both this domain and our "in-the-wild" instrument stem separation domain that we built for SAM Audio Bench, but we either match or beat all of the ones we tested (AudioShake, LalalAI, MoisesAI, etc.)

So ~20% better than demucs, better than the ones they tested, but the acknowledge there are better models out there even today. So not sure "competes against SOTA models" is right, but "getting close to compete against SOTA models" might be more accurate.

embedding-shape•1mo ago
> for audio separation that work great

What did you compare it to? Ableton recently launched a audio separation feature too, and probably the highest ROI on simple/useful/accurate so far I've tried, other solutions been lacking in one of the points before.

yunwal•1mo ago
This is hilariously bad with music. Like I can type in the most basic thing like "string instruments" which should theoretically be super easy to isolate. You can generally one-shot this using spectral analysis libraries. And it just totally fails.
duped•1mo ago
what in theory makes those "super easy" to isolate? Humans are terrible at this to begin with, it takes years to train one of them to do it mildly well. Computers are even worse - blind source separation and the cocktail party problem have been the white whale of audio DSP for decades (and only very recently did tools become passable).
yunwal•1mo ago
The fact that you can do it with spectral analysis libraries, no LLM required.

This is much easier than source separation. It would be different if I were asking to isolate a violin from a viola or another violin, you’d have to get much more specific about the timbre of each instrument and potentially understand what each instruments part was.

But a vibration made from a string makes a very unique wave that is easy to pick out in a file.

duped•1mo ago
Are you making this up? What spectral analysis libraries or tools?

String instruments create similar harmonic series to horns, winds, and voice (because everything is a string in some dimension) and the major differences are in the spectral envelope, something that STFT tools are just ok at approximating because of the time/frequency tradeoff (aka: the uncertainty principle).

This is a very hard problem "in theory" to me, and I'm just above casually versed in it.

613style•1mo ago
He's not making it up and there's no reason for that tone. Strings are more straightforward to isolate compared to vocals/horns/etc because they produce a near-perfect harmonic series in parallel lines in a spectrogram. The time/frequency tradeoff exists, but it's less of a problem for strings because of their slow attack.

You can look up HPSS and python libraries like Essentia and Librosa.

IndySun•1mo ago
Hmmm... was 'tone' a pun?

Why mention a strings 'slow attack' as less of a problem? No isolation software considers this an easy route.

Vocals are more effectively isolated by virtue of the fact they are unique sounding. Strings (and other sounds) are the similar in some ways but far more generic. All software out there indicates this, including the examples mentioned.

mrob•1mo ago
All wind instruments and all bowed string instruments produce a perfect harmonic series while emitting a steady tone. The most important difference between timbres of different instruments is in the attack, where inharmonic tones are also generated. Several old synths used this principle to greatly increase realism, by adding brief samples of attack transients to traditional subtractive synthesis, e.g.:

https://en.wikipedia.org/wiki/Linear_arithmetic_synthesis

dleeftink•1mo ago
I might misremember, but iZotope RX and Melodyne were pretty useful in this regard.
jb1991•1mo ago
If you look at the actual harmonics of a string and of horn, you will see how wrong you are. There is a reason why they sound different to the ear.

It’s because of this that you can have a relatively inexpensive synthesizer (not sample or PCM based) that does a crude job of mimicking these different instruments by just changing the harmonics.

mrob•1mo ago
There is one important difference between the harmonics of string and wind instruments: it's possible to build a wind instrument that suppresses (although not entirely eliminates) the even harmonics, e.g. a stopped organ pipe. If it sounds like a filtered square wave it's definitely a wind instrument. But if it sounds like a filtered sawtooth wave it could be either.
coldtea•1mo ago
>what in theory makes those "super easy" to isolate? Humans are terrible at this to begin with,

Humans are amazing at it. You can discern the different instruments way better than any stem separating AI.

photon_garden•1mo ago
I had the same experience. It did okay at isolating vocals but everything else it failed or half-succeeded at.
embedding-shape•1mo ago
Like most models released for publicity rather than usefulness, they'll do great at benchmarks and single specific use cases, but no one seem to be able to release actually generalized models today.
lomase•1mo ago
Like everything AI you just have to lie a little and people whith 0 clue abot SOTA in audio will think this is amazing.
hamza_q_•1mo ago
Use Demucs bruh https://github.com/adefossez/demucs
yunwal•1mo ago
Hilarious that this is maintained by facebook and yet SAM fails so badly
throwaw12•1mo ago
This is super cool. Of course, it is possible to separate instrument sounds using specialized tools, but can't wait to see how people use this model for bunch of other use cases, where its not trivial to use those specialized tools:

* remove background noise of tech products, but keep the nature

* isolate the voice of a single person and feed into STT model to improve accuracy

* isolating sound of events in games and many more

7734128•1mo ago
Finally a way to perhaps remove laugh tracks in the near future.
sefrost•1mo ago
There are examples on YouTube of laughter tracks being removed and there are lots of awkward pauses, so I think you'd need to edit the video to cut the pauses out entirely.

- https://www.youtube.com/watch?v=23M3eKn1FN0

- https://www.youtube.com/watch?v=DgKgXehYnnw

embedding-shape•1mo ago
Cutting the pauses will change the beats and rhythm of the scene, so you probably need to edit some of the voice lines and actual scenes too then. In the end, if you're not interested in the original performance and work, you might as well read the script instead and imagine it however you want, read it at the pace you want and so on.
vintermann•1mo ago
And have a video model render an entirely new version for you, I guess.
samuell•1mo ago
I tried this to try to extract some speech from an audio track with heavy noise from wind (filmed out on a windy sea shore without mic windscreen), and the result unfortunately was less intelligible than the original.

I got much better results, though still not perfect, with the voice isolator in ElevenLabs.

AkshatJ27•1mo ago
You can try it out in the playground: https://aidemos.meta.com/segment-anything/gallery/ There seem to be many more fun little demos by meta here like automatic video masking, making 3d models from 2d images, etc.
theflyestpilot•1mo ago
sample anything model?
Oras•1mo ago
To try: https://aidemos.meta.com/segment-anything/editor/segment-aud...

Github: https://github.com/facebookresearch/sam-audio

I quite like adding effects such as making the isolated speech studio-quality or broadcast-ready.

keepamovin•1mo ago
FB has been a pioneer in voice and audio, somehow. A couple of years ago FB-Research had a little repo on GitHub that was the best noise-removal / voice-isolation out there. I wanted to use it in Wisprnote and politely emailed the authors. Never heard back (that's okay), but I was so impressed with the perceptual quality and "wind removal" (so hard).
websiteapi•1mo ago
I wonder if it works for speaker diarization out of the box. I've found that open source speaker diarization that doesn't require a lot of tweaking is basically non-existent.
hamza_q_•1mo ago
Yeah I was frustrated by slow and hard to use OSS diarization too; recently released a library to address that, check it out: https://github.com/narcotic-sh/senko

Also https://zanshin.sh, if you'd like speaker diarization when watching YouTube videos

websiteapi•1mo ago
looks interesting. will check it out.
noman-land•1mo ago
Hey, thanks for this. Been trying it out and it's very fast but seems to hear more speakers than are in the audio. I didn't see a way to tweak speaker similarity settings or merge speakers in some way. Any advice?
hamza_q_•1mo ago
Thanks for checking it out!

Yeah unfortunately, since the diarization is acoustic features based, it really does require high recorded voice fidelity/quality to get the best results. However, I just added another knob to the Diarizer class called mer_cos, which controls the speaker merging threshold. The default is 0.875, so perhaps try lowering to 0.8. That should help.

I'll also get around to adding a oracle/min/max speakers feature at some point, for cases where you know the exact number of speakers ahead of time, or wanna set upper/lower bounds. Gotten busy with another project, so haven't done it yet. PR's welcome though! haha

noman-land•1mo ago
Thanks, `mer_cos` definitely gets me closer. I appreciate that. Yeah, I was thinking providing a param for the expected number of speakers would be nice. I'll check out the codebase and see if that's something I can contribute :).
hamza_q_•1mo ago
Yeah would love contributions! Here's a brief overview of how I think it can be done:

Senko has two clustering types, (1) spectral for audio < 20 mins in length, and (2) UMAP+HDBSCAN for >= 20 mins. In the clustering code, spectral actually already supports orcale/min/max speakers, but UMAP+HDBSCAN doesn't. However, someone forked Senko and added min/max speakers to that here (for oracle, I guess min = max): https://github.com/DedZago/senko/commit/c33812ae185a5cd420f2...

So I think all that's required is basically just testing this thoroughly to make sure it doesn't introduce any regressions in clustering quality. And then just wiring the oracle/min/max parameters to the Diarizer class, or diarize() func.

IndySun•1mo ago
A lot of comments here exhibit the Gell-Mann amnesia effect writ large.
AlexeyBelov•1mo ago
Your comment is just a meta-comment and that's just as bad. I suggest gently correcting people instead of just pointing out very non-specifically that someone is wrong.
IndySun•1mo ago
I have. I did. I do. But like so many cocktail sticks launched towards the mammoth, eventually one lobs a final ineffectual remark.

But also agreed (with you, yes), for the vast majority of moments, ignore and don't add more noise. But sometimes... human after all.