frontpage.

Hi all, this is an early version of a side project of mine. Would love some feedback and comments.

I like karaoke and I grew up with the Asian style karaoke with the music video behind and the karaoke lyrics at the bottom.

Sometimes I want to do a song and there is no karaoke version video like that.

A few years ago I came across ML models that cleanly separate the vocals and the instrumental music of a song. I thought of the idea to chain together ML models that can take an input music video file, extract the audio (ffmpeg), separate the tracks (ML), transcribe the lyrics (ML), burn the lyrics back with timing into the video (ffmpeg), and output a karaoke version of the video.

This is an early version of the app, Mac only so far (since I use Mac, despite it being an electron app.. I do eventually want to make a Windows build), I've only let a few friends try it. Let me know what you think!

Too much color: how many decimal places do you need?

Don't worry, Valve still plans to launch the Steam Machine "this year"

Permission denied:Help stop Google's attack on free and open Android development

Spectrogram Text Art with MiniDSP

AI unlocking new treatments for 'incurable' diseases

Ask HN: What are some good AI usage policies?

Miguel: An AI agent that modifies its own source code, sandboxed in Docker

If the differentiation is domain and GTM?

TLA+ as a Design Accelerator: Lessons from the Industry

Tells your next sick day ('Sick Clock') [video]

Clock – Variable Font

Trion: A Behavioral Oracle That Derived Truth from On-Chain History Not Price

Show HN: Rampart – Open-source firewall for AI agents (v0.8)

Markdown Files Won't Make You an Engineer

Build to Capture, Not to Last

Testing Nvidia's FP4: Running 70B LLMs on a Single RTX 5090 with Real Benchmarks

U.S. DOJ Attorney: I used AI to try and replicate my prior [deleted] work

Is Spotify Enabling Impersonation of Famous Jazz Musicians?

Tell HN: Beware of Mac Studio Scams on eBay

What's My ΔE(OK) JND?

Punching, Slamming, Screaming: A Chef's Past Abuse Haunts Noma

Evidence That Managerial Tone Predicts Returns When Text Does Not

YouTube is now the largest media company

The Linux Foundation Certificate of Origin Is recursive

Show HN: Draxl, agent-native source code with stable AST node IDs

Show HN: I built a multiplayer strategy game for AI agents

Okmain: Pick an OK main colour of an image

US CBO Expands Modeling Resources on GitHub

Design, Build, and Analysis of Small-Scale Wave Energy Converter Prototypes

Mcc Minecraft Classic server software

Show HN: KaraMagic – automatic karaoke video maker