frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Gemma 4 Multimodal Fine-Tuner for Apple Silicon

https://github.com/mattmireles/gemma-tuner-multimodal
73•MediaSquirrel•2h ago
About six months ago, I started working on a project to fine-tune Whisper locally on my M2 Ultra Mac Studio with a limited compute budget. I got into it. The problem I had at the time was I had 15,000 hours of audio data in Google Cloud Storage, and there was no way I could fit all the audio onto my local machine, so I built a system to stream data from my GCS to my machine during training.

Gemma 3n came out, so I added that. Kinda went nuts, tbh.

Then I put it on the shelf.

When Gemma 4 came out a few days ago, I dusted it off, cleaned it up, broke out the Gemma part from the Whisper fine-tuning and added support for Gemma 4.

I'm presenting it for you here today to play with, fork and improve upon.

One thing I have learned so far: It's very easy to OOM when you fine-tune on longer sequences! My local Mac Studio has 64GB RAM, so I run out of memory constantly.

Anywho, given how much interest there is in Gemma 4, and frankly, the fact that you can't really do audio fine-tuning with MLX, that's really the reason this exists (in addition to my personal interest). I would have preferred to use MLX and not have had to make this, but here we are. Welcome to my little side quest.

And so I made this. I hope you have as much fun using it as I had fun making it.

-Matt

Comments

dsabanin•2h ago
Thanks for doing this. Looks interesting, I'm going to check it out soon.
MediaSquirrel•1h ago
you are welcome! It was a fun side quest
craze3•1h ago
Nice! I've been wanting to try local audio fine-tuning. Hopefully it works with music vocals too
LuxBennu•1h ago
I run whisper large-v3 on an m2 max 96gb and even with just inference the memory gets tight on longer audio, can only imagine what fine-tuning looks like. Does the 64gb vs 96gb make a meaningful difference for gemma 4 fine-tuning or does it just push the oom wall back a bit? Been wanting to try local fine-tuning on apple silicon but the tooling gap has kept me on inference only so far.
MediaSquirrel•1h ago
Memory usage increases quadratically with sequence length. Therefore, using shorter sequences during fine-tuning can prevent memory explosions. On my 64GB RAM machine, I'm limited to input sequences of about 2,000 tokens, considering my average output for the fine-tuning task is around 1,000 tokens (~3k tokens total).
yousifa•1h ago
This is super cool, will definitely try it out! Nice work
pivoshenko•56m ago
nice!

Project Glasswing: Securing critical software for the AI era

https://www.anthropic.com/glasswing
612•Ryan5453•3h ago•256 comments

System Card: Claude Mythos Preview [pdf]

https://www-cdn.anthropic.com/53566bf5440a10affd749724787c8913a2ae0841.pdf
403•be7a•3h ago•273 comments

S3 Files and the changing face of S3

https://www.allthingsdistributed.com/2026/04/s3-files-and-the-changing-face-of-s3.html
105•werner•2h ago•31 comments

GLM-5.1: Towards Long-Horizon Tasks

https://z.ai/blog/glm-5.1
343•zixuanlimit•5h ago•101 comments

How to get better at guitar

https://www.jakeworth.com/posts/how-to-get-better-at-guitar/
83•jwworth•2d ago•29 comments

Lunar Flyby

https://www.nasa.gov/gallery/lunar-flyby/
55•kipi•6h ago•4 comments

Cambodia unveils a statue of famous landmine-sniffing rat Magawa

https://www.bbc.com/news/articles/c0rx7xzd10xo
212•speckx•4h ago•46 comments

Show HN: Gemma 4 Multimodal Fine-Tuner for Apple Silicon

https://github.com/mattmireles/gemma-tuner-multimodal
73•MediaSquirrel•2h ago•7 comments

Bitcoin and Quantum Computing

https://nehanarula.org/2026/04/03/bitcoin-and-quantum-computing.html
33•nehan•59m ago•14 comments

A truck driver spent 20 years making a scale model of every building in NYC

https://www.smithsonianmag.com/smart-news/a-truck-drive-spent-20-years-making-this-astonishing-sc...
186•1659447091•1d ago•28 comments

Show HN: Brutalist Concrete Laptop Stand (2024)

https://sam-burns.com/posts/concrete-laptop-stand/
665•sam-bee•10h ago•204 comments

Rescuing old printers with an in-browser Linux VM bridged to WebUSB over USB/IP

https://printervention.app/details
123•gmac•5h ago•43 comments

Cloudflare targets 2029 for full post-quantum security

https://blog.cloudflare.com/post-quantum-roadmap/
234•ilreb•7h ago•76 comments

Show HN: An interactive map of Tolkien's Middle-earth

https://middle-earth-interactive-map.web.app/
15•frasermarlow•1h ago•1 comments

The Image Boards of Hayao Miyazaki

https://animationobsessive.substack.com/p/the-image-boards-of-hayao-miyazaki
52•vinhnx•1d ago•6 comments

A whole boss fight in 256 bytes

https://hellmood.111mb.de//A_whole_boss_fight_in_256_bytes.html
11•HellMood•1d ago•2 comments

Assessing Claude Mythos Preview's cybersecurity capabilities

https://red.anthropic.com/2026/mythos-preview/
204•sweis•3h ago•27 comments

AI helps add 10k more photos to OldNYC

https://www.danvk.org/2026/03/08/oldnyc-updates.html
95•evakhoury•1d ago•35 comments

Google open-sources experimental agent orchestration testbed Scion

https://www.infoq.com/news/2026/04/google-agent-testbed-scion/
127•timbilt•8h ago•39 comments

Cells for NetBSD: kernel-enforced, jail-like isolation

https://netbsd-cells.petermann-digital.de/
20•akagusu•2h ago•4 comments

Move Detroit

https://www.movedetroit.com/program
8•rmason•51m ago•0 comments

9 Mothers (YC P26) Is Hiring – Lead Robotics and More

https://jobs.ashbyhq.com/9-mothers?utm_source=x8pZ4B3P3Q
1•ukd1•8h ago

Taste in the age of AI and LLMs

https://rajnandan.com/posts/taste-in-the-age-of-ai-and-llms/
193•speckx•6h ago•172 comments

A blind man made it possible for others with low vision to build Lego sets

https://apnews.com/article/lego-bricks-for-blind-audio-braille-instructions-5a2a27de4354a0b144317...
27•speckx•7h ago•5 comments

We found an undocumented bug in the Apollo 11 guidance computer code

https://www.juxt.pro/blog/a-bug-on-the-dark-side-of-the-moon/
358•henrygarner•11h ago•178 comments

Boneyard: Generate pixel-perfect skeleton screens from your real DOM

https://github.com/0xGF/boneyard
21•steveharing1•4d ago•7 comments

John Coltrane Illustrates the Mathematics of Jazz

https://www.americanjazzmusicsociety.com/blog/john-coltrane-draws
83•luu•15h ago•7 comments

Tailslayer: Library for reducing tail latency in RAM reads

https://github.com/LaurieWired/tailslayer
29•hasheddan•2h ago•8 comments

Show HN: Unicode Steganography

https://steganography.patrickvuscan.com
6•PatrickVuscan•8h ago•0 comments

Moving fast in hardware: lessons from lab to $100M ARR

https://blog.zacka.io/p/simplify-then-add-lightness-bc4
88•rryan•6h ago•22 comments