frontpage.

Show HN: I got Stability AI's small audio model into a consumer iOS app

1•kilojules•2h ago

I thought others might be interested in my experience getting Stability AI's stable-audio-open-small model working in a consumer-grade iOS app. Dealing with multiple model formats, conversion tooling and runtime frameworks complicated the endeavor.

I followed ARM's instructions for converting the stable-audio-open-small model to TensorFlow Lite. These instructions were Android-oriented but are applicable to iOS too.

The first issue I encountered was that Tensorflow Lite's Metal/GPU delegates don't seem to be able to run the three submodels, because many model operations are not supported. This left using the regular (cpu-bound) delegate using XNNPack to try and maximize performance. However, even with XNNPack enabled, the autoencoder model -- the final stage in the pipeline -- performed very poorly in terms of performance and memory usage. Its transient memory usage precluded running the app on older devices with less RAM.

To work around this, I used Apple's CoreML tools to convert just the original autoencoder pytorch model to a CoreML model. I was happy to find that this worked, and performance and memory usage improved significantly, enabling use on older devices. The size of the bundle was also reduced, but remains on the large side.

Although the model sometimes seems weak in some cases where high fidelity is expected, some of the outputs can be quite unusual and unexpected, which might be valuable for creative use-cases. Being able to share the audio clips as a ringtone is a terrific feature of iOS 26 (long overdue).

My app is called Diffuzion and is available in the App Store globally. Exporting audio is a premium feature gated by an in-app purchase, but you can use the promo code DIFFUZION4HN to get the functionality for free. This code is good until November 14th. I would appreciate feedback on possible improvements/features for the future. I have my own set of ideas, but you may have more compelling ones!

Copilot leaked information and misrouted to another users

AI Agent Guides from Google, Anthropic, Microsoft, etc. Released This Week

Europeans recognize Zohran Mamdani's policies as 'normal'

US judge approves DOJ decision to drop Boeing criminal case

Show HN: Tool2agent – a protocol for LLM tool feedback workflows

Azure Cosmos DB and DocumentDB Agenda for Microsoft Ignite 2025

YOLO Mode Is How You Build Fast. Auditable Control Is How You Ship Faster

Show HN: I made a Halloween roguelike where you battle to merge a 1000-line PR

Detection of Covid via sound of cough by machine-learning with 98.5% accuracy

How Much Does This Meeting Cost?

Developing a 80000x40000 linear scanning medium format camera [video]

Apple's Watch Will Lose Wi-Fi Sync with iPhone in Europe

Medical miracles in Lourdes, France recognized by the Catholic Church 2018-2025

First Look at Local Housing Markets in October

Nubank announces a new hybrid model for 2026

Show HN: Flynn's Arcade (Pico8 on Mobile)

Our 10 Rules of using Coding Agents

When did people favor composition over inheritance?

Does the AI boom threaten air quality?

Writing software is an act of learning. Don’t automate it.

Tesla shareholders approve Musk's $1T pay plan with 75%+ voting in favor

The Terrifying Physics of Shaking Hands with an Alien [video]

Merry Sky Weather Forecast

Show HN: Unify-Simple-Decision-Table

GTA 6 Is Delayed Again Until November 2026

Statically typed, coroutine based, algebraic effects in Python

Tesla Shareholders Approve Elon Musk's $1T Pay Package

ClickHouse Acquires LibreChat

Surgical Drill That Cuts Bone but Not Soft Tissue [video]

New therapeutic brain implants could defy the need for surgery