Hi HN,
I put together some info on repurposing the AMD BC-250, an APU (Zen 2 + RDNA 1.5 "Cyan Skillfish") originally made for blockchain appliances. They've been showing up on AliExpress for around $150.
Since it's an unusual hybrid chip (GFX1013), I ended up running the AI compute stack entirely through Vulkan (via Mesa/RADV).
Out of the box, the Linux kernel's TTM memory manager caps allocations, causing OOMs on larger models. I found that passing ttm.pages_limit=4194304 to the kernel bypasses this and unlocks the full 16GB of Unified Memory (UMA) directly for the GPU.
With the memory unlocked, it makes for a fun piece of inference hardware. Currently it runs:
* Qwen3.5-35B-A3B MoE via Ollama at ~38 tok/s. (Interestingly, 27B dense models crash because the GPU lacks matrix cores, but the 35B MoE runs fine since only 3B parameters are active per token).
* FLUX.2-klein-9B for local image generation.
I documented the driver workarounds, memory settings and some benchmarks in the repo in case anyone else has one of these boards and wants to tinker with it (and not just repurpose it as poor man's GabeCube ;)).