Something like eNPU or eTPU seems more appropriate here.
I hope it'll work on an M4 Mac Mini. Does anyone know what hardware to get? You'll need a full ATX PSU to supply power, right? And then tinygrad can do LLM inference on it?
It would work just like a discrete GPU when doing CPU+GPU inference: you'd run a few shared layers on the discrete GPU and place the rest in unified memory. You'd want to minimize CPU/GPU transfers even more than usual, since a Thunderbolt connection only gives you equivalent throughput to PCIe 4.0 x4.
How big a bottleneck is Thunderbolt 5 compared to an SSD? Is the 120 Gbps mode only available when linked to a monitor?
That's why all the projects streaming models into the GPU from an SSD popped up recently.
Takes a standard PSU. However, Mac Minis don't have occulink. So you might be a bit limited by whatever USB C can do.
Now if Intel can get there Arc drivers in order we'll see some real budget fun.
https://www.newegg.com/intel-arc-pro-b70-32gb-graphics-card/...
32 GB of VRAM for 1000$. Plus a 500$ Mac Mini.
Article mentions: "Apple finally approved our driver for both AMD and NVIDIA"
Does not mention Intel (GPUs). Select AMD GPUs work on macOS, but...
Macs (both Intel and ARM) support TB, but eGPU only work on Intel Macs, and basically only with AMD.
Good news is for medium end gaming choices are solid, and CUDA works on AMD these days.
I own one of these, the cage is just a piece of plastic. Anyway, I don't think 80$ is that big of a difference here. I can't really afford a 4k Nvidia GPU. Intel is my only hope.
Brand is TH3P4G3. Egpu.io has decent eGPU comparisons.
I wouldn't want all that dust in my GPU fans, prefer that near my case fans. I also don't like it given I got cats and want to store/box hw. I do use the eGPU in the fuse box. If I had a larger house, I'd use a server rack.
I was recently in the market for an eGPU but for a different niche (not eGPU/eNPU/eTPU but getting a HBA via TB to connect a LTO-6 drive via SAS). I went for a Sonnet instead, very low profile and small. I also bought an Asus one. Slightly bigger, came with more fans but TB4 instead of TB3 on the Sonnet. The cages are aluminium. Those eGPU were second hand (also without warranty but quicker S&H than Chinese New Year) but came with PSU. As you also gotta buy a PSU for it which came with the eGPUs I mentioned. For me no biggie, as I got a decent PSU lying around.
I hooked up a Radeon RX 9060 XT to my Feodra KDE laptop (Yoga Pro 7 14ASP9) using a Razer Core X Chroma (40Gbps), and the performance when using the eGPU was very similar to using the Radeon 880M built into the laptop's Ryzen 9 365 APU.
So at least with my setup, performance is not great at all.
On paper, TB4 is capable of pushing 5GB/s, which is somewhere between 4x and 8x of PCIe 3.0, while a 16x PCIe 4.0 link can do ~31.5GB/s.
For numbers about all PCIe generations and lane counts, see the "History and revisions" section here: https://en.wikipedia.org/wiki/PCI_Express
Edit to add: the performance I measured is in gaming workloads, not compute
Apple's decision is not constrained by server logic or ballooning costs, it is entirely a client-based policy to not sign CUDA drivers.
Apple has a monopoly over the "M-chip" personal computer market. They have a monopoly over the iOS market with the app store. They have a monopoly over the driver market on macOS.
Like, Microsoft was found guilty of exploiting its monopoly for installing IE by default while still allowing other browser engines. On iOS, apple bundles safari by default and doesn't allow other browser engines.
If we apply the same standard that found MS a monopoly in the past, then Apple is obviously a monopoly, so at the very least I think it's fair to say that reasonable people can disagree about whether Apple is a monopoly or not.
[0]: https://en.wikipedia.org/wiki/United_States_v._Microsoft_Cor....
Microsoft was found guilty, so clearly the bar is not what you're trying to claim.
If we have a right to repair (we broadly do not, AFAICT), then that doesn't necessarily mean that we have a right to modify and/or add new functionality.
When I repair a widget that has become broken, I merely return it to its previous non-broken state. I might also decide to upgrade it in some capacity as part of this repair process, but the act of repairing doesn't imply upgrades. At all.
> No OS provider should be allowed to dictate what software you can or not run on your own device and / or OS you have paid for.
I agree completely, but here we are anyway. We've been here for quite some time.
The machine I'm using now represents my choices and matches what matters to me, and works closer to perfectly than all my machines in the past
And yes, I have worked with macs, and no, the UX and the entire tyranny in the Apple ecosystem was not something I could live with
And yes, this machine is fast, predictable, a joy to work with and is a tool I control, not a tool to control me. If something happens to it, I can order the part with the same price that goes into a new machine, and keep using my laptop
Like, for phones, I want a phone which runs Linux, has NFC support, and also has iMessage so my friend who only communicates with blue-bubbles and will never message a green-bubble will still talk to me. I also want it to have regulatory approval in the country I live in so I can legally use it to make calls.
Because apple has closed the iMessage ecosystem such that a linux phone can't use it, such a device is impossible. I cannot vote for it.
As such, I will complain about every phone I own for the foreseeable future.
Because of that, you need an apple device around to be able to deal with iMessage users.
Thanks to Apple co-opting phone numbers, there's literally no need to ever have iMessage for anyone
Isn't that the whole point of the walled garden, that they approve things? How could they aim and realize a walled garden without making things like that have to pass through them?
For the same reason that Microsoft requires Windows driver signing?
Drivers run with root permissions.
Because third party drivers usually are utter dogshit. That's how Apple managed to get double the battery life time even in the Intel era over comparable Windows based offerings.
Modern Mac is Macintosh descendants and by contrast PC is IBM PC descendants (their real name is technically PC-clone but because IBM PC don’t exist anymore the clone part have been scrapped).
And with Apple silicon Mac the two is again very different, for example Mac don’t use NVMe, they use just nand (their controller part is integrated in the SoC) and they don’t use UEFI or BIOS, but a combination of Boot ROM, LLB and iBoot
Or I could have totally misunderstood the role of Docker in this.
Since that’s definitely a big enough use case all on its own, I wonder if such a product should really just double down on LLMs.
You want to use an NVidia GPU for LLM ? just buy a basic PC on second hand (the GPU is the primary cost anyway), you want to use Mac for good amount of VRAM ? Buy a Mac.
With this proposed solution you have an half-backed system, the GPU is limited by the Thunderbolt port and you don’t have access to all of NVidia tool and library, and on other hand you have a system who doesn’t have the integration of native solution like MLX and a risk of breakage in future macOS update.
bigyabai•3h ago
chuckadams•3h ago
bigyabai•2h ago
QuantumNomad_•1h ago
If Apple was in the high-end server market, I see no reason why the company I was working for would not be running macOS on Apple hardware as servers, instead of the fleet of Linux based servers they had.
varispeed•2h ago
altairprime•2h ago