frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

April 2026 TLDR Setup for Ollama and Gemma 4 26B on a Mac mini

https://gist.github.com/greenstevester/fc49b4e60a4fef9effc79066c1033ae5
54•greenstevester•2h ago

Comments

greenstevester•2h ago
Right. So Google released Gemma 4, a 26B mixture-of-experts model that only activates 4B parameters per token.

It's essentially a model that's learned to do the absolute minimum amount of work while still getting paid. I respect that enormously.

It scores 1441 on Arena Elo — roughly the same as Qwen 3.5 at 397B and Kimi k2.5 at 1100B.

Ollama v0.19 switched to Apple's MLX framework on Apple Silicon. 93% faster decode.

They've also improved caching so your coding agents don't have to re-read the entire prompt every time, about time I'd say.

The gist covers the full setup: install, auto-start on boot, keep the model warm in memory.

It runs on a 24GB Mac mini, which means the most expensive part of your local AI setup is still the desk you put it on.

krzyk•36m ago
By desk you mean that "Mac mini"? Because it is pricey. In my country it is 1000 USD (from Apple for basic M4 with 24GB). My desk was 1/5th of that price.

And considering that this Mac mini won't be doing anything else is there a reason why not just buy subscription from Claude, OpenAI, Google, etc.?

Are those open models more performant compared to Sonnet 4.5/4.6? Or have at least bigger context?

redrove•1h ago
There is virtually no reason to use Ollama over LM Studio or the myriad of other alternatives.

Ollama is slower and they started out as a shameless llama.cpp ripoff without giving credit and now they “ported” it to Go which means they’re just vibe code translating llama.cpp, bugs included.

iLoveOncall•1h ago
> There is virtually no reason to use Ollama over LM Studio or the myriad of other alternatives.

Hmm, the fact that Ollama is open-source, can run in Docker, etc.?

alifeinbinary•1h ago
I really like LM Studio when I can use it under Windows but for people like me with Intel Macs + AMD gpu ollama is the only option because it can leverage the gpu using MoltenVK aka Vulkan, unofficially. We're still testing it, hoping to get the Vulkan support in the main branch soon. It works perfectly for single GPUs but some edge cases when using multiple GPUs are unsupported until upstream support from MoltenVK comes through. But yeah, I agree, it wasn't cool to repackage Georgi's work like that.
lousken•1h ago
lm studio is not opensource and you can't use it on the server and connect clients to it?
jedisct1•45m ago
LM Studio can absolutely run as as server.
walthamstow•12m ago
IIRC it does so as default too. I have loads of stuff pointing at LM Studio on localhost
meltyness•47m ago
I feel like the READMEs for these 3 large popular packages already illustrate tradeoffs better than hacker news argument
gen6acd60af•45m ago
LM Studio is closed source.

And didn't Ollama independently ship a vision pipeline for some multimodal models months before llama.cpp supported it?

faitswulff•21m ago
Does LM Studio have an equivalent to the ollama launch command? i.e. `ollama launch claude --model qwen3.5:35b-a3b-coding-nvfp4`
easygenes•1h ago
Why is ollama so many people’s go-to? Genuinely curious, I’ve tried it but it feels overly stripped down / dumbed down vs nearly everything else I’ve used.

Lately I’ve been playing with Unsloth Studio and think that’s probably a much better “give it to a beginner” default.

polotics•1h ago
Ollama got some first-mover advantage at the time when actually building and git pulling llama.cpp was a bit of a moat. The devs' docker past probably made them overestimate how much they could lay claim to mindshare. However, no one really could have known how quickly things would evolve... Now I mostly recommend LM-studio to people.

What does unsloth-studio bring on top?

easygenes•1h ago
LM Studio has been around longer. I’ve used it since three years ago. I’d also agree it is generally a better beginner choice then and now.

Unsloth Studio is more featureful (well integrated tool calling, web search, and code execution being headline features), and comes from the people consistently making some of the best GGUF quants of all popular models. It also is well documented, easy to setup, and also has good fine-tuning support.

diflartle•25m ago
Ollama is good enough to dabble with, and getting a model is as easy as ollama pull <model name> vs figuring it out by yourself on hugging face and trying to make sense on all the goofy letters and numbers between the forty different names of models, and not needing a hugging face account to download.

So you start there and eventually you want to get off the happy path, then you need to learn more about the server and it's all so much more complicated than just using ollama. You just want to try models, not learn the intricacies of hosting LLMs.

robotswantdata•1h ago
Why are you using Ollama? Just use llama.cpp

brew install llama.cpp

use the inbuilt CLI, Server or Chat interface. + Hook it up to any other app

Bigsy•28m ago
For MLX I'd guess.
boutell•39m ago
Last night I had to install the VO.20 pre-release of ollama to use this model. So I'm wondering if these instructions are accurate.
logicallee•11m ago
In case someone would like to know what these are like on this hardware, I tested Gemma 4 32b (the ~20 GB model, the largest Gemma model Google published) and Gemma 4 gemma4:e4b (the ~10 GB model) on this exact setup (Mac Mini M4 with 24 GB of RAM using Ollama), I livestreamed it:

https://www.youtube.com/live/G5OVcKO70ns

The ~10 GB model is super speedy, loading in a few seconds and giving responses almost instantly. If you just want to see its performance, it says hello around the 2 minute mark in the video (and fast!) and the ~20 GB model says hello around 5 minutes 45 seconds in the video. You can see the difference in their loading times and speed, which is a substantial difference. I also had each of them complete a difficult coding task, they both got it correct but the 20 GB model was much slower. It's a bit too slow to use on this setup day to day, plus it would take almost all the memory. The 10 GB model could fit comfortably on a Mac Mini 24 GB with plenty of RAM left for everything else, and it seems like you can use it for small-size useful coding tasks.

Show HN: European alternatives to Google, Apple, Dropbox and 120 US apps

https://only-eu.eu/en/
279•madman_dev•2h ago•105 comments

Show HN: Apfel – The free AI already on your Mac

https://apfel.franzai.com
156•franze•3h ago•24 comments

April 2026 TLDR Setup for Ollama and Gemma 4 26B on a Mac mini

https://gist.github.com/greenstevester/fc49b4e60a4fef9effc79066c1033ae5
55•greenstevester•2h ago•19 comments

Google releases Gemma 4 open models

https://deepmind.google/models/gemma/gemma-4/
1546•jeffmcjunkin•20h ago•423 comments

Decisions that eroded trust in Azure – by a former Azure Core engineer

https://isolveproblems.substack.com/p/how-microsoft-vaporized-a-trillion
872•axelriet•20h ago•373 comments

ESP32-S31: 320MHz 2C RV32IMAFCP+CLIC, 512KB SRAM, GbE, 802.11ax, 61 GPIO

https://www.espressif.com/en/news/ESP32_S31_Release
77•topspin•5d ago•43 comments

NHS staff refusing to use FDP over Palantir ethical concerns

https://www.freevacy.com/news/financial-times/nhs-staff-refusing-to-use-fdp-over-palantir-ethical...
103•chrisjj•2h ago•23 comments

What Category Theory Teaches Us About DataFrames

https://mchav.github.io/what-category-theory-teaches-us-about-dataframes/
40•mchav•5d ago•4 comments

'Fatal decision': EU slammed for caving to US pressure on digital rules

https://www.politico.eu/article/fatal-decision-eu-slammed-for-caving-to-us-pressure-on-digital-ru...
38•nickslaughter02•1h ago•19 comments

The True Shape of Io's Steeple Mountain

https://www.weareinquisitive.com/news/hidden-in-the-shadow
68•carlosjobim•5d ago•1 comments

Tailscale's new macOS home

https://tailscale.com/blog/macos-notch-escape
486•tosh•17h ago•235 comments

Cursor 3

https://cursor.com/blog/cursor-3
446•adamfeldman•18h ago•340 comments

Intel Assured Supply Chain Product Brief

https://www.intel.com/content/www/us/en/content-details/850997/intel-assured-supply-chain-product...
9•aw-engineer•3d ago•1 comments

Artemis II's toilet is a moon mission milestone

https://www.scientificamerican.com/article/artemis-iis-toilet-is-a-moon-mission-milestone/
265•1659447091•1d ago•114 comments

Qwen3.6-Plus: Towards real world agents

https://qwen.ai/blog?id=qwen3.6
547•pretext•21h ago•187 comments

C89cc.sh – standalone C89/ELF64 compiler in pure portable shell

https://gist.github.com/alganet/2b89c4368f8d23d033961d8a3deb5c19
144•gaigalas•2d ago•45 comments

Proton Meet Isn't What They Told You It Was

https://www.sambent.com/proton-meet-isnt-what-they-told-you/
127•bundie•3h ago•107 comments

Good ideas do not need lots of lies in order to gain public acceptance (2008)

https://blog.danieldavies.com/2004/05/d-squared-digest-one-minute-mba.html
277•sedev•18h ago•122 comments

H.264 Streaming Fees: What Changed, Who's Affected, and What It Means

https://www.streamingmedia.com/Articles/ReadArticle.aspx?ArticleID=173935
23•phantomathkg•57m ago•11 comments

Vector Meson Dominance

https://johncarlosbaez.wordpress.com/2026/03/29/vector-meson-dominance/
40•chmaynard•5d ago•3 comments

New Rowhammer attacks give complete control of machines running Nvidia GPUs

https://arstechnica.com/security/2026/04/new-rowhammer-attacks-give-complete-control-of-machines-...
61•01-_-•4h ago•4 comments

Show HN: Home Maker: Declare Your Dev Tools in a Makefile

https://thottingal.in/blog/2026/03/29/home-maker/
69•sthottingal•5d ago•40 comments

LinkedIn is searching your browser extensions

https://browsergate.eu/
1779•digitalWestie•23h ago•722 comments

Switzerland hosts 'CERN of semiconductor research'

https://www.swissinfo.ch/eng/swiss-ai/switzerland-hosts-cern-of-semiconductor-research/91015332
23•teleforce•2h ago•4 comments

Maze Algorithms (1997)

https://www.astrolog.org/labyrnth/algrithm.htm
66•marukodo•2d ago•17 comments

Significant progress made on Xbox 360 recompilation

https://readonlymemo.com/rexglue-xbox-360-recompilation-interview/
129•tetrisgm•4d ago•25 comments

George Goble has died

https://www.legacy.com/us/obituaries/wlfi/name/george-goble-obituary?id=61144779
154•finaard•17h ago•32 comments

ParadeDB (YC S23) Is Hiring Database Internal Engineers (Rust)

https://paradedb.notion.site/
1•philippemnoel•14h ago

I Built an SMS Gateway with a $20 Android Phone – Jonno.nz

https://jonno.nz/posts/built-an-sms-gateway-with-a-20-dollar-android-phone/
57•jonno-nz•12h ago•13 comments

JSON Canvas Spec (2024)

https://jsoncanvas.org/spec/1.0/
120•tobr•4d ago•33 comments