I'm not bullish on MCP, but at the least this approach gives a good way to experiment with it for free.
You gotta help me out. What do you see holding it back?
Are you sharing any of your revenue from that $79 license fee with the https://ollama.com/ project that your app builds on top of?
Nice to have a local option, especially for some prompts.
I have a 48GB macbook pro and Gemma3 (one of the abliterated ones) fits my non-code use case perfectly (generating crime stories which the reader tries to guess the killer).
For code, I still call Google to use Gemini.
What I like about ollama is that it provides a self-hosted AI provider that can be used by a variety of things. LM Studio has that too, but you have to have the whole big chonky Electron UI running. Its UI is powerful but a lot less nice than e.g. BoltAI for casual use.
Upon installing the first model offered is google/gemma-3-12b - which in fairness is pretty decent compared to others.
It's not obvious how to show the right sidebar they're talking about, it's the flask icon which turns into a collapse icon when you click it.
I set the MCP up with playwright, asked it to read the top headline from HN and it got stuck on an infinite loop of navigating to Hacker News, but doing nothing with the output.
I wanted to try it out with a few other models, but figuring out how to download new models isn't obvious either, it turned out to be the search icon. Anyway other models didn't fare much better either, some outright ignored the tools despite having the capacity for 'tool use'.
chisleu•4h ago
Can't wait for it to arrive and crank up LM Studio. It's literally the first install. I'm going to download it with safari.
LM Studio is newish, and it's not a perfect interface yet, but it's fantastic at what it does which is bring local LLMs to the masses w/o them having to know much.
There is another project that people should be aware of: https://github.com/exo-explore/exo
Exo is this radically cool tool that automatically clusters all hosts on your network running Exo and uses their combined GPUs for increased throughput.
Like HPC environments, you are going to need ultra fast interconnects, but it's just IP based.
dchest•4h ago
Probably should just use llama.cpp server/ollama and not waste a gig of memory on Electron, but I like GUIs.
minimaxir•3h ago
karmakaze•4h ago
incognito124•3h ago
Oof you were NOT joking
noman-land•2h ago
teaearlgraycold•1h ago
sneak•3h ago
I haven’t been using it much. All it has on it is LM Studio, Ollama, and Stats.app.
> Can't wait for it to arrive and crank up LM Studio. It's literally the first install. I'm going to download it with safari.
lol, yup. same.
chisleu•3h ago
I'm considering ordering one of these today: https://www.newegg.com/p/N82E16816139451?Item=N82E1681613945...
It looks like it will hold 5 GPUs with a single slot open for infiniband
Then local models might be lower quality, but it won't be slow! :)
kristopolous•2h ago
evo_9•45m ago
Just wondering if Claude 3.7 has seemed differently lately for anyone else? Was my go to for several months, and I'm no fan of OpenAI, but o3 has been rock solid.
teaearlgraycold•3h ago
chisleu•3h ago
I'm interested in using models for code generation, but I'm not expecting much in that regard.
I'm planning to attempt fine tuning open source models on certain tool sets, especially MCP tools.
prettyblocks•2h ago
truemotive•2h ago
prophesi•2h ago
LM Studio isn't FOSS though.
I did enjoy hooking up OpenWebUI to Firefox's experimental AI Chatbot. (browser.ml.chat.hideLocalhost to false, browser.ml.chat.provider to localhost:${openwebui-port})
s1mplicissimus•1h ago
noman-land•2h ago
imranq•2h ago
zackify•2h ago
Get the RTX Pro 6000 for 8.5k with double the bandwidth. It will be way better