And, I'm getting worried that someday Anthropic will say "Hey, yeah, about that Max plan which is $100/mo. Sorry, we decided we need to charge you $5000/mo. Oh, and LOL, btws, that's if you commit to an annual plan."
Or, a Google rep will email me saying "Sundar (it wasn't me!) says you were too critical of Google on HN that one time four years ago (Sundar verified it isn't a gemini hallucination, but I can't really question it). So, your gemini cli is cut off immediately."
Then, I'll be stuck and no more software engineering work because my brain rotted away.
For this reason, I want to run LLMs locally, using llama.cpp/ollama and use tools like Aider. But, running a "big" model with my hardware is tough. The quality of output and all the things that make claude and gemini so powerful are not there the combination of local LLMs and tool like aider, at least when things run locally. Perhaps I'm doing it wrong?
I wonder why I can't find a model that only does Python and is good only at that, and run that locally. When I need to do zig, I can switch to a zig model, and unload the python one from memory. If it only does a single language, and it does not need to know about US presidential elections, couldn't it be very small and something I could run on my MacOS M1 laptop with 16GB of ram?
I feel like models get big when they get generalized. I am never working on a codebase that has Rails and FastAPI and Elixir and React and Svelte and Go and Rust and COBOL. I might work on a repo with typescript and python, but never more than one, and I'm usually focused on either the frontend or backend.
If this is the solution, are language foundations building their own models? Is this already happening on huggingface or somewhere else?
This seems like an approach where a language foundation could train and certify their own model and it would be safe and "open source" and "open weights."
Is there a big stupid assumption I'm making here that makes this idea impossible?
ben_w•1h ago
I also wonder this.
My suspicion — based on what I experienced with local image generating models, but otherwise poorly educated — is that they need all of the other stuff besides programming languages just to understand what your plain English prompt means in the first place, and they need to be quite bulky models to have any kind of coherency over token horizons longer than one single function.
Of interest: Apple does ship a coding LLM in Xcode that's (IIRC) 2 GB and it really just does feel like fancy Swift-only autocomplete.