I hear the reason for this is that llama.cpp keeps breaking basic things, so they have become an unreliable partner. Seems this is what Ollama is trying to address by diluting their connections to llama.cpp and directly contacting companies training these models to have simultaneous releases (e.g. GPT-OSS).
They do release high quality inference code, ie https://github.com/mistralai/mistral-inference
baggiponte•4h ago