https://huggingface.co/unsloth/MiniMax-M2.7-GGUF
I've been using M2.7 through the Alibaba coding plan for a bit now, and am quite impressed with it's coding ability, and even more impressed when I see how small it is. Fascinating really, makes me wonder how big the frontier models are.
How does it compare to z.ai GLM?
GLM-5 (which is all I have access to on it, not the newer GLM-5.1) is slightly better for the coding tasks I'm using them for, in terms of being more accurate slightly more often. Both are very good, and very close to one another in practice
Qwen3.5-plus is also quite excellent: all of these models feel pretty similar to Sonnet 4.5 in practice, though GLM-5 can have "Opus" like reasoning through surprisingly long context chains I've found.
Composer 2, M2.7, and Qwen 3.6 are all capable to execute those plans just fine.
What the article describes is that the model was able to tweak to its own deployment harness (memory, skills, experimental loop etc) to improve performance on benchmarks. While impressive, it's not doing any modifications to its own weights by e.g. modifying the training code.
There are much more people who are interested only in doing model inference, for which an open-weights model is sufficient to avoid the uncertainties and costs associated with a subscription, and for enabling them to make and use better model harnesses than those offered commercially (better by being more suitable for their specific needs), than people who also want to do model training, for which an open-source model would be needed.
> Non-commercial use permitted based on MIT-style terms; commercial use requires prior written authorization.
And calling the non-commercial usage "MIT-style terms" is a stretch - they come with a bunch of extra restrictions about prohibited uses.
It's open weights, not open source.
There's no copyright on model weights themselves (because they are produced purely mechanically without involving human creativity, the same way there's no copyright on compiled artifacts of a piece of software or an h264 encoded movie file). For software and movies the copyright cover the source material, not the resulting binary, and for LLMs the source material can also be protected by copyright. The problem, is that LLM makers don't own most of the copyright on the source material and worse they claim the training process is transformative enough to erase the copyright of the source material so even the part of the training data for which they own copyright couldn't extend their copyright protection to the weights.
It's very likely that these licenses are entirely devoid of legal value (and I don't think Meta engaged in any legal actions (not even a DMCA takedown) on any of the bazillions llama finetunes violating the llama license on huggingface).
I am pretty sure MiniMax M2.7 would be much better.
I had a really bad time with it. I use (real) Claude Code for work so I know what a good model feels like. MiniMax's token plan is nice but the quality is really far from Claude models.
I needed to constantly "remind" it to get things done. Even for a four sentence prompt in a session that is well below the context window, MiniMax would ignore half of it. This happens all the time. (This is Claude Code + MiniMax API, set up using official instructions)
Basically, if I say get A, B and C done, it will only do A and B. I say, you still need to do C, so it does C but reverts the code for A.
Things that Claude can usually one shot takes 5 iterations with MiniMax.
I ended up switching to Claude to get one of my personal projects done.
Ridiculous, my company has committed to $200k annual plans and they changed the deal mid-way. We'll have to see about a refund.
tiresome
steveharing1•2h ago
ctdinjeu5•1h ago
exe34•1h ago
ctdinjeu5•1h ago
avaer•1h ago
fifthace•1h ago
zozbot234•44m ago
That difference is actually pretty surprising. Is Claude that much more expensive to host? The end-user pricing seems to be pretty similar, or better for OpenAI.
Demiurge•1h ago
adrian_b•33m ago
The most important advantage of using open weights models is to have perfectly predictable performances and costs in the future. When you can run the model on your own hardware you are protected from price increases, subscription limits decreases or quality reductions of the provided models, like it has already happened for the users of Claude Code.
The disadvantage is that if you also want a high speed, you need more expensive hardware. You may defer the cost of buying better hardware, if you use an external provider for now, but you keep in reserve the possibility of hosting yourself the models that you are using, if anything makes the external providers worse.
aand16•1h ago
After logging into my shiny new Nvidia account, I'm presented with a banner saying "contact support to verify your account at help@build.nvidia.com".
I've contacted Nvidia support and haven't heard back. But they did send me a newsletter...
rvz•44m ago
"free" does not mean what you think it means.
To Downvoters: I hope you have read the NVIDIA API Trial Terms of Service [0] before signing up. It clearly has restrictions and limitations.
From [0]:
> Unless you purchase a Subscription from NVIDIA or a Service Provider (as applicable), you may only use the API Service for internal testing and evaluation purposes, not in production. The terms and conditions of your Subscription will govern your production use of the API Service.
[0] https://assets.ngc.nvidia.com/products/api-catalog/legal/NVI...