Looks like it doesn't get close to GPT-5, Claude 4, or GLM-4.5, but still does reasonably well compared to other open weight models. Benchmarks are rarely the full story though, so time will tell how good it is in practice.
seunosewa•41m ago
The DeepSeek R1 in that list is the old model that's been replaced.
yorwba•16m ago
Yes, and 31.3% is given in the announcement as the performance of the new v3.1, which would put it in sixteenth place.
seunosewa•3m ago
So it's not that great for shell terminals.
coliveira•22m ago
My personal experience is that it produces high quality results.
amrrs•17m ago
Any example or prompt you use to make this statment?
YetAnotherNick•13m ago
Depends on the agent. Rank 5 and 15 are claude 4 sonnet, and this stands close to 15th.
seunosewa•44m ago
It's a hybrid reasoning model. It's good with tool calls and doesn't think too much about everything, but it regularly uses outdated tool formats randomly instead of the standard JSON format. I guess the V3 training set has a lot of those.
Those Qwen3 2507 models are the local creme-de-la-creme right now. If you've got any sort of GPU and ~32gb of RAM to play with, the A3B one is great for pair-programming tasks.
pdimitar•17m ago
Do you happen to know if it can be run via an eGPU enclosure with f.ex. RTX 5090 inside, under Linux?
I'm considering buying a Linux workstation lately and I want it full AMD. But if I can just plug an NVIDIA card via an eGPU card for self-hosting LLMs then that would be amazing.
gunalx•2m ago
You would still need drivers and all the stuff difficult with nvidia in linux with a egpu. (Its not nessecarily terrible just suboptimal) Rather just add the second GPU in the Workstation, or just run the llm in your AMD GPU.
hodgehog11•48m ago
https://www.tbench.ai/leaderboard
Looks like it doesn't get close to GPT-5, Claude 4, or GLM-4.5, but still does reasonably well compared to other open weight models. Benchmarks are rarely the full story though, so time will tell how good it is in practice.
seunosewa•41m ago
yorwba•16m ago
seunosewa•3m ago
coliveira•22m ago
amrrs•17m ago
YetAnotherNick•13m ago