Qwen3.7-Max: The Agent Frontier

55•kevinsimper•2h ago

Comments

goyozi•1h ago

These are very good numbers. I still don’t get why they don’t compare against latest competitor versions in these posts, it’s not like we’re all not going to notice.

hmokiguess•39m ago

this puzzles me too, I want to know

htrp•22m ago

I think its part of the expectation setting (with a side of we did our distillation/ eval harness on a specific model).

if they say it's 4.7 comparable, it anchors that into your head as the model to evaluate against.

maelito•18m ago

Marketing.

bratao•40m ago

It is super strange that all last (3?) releases they keep comparing older models such as Opus-4.6.

vessenes•31m ago

Some of it’s probably timing. Some of it is wanting to look good. That said, I just went to the claw-eval site, and neither 4.7 nor 5.5 from oAI are listed on the benchmarks. So there’s also just the time from others to get benchmarking done and published.

tarruda•29m ago

Looking forward to more open weight releases from Qwen, especially 122B and 397B.

smcleod•25m ago

Yeah that 60-150b~ range is such a sweet spot for current 'prosumer' hardware, I'd love to see something like a 120b-a14b or there about.

gcr•21m ago

What’s the price point for getting into that sweet spot?

I’m on an M1 Max with 32GB VRAM, so I’m looking forward to the 27B or 35B-A3B models. Is dropping $5k for an RTX 6000 or a DGX Spark really the best option?

tarruda•13m ago

> What’s the price point for getting into that sweet spot?

In October/2024 I got my Mac studio M1 ultra with 128G, IIRC it was ~$2500. With recent prices explosion, it has certainly gotten more expensive. https://frame.work/ is selling 128G strix halo mainboard for $2700, but you have to add storage and case.

ttoinou•10m ago

M5 Max 64GB (sweet spot) or 128GB (only 1000 USD, better to keep it for the future) more are the best quality price ratio, future proof, reliable, resellable and flexible workloads. Harder to use as a server might be the only drawback

roger_•6m ago

M5 Max 128GB for $1k?

throwaw12•5m ago

What do you recommend for non-Mac setup? I am a Mac user, but its getting expensive, and not seeing reason to jump to the latest M5

anonym29•9m ago

Strix Halo at $2k with similar TG and about half the PP of DGX Spark was a pretty good deal IMO, especially considering it's also a full x86 system... 16c/32t Zen 5, 40 CU RDNA 3.5, 128 GB unified memory at ~220 GB/s real-world speeds (256 GB/s theoretical) - that runs full tilt at 140W in performance mode and idles at ~10W.

Unfortunately, the prices rose on these a lot, but unevenly. Beelink GTR 9 Pro is $4400, Framework Desktop is ~$3500, for what is basically the exact same mainboard as a Bosgame M5 for $2800.

Apple's M5 Max is another attractive option. Apple silicon traditionally had great MBW and was good at TG, but struggled with PP, but the new neural engines in those GPU cores have made a big difference in a good way here.

Gorgon Halo is rumored for June announcement with Q4'26 release with basically +100 MHz clocks on Strix Halo, LPDDR5X-8533 instead of LPDDR5X-8000, but more importantly, 192 GB max instead of 128 GB.

I'd say it's better to wait for Gorgon Halo than to grab Strix Halo now. However, Medusa Halo, rumored for H2'27, is slated to have 24c/48t Zen 6, 48 CU of RDNA 5 instead of 40 CU RDNA 3.5, and a 384 bit bus w/ LPDDR6, which should make 256 GB at more like ~490-600 GB/s MBW, which will really make Strix and Gorgon Halo obsolete.

Also worth keeping an eye out for Serpent Lake (intel CPU + nvidia iGPU on a single board with unified memory, rumored for 2028-2029 iirc), and on the 160 GB Crescent Island Intel dGPU.

tarruda•19m ago

I have a 128G mac studio and even 397B was a happy surprise to me due to its high quantization resilience.

I've created a 2.54BPW quant that fit on my hardware with 128k context, 20 tps tg and 200tps pp, while maintaining high scores on many benchmarks: https://huggingface.co/tarruda/Qwen3.5-397B-A17B-GGUF/discus...

ttoinou•12m ago

better than antirez ds4 ?

tarruda•5m ago

I only tried a very early version of that when it was just a llama.cpp fork and Qwen was certainly better in my tests.

But I was not super impressed with deepseek 4 flash using it from the official API either, so it doesn't seem quantization fault. It is a good model, but nothing out of the ordinary in the few benchmarks I ran on it (with full awareness that benchmarks are biased).

mixtureoftakes•12m ago

I'm more excited for qwen3.7 9b and 72b, these are usually so good for their size

bsenftner•24m ago

Any reports from people using their coding agent(s)?

Map of Metal

Everything in C is undefined behavior

College students drown out AI-praising commencement speeches with boos

Qwen3.7-Max: The Agent Frontier

Nobody understands the point of hybrid cars [video]

Gemini 3.5 Flash

FiveThirtyEight articles on the Internet Archive

I’ve built a virtual museum with nearly every operating system you can think of

Japan is gripped by mass allergies. A 1950s project is to blame

Learnings from 100K lines of Rust with AI (2025)

Infomaniak transitions to a foundation model to protect user data privacy

Google changes its search box

Show HN: Forge – Guardrails take an 8B model from 53% to 99% on agentic tasks

Google's AI is being manipulated. The search giant is quietly fighting back

Anna's Archive Hit with $19.5M Default Judgment and Global Domain Takedown Order

Remove-AI-Watermarks – CLI and library for removing AI watermarks from images

Apple unveils new accessibility features

OpenAI Adopts Google's SynthID Watermark for AI Images with Verification Tool

Mistral AI acquires Emmi AI

The Invention of Buses

Gemini CLI will stop working from June 18, 2026

CopyFail: From Pod to Host

Simulated Evolution on the PICO-8

RISC-V and Floating-Point

No way to parse integers in C (2022)

Incident Report: Railway Blocked by Google Cloud (Resolved)

In 1979 engineer Hugh Padgham discovered "gated reverb" – by accident

GitHub is investigating unauthorized access to their internal repositories

Minnesota becomes first state to ban prediction markets

The Mercury logic programming system

Qwen3.7-Max: The Agent Frontier

Comments

Map of Metal

Everything in C is undefined behavior

College students drown out AI-praising commencement speeches with boos

Qwen3.7-Max: The Agent Frontier

Nobody understands the point of hybrid cars [video]

Gemini 3.5 Flash

FiveThirtyEight articles on the Internet Archive

I’ve built a virtual museum with nearly every operating system you can think of

Japan is gripped by mass allergies. A 1950s project is to blame

Learnings from 100K lines of Rust with AI (2025)

Infomaniak transitions to a foundation model to protect user data privacy

Google changes its search box

Show HN: Forge – Guardrails take an 8B model from 53% to 99% on agentic tasks

Google's AI is being manipulated. The search giant is quietly fighting back

Anna's Archive Hit with $19.5M Default Judgment and Global Domain Takedown Order

Remove-AI-Watermarks – CLI and library for removing AI watermarks from images

Apple unveils new accessibility features

OpenAI Adopts Google's SynthID Watermark for AI Images with Verification Tool

Mistral AI acquires Emmi AI

The Invention of Buses

Gemini CLI will stop working from June 18, 2026

CopyFail: From Pod to Host

Simulated Evolution on the PICO-8

RISC-V and Floating-Point

No way to parse integers in C (2022)

Incident Report: Railway Blocked by Google Cloud (Resolved)

In 1979 engineer Hugh Padgham discovered "gated reverb" – by accident

GitHub is investigating unauthorized access to their internal repositories

Minnesota becomes first state to ban prediction markets

The Mercury logic programming system