GLM-5.2 is a step change for open agents

https://www.interconnects.ai/p/glm-52-is-the-step-change-for-open

75•vantareed•1d ago

Comments

Balinares•1d ago

I can't help wondering what kind of models we'll see coming out of China once it gets its own chip fabs up and running. Right now it sounds like the US's export ban is not slowing them down a whole lot.

ceejayoz•1h ago

> Right now it sounds like the US's export ban is not slowing them down a whole lot.

It may wind up being a massive boost to them in the long run, even.

Necessity is the mother of invention.

pkroll•1h ago

If this pans out, you're not at all kidding: https://www.youtube.com/watch?v=8ekndZwyOzo

jerojero•1d ago

Open weight models from Chinese labs tend to be significantly cheaper.

I think theyre absolutely needed. I can't afford 200 USD a month for personal use of coding AI, and I don't think such prices are reasonable for most of the world economy anyway. Not to mention US firms might be giving their employees a lot more than that.

It's increasingly feeling, to me, that theres a gap building up between haves and have nots. But then, we get news of these open weight models that are reasonably priced in inference with reasonable capabilities. Yes, they take maybe 6-9 months to get there, tbh, that's not a bad trade off at all.

ttoinou•1d ago

200 is much less than the value you’re supposed to get out of it. If it’s not then yeah go ahead and use cheaper models with worst quality

Dayshine•1d ago

I'm not sure how I'm supposed to get $200 of value out of personal use!

LPisGood•1h ago

Note that 200 dollars of value is different than 200 dollars of profit.

devmor•1h ago

I personally don’t find it that useful for most tasks, but if say, you get paid $50/hr for your work and it saves you more than 4 hours of work in a month, there you go.

holoduke•53m ago

Here most of my colleagues have +200 dollar rates. It's really a no brainer. But sure, in south America or some Asian countries maybe it is. But still most devs need it anyway. Also in the poor regions.

HDBaseT•22m ago

$200/h is on the extreme end and I would argue most people here aren't anywhere close to that.

The median hourly wage in the US is $28/h, this equates to nearly 7.5 hours. A full day of work a month for the average person to use Claude with reasonable limits.

Yes, the people on $28/h may not be the software development types, so their income might not be as high, but these are the people who would probably be vibe coding the most since they aren't day to day programmers!

themgt•1h ago

I just tested GLM 5.2 out via Z.ai in pi for a little one-off project that was already scoped. It actually did a relatively decent job starting out, and figured important things out from context.

But the reasoning traces became increasingly hilarious, with it getting confused and going in loops, doubting itself. I began to feel almost sad, it was like listening to the internal monologue of someone with anxiety disorder.

It made pretty good progress but wound up going in a lot of goofy loops and doing things a bit "off" from standards I'd hoped it would infer, and finally started going a bit nuts, "This is very confusing.", "OH WAIT", seemingly hallucinating a whole side-quest that didn't make sense and looking at making internal system changes to try to achieve its (now very confused) goal when I pulled the plug.

Without seeing the reasoning traces from Claude/GPT it's hard to really know, but it definitely didn't feel like the same quality of reasoning, even if dogged persistence does wind up actually working eventually.

jauntywundrkind•58m ago

I think the self-doubt might actually be a very crucial part of it's capability. I often feel compelled to interrupt when I'm watching it think (which thank the stars it let's us do, unlike the big American models!!), but usually it makes the right pick!

Being willing and able to reconsider seems very good. Going around and around, pulling in more thinking, integrating it: maybe that's why it is as good as it's good.

I want to emphasize again how excellent it is that we can see the thinking. I think this makes GLM so much better an experience for me. It gives me such insight into what is being considered, helps me see where things go wrong. It grounds me, gives me the notion of where the results come from. It was so jarring to switch to GPT and Opus and find that they won't discuss with me, won't reveal their thinking: that feels fundamentally unsafe, for me, for society, to have such a severe black box. I don't think it should be allowed, honestly.

Many thanks to this recent submission, which is the first time I've seen anyone blog about this core difference: The text in Claude Code’s “Extended Thinking” output is not authentic. https://patrickmccanna.net/the-text-in-claude-codes-extended... https://news.ycombinator.com/item?id=48630535

citizenpaul•57m ago

Ive been using glm5 since its release and still prefer it to glm5.1 and so far to glm5.2

Perhaps it is just my harness and workflow, but the older model still seems to work better. Also the token cost is significantly lower. I rarely spend more than $20 a week with $50 cap. Not even half claudes ambiguous minimum $200 a month plan.

timcobb•21m ago

Can people share their GLM and open model setups in general please? What provider do you use. Why do you trust it with serving full quality? What harness do you use? Why do you trust it not to have malware (most harnessed are TS apps). I am just trying GLM 5.1 from Nvidia build in open code would love to hear how you all do it, thanks.

rainmaking•12m ago

GLM 5.2 coding plan- I'll post the agent as soon as I can! But opencode works and their own zcode is really good as well.

aunty_helen•12m ago

I signed up to a z.ai max account, $144. Hardly been able to use it as it 429s on most requests. They’re also refusing to refund me.

wuhhh•27m ago

Your post made me laugh because I experienced the same as you but the other way around. I switched from Claude to a multi model harness a couple of days ago and the first model I tried was GLM5.2.

I gave it some simple code porting exercises and watched dumbfounded at the reasoning, which was more like the ravings of a lunatic - but lo and behold, after much confusion and a dizzying number of eureka moments the task was completed very successfully.

I tried Kimi on a similar task, much faster, a little more reassuring somehow in its ramblings, also surprisingly good results.

To be clear, I’m not surprised the results were good because they’re not GPT or Claude, but because the line of reasoning was so bonkers. Coming from Claude, I was just not used to seeing this, but I’ll bet it’s just as nuts with the frontier models and we’re just not allowed to see it (I’m about to read the links you shared).

Agree wholeheartedly that transparency is of grave importance.

rainmaking•8m ago

Yeah isn't that thinking weird?

Now I see the issue clearly! But wait... now I have the full picture! But wait... Found it!

I gave up a few times because of it at first until I realized I just had to let GLM get on with it and what came out was great!

But once it was outright endearing- challenging bug, it said: I have been very thorough. Then it escalated where to look and aced it. Built in confucian values

OpenAI unveils its first custom chip, built by Broadcom

Qualcomm to Acquire Modular

RubyLLM: A Ruby framework for all major AI providers

Elastic lays off 7% of employees

We’re making Bunny DNS free

PR spam today looks like email spam in the early 2000s

Computer use in Gemini 3.5 Flash

The Xteink X4 E-Ink Reader

Robotics Teams Are Rebuilding the Data Stack from Scratch

Crawling BitTorrent DHTs for Fun and Profit [pdf]

There are a few things that I look back on as my mistakes in the early days

45°C cooling design cuts data center water use to near zero

PostgreSQL Is Enough

GLM-5.2 is a step change for open agents

Show HN: Nub – A Bun-like all-in-one toolkit for Node.js

Show HN: LookAway, a Mac break reminder that knows when not to interrupt

GitHub shouldn't be a dependency for publishing Rust on crates.io

Stealing Is a Skill

Krea 2: SOTA open-weights 12B image model

I can haz smoller NixOS ISOs?

How the Fifth Lateran Council unlocked financial theory

Pondering routing more of my traffic via nodes outside the UK

A Practical Guide to SSH Tunnels: Local and Remote Port Forwarding

Running Windows Games on a Hobby OS with Wine

Show HN: Monolisa v3 – a typeface for developers and creatives

Exploiting vulnerabilities in Johnson and Johnson web apps

Thomann takes legal action against Fender

Big AI labs are hiring philosophers

NSA lost access to Mythos amid Anthropic dispute

I taught a bucket to speak Git

OpenAI unveils its first custom chip, built by Broadcom

Qualcomm to Acquire Modular

RubyLLM: A Ruby framework for all major AI providers

Elastic lays off 7% of employees

We’re making Bunny DNS free

PR spam today looks like email spam in the early 2000s

Computer use in Gemini 3.5 Flash

The Xteink X4 E-Ink Reader

Robotics Teams Are Rebuilding the Data Stack from Scratch

Crawling BitTorrent DHTs for Fun and Profit [pdf]

There are a few things that I look back on as my mistakes in the early days

45°C cooling design cuts data center water use to near zero

PostgreSQL Is Enough

GLM-5.2 is a step change for open agents

Show HN: Nub – A Bun-like all-in-one toolkit for Node.js

Show HN: LookAway, a Mac break reminder that knows when not to interrupt

GitHub shouldn't be a dependency for publishing Rust on crates.io

Stealing Is a Skill

Krea 2: SOTA open-weights 12B image model

I can haz smoller NixOS ISOs?

How the Fifth Lateran Council unlocked financial theory

Pondering routing more of my traffic via nodes outside the UK

A Practical Guide to SSH Tunnels: Local and Remote Port Forwarding

Running Windows Games on a Hobby OS with Wine

Show HN: Monolisa v3 – a typeface for developers and creatives

Exploiting vulnerabilities in Johnson and Johnson web apps

Thomann takes legal action against Fender

Big AI labs are hiring philosophers

NSA lost access to Mythos amid Anthropic dispute

I taught a bucket to speak Git

GLM-5.2 is a step change for open agents

Comments