frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Jyllands-Posten Muhammad cartoons controversy (2005)

https://en.wikipedia.org/wiki/Jyllands-Posten_Muhammad_cartoons_controversy
1•simonebrunozzi•2m ago•0 comments

How to Sparkle in Conversation with Strangers

https://www.newscientist.com/article/2530034-how-to-sparkle-in-conversation-with-strangers/
1•Anon84•2m ago•0 comments

Multiplayer Surf_ski_2 in the Browser

https://www.surfski2.com/
1•possiblelion•3m ago•1 comments

Need a Co-Founder

1•gangaplains•6m ago•0 comments

Captured Logs Reveal Hackers Using Claude and Codex to Breach Companies

https://research.openanalysis.net/claude/codex/hacking/ai%20hacking/llm/redteam/policy%20violatio...
1•Tiberium•7m ago•1 comments

The discovery that changed how scientists think about memory – IBM

https://www.ibm.com/think/news/discovery-changed-how-scientists-think-about-memory-kavli-prize
1•rbanffy•7m ago•0 comments

How the UK government is using AI to speed up the planning system

https://takes.jamesomalley.co.uk/p/build-gemini-build
1•writerJames•8m ago•0 comments

Url.computer – client side URL parser and cURL query builder

https://url.computer/
1•interweb_tube•9m ago•1 comments

Kepp – save anything in one tap, no folders (iOS/Android)

https://kepp.io/
2•palpalych•9m ago•0 comments

Smarter Charging, An AI controller treats batteries differently as they age

https://spectrum.ieee.org/ev-charging-strategy
1•oldnetguy•9m ago•0 comments

How to Hack a Superyacht

https://thewalrus.ca/how-to-hack-a-superyacht/
1•pseudolus•11m ago•0 comments

Entrepreneurs in Nairobi make the case for going solar

https://www.technologyreview.com/2026/06/17/1138600/entrepreneurs-nairobi-case-for-going-solar/
1•joozio•13m ago•0 comments

IBM Turns 115 Today

https://www.threads.com/@therab/post/DZqsiiWjI-k
1•rbanffy•13m ago•0 comments

AIs on their own found ways to exploit regulations and evade current safeguards

https://www.science.org/content/article/ai-models-have-troubling-knack-discovering-legal-loopholes
1•pseudolus•13m ago•0 comments

Show HN: Digital inspection reports for any rental property

https://kamerinspectie.nl/en
1•tjardo•15m ago•0 comments

Huall, autonomous AI agents

https://huall.dev
1•Kreshnik•16m ago•1 comments

Scientists Find Intriguing Link Between Ozempic and Violent Behavior

https://gizmodo.com/scientists-find-intriguing-link-between-ozempic-and-violent-behavior-2000772629
1•akyuu•19m ago•0 comments

Structural steel estimating: the steps were never the hard part

https://bidferra.com/blog/the-honest-guide-to-structural-steel-estimating
2•fazlerocks•22m ago•0 comments

Lenovo releases new 14-inch ThinkPad with 64 GB RAM and built-in pen

https://www.notebookcheck.net/Lenovo-releases-new-14-inch-ThinkPad-with-64-GB-RAM-and-built-in-pe...
1•teleforce•24m ago•0 comments

RFC 10008: The HTTP Query Method

https://www.rfc-editor.org/info/rfc10008/
2•schappim•25m ago•0 comments

From Combinatorial Mess to Linear Elegance: Architecting a Conversion Engine

https://blog.minimal.app/conversion-engine/
1•arthurofbabylon•28m ago•0 comments

Could Earth have sent life to Jupiter's moon Europa?

https://phys.org/news/2026-06-earth-life-jupiter-moon-europa.html
1•pseudolus•30m ago•1 comments

Color Picking OKLCH for Mortals

https://hugodaniel.com/posts/color-picking-oklch/
1•hugodan•30m ago•0 comments

Is MCP a sign of the reopening of the internet?

https://bakkenbaeck.com/tech/is-mcp-the-reopening-of-the-internet
1•_n_nym__s•31m ago•1 comments

Zlib-Rs in Firefox

https://trifectatech.org/blog/zlib-rs-in-firefox/
1•mcraiha•32m ago•0 comments

Ask HN: Does your mind drift while waiting for AI prompts to finish?

1•cryptoSympozium•39m ago•9 comments

The MRV engine for carbon removal

https://www.cula.tech/
1•doener•40m ago•0 comments

Against essential and accidental complexity (2020)

https://danluu.com/essential-complexity/
1•pramodbiligiri•40m ago•0 comments

Magnetically Hovering Guitar Strings

https://www.youtube.com/watch?v=ueCO4spGNPs
1•SweetSoftPillow•40m ago•0 comments

Ask HN: How much we change since LLM era?

1•modinfo•43m ago•1 comments
Open in hackernews

GLM-5.2 is the new leading open weights model on Artificial Analysis

https://artificialanalysis.ai/articles/glm-5-2-is-the-new-leading-open-weights-model-on-the-artificial-analysis-intelligence-index
125•himata4113•2h ago

Comments

Tiberium•47m ago
It seems to really be a nice step-up and is getting quite close to the frontier. I wish they'd start focusing on the reasoning efficiency now, though. I have a simple (relatively) test task to evaluate LLMs: writing a simple math evaluator library in Nim (it's about 400-600 lines total max), and GLM 5.2 (xhigh which maps to max effort) spent over 15 minutes (!) reasoning, spending about 45k tokens, before it finally wrote the first file.

I know it's hard to improve on that, but now that their models are good enough at raw intelligence, I think this should become a higher priority task.

Currently on https://artificialanalysis.ai/#output-tokens GPT 5.5 xhigh spends 16k tokens total on average, GPT 5.5 high is 10k, Fable 5 33k, Opus 4.8 41k, GLM 5.2 is 42k. GPT 5.5 is extremely reasoning efficient.

Of course if you convert those values to actual request cost, GLM 5.2 will probably beat GPT 5.5/Opus 4.8, but speed matters for a lot of people, I think.

bertili•39m ago
This is GLM 5.2 Max. GLM 5.2 High which use less than half[1] the tokens.

[1] https://z.ai/blog/glm-5.2

Tiberium•37m ago
Yes, but the Artificial Analysis result is also from GLM 5.2 (max), not high.
andai•19m ago
They have this with a lot of models, measuring only the max setting, while the one you'd actually want to use for most tasks is much lower.
epolanski•1m ago
For the brief period with had Fable, I never had to use it above medium.

Low nailed the overwhelming majority of mundane tasks on it's own, medium was good for more complex stuff.

vorticalbox•16m ago
This is a problem I find with opus is will spend so long thinking then going “but wait what if”

To point where I stop it and simple tell it to “start writing code you can work it out as you go along”

Seems writers block also effects LLM

epolanski•2m ago
Fable was 20 times worse on that.

It's clear it was the vibe coding model, as like no other model before, fully turned you into his assistant instead of the other way around.

Havoc•43m ago
It’s pretty good. More talkative than 5.1. Reminds me of deepseek 4

Their servers are melting though - getting more timeouts etc

unrvl22•40m ago
Why aren't more people talking about this? It's literally Opus 4.7 quality stupid prices. I know providers who are offering this at unlimited tokens for $50 a month. Some are even offering API rates at 3x lower than the official ZAI api rates which are already like 10x cheaper than Opus. (Crof and Umans btw)

This is a huge blow to Anthropic/OpenAI/Google and a massive win for the rest of the world. The official API prices and speeds mean nothing for open source models.

unrvl22•39m ago
I cancelled my claude sub after realizing I can burn 300m tokens a day of this quality, for $50 a month.
Hamuko•37m ago
I’m not that interested in models that I can’t run on my desktop for ~0€, which is my AI budget.
igravious•30m ago
Cool beans. You're not the target audience then.
Hamuko•22m ago
Did I claim I was? I just said why I and people like me are not talking about it.
simianwords•7m ago
and he said its cool
nh43215rgb•40m ago
> GLM-5.2 sits off the most attractive quadrant on the Intelligence vs Output Tokens chart.

That is unfortunate...

CuriouslyC•39m ago
I've been playing with this model a fair amount over the last 24 hours, and I can confirm it's quite capable, while being a little bit verbose (I've seen it reconsider things 3-4 times in thinking traces before deciding on a path forward), and not being quite as good as GPT5.5 at working through complex abstract requirements.

Honestly it's good enough that I feel comfortable recommending a Z.AI sub + a $20/mo OpenAI sub for all but the most AI pilled multi-orchestrators, or the die hard Claude fans. GLM writing + GPT reviewing/debugging feels pretty unlimited and minimally worse than just doing everything in GPT with the $200/mo plan.

igravious•22m ago
After having got a taste of Fable 5 for me Opus 4.8 doesn't cut it any more -- and I don't know how to put this, I don't know if it's just me, but it's rhetorical flourishes are starting to really grate on me, never mind that it is at times deliberately weasel-wordy and economical with the truth until pressed. Opus 4.8 is definitely a stronger coding agent than DeepSeek 4.0 or Kimi 2.7 succeeding where they flounder and fail but its way of expressing itself conversationally is making me reconsider my subscription …
elwebmaster•9m ago
You are not alone. How about GPT 5.5? Does it come close to Fable 5?
fragmede•4m ago
5.5 is pretty good. It's no Fable though. It is definitely better than opus tho.
andai•18m ago
This is my workflow. And then once a day I copy paste the code into the free Claude Sonnet so it comes out actually readable.
kingstnap•31m ago
According to many benchmarks this model is straight up frontier level and Zai seriously cooked. Some of these numbers are incredible.

Excited to see if this turns out to be a Open Weight Opus 4.5 or better.

andai•3m ago
I've seen models at the top of AA do really stupid things. These harder benchmarks match my experience more closely:

https://deepswe.datacurve.ai/

and https://cognition.ai/blog/frontier-code

That being said, I've had a reasonably pleasant time with GLM-5.2 so far. (And have had an OK time with DeepSeek as well.)

By the time I'm done testing all the Chinese models, they'll be obsolete :)

davidwritesbugs•28m ago
I like their models, super cheap - I'm a Lite plan subscriber, and subjective performance seems to be same as lower Anthropic models, useful for lots of grunt work. The problem is that Ziphu really __really__ struggle with capacity - everyone is complaining of timeouts or very slow speeds. I can't get direct access to the model though I see it is in OpenRouter so I may play. But the capacity issues means DeepSeek is my main provider these days
mohsen1•24m ago
I don't if it is harness or the model is really not at the level those benchmarks are showing because based of my own "feelings" after using it I felt it's not Opus 4.5 level. It can't figure things out in my project (https://tsz.dev) or maybe tsz is at a stage that things are getting too difficult even for frontier models to be productive. I had the most productive time in the weekend Fable was available and since then it's been pretty slow to make progress
tensegrist•20m ago
> On the Intelligence vs. Cost per Task Pareto Frontier: GLM-5.2 is on the Pareto frontier of the Intelligence vs Cost per Task chart, with the lowest cost per task among models at its intelligence level. GLM-5.2 costs ~$0.46 per task, compared to GLM-5.1 ($0.25), Kimi K2.6 ($0.31), MiniMax-M3 ($0.18) and DeepSeek V4 Pro (max, $0.05)

am i missing something?

xiaoyu2006•10m ago
Some models are heavily subsidized. Total params & active params are better measurement of inference cost.
simianwords•9m ago
No models are subsidised -- there are lots of third party hosting services that will still run at breakeven/profit. (except Deepseek after discount)
rahidz•19m ago
Correct me if I'm wrong, but neither DeepSeek nor GLM have image input modality. This makes them less useful when looking at UIs, photos, screenshots, etc. doesn't it? Or do they have alternate ways of doing so?
creamyhorror•13m ago
It's a real step forward, getting closer to SOTA. It seems to be very epistemically cautious in its reasoning. I hope Deepseek and the other open-weights labs stay in the game and catch up too.
xiaoyu2006•12m ago
This open source model is quite near SOTA with only 700B/40B MoE. Truly efficient.
lousken•12m ago
Cerebras really needs to have this on their API list (if they even still exist).
Marciplan•7m ago
they went public a few weeks ago
ramon156•8m ago
I've made a comment before that 5.1 will sometimes get stuck looping over a simple decision or statement. It will basically contradict and then not realize that one option is the definite option. Sometimes it's two statements that aren't even exclusive. Nonetheless, a lot of tokens that get wasted from this.

I haven't extensively used 5.2 yet, but it seems a lot better.

_pdp_•5m ago
I am helpful.

DeepSeek V4 has been quite amazing in our workloads and it operates at a fraction of the cost. I have not tried GLM 5.2 but it seems that it hits a sweet spot.

andai•7m ago
Electricity cost seems to be about $30/month for a 32B model on a GPU. It's probably better on Apple hardware.

https://github.com/QuantiusBenignus/Zshelf/discussions/2

Not accounting for hardware, of course :)

CuriouslyC•35m ago
Be careful about unofficial providers, a lot of them misconfigure models or stealth quantize them. For a while the difference between Kimi on the official API and most third party providers was 20-40%.
unrvl22•33m ago
the 2 I mentioned both have a fairly large following, who run benchmarks and absolutely will spot issues.
cedws•18m ago
OpenRouter should be penalising or banning for this.
embedding-shape•31m ago
> Why aren't more people talking about this?

Wasn't this released like 2 days ago? Everyone is still evaluating and playing around with it, things like the submission is just starting to come out. Give it some days at least before jumping to conclusions, ideally weeks.

Schiendelman•28m ago
To answer the question in your first sentence - because it's VERY computationally (ha) expensive as a human being to keep up with all the options. It's also very hard to figure out how to run a model like this. There's no installer. If you really really care, which 99% of people do not, you have to google a guide, and then find out it's out of date...

I've tried a number of these, and the learning curve is very steep compared to "install Claude Code and pay $100/mo". There is no way saving me $50/month matters compared to figuring that out.

andai•15m ago
But it just works with Claude Code? They have a guide on their website.

https://docs.z.ai/devpack/tool/claude

Here's my setup. I add this to my .bashrc

export ZAI_API_KEY="your_key_here"

alias claudez='ANTHROPIC_AUTH_TOKEN="$ZAI_API_KEY" ANTHROPIC_BASE_URL="https://api.z.ai/api/anthropic" ANTHROPIC_DEFAULT_OPUS_MODEL="glm-5.2[1m]" ANTHROPIC_DEFAULT_SONNET_MODEL="glm-4.7" ANTHROPIC_DEFAULT_HAIKU_MODEL="glm-4.7" claude'

Then I just run claudez

pro tip the same thing works with deepseek https://api-docs.deepseek.com/guides/anthropic_api

Even more pro tip: Claude Code can set this up for you haha

Schiendelman•9m ago
Sure, I'm not saying I, a software engineer, cannot do this. I'm saying it's significant onboarding friction.

Unless this were a massive differentiator, people aren't going to be "talking about it" the way GP suggests!

cedws•18m ago
In my org everyone is extremely Claude-pilled to the point you’d think it’s the only LLM that exists, purely because it caters to non-engineers within enterprises.
stanac•18m ago
> Some are even offering API rates at 3x lower than the official ZAI api rates

Looking at openrouter [1], some of the cheaper offerings are for quantized models. Not sure how much intelligence is lost in quantization. And they are not 3 times cheaper. Where did you find 3x lower prices for APIs? I am considering skipping open router and using them directly for that price.

edit:

I see, croft [2] 8bit for $0.50/$0.08/$2.20

[1]: https://openrouter.ai/z-ai/glm-5.2

[2]: https://ai.nahcrof.com/pricing

anuramat•15m ago
> unlimited tokens for $50 a month

link?

> Why

imho everything but opus produces unusable code (fable was even better...), eg gpt5.5 seems to write the absolute worst code that still technically solves the problem; tbh I'd be totally willing to trade "raw intelligence" for "code taste"

more labs need to figure out whatever anthropic did to destroy everybody else on frontiercode bench