frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

GPT-5.5 Codex reasoning-token clustering may be leading to degraded performance

https://github.com/openai/codex/issues/30364
76•maille•1h ago

Comments

maille•1h ago
tldr:

GPT-5.5 Codex model exhibits a clustering phenomenon in which reasoning_output_tokens cluster at fixed values spaced 518 apart.

These stuck responses at fixed thresholds are strongly correlated with errors in complex tasks.

Observed phenomenon is specific to GPT-5.5; it is much less prevalent in GPT-5.4 and almost absent in GPT-5.2 and 5.3

ProofHouse•56m ago
Personally, I would say very likely, to be honest. I gotta go through this a little more, but I actually use 5.5 codex an obscene amount, and I almost never use it for reasoning anymore. It's not even in the same galaxy as far as actually taking out the thinking and using GPT-5.5 or even Claude and then coming back and giving it the reasoning. Blah blah blah, it's the same model. Well, let me tell you, no, it's not, for several reasons, and the delta on intelligence is pretty staggering.
m101•54m ago
What?
benjiro29•53m ago
Care to explain what you mean by that?
dimitrios1•29m ago
I know that these types of comments are not really popular here, but this struck a chord with me because I feel the same. They aren't remotely close.

I have codex right now purely because they gave me a month free of ChatGPT Pro, so I have been using it in between my usage resets with claude. Since it's "free money" for me I have been using it exclusively on xHigh.

One of my most frequent prompts is "hey codex worked on ____, but it didn't quite hit the mark, can we review the work..."

Yes, part of this is normal even within the same model -- you have the highest power model review the work for correctness, refactoring opportunities, and so on, but man I tell you, I don't know what it is about codex, this is obviously one guy's anecdote -- same prompting style, same repository documentation ala MD files, same skills, way different results.

All that to say, maybe the bug report is on to something here, and it can be fixed.

kleton•30m ago
Clearly they are batching reasoning inference in a few multiples of 512 tokens as a throughput optimization
zenapollo•21m ago
I’ve definitely experienced step jumps down in quality on an almost daily basis. I usually used xhigh. The experience of relying on codex’s outstandingly thorough coding earlier in the year has evaporated for me. I’m seeing incredibly stupid implementations intermittently, and have simply switched to Claude until openai takes the issue seriously. As far as i could tell they haven’t taken it seriously for the several months I’ve been personally seeing it.
siva7•16m ago
I've switched 3 months ago to Codex because Claude got incredibly stupid. 6 months ago vice versa. It doesn't matter if you use Codex or Claude. Both will fuck with you at some point. Though Codex probably less.
siva7•8m ago
I swear some days ago someone here claimed Openai succeeded cutting down their compute cost by half with a breakthrough optimization. So this is it?

The Quest to Make Humanoid Robots Safe Enough for Humans

https://www.wsj.com/tech/the-quest-to-make-humanoid-robots-safe-enough-for-humans-4887c123
1•bookofjoe•7m ago•1 comments

Recovering garbled Bitcoin addresses (2024)

https://purplesyringa.moe/blog/recovering-garbled-bitcoin-addresses/
1•zX41ZdbW•8m ago•0 comments

AI Authentication and Authorization

https://fusionauth.io/articles/ai/ai-authentication-authorization
1•mooreds•9m ago•0 comments

The Ugly Phase

https://trishagee.com/2026/07/04/the-ugly-phase/
1•mooreds•10m ago•0 comments

Mantissa, a distributed workload orchestration system

https://mantissa.io/blog/init/
1•artursapek•12m ago•0 comments

Scientists discover a surprising link between Vitamin C and brain health

https://www.sciencedaily.com/releases/2026/06/260626030428.htm
1•OutOfHere•18m ago•0 comments

Ask HN: When will the stock market crash?

2•roschdal•24m ago•2 comments

DeFi yield comparison, read before depositing

https://hduynam99.substack.com/p/how-to-find-the-best-defi-yield
1•hoangthuytrang•28m ago•0 comments

Birdsong data from Merlin ID app to help global biodiversity project

https://www.theguardian.com/environment/2026/jul/04/merlin-app-birdsong-identification-ebird-biod...
2•andsoitis•29m ago•0 comments

Microsoft Copilot OS revealed in LEAKED video: built on Copilot and agentic AI

https://www.windowscentral.com/microsoft/windows-11/microsoft-copilot-os-revealed-in-leaked-video...
1•type0•29m ago•0 comments

Tell HN: Megalodon.jp is faster than archive.today and doesn't require reCAPTCHA

1•Cider9986•30m ago•0 comments

Out-of-core LLM inference engine written from scratch in Rust

https://github.com/Vage91/Kortex
1•Vage91•32m ago•0 comments

Simulation Game: Can you Terraform Mars?

https://www.nature.com/immersive/d41586-026-01978-8/index.html
1•cybermango•34m ago•0 comments

Tool that stops iCloud from eating your Mac's SSD

https://github.com/rexbrahh/icloud-guard/tree/main
2•rexbrahh•34m ago•1 comments

I just tired of killing AI slop

https://www.surgeos.app/
1•yernururu•37m ago•1 comments

RFC: Stopping runaway AI agent spend with atomic budget reservations

https://github.com/iamapsrajput/agent-budget-protocol/blob/main/RFC.md
1•iamapsrajput•38m ago•0 comments

Freedom from NPM. Happy 4th

https://www.npmjs.com/package/donobu
2•vasusen•39m ago•1 comments

Check my temp mail with 18 custom domain

https://mytempmail.pro
1•Asdfghjkkzxcv•41m ago•0 comments

Cells, boundaries, and the emergence of biological order

https://www.embl.org/news/science-technology/cells-boundaries-and-the-emergence-of-biological-order/
1•hhs•42m ago•0 comments

Mystery of India's red-haired child unlocks hidden colour genes

https://www.nature.com/articles/d44151-026-00124-7
2•cybermango•43m ago•0 comments

Peekdiff – review GitHub PRs without the diff touching my server

https://peekdiff.codebyram.dev
1•sriram-52•43m ago•0 comments

Jellyfish can heal wounds in minutes. Scientists want their secrets

https://www.mbl.edu/news/jellyfish-can-heal-wounds-minutes-scientists-want-their-secrets
2•hhs•43m ago•0 comments

FlashAttention-4: Algorithm and Kernel Pipelining Co-Design

https://research.colfax-intl.com/flashattention-4-algorithm-and-kernel-pipelining-co-design-for-a...
1•skidrow•45m ago•0 comments

New bacterial species discovered in NASA's cleanrooms

https://www.nature.com/articles/d44151-025-00219-7
4•cybermango•45m ago•0 comments

Toward Better Hip Kernel Generation for AMD GPUs

https://scalingintelligence.stanford.edu/blogs/hipkernels/
1•skidrow•46m ago•0 comments

Researchers affirm long-held belief that viruses can trigger Parkinson's disease

https://stories.tamu.edu/news/2026/06/29/researchers-affirm-long-held-belief-that-viruses-can-tri...
2•hhs•47m ago•0 comments

China Is Devastating the Last Stronghold of German Industry

https://www.wsj.com/economy/china-is-devastating-the-last-stronghold-of-german-industry-c7a98514
6•impish9208•50m ago•3 comments

Four Corners – a spin on Connections-like games

https://fourcorners.smol.quest/
1•ens0•51m ago•1 comments

Show HN: WifeBench – My wife vibes LLM rankings

https://www.wifebench.com/
1•fristovic•53m ago•0 comments

Dermatology is wrong about the sun

https://twitter.com/MattZirwas/status/2050586857868591306
1•bilsbie•55m ago•0 comments