frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

GPT-5.5 Codex reasoning-token clustering may be leading to degraded performance

https://github.com/openai/codex/issues/30364
76•maille•1h ago

Comments

maille•1h ago
tldr:

GPT-5.5 Codex model exhibits a clustering phenomenon in which reasoning_output_tokens cluster at fixed values spaced 518 apart.

These stuck responses at fixed thresholds are strongly correlated with errors in complex tasks.

Observed phenomenon is specific to GPT-5.5; it is much less prevalent in GPT-5.4 and almost absent in GPT-5.2 and 5.3

ProofHouse•57m ago
Personally, I would say very likely, to be honest. I gotta go through this a little more, but I actually use 5.5 codex an obscene amount, and I almost never use it for reasoning anymore. It's not even in the same galaxy as far as actually taking out the thinking and using GPT-5.5 or even Claude and then coming back and giving it the reasoning. Blah blah blah, it's the same model. Well, let me tell you, no, it's not, for several reasons, and the delta on intelligence is pretty staggering.
m101•54m ago
What?
benjiro29•53m ago
Care to explain what you mean by that?
dimitrios1•29m ago
I know that these types of comments are not really popular here, but this struck a chord with me because I feel the same. They aren't remotely close.

I have codex right now purely because they gave me a month free of ChatGPT Pro, so I have been using it in between my usage resets with claude. Since it's "free money" for me I have been using it exclusively on xHigh.

One of my most frequent prompts is "hey codex worked on ____, but it didn't quite hit the mark, can we review the work..."

Yes, part of this is normal even within the same model -- you have the highest power model review the work for correctness, refactoring opportunities, and so on, but man I tell you, I don't know what it is about codex, this is obviously one guy's anecdote -- same prompting style, same repository documentation ala MD files, same skills, way different results.

All that to say, maybe the bug report is on to something here, and it can be fixed.

kleton•30m ago
Clearly they are batching reasoning inference in a few multiples of 512 tokens as a throughput optimization
zenapollo•21m ago
I’ve definitely experienced step jumps down in quality on an almost daily basis. I usually used xhigh. The experience of relying on codex’s outstandingly thorough coding earlier in the year has evaporated for me. I’m seeing incredibly stupid implementations intermittently, and have simply switched to Claude until openai takes the issue seriously. As far as i could tell they haven’t taken it seriously for the several months I’ve been personally seeing it.
siva7•17m ago
I've switched 3 months ago to Codex because Claude got incredibly stupid. 6 months ago vice versa. It doesn't matter if you use Codex or Claude. Both will fuck with you at some point. Though Codex probably less.
siva7•9m ago
I swear some days ago someone here claimed Openai succeeded cutting down their compute cost by half with a breakthrough optimization. So this is it?

Command and Conquer Generals natively ported to macOS, iPhone, iPad using Fable

https://github.com/ammaarreshi/Generals-Mac-iOS-iPad/tree/main
270•asronline•3h ago•116 comments

GPT-5.5 Codex reasoning-token clustering may be leading to degraded performance

https://github.com/openai/codex/issues/30364
76•maille•1h ago•14 comments

Leaking YouTube creators' private videos

https://javoriuski.com/post/youtube
434•javxfps•6h ago•230 comments

Google Books (or similar) all book scans – $200k bounty (2025)

https://software.annas-archive.gl/AnnaArchivist/annas-archive/-/work_items/234
278•Cider9986•6h ago•147 comments

Better Models: Worse Tools

https://lucumr.pocoo.org/2026/7/4/better-models-worse-tools/
60•leemoore•3h ago•14 comments

Potential session/cache leakage between workspace instances or consumer accounts

https://github.com/anthropics/claude-code/issues/74066
262•chatmasta•9h ago•120 comments

Verizon is About to Break our Watches

https://www.jefftk.com/p/verizon-is-about-to-break-our-watches
116•jefftk•5h ago•52 comments

Explanation of everything you can see in htop/top on Linux (2019)

https://peteris.rocks/blog/htop/
360•theanonymousone•11h ago•50 comments

Zig: All Package Management Functionality Moved from Compiler to Build System

https://ziglang.org/devlog/2026/#2026-06-30
108•tosh•6h ago•22 comments

Drone Physics

https://iahmed.me/post/drone-physics/
64•wrxd•4d ago•19 comments

Windows CE Dreamcast Community Edition (wince-dc)

https://github.com/maximqaxd/wince-dc
82•msephton•8h ago•17 comments

Protocol Prying: Vulnerability Research in AirDrop and Quick Share

https://arxiv.org/abs/2606.26967
7•logickkk1•2h ago•0 comments

Curveball

https://mightyburger.net/projects/curveball/
43•toilet•7h ago•11 comments

Astrophysicists Puzzle over Webb’s New Universe

https://www.quantamagazine.org/astrophysicists-puzzle-over-webbs-new-universe-20260702/
182•jnord•14h ago•116 comments

It's not me, it's the compiler

https://parsa.wtf/cast/
37•SVI•3d ago•8 comments

The Vespa at 80

https://www.cbc.ca/news/world/vespa-italy-postwar-design-9.7252641
135•cf100clunk•3d ago•127 comments

Fable created novel 4D splat format

https://adamraudonis.github.io/splats4D/
80•adamraudonis•7h ago•20 comments

Can you build a recognizable World Map in under 500 bytes?

https://www.experimentlog.com/blog/building-a-world-map-with-only-500-bytes
6•iweczek•3d ago•11 comments

Neural Render Proxies for Interactive and Differentiable Lighting

https://studios.disneyresearch.com/2026/07/01/neural-render-proxies-for-interactive-and-different...
44•tobr•3d ago•5 comments

EndBASIC 0.14: Are we multimedia yet?

https://www.endbasic.dev/2026/07/endbasic-0.14.html
22•jmmv•6h ago•2 comments

Designing DB partitions you don't have to babysit

https://explainanalyze.com/p/designing-partitioning-you-dont-have-to-babysit/
52•rtolkachev•3d ago•7 comments

Postgres data stored in Parquet on S3: LTAP architecture explained

https://www.databricks.com/blog/lakebase-ltap-rethinking-database-storage
159•andrenotgiant•3d ago•51 comments

Breaking the Bird Barrier: Scientist Decodes Zebra Finch Language

https://www.freepressjournal.in/education/breaking-the-bird-barrier-scientist-decodes-zebra-finch...
80•yyyk•4d ago•24 comments

BareMetal RAM Dumper – Bare-metal x86 tool for Cold Boot Attack experiments

https://github.com/pIat0n/BareMetal-RAM-Dumper
45•liffik•5h ago•31 comments

Finland's last analogue landline phones go silent after 150 years

https://www.euronews.com/next/2026/06/30/finlands-last-analogue-landline-phones-go-silent-after-1...
83•ohjeez•6h ago•22 comments

The .join() that should be a bug

https://kronotop.com/blog/the-join-that-should-be-a-bug/
14•mastabadtomm•4d ago•2 comments

The bottleneck might be the air in the room

https://blog.mikebowler.ca/2026/07/03/co2-and-decision-making/
739•gslin•16h ago•420 comments

Mir Books – Books from the Soviet Era

https://mirtitles.org
165•clmul•4d ago•78 comments

Game Boy Advance Dev: Logging to the Console

https://www.mattgreer.dev/blog/gba-dev-logging/
20•jandeboevrie•6h ago•2 comments

Plein Air

https://art.joonas.wtf/
52•bookofjoe•6h ago•7 comments