I have codex right now purely because they gave me a month free of ChatGPT Pro, so I have been using it in between my usage resets with claude. Since it's "free money" for me I have been using it exclusively on xHigh.
One of my most frequent prompts is "hey codex worked on ____, but it didn't quite hit the mark, can we review the work..."
Yes, part of this is normal even within the same model -- you have the highest power model review the work for correctness, refactoring opportunities, and so on, but man I tell you, I don't know what it is about codex, this is obviously one guy's anecdote -- same prompting style, same repository documentation ala MD files, same skills, way different results.
All that to say, maybe the bug report is on to something here, and it can be fixed.
maille•1h ago
GPT-5.5 Codex model exhibits a clustering phenomenon in which reasoning_output_tokens cluster at fixed values spaced 518 apart.
These stuck responses at fixed thresholds are strongly correlated with errors in complex tasks.
Observed phenomenon is specific to GPT-5.5; it is much less prevalent in GPT-5.4 and almost absent in GPT-5.2 and 5.3