It is slower, but the results are much more often correct and it doesn't rush into half-baked solutions/dumb approaches as eagerly.
I'd much rather wait 5 minutes than have to clean up manually or try to coax a model into doing things differently.
I also wouldn't be surprised if the slowness was partially due to OpenAI being quite resource constrained. They are repeatedly complaining about not having sufficient compute.
Bigger picture: I think all the AI coding environments are incredibly immature. There are many improvements to be unlocked.
Rather, the real reason codex takes longer is that it does more work to read more context.
IMO the results are much better with codex, not even close
As far as which one is better, it's highly dependent on what we're each doing, but I will say that I have this one project where bare "make" won't work, and I have a script that needs to be run instead. I have instructions to call that script in multiple .md files, and codex is able to call the script instead of make, but it keeps forgetting that and tries to run make which fails and it gets confused. (Claude code running on macOS host but build on Linux vm.) I could work around it, but that really takes the "shiny" factor off of codex+GPT-5 for me.
- Codex-medium is better if you have a well articulated plan you "merely" need to execute on, need help finding a bug, have some specific complex piece of logic you need to tweak, truly need a ton of long range context to reason about an issue. It's great and usage limits are very generous!
- Sonnet 4.5 is better for everything else. That means for me: non-coding CLI ops, git ops, writing code with it as a pair programmer, OOD tasks, big new chunks of functionality that are highly conceptual, architectural discussion, etc. I generally approve every edit and often interrupt it. The fast iteration and feedback is key.
I probably use CC 80% of the time with Codex the other 20%. My company pays for CC and I don't even look at the cost. Most of my coworkers use CC over Codex. We do find the Codex PR reviewer to be the best of any tool out there.
Codex gets a lot of play on twitter also because a lot of the most prolific voices there are solo devs who are "building in public". A greenfield, solo project is the ideal (only?) use case for running 5 agents in parallel or whatever. Codex is probably amazing at that. But it's not practical for building in enterprise contexts IMO.
I would argue this is the wrong way of using these tools. Writing out a defined plan in plain english and then have codex / claude write it out is better since that way we understand the intention. You can always have codex come up with an abstract plan first, iterate on it and then implement. Kind of like how we would implement software in real life.
There seems to be constant stream of not terribly interesting or unique “my Claude code/codex success story” blog posts that mange to solicit so many upvotes.
If you haven’t been vocal about your support of products in general, you wouldn’t show up on the radar for these “opportunities.”
What does this mean? What do you mean unique shit? What do you mean when you say you’re trying to draft on the sentiment? What is “them” referring to?
Genuinely. I’m not being (deliberately) obtuse, just trying to follow. Thanks
this is obviously pure conjecture, but perhaps the OE folks had automated their multiple roles and now they need to be more involved.
As I got better round June/July I finally found the energy to try it out. It was working incredibly well at the time. It was so fun (for me), that I basically kept playing with it every day after finishing work. So for roughly 1.5 months basically every free minute each day, along with side explorations during work hours when I could get away with it.
Then I had to take another business trip mid August, when I finally came back in September it was unrecognizable - and from my perspective, it definitely hasn't recovered to how ultrathink+opus performed back then.
You can definitely still use it, but you need to take a massively more hands-on approach.
At least my opinion is not swayed by their reduced quota ... But to stay in line with the sentiment analysis this article is about - neither have I tried Codex to this point. Which I will, eventually.
I've been coding for 30 years.
Using Codex I'm finally enjoying it again for the first time in maybe 15 years. Outsource all that annoying part? Heck yeah - bring it on.
And I tell everyone I can how transformational it has been for me.
I hate working with codex. It feels like a machine. You tell it to do something, and it just does it. No pretension at being human, or enthusiastic, or anything really.
But codex almost always does it right. And the comments are right, I never run into random usage limits. Codex doesn’t arbitrarily decide to shrink the context window, or start compacting again after 3 messages.
The codex client sucks, claude code is much better. But the codex client is consistent, which is much more important. Claude was amazing 3 months ago. The model is still fine, but the quality of the experience has degraded so far it’s hard to consider using it.
I take issue with the AI industry in general and the hand-wavy approach to risk, but OpenAI really is on another level in my book. While I don't trust the industry's approach to AI development, with OpenAI I don't trust the leaderships' intentions.
Me too, so much so that I doubt this is legitimate. This blog post is the only place I've seen people 'raving' about codex.
Claude Code is the current standard all others are measured against.
The biggest thing I use agents for is getting good search with less context.
Codex just struggles when the model needs to search too much because of this. Codex also struggles with too much context: there have been a number of times when it has just ran up on the context limit and couldn’t compact, so you just loose everything since your last message, which has been a lot of lost context/work for me.
visiondude•8h ago
aaronSong•8h ago
candiddevmike•6h ago