I wonder if a shared Claude Code instance has the same effect?
Doesn't CC sometimes take twenty, thirty minutes to return an attempt? I wouldn't know, because I'm not rich and my employer has decided CC is too expensive, but I wonder what you would do with your pair programming partner while you wait.
The bosses would like to think we'd start working on something else, maybe start up a different Claude instance, but can you really change contexts and back before the first one is done? You AND your partner?
Nah, just go play air hockey until your boss realizes Claude is what they need, not you.
This is a depressing comment.
I am apprehensive about the future of software development in this milieu. I've pumped out a ~15,000 line application heavily utilizing Claude Code over a few days that seems to work, but I don't know how much to trust it.
Certainly part of the fun of building something was missing during that project, but it was still fun to see something new come to life.
Maybe I should say I am cautiously optimistic but also concerned: I don't feel confident in the best ways to use these tools to build good software, and I'm not sure exactly what skills are useful in order to get them there.
Can I ask what you built?
https://gs.statcounter.com/os-market-share/mobile/worldwide
So maybe they rather please the home market, I guess.
[1]: https://www.techtarget.com/searchcio/definition/Instagram
Android is simply a much worse platform to make money on. Users spend <25% as much as iOS users. Why would they prioritize that?
Globally in dollars spent, not human heads. iOS is over 2x larger than Android globally, and the gap is widening year over year.
iOS spending growth outpaces Android, which even shrunk during covid while iOS spending continued to grow
https://api.backlinko.com/app/uploads/2024/03/iphone-vs-andr...
Anthropic makes money off product sales, not ad revenue, so wallets count more than eyes for this. Free users who are less than 25% as likely to spend are a burden not to be prioritized for a product business with free tier access. They need to spend much more to get a paying user on Android.
If Android were the bigger market, they'd prioritize it
https://www.reddit.com/r/applesucks/comments/1k6m2fi/why_do_...
- Claude Code has been added to iOS
- Claude Code on the Web allows for seamless switching to Claude Code CLI
- They have open sourced an OS-native sandboxing system which limits file system and network access _without_ needing containers
However, I find the emphasis on limiting the outbound network access somewhat puzzling because the allowlists invariably include domains like gist.github.com and dozens of others which act effectively as public CMS’es and would still permit exfiltration with just a bit of extra effort.
Using a proxy through the `HTTP_PROXY` or `HTTPS_PROXY` environment variables has its own issues. It relies on the application respecting those variables—if it doesn’t, the connection will simply fail. Sure, in this case since all other network connection requests are dropped you are somewhat protected but then an application that doesn't respect them will just not work
You can also have some fun with `DYLD_INSERT_LIBRARIES`, but that often requires creating shims to make it work with codesigned binaries
Just to be clear, I'm excited for the capability to use Claude Code entirely within the browser. However, I've heard reports of Max users experiencing throttled usage limits in recent months, and am concerned as to whether this will exacerbate that issue or not.
EDIT: I had meant defer which is the first time I've made a /r/boneappletea in awhile
I'm using Claude Code locally a lot, occasionally with a couple parallel session.
I was very happy when they made the GitHub Action - I used it quite a bit, but in practice I got frustrated that I effectively only get a single back-and-forth out of it, I can't really "continue the conversation without losing context" - Sure, I can respond to it in the PR it makes, but that will be a fresh session with a fresh empty context.
So, as much as I don't like moving out of my standard development workflow with my tools, I think this could be quite useful. The ability to interrupt and/or continue a conversation should be very nice.
My main worry is - usually my unit tests and integration tests rely on a postgres database running on the machine, and it's not obvious to me if I can spin that up here?
With Happy, I managed to turn one of these Claude Code instances into a replacement for Claude that has all the MCP goodness I could ever want and more.
It's really solid. It's effectively a web (and native mobile) UI over Claude Code CLI, more specifically "claude --dangerously-skip-permissions".
Anthropic have recognized that Claude Code where you don't have to approve every step is massively more productive and interesting than the default, so it's worth investing a lot of resources in sandboxing.
Here's an example from this morning, getting CUDA working on a NVIDIA Spark: https://simonwillison.net/2025/Oct/20/deepseek-ocr-claude-co...
I have a few more in https://github.com/simonw/research
So increasingly I let it run, and then review when it stops, and then I give it a proper review, and let it run until it stops again. It wastes far less of my time, and finishes new code much faster. At least for the things I've made it do.
Their "restricted network access" setting looks questionable to me - it allow-lists a LOT of stuff: https://docs.claude.com/en/docs/claude-code/claude-code-on-t...
If you configure your own allow-list you can restrict to just domains that you trust - which is enforced by a separate HTTP/HTTPS proxy, described here: https://docs.claude.com/en/docs/claude-code/claude-code-on-t...
Pretty cool though! Will need to use it for some more isolated work/code edits. Claude Code is now my workhorse for a ton of stuff including non-coding work (esp. with the right MCPs)
I don't have any relationship with any AI company, and honestly I was rooting for Anthropic, but Codex CLI is just way way better.
Also Codex CLI is cheaper than Claude Code.
I think Anthropic are going to have to somehow leapfrog OpenAI to regain the position they were in around June of this year. But right now they're being handed their hat.
Kinda sick of Codex asking for approval to run tests for each test instance
Also while not quite as smart, it's a better pair programmer. If I'm feeling out a new feature and am not sure how exactly it should work yet, I prefer to work with Sonnet 4.5 on it. It typically gives me more practical and realistic suggestions for my codebase. I've noticed that GPT-5 can jump right into very sophisticated solutions that, while correct, are probably not appropriate.
Sonnet 4.5: "Why don't we just poll at an interval with exponential backoff?"
GPT-5: "The correct solution is to include the data in the event stream...let us begin by refactoring the event system to support this..."
That said, if I do want to refactor the event system, I definitely want to use Codex for that.
Sonnet 4 was a coding companion, I could see what it was doing and it did what I asked.
Sonnet 4.5 is like Opus, it generates massive amounts of "helper scripts" and "bootstrap scripts" and all kinds of useless markdown documentation files even for the tinies PoC scripts.
It's as if there are two vendors saying they can give up incredibly superpowers for an affordable price, and only one of them actually delivers the full package. The other vendor's powers only work on Tuesdays, and when you're lucky. With that situation, in an environment as competitive as things currently stand, and given the trajectory we're on, Claude is an absolute non-starter for me. Without question.
Codex says “This is a lot of work, let me plan really well.”
Claude says “This is a lot of work, let me step back and do something completely different that you didn’t ask for.”
Edit: I'd like to reply to this comment in particular but can't in a threaded reply, so will do that here: "Ah, super secret problem domains that have been thoroughly represented in the LLM training data. Nice."
This exhibits a fundamental misunderstanding of why coding agents powered by LLMs are such a game changer.
The assumption this poster is making is that LLMs are regurgitating whole cloth after being trained on whole cloth.
This is a common mistake among lay people and non-practitioners. The reality is that LLMs have gained the ability to program, by learning from the code of others. Much like a human would learn from the code of others, and then be able to create a completely novel application.
The difference between a human programmer an an agentic coder is that the agent has much broader and deeper expertise across more programming languages, and understands more design patterns, more operating systems, more about programming history, etc etc and it uses all this knowledge to fulfill the task you've set it to. That's not possible for any single human.
It's important for the poster to take two realities on board: Firstly, agentic coding agents are not regurgitating whole cloth from whole cloth. Instead they are weaving new creations because they have learned how to program. Secondly, agentic coding agents have broader and deeper knowledge than any human that will ever exist, and they never tire, and their mood and energy level never changes. In fact that improves on a continuous basis as the months go by and progress continues. This means we can, as individual practitioners or fast moving teams, create things that were never before possible for us without raising huge amounts of money and hiring large very expensive teams, and then having the overhead of lining everyone up behind a goal AND dealing with the human issues that arise, including communication overhead.
This is a very exciting time. Especially if you're curious, energetic, and are willing to suspend disbelief to go and take a look.
The easier threading-focused approach to the conversation might be to add the additional comment as an edit at the end of the original and reply to the child https://news.ycombinator.com/item?id=45649068 directly. Of course, I've broken the ability to do that by responding to you now about it ;).
Well, that, and it’s just a bit annoying to claim that you’ve found some amazing new secret but that you refuse to share what the secret is. It doesn’t contribute to an interesting discussion whatsoever.
Claude, especially 4.5 Sonnet, is a lot nicer to interact with, so it may be a better choice in cases where you are co-working with the agent. Its output is nicer, it "improvises" really well even if you give it only vague prompts. That's valueable for interactive use.
But for delegating complete tasks, Codex is far better. The benchmarks indicate that, as do most practicioners I talk to (and it is indeed my own experience).
In my own work, I use Codex for complete end-to-end tasks, and Claude Sonnet for interactive sessions. They're actually quite different.
The output of codex is also not as great. Codex is great at the planning and investigation portion but sucks at execution and code quality.
Sora 4.5 tends to randomly hallucinate odd/inappropriate decisions and goes to make stupid changes that have to be patched up manually.
Also CC tool usage is so much better! Many, many times I’ve seen Codex writing a python script to edit a file which seems to bypass the diff view so you don’t really know what’s going on.
100% of the time Codex has done a far better job according to both Codex and Claude Code when reviewing. Meeting all the requirements where Claude would leave things out, do them lazily or badly and lose track overall.
Codex high just feels much smarter and more capable than Claude currently and even though it's quite a bit slower, it's work that I don't have to go over again and again to get it to the standards I want.
Now, I will concede that for non-coding long-horizon tasks, GPT-5 is marginally worse than Sonnet 4.5 in my own scaffolds. But GPT-5 is cheaper, and Sonnet 4.5 is about 2 months newer. However, for coding in a CLI context, GPT-5-Codex is night-and-day better. I don't know how they did it.
Initially, I found Codex CLI with GPT-5 to be a substitute for Claude Code - now GPT-5 Codex materially surpasses it in my line of work, with a huge asterisk. I work in a niche industry, and Codex has generally poor domain understanding of many of the critical attributes and concepts. Claude happens to have better background knowledge for my tasks, so I've found that Sonnet 4.5 with Claude Code generally does a better job at scaffolding any given new feature. Then, I call in Codex to implement actual functionality since Codex does not have the "You're absolutely right" and mocked/placeholder implementation issues of CC, and just generally writes clean, maintainable, well-planned code. It's the first time I've ever really felt the whole "it's as good as a senior engineer" hype - I think, in most cases, GPT5-Codex finally is as good as a senior engineer for my specific use case.
I think Codex is a generally better product with better pricing, typically 40-50% cheaper for about the same level of daily usage for me compared to CC. I agree that it will take a genuinely novel and material advancement to dethrone Codex now. I think the next frontier for coding agents is speed. I would use CC over Codex if it was 2x or 3x as fast, even at the same quality level. Otherwise, Codex will remain my workhorse.
They put all of their eggs in the coding basket, with the rest of their mission couched as "effective altruism," or "safetyism," or "solving alignment," (all terms they are more loudly attempting to distance themselves from[0], because it's venture kryptonite).
Meanwhile, all OpenAI had to do was point their training cannon at it for a run, and suddenly Anthropic is irrelevant. OpenAI's focus as a consumer company (and growing as a tool company) is a safe, venture-backable bet.
Frontier AI doesn't feel like a zero-sum game, but for now, if you're betting on AI at all, you can really only bet on OpenAI, like Tesla being a proxy for the entire EV industry.
[0] https://forum.effectivealtruism.org/posts/53Gc35vDLK2u5nBxP/...
Claude Code is just a lot more comfortable being in your workflow and being a companion or going full 'agent(s)' and running for 30 minutes on one ticket. It's also a lot happier playing with Agents from other APIs.
There's nothing wrong with Anthropic wanting to completely own that segment and not have aspirations of world domination like OpenAI. I don't see how that's a negative.
If anything, the more ChatGPT becomes a 'everything app' the less likely I am to hold on to my $20 account after cancelling the $200 account.
- Good bash command permission system
- Rollbacks coupled with conversation and code
- Easy switching between approval modes (Claude had a keybind that makes this easy)
- Ability to send messages while it’s working (Codex just queues them up for after it’s done, Claude injects them into the current task)
- Codex is very frustrating when I have to keep allowing it to run the same commands over and over, Claude this works well when I approve it to run a command for the session
- Agents (these are very useful for controlling context)
- A real plan mode (crucial)
- Skills (these are basically just lazy loaded context and are amazing)
- The sandboxing in codex is so confusing, commands fail all the time because they try to log to some system directory or use internet access which is blocked by default and hard to figure out
- Codex prefers python snippets to bash commands which is very hard to permission and audit
When Codex gets to feature parity, I’ll seriously look at switching, but until then it’s just a really good model wrapped in an okay harness
The only edge Claude has is context window, which we do sometimes hit, but I’m sure that gap will close.
Would love any suggestions if anyone in a similar story.
I don't want to leak data either way by using some "let's throw SSO from a sketchy adtech company into the trust loop".
I don't want to wait a minute for Anthropic's login-by-email link, and have the process slam the brakes on my workflow and train of thought.
I don't want to wait a minute for OpenAI's MFA-by-email code (even though I disabled that in the account settings, it still did it).
I don't want to deal with desktop clients I don't trust, or that might not keep up with feature improvements. Nor have to kludge up a clumsy virtualization sandbox for an untrusted client, just to ask an LLM questions that could just be in a Web browser.
I wish the standard were for companies to check new passwords against leaked password lists, e.g. what https://haveibeenpwned.com uses.
I use a similar workflow and have found that websites that allow passkey-based login can avoid the friction of waiting for TOTP codes or magic links.
The current claude.ai signin mechanism is rather annoying.
I have setup a little workflow where given linear tags it sets up a work tree on my dev box installs deps and starts the implementation so I can take it over I prefer this workflow to the fully managed cloud based solutions.
This kind of fits in for issues where I’m basically sure I won’t have to take it over (and it can do it fully on its own). Which aren’t that many.
Very simple example there was a warning pop up on something where I thought there shouldn’t be now it’s done fully automatically from my phone in 5 mins. I quite like that these small changes become so easy.
I got my environment working well with Codex's Cloud Task. Trying to same repo with Claude Code Web (which started off with Claude Code CLI mind you), and the yarn install just hangs with no debuggable output.
I'll try this, but the grounding seems crucial for these LLMs to deliver results that are fewer shot than otherwise.
AI coding should be tightly in the inner dev loop! PRs are a bad way to review and iterate on code. They are a last line of defense, not the primary way to develop.
Give me an isolated environment that is one click hooked up to Cursor/VSCode Remote SSH. It should be the default. I can't think of a single time that Claude or any other AI tool nailed the request on the first try (other than trivial things). I always need to touch it up or at least navigate around and validate it in my IDE.
Also it isn't always about editing. It is about seeing the surrounding code, navigating around, and ensuring the AI did the right thing in all of the right places.
[1] https://ona.com/
I want to run a prompt that operates in an isolated environment that is open in my IDE where I can iterate with the AI. I think maybe it can do this?
idk, we’ve (humans) gotten this far with them. I don’t think they are the right tool for AI generated code and coding agents though, and that these circles are being forced to fit into those squares. imho it’s time for an AI-native git or something.
I like the idea of background agents running in the cloud but it has to be a more persistent environment. It also has to run on a GUI so it can develop web applications or run the programs we are developing, and run them properly with the GUI and requiring clicking around, typing things etc. Computer use, is what we need. But that would probably be too expensive to serve to the masses with the current models
They want to turn everything into a bootstrap framework, which is probably the limit of their mental horizon. And many people maintain that the emperor is fully clothed and that the scam works.
Seeing comments like this all over the place. I switched to CC from Cursor in June / July because I saw the same types of comments. I switched from VSCode + Copilot about 8 months before that for the same reason. I remember being skeptical that this sort of thing was guerilla marketing, but CC was in fact better than Cursor. Guess I'll try Codex, and I guess that it's good that there are multiple competing products making big strides.
Never would have imagined myself ditching IDEs and workflows 3x in a few months. A little exhausting
I use CC and codex somewhat interchangeably, but I have to agree with the comments. Codex is a compete monster, and there really isn’t any competition right now.
Wouldn't work for my case since I need a lot of HDD space, GPUs etc. to run the thing I'm working on, but it would be great if I could run a Claude Code server in my server, expose the port and then connect via web or iOS interface.
Sure I can use tmux/ssh but it's very impractical specially in mobile.
It can run in a front-end only mode (I'll put up a hosted version soon), and then you need to specify your OpenCode API server and it'll connect to it. Alternatively, it can spin up the API server itself and proxy it, and then you just need to expose (securely) the server to the internet.
The UI is responsive and my main idea was that I can easily continue directing the AI from my phone, but it's also of course possible to just spin up new sessions. So often I have an idea while I'm away from my keyboard, and being up able to just say "create an X" and let it do its thing while I'm on the go is quite exciting.
It doesn't spin up a special sandbox environment or anything like that, but you're really free to run it inside whatever sandboxing solution you want. And unlike Claude Code, you're of course free to choose whatever model you want.
It’s Sonnet 4.5 + GPT-5 working together.
Codex just isn’t as good as people make it out to be. OpenAI seems to train on a lot of JavaScript/Tailwind to make visuals look more impressive but when it comes to actual backend work it just fails more than it succeeds. Sonnet is much better at chewing through tasks and GPT 5 is great at consulting planning and analysis.
Using Amp and asking it to check everything with the oracle leads to superior results.
But no one on HN has heard of it. I’m guessing HN hates twitter?
minimaxir•3h ago