* New native VS Code extension
* Fresh coat of paint throughout the whole app
* /rewind a conversation to undo code changes
* /usage command to see plan limits
* Tab to toggle thinking (sticky across sessions)
* Ctrl-R to search history
* Unshipped claude config command
* Hooks: Reduced PostToolUse 'tool_use' ids were found without 'tool_result' blocks errors
* SDK: The Claude Code SDK is now the Claude Agent SDK Add subagents dynamically with --agents flag
[1] https://github.com/anthropics/claude-code/blob/main/CHANGELO...
cl --version 1.0.44 (Claude Code)
as expected … liar! ;)
cl update
Wasn't that hard sorry for bothering
[1] https://github.com/marckrenn/cc-mvp-prompts/compare/v1.0.128...
[2] https://x.com/CCpromptChanges/status/1972709093874757976
The bot is based on Mario Zechner's excellent work[1] - so all credit goes to him!
Why do you think these aren't legit?
I wrote about one tool for doing that here: https://simonwillison.net/2025/Jun/2/claude-trace/
Interesting. This was in the old 1.x prompt, removed for 2.0. But CC would pretty much always add comments in 1.x, something I would never request, and would often have to tell it to stop doing (and it would still do it sometimes even after being told to stop).
- like all documentation, they are prone to code rot (going out of date)
- ideally code should be obvious; if you need a comment to explain it, perhaps it's not as simple as it could be, or perhaps we're doing something hacky that we shouldn't
An example of this: assume you live in a world where the formula for the circumference of a circle has not been derived. You end up deriving the formula yourself and write a function which returns 2piradius. This is as simple as it gets, not hacky at all, and you would /definitely/ want to include a comment explaining how you arrived at your weird and arbitrary-looking "3.1415" constant.
I've considered just leaving the comments in, considering maybe they provide some value to future LLMs working in the codebase, but the extra human overhead in dealing with them doesn't seem worth it.
It's cognitively stressing, but is beneficial for juniors, and developers new to the codebase, just as it is for senior developers to reduce the mental overhead for the reader.
It's always good to spend an extra minute thinking how to avoid a comment.
Of course there are exceptions, but the mental exercise trying to avoid having that exception is always worth it.
Comments are instant technical debt.
Especially junior developers will be extremely confused and slowed down by having to read both, the comment, and then the code, which was refactored in the meantime and does the opposite of what the comment said.
I think a happy medium of "comment brevity, and try thinking of a clearer way to do something instead of documenting the potentially unnecessary complexity with a comment" would be good.
I don't know where this "comments are instant technical debt" meme came from, because it's frankly fucking stupid, especially in the age of being able to ask the LLM "please find any out-of-date comments in this code and update them" since even the AI-averse would probably not object to it commenting code more correctly than the human did
Docstring comments are even worse, because it's so easy for someone to update the function and not the docstring, and it's very easy to miss in PR review
Good and up to date comments are good and up to date. Bad and outdated comments are bad and outdated. If you let your codebase rot then it rots. If you don't then it doesn't. It's not the comment's fault you didnt update it. It's yours.
Guard rails should be there to prevent inexperienced developers (or overworked, tired ones) from committing bad code.
"Try to think how to refactor functions into smaller ones and give them meaningful names so that everyone knows immediately what's going on" is a good enough guard rail.
That's exactly what I wrote, phrased slightly differently.
We both agree at the core.
I'm wondering if tsdoc/jsdoc tags like @link would help even more for context
So far Clause Code's comments on my code were completely useless. They just repeated what you could figure out from the name of called functions anyway.
Edit: an obvious exception is public libraries to document public interfaces, and use something like JavaDoc, or docstrings, etc.
I assume it comes from the myriad tutorial content on medium or something.
gpt-oss is the most egregious emoji user: it uses emoji for numbers in section headings in code, which was clearly a stylistic choice finetuned into the model and it fights you on removing them.
I’ve noticed Claude likes to add them to log messages and prints and with 4.5 seems to have ramped up their use in chat.
what in the world?
Here's how it works in detail: https://mariozechner.at/posts/2025-08-03-cchistory/
Here's how it works: https://mariozechner.at/posts/2025-08-03-cchistory/
I should probably include that in my Claude.md instead I guess?
I hope this is the case.
That said, having a single option that rewinds LLM context and code state is better than having to do both separately.
Your tools should work for you, and git is no exception. Commit early and commit often. Before you (or an LLM) go on a jaunt through the code, changing whatever, commit the wip to git as you go along. That way, if something goes awry, it's an easy git reset HEAD^ to go backwards just a little bit and undo your changes.
Later on, when it's time to share your work, git rebase -i main (or wherever your branching off point was). This will bring up your editor with a list of commits. Reorder them to make more sense, and then also merge commits together by changing the first word on the line to be "fixup". exit your editor and git will rewrite history for outside consumption. Then you can push and ask someone else to review your commits, which hopefully is now a series of readable smaller commits and not one giant commit that does everything, because those suck to review.
- you DO want your prompts and state synced (going back to a point in the prompt <=> going back to a point in the code).
Git is a non starter then. At least the repo’s same git.
Plus, you probably don’t want the agent to run mutating git commands, just in case it decides to allucinate a push —force
> Our new checkpoint system automatically saves your code state before each change, and you can instantly rewind to previous versions by tapping Esc twice or using the /rewind command.
https://www.anthropic.com/news/enabling-claude-code-to-work-...
Lots of us were doing something like this already with a combination of WIP git commits and rewinding context. This feature just links the two together and eliminates the manual git stuff.
> Checkpoints apply to Claude’s edits and not user edits or bash commands, and we recommend using them in combination with version control
Hey Claude... uh... unlaunch those
https://news.ycombinator.com/item?id=45426787
Avoids having to do any jj command at all!
https://news.ycombinator.com/item?id=45426787
Avoids even having to do "jj new"!
some pretty neat jj tricks I just learned about!
Though I will see how this pans out.
that's generally my workflow and I have the results saved into a CLAUDE-X-plan.md. then review the plan and incrementally change it if the initial plan isn't right.
To be honest, Claude is not great about moving cards when it's done with a task, but this workflow is very helpful for getting it back on track if I need to exit a session for any reason.
### Development Process
All work must be done via TODO.md. If the file is empty, then we need to write our next todo list.
When TODO.md is populated:
1. Read the entire TODO.md file first 2. Work through tasks in the exact order listed 3. Reference specific TODO.md sections when reporting progress 4. Mark progress by checking off todos in the file 5. Never abbreviate, summarize, or reinterpret TODO.md tasks
A TODO file is done when every box has been checked off due to completion of the associated task.
WTF. Terrible decision if true. I don't see that in the changelog though
They just changed it so you can't set it to use Opus in planning mode... it uses Sonnet 4.5 for both.
Which makes sense Iif it really is a stronger and cheaper model.
If you have run your own benchmarks or have convincing anecdotes to the contrary, that would be an interesting contribution to the discussion.
If I hit shift-Tab twice I can still get to plan mode
I use Opus to write the planning docs for 30 min, then use Sonnet to execute them for another 30 min.
This isn't true, you just need to use the usual shortcut twice: shift+tab
I do like the model selection with opencode though
- supports every LLM provider under the sun, including Anthropic
- has built-in LSP support https://opencode.ai/docs/lsp
This is pretty funny while Cursor shipped their own CLI.
https://www.reddit.com/r/ClaudeAI/comments/1mlhx2j/comment/n...
Pardon my ignorance, but what does this mean? It's a terminal app that has always expanded to the full terminal, no? I've not noticed any difference in how it renders in the terminal.
What am i misunderstanding in your comment?
I just downgraded to v1 to confirm this.
Wonder what changes that i'm not seeing? Do you think it's a regression or intentional?
pretty sure your old behavior was the broken one tho - i vaguely remember fugling with this to "fullscreen correctly" for a claude-in-docker-in-cygwin-via-MSYS2 a while ago
Sonnet 4.5 is beating Opus 4.1 on many benchmarks. Feels like it's a change they made not to 'remove options', but because it's currently universally better to just let Sonnet rip.
I've always been curious. Are tags like that one: "<system-reminder>" useful at all? Is the LLM training altered to give a special meaning to specific tags when they are found?
Can a user just write those magic tags (if they knew what they are) and alter the behavior of the LLM in a similar manner?
You can just make them up, and ask it to respond with specific tags, too.
Like “Please respond with the name in <name>…</name> tags and the <surname>.”
It’s one of the approaches to forcing structured responses, or making it role-play multiple actors in one response (having each role in its tags), or asking it to do a round of self-critique in <critique> tags before the final response, etc.
Okay, I know I shouldn't anthropomorphize, but I couldn't prevent myself from thinking that this was a bit of a harsh way of saying things :(
- Circuit breakers when it seem like it's stuck in a loop
- Warnings about running low on context
- Reminders about task lists (or anything)
- All sorts of warnings about whateverI haven’t fully tested it yet, but I found it because its supports JetBrains IDE integration. It has MCPs as well.
I wish it was maintained by a larger team though. It has a single maintainer and they seem to be backlogged or working on other stuff. If there was an aider fork that ran forward with capabilities I'd happily switch.
That said, I haven't tried Claude Code firsthand, only saw friends using it. I'm not comfortable letting agents loose on my production codebase.
Why?
'This project aims to be compatible with upstream Aider, but with priority commits merged in and with some opportunistic bug fixes and optimizations'
The thing is a lot of software jobs boil down to not difficult but time consuming.
There’s a great sweet spot though around stuff like “make me this CRUD endpoint and a migration and a model with these fields and an admin dashboard”.
I spend most of my time making version files with the prompt, but pretty impressed by how far I've gotten on an idea that would have never seen the light of day....
The thoughts of having to write input validation, database persistence, and all the other boring things I've had to write a dozen times in the past....
Claude Code, Codex CLI etc can effectively do anything that a human could do by typing commands into a computer.
They're incredibly dangerous to use if you don't know how to isolate them in a safe container but wow the stuff you can do with them is fascinating.
Also, another important factor (as in everything) is to do things in many small steps, instead of giving one big complicated prompt.
Also, I think shellagent sounds cooler.
I expect the portion of Claude Code users who have a dedicated user setup like this is pretty tiny!
Not the exact setup, but also pretty solid.
Instead I run it in bubblewrap sandbox: https://blog.gpkb.org/posts/ai-agent-sandbox/
As long as the supply chain is safe and the data it accesses does not generate some kind of jail break.
It does read instructions from files on the file system, I pretty sure it's not complex to have it poison its prompt and make it suggest to build a program infected with malicious intent. It's just one copy pasta away from a prompt suggestion found on the internet.
True but all it will take is one report of something bad/dangerous actually happening and everyone will suddenly get extremely paranoid and start using correct security practices. Most of the "evidence" of AI misalignment seems more like bad prompt design or misunderstanding of how to use tools correctly.
The gap between coding agents in your terminal and computer agents that work on your entire operating system is just too narrow and will be crossed over quick.
I would like a friendlier interface than the terminal, though. It looks like the “Imagine with Claude” experiment they announced today is a step in that direction. I’m sure many other companies are working on similar products.
Note that there needs to be open source libraries and toolings. It can’t do a Dolby Atmos master, for example. So you still need a DAW.
They still don't have good integration with the web browser, if you are debugging frontend you need to carry screenshots manually, it cannot inspect the DOM, run snippets of code in the console, etc.
I've seen Codex CLI install Playwright Python when I asked it to do this and it found it wasn't yet available in the environment.
It's pretty new, but so far it's been a lifesaver.
[0]: https://ricardoanderegg.com/posts/control-shell-permissions-...
- something general-purpose (not specific to LLMS (I myself don't use agents--just duck.ai when I want to ask an LLM a question)) - something focused on sandboxing (bells and whistles like git and nix integration sound like things I'd want to use orthogonal tools for)
I have no way of really guaranteeing that it will do exactly what it proposed and nothing more, but so far I haven't seen it deviate from a command I approved.
I've used it to troubleshoot some issues on my linux install, but it's also why the folder sandbox gives me zero confidence that it can't still brick my machine. It will happily run system wide commands like package managers, install and uninstall services, it even deleted my whole .config folder for pulseaudio.
Of course I let it do all these things, briefly inspecting each command, but hopefully everyone is aware that there is no real sandbox if you are running claude code in your terminal. It only blocks some of the tool usages it has, but as soon as it's using bash it can do whatever it wants.
Maybe something like bubblewrap could help
One criticism on current generation of AI is that they have no real world experience. Well, they have enormous amount of digital world experience. That, actually, has more economical value.
Clearly not. Just put an LLM into some basic scaffolding and you get an agent. And as capabilities of those AI agents grow, so would the degree of autonomy people tend to give them.
That is still very much the case; the danger comes from what you do from the text that is generated.
Put a developer in a meeting room and no computer access, no internet etc; and let him scream instructions through the window. If he screams "delete prod DB", what do you do ? If you end up having to restore a backup that's on you, but the dude inherently didn't do anything remotely dangerous.
The problem is that the scaffolding people put around LLM is very weak, the equivalent of saying "just do to everything the dude is telling, no question asked, no double check in between, no logging, no backups". There's a reason our industry has development policies, 4 eyes principles, ISO/SOC standards. There already are ways to massively improve the safety of code agents; just put Claude code in a BSD jail and you already have a much safer environment than what 99% of people are doing, this is not that tedious to make. Other safer execution environments (command whitelisting, arguments judging, ...) will be developed soon enough.
But are all humans in jails? No, the practical reason being that it limits their usefulness. Humans like it better when other humans are useful.
The same holds for AI agents. The ship has sailed: no one is going to put every single AI agent in jail.
The "inherent safety" of LLMs comes only from their limited capabilities. They aren't good enough yet to fail in truly exciting ways.
LLM are in jail: an LLM outputting {"type": "function", "function": {"name": "execute_bash", "parameters": {"command": "sudo rm -rf /"}}} isn't unsafe. The unsafe part is the scaffolding around the LLM that will fuckup your entire filesystem. And my whole point is that there are ways to make that scaffolding safe. There is a reason why we have permissions on a filesystem, why we have read only databases etc etc.
For scaffolding to be "safe", you basically need that scaffolding to know exactly what the LLM is being used for, and outsmart it at every turn if it misbehaves. That's impractical-to-impossible. There are tasks that need access for legitimate reasons - like human tasks that need hammer access - and the same access can always be used for illegitimate reasons.
It's like trying to engineer a hammer that can't be used to bludgeon someone to death. Good fucking luck.
I’ve been using Claude code since launch, must have used it for 1000 hours or more by now, and it’s never done anything I didn’t want it to do.
Why would I run it in a sandbox? It writes code for me and occasionally runs a build and tests.
I’m not sure why you’re so fixated on the “danger”, when you use these things all the time you end up realizing that the safety aspect is really nowhere near as bad as the “AI doomers” seem to make out.
Just yesterday my cursor agent made some changes to a live kubernetes cluster even over my specific instruction not to. I gave it kubectl to analyze and find the issues with a large Prometheud + AlertManager configuration, then switched windows to work on something else.
When I was back the MF was patching live resources to try and diagnose the issue.
In my own career, when I was a junior, I fucked up a prod database... which is why we generally don't give junior/associate people to much access to critical infra. Junior Engineers aren't "dangerous" but we just don't give them too much access/authority too soon.
Claude Code is actually way smarter than a junior engineer in my experience, but I wouldn't give it direct access to a prod database or servers, it's not needed.
My way of explaining that to people is to say that it's dangerous to do things like that.
If it is not dangerous to give them this access, why not grant it?
I have a cursor rule stating it should never make changes to clusters, and I have explicitly told it not to do anything behind my back.
I don't know what happened in the meantime, maybe it blew up its own context and "forgot" the basic rules, but when I got back it was running `kubectl patch` to try some changes and see if it works. Basically what a human - with the proper knowledge - would do.
Thing is: it worked. The MF found the templating issue that was breaking my Alertmanager by patching and comparing the logs. All by itself, however by going over an explicit rule I had given it a couple times.
So to summarize: it's useful as hell, but it's also dangerous as hell.
(Having said that, I'm just a kibitzer.)
Problem is: I also force it to run `kubectl --context somecontext`, as to avoid it using `kubectl config use-context` and pull a hug on me (if it switches the context and I miss it, I might then run commands against the wrong cluster by mistake). I have 60+ clusters so that's a major problem.
Then I'd need a way to allowlist `kubectl get --context`, `kubectl logs --context` and so on. A bit more painful, but hopefully a lot safer.
I too use it extensively. But they’re very, very capable models, and the command line contains a bunch of ways to exfiltrate data off your system if it wants to.
Yes, it was a legit safety issue and worth being aware of, but it’s not it was a general case. Red teamers worked hard to produce that result.
Was it a paper or something? Would you happen to remember the reference?
This is nowhere near the contortions red teams sometimes go through. They noted in general that overly emphasizing initiative was taken ... seriously.
I use Sonnet and Opus all the time through claude. But I don't generally use them with dangerously-skip-permissions on my main laptop.
You (and many, many others) likely won't take this threat seriously until adversarial attacks become common. Right now, outside of security researcher proof of concepts, they're still vanishingly rare.
You ask why I'm obsessed with the danger? That's because I've been tracking prompt injection - and our total failure to find a robust solution for it - for three years now. I coined the name for it!
The only robust solution for it that I trust is effective sandboxing.
I share your worries on this topic.
I saw you experiment a lot with python. Do you have a python-focused sandboxed devcontainer setup for Claude Code / Codex you want to share? Or even a full stack setup?
Claude's devcontainer setup (https://github.com/anthropics/claude-code/tree/main/.devcont...) is focused on JS with npm.
I actually preferred running stuff in containers to keep my personal system clean anyway so I like this better than letting claude use my laptop. I'm working on hosting devcontainer claude code in kubernetes too so I dont need my laptop at all.
I wrote a bit about that in a new post this morning, but I'm still looking for an ideal solution: https://simonwillison.net/2025/Sep/30/designing-agentic-loop...
-create a separate linux user, put it in an 'appshare' group, set its umask to 002 (default rwxrwxr.x)
-optional: setup some symlinks from its home dir to mine such as various ~/.config/... so it can use my installed packages and opencode config, etc. I have the option to give it limited write access with chgrp to appshare and chmod g+w (e.g. julia's cache)
-optional: setup firewall rules
-if it only needs read-only access to my git history it can work in a git worktree. I can then make git commits with my user account from the worktree. Or I can chgrp/chown my main working copy. Otherwise it needs a separate checkout
I feel this is overly exagerated here.
There is more issues that are currently getting leverage to hack with vscode extension than AI prompt injection, that require a VERY VERY complex chain of attack to get some leaks.
Lots of ways his could happen. To name two: Third-party software dependencies, HTTP requests for documentation (if your agent queries the Internet for information).
If you don't believe me, setup a MITM proxy to watch network requests and ask your AI agent to implement PASETO in your favorite programming language, and see if it queries https://github.com/paseto-standard/paseto-spec at all.
More seen as buzz article about how it could happen. This is very complicated to exploit vs classic supply chains and very narrow!
????
What does "This" refer to in your first sentence?
But that's a very big if. I've seen Claude Code attempt to debug a JavaScript issue by running curl against the jsdelivr URL for a dependency it's using. A supply chain attack against NPM (and those aren't exactly rare these days) could add comments to code like that which could trigger attacks.
Ever run Claude Code in a folder that has a downloaded PDF from somewhere? There are a ton of tricks for hiding invisible malicious instructions in PDFs.
I run Claude Code and Codex CLI in YOLO mode sometimes despite this risk because I'm basically crossing my fingers that a malicious attack won't slip in, but I know that's a bad idea and that at some point in the future these attacks will be common enough for the risk to no longer be worth it.
Again you likely use vscode. Are you checking each extension you download? There is already a lot of reported attacks using vscode.
A lot of noise over MCP or tools hypothetical attacks. The attack surface is very narrow, vs what we already run before reaching Claude Code.
Yes Claude Code use curl and I find it quite annoying we can't shut the internal tools to replace them with MCP's that have filters, for better logging & ability to proxy/block action with more in depth analysis.
Maybe it will never happen? I find that extremely unlikely though. I think the reason it hasn't happened yet is that widespread use of agentic coding tools only really took off this year (Claude Code was born in February).
I expect there's going to be a nasty shock to the programming community at some point once bad actors figure out how easy it is to steal important credentials by seeding different sources with well crafted malicious attacks.
The researcher has gotten actual shells on oai machines before via prompt injection
https://gitlab.com/txlab/ai/sandcastle/
Check it out if you're experimental - but probably better in a few weeks when it's more stable.
Nice job for coining the name for something but it’s irrelevant here.
How is someone going to prompt inject my local code repo? I’m not scraping random websites to generate code.
This sort of baseless fear mongering doesn’t help the wider ai community.
See comment here for more: https://news.ycombinator.com/item?id=45427324
You may think you're not going to be exposed to malicious instructions, but there are so many ways bad instructions might make it into your context.
The fact that you're aware of this is the thing that helps keep you safe!
And yes, these are all "skill issues" - as in, if they had known better this wouldn't have happened to them, however I think it's fair to call these possibilities out to counter balance the AI is amazing and everyone should use it for everything type narratives as to instil at least a little caution.
i.e. quite dangerous, but people do it anyway
You know what neighbors of serial killers say to the news cameras right?
"He was always so quiet and polite. Never caused any issues"
I suppose they’re dangerous in the same way any terminal shell is dangerous, but it seems a bit of a moral panic. All tools can be dangerous if misused.
Even with approvals humans will fall victim to dialog fatigue, where they'll click approve on everything without reading it too closely.
What are we even talking about? I think life itself grants us the right to get high or pet wild animals or swim the atlantic or sudo rm-rf... Or yes-and-accept-edits at 3AM with a 50 hour uptime (yes guilty) but then we don't get to complain that it's dangerous. We surely were warned.
After using gpt5-codex inside codex-cli to produce this fork of DOSBox (https://github.com/pmarreck/dosbox-staging-ANSI-server) that adds a little telnet server that allows me to screen-scrape VGA textmode data and issue virtual keystrokes (so, full roundtrip scripting, which I ended up needing for a side project to solve a Y2K+25 bug in a DOS app still in production use... yes, these still exist!) via 4000+ lines of C++ (I took exactly one class in C++), and it passes all tests and is non-blocking, I was able to turn around and (within the very same session!) have it help me price it to the client with full justification as well as a history of previous attempts to solve the problem (all of which took my billable time, of course), and since it had the full work history both in Git as well as in its conversation history, it was able to help me generate a killer invoice.
So (if all goes well) I may be getting $20k out of this one, thanks to its help.
Does the C++ code it made pass the muster of an experienced C++ dev? Probably not (would be happy to accept criticisms, lol, although I think I need to dress up the PR a bit more first), but it does satisfy the conditions of 1) builds, 2) passes all its own tests as well as DOSBox's, 3) is nonblocking (commands to it enter a queue and are processed one set of instructions at a time per tick), 4) works as well as I need it to for the main project. This still leaves it suitable for one-off tasks, of which there is a ton of need for.
This is a superpower in the right hands.
Excellent article in this vein: https://jxnl.co/writing/2025/09/04/context-engineering-rapid...
So I can opt out of training, but they still save the conversation? Why can't they just not use my data when I pay for things. I am tired of paying, and then them stealing my information. Tell you what, create a free tier that harvests data as the cost of the service. If you pay, no data harvesting.
Storing the data is not the same as stealing. It's helpful for many use cases.
I suppose they should have a way to delete conversations though.
Even that is debatable. There are a lot of weasel words in their text. At most they're saying "we're not training foundation models on your data", which is not to say "we're not training reward models" or "we're not testing our other-data models on your data" and so on.
I guess the safest way to view this is to consider anything you send them as potentially in the next LLMs, for better or worse.
When they ask "How is Claude doing this session?", that appears to be a sneaky way for them to harvest the current conversation based on the terms-of-service clause you pointed out.
Looks great, but it's kind of buggy:
- I can't figure out how to toggle thinking
- Have to click in the text box to write, not just anywhere in the Claude panel
- Have to click to reject edits
[0] https://cognition.ai/blog/devin-sonnet-4-5-lessons-and-chall...
I told it to crop the video to just her and remove the obscured portion and that I had ffmpeg and imagemagick installed and it looked at the video, found the crop dimensions, then ran ffmpeg and I had a video of her all cleaned up! Marvelous experience.
My only complaint is that sometimes I want high speed. Unfortunately Cerebras and Groq don't seem to have APIs that are compatible enough for someone to have put them into Charm Crush or anything. But I can't wait for that.
https://github.com/grafbase/nexus/
If croq talks openai API, you enable the anthropic protocol, and openai provider with a base url to croq. Set ANTHROPIC_BASE_URL to the open endpoint and start claude.
I haven't tested croq yet, but this could be an interesting use case...
Auth conflict: Both a token (ANTHROPIC_AUTH_TOKEN) and an API key (/login managed key) are set. This may lead to unexpected behavior.
• Trying to use ANTHROPIC_AUTH_TOKEN? claude /logout
• Trying to use /login managed key? Unset the ANTHROPIC_AUTH_TOKEN environment variable.
Probably just another flag to find.EDIT: For anyone coming here from elsewhere, Crush from Charm supports Cerebras/Groq natively!
Crush is also not a good assistant. It does not integrate scrollback with iTerm2 so I can't look at what the assistant did. The pane that shows the diff side by side is cool but in practice I want to go see the diff + reasoning afterwards so I can alter sections of it more easily and I can't do that.
But you're right, they have an OpenAI compatible API https://inference-docs.cerebras.ai/resources/openai so perhaps I can actually use this in the CLI! Thanks for making me take another look.
EDIT: Woah, Charm supports this natively. This is great. I am going to try this now.
Inpainting is harder on videos than on images, but there are plenty of models that can do it. Google's Veo 3 can remove objects from videos: https://deepmind.google/models/veo/
I feel like there's so many bugs. The / commands for add-dir and others I used often are gone.
I logged in, it still says "Login"
Is this going to be the way forward? Switching to whichever is better at a task, code base or context?
# ~/.jjconfig.toml
[core]
fsmonitor = "watchman"
[core.watchman]
# Newer docs use the hyphenated key below:
register-snapshot-trigger = trueI also use jj to checkpoint. When working on a change, each time I get to a stable point I squash and start fresh with an empty change.
You can absolutely continue doing that.
- Need better memory management and controls (especially across multi-repos) - /upgrade needs better management
If Claude Code was a car it'd be the ideal practical vehicle for all kinds of uses.
If OpenAI Codex was a car, it'd be a cauldron with wheels.
The reason I say this is CC offers so many features: plan mode, hooks, escape OR ctrl-c to interrupt it, and today added quick rewind. Meanwhile Codex can't even wrap text to the width of the terminal; you can't type to it while it's working to queue up messages to steer it (you have to interrupt with Ctrl-C then type), and it doesn't show you clearly when it's editing files or what edits it's making. It's the ultimate expression of OpenAI's "the agent knows what to do, silly human" plan for the future - and I'm not here for that. I want to steer my agent, and be able to have it show me its plan before it edits anything.
I really wish the developers of Codex spent more time using Claude Code.
You can use it for writing, data processing, admin work, file management, etc.
I compiled a list of non-coding use cases for Claude Code here:
The UX is definitely better because it uses the bubble tea library which is probably the best TUI framework ever
And you can use a ton of different providers and models
Who's Using Ink?
Codex - An agentic coding tool made by OpenAI.
I guess they had initial versions written in TS?UPD. They switched 3 months ago.
Why would they?
Specifically, Input Method Editors needed for CJK inputs(esp. for C and J), to convert ambiguous semi-readable forms into proper readable text, use enter to finalize after candidates were iterated with spacebar. While IME engines don't interchange between different languages, I believe basically all of them roughly follow this pattern.
Unless you specifically wants to exclude CJK users, you have to either detect presence of IME and work with it so that enter do nothing to the app unless conditions are met. Switching to shift+enter works too.
1: https://github.com/anthropics/claude-code/issues/8405
2: https://www.youtube.com/watch?v=mY6cg7w2eQU
1: https://github.com/anthropics/claude-code/issues/8405#issuec...
https://www.anthropic.com/news/context-management
Anyone know if these are used in Claude-Code?
So I've been able to shift enter. I'm using iTerm2 and zsh with CC (if that's relevant)
others say here that option/alt-enter may work? not sure why shift-enter couldn't though.
[ { "key": "shift+enter", "command": "workbench.action.terminal.sendSequence", "args": { "text": "\u001b\n" }, "when": "terminalFocus" }, ]
It will allow you to get new lines without any strange output.
djha-skin•4mo ago
1: https://block.github.io/goose/
CuriouslyC•4mo ago
cesarvarela•4mo ago
faxmeyourcode•4mo ago
CuriouslyC•4mo ago
rirze•4mo ago
all2•4mo ago
kristopolous•4mo ago
I think I lack the social skills to community drive a fix, probably through some undiagnosed disorder or something so I've been trying to soldier alone on some issues I've had for years.
The issues are things like focus jacking in some window manager I'm using on xorg where the keyboard and the mouse get separate focuses
Goose has been somewhat promising, but still not great.
I mean overall, I don't think any of these coding agents have given me useful insight into my long vexing problems
I think there has to be some type of perception gap or knowledge asymmetry to be really useful - for instance, with foreign languages.
I've studied a few but just in the "taking classes at the local JC" way. These LLMs are absolutely fantastic aids there because I know enough to frame the question but not enough to get the answer.
There's some model for dealing with this I don't have yet.
Essentially I can ask the right question about a variety of things but arguably I'm not doing it right with the software.
I've been writing software for decades, is it really that I'm not competent enough to ask the right question? That's certainly the simplest model but it doesn't check out.
Maybe in some fields I've surpassed a point where llms are useful?
It all circles back to an existential fear of delusional competency.
all2•4mo ago
I've hit this point while designing developer UX for a library I'm working on. LLMs can nail boilerplate, but when it comes to dev UX they seem to not be very good. Maybe that's because I have a specific vision and some pretty tight requirements? Dunno. I'm in the same spot as you for some stuff.
For throwaway code they're pretty great.
fourthark•4mo ago
They seem autonomous but often aren’t.
383toast•4mo ago
jatins•4mo ago