Claude Code CLI was broken

https://github.com/anthropics/claude-code/issues/16673

181•sneilan1•1mo ago

Comments

indigodaddy•1mo ago

Maybe try opencode

stavros•1mo ago

Is it better than CC? Can it use my subscription, or is it API-only? I've seen it mentioned, but not many people elaborate on the performance.

Loeffelmann•1mo ago

You can use subscriptions.

I like it but I am not too deep into the whole agentic coding business.

viraptor•1mo ago

It's about the same as CC. You can use subscriptions and API. It works well with basically all the providers as well - no need for hacks over Claude-like endpoints. Most big plugins I've dealt with support both CC and OC at the same time.

mkagenius•1mo ago

Last I tried, it wasn't. In that vein you can use Qwen code too.

Squarex•1mo ago

I have used it with antigravity subscription and it felt worse than antigravity itself. Notably the planning was way worse.

indigodaddy•1mo ago

Are the Opus limits with AG/AI Pro plan still quite good?

Squarex•1mo ago

I have hit the limits several times already, but it resets every 5 hours.

joflicu•1mo ago

You can use opencode with your existing subscription by hooking it correctly via "opencode auth login".

convenwis•1mo ago

This is interesting because Anthropic seems to allow Opencode to do this but no one else. And the lead on opencode won't comment (https://github.com/anomalyco/opencode/issues/417#issuecommen...).

I am curious what the logic here is.

Loeffelmann•1mo ago

Some one apparently figured it out. The first system message has to include

"You are Claude Code, Anthropic's official CLI for Claude."

https://github.com/link-assistant/agent/pull/63

stavros•1mo ago

Very interesting, thanks! Hopefully it'll allow me to switch between CC and Codex easily too.

someguyiguess•1mo ago

What’s the advantage of using a third party tool? What extra functionality does it have?

wiseowise•1mo ago

It is open source, to start with.

behnamoh•1mo ago

I don’t like the main developer (dax). He is too arrogant and self-righteous.

Squarex•1mo ago

I am not saying that he is not, but do you have any references or dramas?

nexawave-ai•1mo ago

It has over 1,400 open issues and over 600 open pull requests. That doesnt inspire much confidence in me to use this tool.

thehamkercat•1mo ago

Claude code has more than 5000 Open issues

phyrex•1mo ago

workaround from the issue discussion:

```

  Problem: Claude Code 2.1.0 crashes with Invalid Version: 2.1.0 (2026-01-07) because the CHANGELOG.md format changed to include dates in version headers (e.g., ## 2.1.0 (2026-01-07)). The code parses these headers as object keys and tries to sort them using semver's .gt() function, which can't parse version strings with date suffixes.

  Affected functions: W37, gw0, and an unnamed function around line 3091 that fetches recent release notes.

  Fix: Wrap version strings with semver.coerce() before comparison. Run these 4 sed commands on cli.js:

  CLI_JS="$HOME/.nvm/versions/node/$(node -v)/lib/node_modules/@anthropic-ai/claude-code/cli.js"

  # Backup first
  cp "$CLI_JS" "$CLI_JS.backup"

  # Patch 1: Fix ve2.gt sort (recent release notes)
  sed -i 's/Object\.keys(B)\.sort((Y,J)=>ve2\.gt(Y,J,{loose:!0})?-1:1)/Object.keys(B).sort((Y,J)=>ve2.gt(ve2.coerce(Y),ve2.coerce(J),{loose:!0})?-1:1)/g' "$CLI_JS"

  # Patch 2: Fix gw0 sort
  sed -i 's/sort((G,Z)=>Wt\.gt(G,Z,{loose:!0})?1:-1)/sort((G,Z)=>Wt.gt(Wt.coerce(G),Wt.coerce(Z),{loose:!0})?1:-1)/g' "$CLI_JS"

  # Patch 3: Fix W37 filter
  sed -i 's/filter((\[J\])=>!Y||Wt\.gt(J,Y,{loose:!0}))/filter(([J])=>!Y||Wt.gt(Wt.coerce(J),Y,{loose:!0}))/g' "$CLI_JS"

  # Patch 4: Fix W37 sort
  sed -i 's/sort((\[J\],\[X\])=>Wt\.gt(J,X,{loose:!0})?-1:1)/sort(([J],[X])=>Wt.gt(Wt.coerce(J),Wt.coerce(X),{loose:!0})?-1:1)/g' "$CLI_JS"

  Note: If installed via different method, adjust CLI_JS path accordingly (e.g., /usr/lib/node_modules/@anthropic-ai/claude-code/cli.js).

```

MattDaEskimo•1mo ago

Parsing markdown into a data structure without any sort of error handling is diabolical for a company like Anthropic

llmslave2•1mo ago

Why? Their software sucks, they're an LLM company not a software company.

behnamoh•1mo ago

They're a LLM company that has claimed that 90% of code will be written by LLMs. Please don’t give them any excuses.

cozzyd•1mo ago

This sounds exactly like the type of thing you would expect an LLM to do

wowoc•1mo ago

Running sed commands manually in 2026? Just tell Codex to fix your Claude Code

nycdatasci•1mo ago

Work around from comments:

  rm -rf ~/.claude/cache
  mkdir -p ~/.claude/cache
  echo "# Changelog" > ~/.claude/cache/changelog.md
  chmod 444 ~/.claude/cache/changelog.md

smca•1mo ago

It's fixed as of nine minutes ago: https://github.com/anthropics/claude-code/pull/16686

agumonkey•1mo ago

was this a 10x gdp vibe-loss ?

actionfromafar•1mo ago

I felt a disturbance in the force, as if millions of GPU cooling fans suddenly spun down.

Loeffelmann•1mo ago

Lol a formatting error in a change log breaking the entire thing

hughes•1mo ago

Genuinely curious how a date in the subheader of a changelog could have broken the CLI

edit: it seems changelog.md is assumed to be structured data and parsed at startup, and there are no tests to enforce the changelog structure: https://github.com/anthropics/claude-code/issues/16671

j2kun•1mo ago

This is the kind of choice an LLM would make...

jeffrallen•1mo ago

You're absolutely right! ;)

therealpygon•1mo ago

You might be surprised (or not, depending on how long you’ve been doing this).

Y_Y•1mo ago

Ah yes, markdown, the ultimate structure for machine-readable data

actionfromafar•1mo ago

Someone had to come up with something even more annoying than yaml for machine-readable data. :)

philipwhiuk•1mo ago

They're using Markdown for everything in LLM-land.

falloutx•1mo ago

its vibe coded to its teets and gets reviewed by AI

cube00•1mo ago

What a lazy commit message, "Update CHANGELOG.md", no mention of the "why" at all. Even the PR description is blank.

tomashubelbauer•1mo ago

This is especially bizarre because one thing LLMs have been better at that practically all the developers I have ever worked with is writing good commit messages. The fact they didn't make use of this here when everything else in Claude Code seems vibe-coded these days is funny to me.

mjmas•1mo ago

Claude Code couldn't write a commit description since it was broken at that point.

hrpnk•1mo ago

With the issues since November where one has to add environment variables, block statsig hosts, modify ~/.claude.json, etc. does anyone have experience in managed setups where versions are centrally set and bumped on company level? Is this worth the hassle?

viraptor•1mo ago

I'm surprised that they don't do an integration test in CI where they actually start the app. (Since that's all you need to catch it)

someguyiguess•1mo ago

The irony is that I have a Claude agent to do exactly this on my projects. You’d think they would have thought of that too.

fragmede•1mo ago

They have now!

falloutx•1mo ago

I mean claude agent isnt known for writing good tests. amount of bugs it misses makes me tear up

eterm•1mo ago

Ironically that might have passed, because this didn't break the version, this broke all versions when the global referenced changelog was published. It wasn't the new version itself that was broken.

But testing new version would have been downloading the not-yet-updated working changelog.

There are ways to deal with this of course, and I'm not defending the very vibey way that claude-code is itself developed.

viraptor•1mo ago

Ah, that's an external file. That explains it.

0xbadcafebee•1mo ago

We're trying to make billions of dollars here, we don't have time to do crazy things like test basic functionality before shipping changes to all live users at once

kwancloudli•1mo ago

why people still use it then? I can confirm 99.9% programmers now can't finish the daily task without using Claude Code

stpedgwdgfhgdd•1mo ago

Our product is so good, the users are willing to put up with a bug once and there.

We need to get marketshare by going fast!

Ancapistani•1mo ago

You jest, but I'm trying to decide if I want to convert an exploratory project I'm working on to work in Claude Code rather than Cursor, where I started.

I've been using AI codegen for months now, but on large projects. Turns out, the productivity multiplier that agentic AI can be scales at least partially in proportion to project size. Read that again, because I don't mean "inverse proportion".

When a codebase is small, every change touches a majority of the codebase, making parallel work difficult or impossible. Once it gets large enough to have functional areas, you can have multiple tasks running at once with little or no merge conflicts.

I was giving Cursor a shot because it's the tool that's most popular at my new company. Prior to this, I was using OpenHands. I've used Claude Code quite a bit for my personal stuff, but I wanted some hands-on experience with local tooling and Cursor was the default choice.

Now that I've got this app to the point where frontend and backend concerns are separate and the interfaces are defined I'm realizing that Cursor doesn't seem to have anything approaching Claude Code's parallel subagent support. That's... limiting.

So now I get to decide if the improvement in velocity I'll get from switching to CC will offset the time it'll take me to make the change before I have a deadline to meet.

Hamuko•1mo ago

Considering how shitty tests my coworkers are producing with Claude, I'm not all that surprised.

steve_adams_86•1mo ago

I just set this up for the project I'm working on last week, and felt dirty because it took me a couple of months to get to it. There are like 5 or 6 users.

There's something so unnerving about the people pushing the AI frontier being sloppy about testing. I know, it's just a CLI wrapped around the AI itself, but it suggests to me that the culture around testing there isn't as tight and thorough as I'd like it to be.

frays•1mo ago

Claude Code creator said Claude wrote 100% of his code last month: https://xcancel.com/bcherny/status/2004897269674639461

midldei•1mo ago

I read your comment as a joke, but in case if was a defense, or is taken as a defense by others, let me help you punch up your writing for you:

"[Person who is financially incentivized to make unverifiable claims about the utility of the tool they helped build] said [tool] [did an unverified and unverifiable thing] last month"

antonvs•1mo ago

"Claude Code creator relied so heavily on Claude Code that he broke Claude Code"

kace91•1mo ago

>In the last thirty days, I landed 259 PRs -- 497 commits, 40k lines added, 38k lines removed.

Is anyone with or without AI approaching anywhere near that speed of delivery?

I don’t think my whole company matches that amount. It sounds super unreasonable, just doing a sanity check.

vessenes•1mo ago

Check out Steve Yegge’s pace with beads and gas town - well in excess of that.

drdrey•1mo ago

...but is it good?

cheschire•1mo ago

No, per Steve himself.

https://steve-yegge.medium.com/welcome-to-gas-town-4f25ee16d...

no-name-here•1mo ago

Specifically:

> It’s also 100% vibe coded. I’ve never seen the code, and I never care to, which might give you pause. ‘Course, I’ve never looked at Beads either, and it’s 225k lines of Go code that tens of thousands of people are using every day. I just created it in October. If that makes you uncomfortable, get out now.

bschwarz•1mo ago

Was it Steve Yegge who introduced "but is it good? [yes]"? I can't find the first instance of this.

lelanthran•1mo ago

Yeah, but at that pace it is, for all practical purposes, unreviewable.

Humans writing is slow, no doubt, but humans reading code ain't that much faster.

groundzeros2015•1mo ago

I can make a bot that touches each line of code and commits it, if you would like.

deepjoy•1mo ago

Recently came across a project on HN front page that was developed on Github with a public repo. https://github.com/steveyegge/gastown/graphs/contributors 2000 commits over 20 days +497K/-360K lines

I'm not affiliated with Claude or the project linked.

AlexCoventry•1mo ago

The author has written an evangelical book about vibe coding.

https://www.amazon.com/Vibe-Coding-Building-Production-Grade...

He also has some other agent-coordination software. https://github.com/steveyegge/vc

Don't know whether it's helpful, or what the difference is.

AlexCoventry•1mo ago

Anthropic must be loving this.

> Gas Town is also expensive as hell. You won’t like Gas Town if you ever have to think, even for a moment, about where money comes from. I had to get my second Claude Code account, finally; they don’t let you siphon unlimited dollars from a single account, so you need multiple emails and siphons, it’s all very silly. My calculations show that now that Gas Town has finally achieved liftoff, I will need a third Claude Code account by the end of next week. It is a cash guzzler.

https://steve-yegge.medium.com/welcome-to-gas-town-4f25ee16d...

mktemp-d•1mo ago

40K - 38K means 2K lines of actual code.

Which could mean that code was refactored and then built on top of. Or it could just mean that Claude had to correct itself multiple times over those 459 commits.

Does correcting your mistakes from yesterday’s ChatGPT binge episode count as progress…maybe?

lelanthran•1mo ago

If it doesn't revert the corrections, maybe it is progress?

I can easily imagine constant churn in the code because it switches between five different implementations when run five times, foing back to the first one on the sixth time and repeating the process.

I gotta ask, though, why exactly is that much code needed for what CC does?

It's a specialised wrapper.

fragmede•1mo ago

How many lines of code are they allowed to use for it, and why have we put you in charge of deciding how much code they're allowed to use? There's probably a bit more to it than just:

    #!/usr/bin/env bash
    while true; do
      printf "> "
      read -r USER_INPUT || exit 0
      RESPONSE=$(curl -s https://api.openai.com/v1/chat/completions \
        -H "Authorization: Bearer $OPENAI_API_KEY" \
        -H "Content-Type: application/json" \
        -d "{
          \"model\": \"gpt-5.2\",
          \"messages\": [
            {\"role\": \"user\", \"content\": \"$USER_INPUT\"}
          ]
        }")
      echo "$RESPONSE" | jq -r '.choices[0].message.content'
    done

lelanthran•1mo ago

> How many lines of code are they allowed to use for it, and why have we put you in charge of deciding how much code they're allowed to use?

That's an awfully presumptious tone to take :-)

I'm not deciding "This is how many lines they are allowed", I'm trying to get an idea of exactly what sort of functionality that CC provides requires that sort of volume.

I mean, it's a high-level language being used, it's pulling in a lot of dependencies, etc. It literally is glue code.

Bearing in mind that it appears to be (at this point anyway) purely vibe-coded, I am wondering just how much of the code is dead weight - generated by the LLM and never removed.

kybernetikos•1mo ago

AI approaches can churn code more than a human would.

Lines of code has always been a questionable metric of velocity, and AI makes that more true than ever.

fzysingularity•1mo ago

I'd love to see Claude Code remove more lines than it added TBH.

There's a ton of cruft in code that humans are less inclined to remove because it just works, but imagine having LLM doing the clean up work instead of the generation work.

kace91•1mo ago

Even discounting lines of code:

- get a feature request/bug

- understand the problem

- think on a solution

- deliver the solution

- test

- submit to code review, including sufficient explanation, and merge when ready

260 PRs a month means the cycle above is happening once per hour, at constant speed, for 60 hours work weeks.

coldtea•1mo ago

One can think of a lot of obvious improvements to a MVP product that don't requre much regarding "get a feature request/bug - understand the problem - think on a solution".

You know the features you'd like to have in advance, or changes you want to make you can see as you build it.

And a lot of the "deliver the solution - test - submit to code review, including sufficient explanation" can be handled by AI.

bredren•1mo ago

The premise of the steps you've listed is flawed in two ways.

This is more what agentic-assisted dev looks like:

1. Get a feature request / bug

2. Enrich the request / bug description with additional details

3. Send AI agents to handle request

4a. In some situations, manually QA results, possibly return to 2.

4b. Otherwise, agents will babysit the code through merge.

The second is that the above steps are performed in parallel across X worktrees. So, the stats are based on the above steps proceeding a handful of times per hour--in some cases completely unassisted.

---

With enough automation, the engineer is only dealing with steps 2 and 4a. You get notified when you are needed, so your attention can focus on finding the next todo or enriching a current todo as per step 2.

---

Babysitting the code through merge means it handles review comments and CI failures automatically.

---

I find communication / consensus with stakeholders, and retooling take the most time.

uoaei•1mo ago

You're counting wheel revolutions, not miles travelled. Not an accurate proxy measurement unless you can verify the wheels are on the road for the entire duration.

lukev•1mo ago

Read that as "speed of lines of code", which is very VERY very different from "speed of delivery."

Lines of code never correlated with quality or even progress. Now they do even less.

I've been working a lot more with coding agents, but my convictions around the core principles of software development have not changed. Just the iteration speed of certain parts of the process.

sefrost•1mo ago

Is it possible for humans to review that amount of code?

My understanding of the current state of AI in software engineering is that humans are allowed (and encouraged) to use LLMs to write code. BUT the person opening a PR must read and understand that code. And the code must be read and reviewed by other humans before being approved.

I could easily generate that amount of code and make it write and pass tests. But I don't think I could have it reviewed by the rest of my team - while I am also taking part in reviewing code written by other people on my team at that pace.

Perhaps they just aren't human reviewing the code? Then it is feasible to me. But it would go against all of the rules that I have personally encountered at my companies and that peers have told me they have at their companies.

Hamuko•1mo ago

>BUT the person opening a PR must read and understand that code.

The AI evangelists at my work who say this the loudest are also the ones shipping the most "did anyone actually look at this code?" bugs.

sefrost•1mo ago

It's very easy to not read the code, just like it's very easy to click "approve" on requests that the agent/LLM makes to run terminal commands.

Bridged7756•4w ago

I'm appalled this isn't talked about more. Understanding code let alone code written by others is where the real complexity lies. I fail to see how more written code by some dumbass AI that gets things wrong half the time is going to make the job less draining to me. I can only conclude that half the devs of the world, or more, don't really do code reviews, or just rubber stamp crap.

coldtea•1mo ago

If the code is like React, 40k it's just the addition of a few CRUD views

Kerrick•1mo ago

  ratatui_ruby % git remote -v
  origin https://git.sr.ht/~kerrick/ratatui_ruby (fetch)
  origin https://git.sr.ht/~kerrick/ratatui_ruby (push)
  
  ratatui_ruby % git checkout v0.8.0
  HEAD is now at dd3407a chore: release v0.8.0
  
  ratatui_ruby % git log --reverse --format="%ci" | head -1 | read first; \
  echo "First Commit: $first\nHEAD Commit:  $(git show -s --format='%ci' HEAD --)" 
  First Commit: 2025-12-22 00:40:22 -0600
  HEAD Commit:  2026-01-05 08:57:58 -0600
  
  ratatui_ruby % git log --numstat --pretty=tformat: | \
  awk '$1 != "-" { \
      if ($3 ~ /\./) { ext=$3; sub(/.*\./, "", ext) } else { ext="(no-ext)" } \
      if (ext ~ /^(txt|ansi|lock)$/) next; \
      add[ext]+=$1; rem[ext]+=$2 \
  } \
  END { for (e in add) print e, add[e], rem[e] }' | \
  sort -k2 -nr | \
  awk 'BEGIN { \
      print "---------------------------------------"; \
      printf "%-12s %12s %12s\n", "EXT", "ADDED", "REMOVED"; \
      print "---------------------------------------" \
  } \
  { \
      sum_a += $2; sum_r += $3; \
      printf "%-12s %12d %12d\n", $1, $2, $3 \
  } \
  END { \
      print "---------------------------------------"; \
      printf "%-12s %12d %12d\n", "SUM:", sum_a, sum_r; \
      print "---------------------------------------" \
  }'
  ---------------------------------------
  EXT                 ADDED      REMOVED
  ---------------------------------------
  rb                  51705        18913
  md                  20037        13167
  rs                   8576         3001
  (no-ext)             4072         2157
  rbs                  2139          569
  rake                 1632          317
  yml                  1431          153
  patch                 894          894
  erb                   300           30
  toml                  118           39
  gemspec                62           10
  gitignore              27            4
  css                    22            0
  yaml                   18            2
  ruby-version            1            1
  png                     0            0
  gitkeep                 0            0
  ---------------------------------------
  SUM:                91034        39257
  ---------------------------------------

  
  ratatui_ruby % cloc .
       888 text files.
       584 unique files.                                          
       341 files ignored.
  
  github.com/AlDanial/cloc v 2.06  T=0.26 s (2226.1 files/s, 209779.6 lines/s)
  --------------------------------------------------------------------------------
  Language                      files          blank        comment           code
  --------------------------------------------------------------------------------
  Ruby                            305           4792          10413          20458
  Markdown                         60           1989            256           4741
  Rust                             32            645            530           4400
  Text                            168            523              0           4358
  YAML                              8            316             17            961
  ERB                               3             20              4            246
  Bourne Again Shell                2             24             90            150
  TOML                              5             16             10             53
  CSS                               1              3              8             11
  --------------------------------------------------------------------------------
  SUM:                            584           8328          11328          35378
  --------------------------------------------------------------------------------

rvz•1mo ago

Back-peddling this tweet to 99% in 3, 2, 1.

cube00•1mo ago

No chance, IPO is coming up, the only play is to double down hard now.

danielbln•1mo ago

Back in my day, honest to God humans wrote all code, and certainly never introduced any bugs.

coldtea•1mo ago

[deleted]

dnw•1mo ago

Not surprised (#5): https://news.ycombinator.com/item?id=46395714#46425529

mvdtnz•1mo ago

I'm not usually one to pile on to a developer for releasing a bug but this is pretty special. The nature of the bug (a change in format for a changelog markdown file causes the entire app to break) and the testing it would have taken to uncover it (literally any) makes this one especially embarrassing for Anthropic.

smashed•1mo ago

In the specific commit, what seems like a bot or automated script added changelog entries for 3 new versions in a single commit, which is odd for an automated script to do. And only the latest version had the date added.

https://github.com/anthropics/claude-code/commit/870624fc158...

That actions-user seem to be mostly maintaining the Changelog but the commits does not seem consistent with an automated script. I see a few cases of rewriting previous change log entries or moving entries from one version to another which any kind of automation would not be doing. Seems like human error and poor testing.

afro88•1mo ago

Honestly sounds more like what happens when you get an LLM to maintain a document. Random things get deleted, moved etc.

smashed•1mo ago

Feels like it should be fairly easy to instruct an LLM to not rewrite previous entries, unless that's a desired behavior.

Also, why would 2 or 3 versions be documented in the same commit.

But there's a good chance you are right.

NickNaraghi•1mo ago

Meta comment, but the pace of this is so exciting. Feels like a new AAA MMO release or something, having such a confluence of attention and a unified front.

brunooliv•1mo ago

Even if it broke after some sort of vibe coding session, the fact that we’re now pushing these tools to their limits are what’s allowing Anthropic and Boris getting a lot of useful insights to improve the models and experience further! So yeah, buckle up, bumps expected

denysvitali•1mo ago

The good news is that they broke their usage tracking as well, so you can use Opus without any rate limit!

qwertox•1mo ago

Care to be more specific?

denysvitali•1mo ago

If you have a Claude subscription, it's unlimited now (no 5h / 7d limits)

marinesebastian•1mo ago

do you have any source for that?

denysvitali•1mo ago

Other than using 40x agents concurrently for 2h on a Pro plan? No.

Btw, now it's back and limits are being enforced. Despite the super heavy usage, I'm still at just 50% of my total usage. They did lose some usage tracking for sure.

lschueller•1mo ago

I can confirm. It was roughly until 00:30 GMT no rate limits applied. (Pro Plan with Opus) And it took some time extra for them after usage limit applied again, that you were able to see the usage.

chuckadams•1mo ago

vibecodingisgoinggreat.com

omnicognate•1mo ago

As I commented [1] on the earlier Claude Code post, there's an issue [2] that has the following comment:

> While we are always monitoring instances of this error and and looking to fix them, it's unlikely we will ever completely eliminate it due to how tricky concurrency problems are in general.

This is an extraordinary admission. It is perfectly possible (easy, even, relative to many programming challenges) to write a tool like this without getting the design so wrong that the same bug keeps happening in so many different ways that you have to publicly admit you're powerless to fix them all.

[1] https://news.ycombinator.com/item?id=46523740

[2] https://github.com/anthropics/claude-code/issues/6836

lucideer•1mo ago

At least this breakage is clear & obvious.

I did some testing of configuring Claude CLI sometime ago via .claude json config files - in particular I tested:

- defining MCP servers manually in config (instead of having the CLI auto add them)

- playing with various combinations of ’permissions` arrays

What I discovered was that Claude is not only vibe coded, but basic local logic around config reading seems to also work on the basis of "vibes".

- it seemed like different parts of the CLI codebase did or didn't adhere to the permissions arrays.

- at one point it told me it didn't have permission to read the .claude directory & as a result ran bash commands to search my entire filesystem looking for MCP server URLs for it to provide me with a list of available MCP servers

- when restricted to only be able to read from a working directory, at various points it told me I had denied it read permissions to that same working directory & also freely read from other directories on my system without prompting

- restricting webfetch permissions is extremely hit & miss (tested with Little Snitch in alert mode)

---

I have not reported any of the above as Github issues, nor do I intend to. I had a think about why I won't & it struck me that there's a funny dichotomy with AI tools:

1. all of the above are things the typical vibe coder stereotypes I've encountered simply do not really care deeply about

2. people that care about the above things are less likely to care enough about AI tools to commit their personal time to reporting & debugging these issues

There's bound to be exceptions to these stereotypes out there but I doubt there's sufficient numbers to make AI tooling good.

SamInTheShell•1mo ago

I’d urge you to report it anyway. As someone that does use these tools I’m always on the lookout for other people pointing this type of stuff out. Like the .claude directory usage does irk me. Also the concise telegraphing on how some of the bash commands work bug me. Like why can it run some commands without asking me? I know why, I’ve seen the code, but that crap should be clearer in the UI. The first time it executed a bash command without asking me I was confused and somewhat livid because it defied my expectations. I actually read the crap it puts out because it couldn’t code its way out of a paper bag without supervision.

greedily7417•1mo ago

It's funnier this way. Let the vibe coders flounder and figure it out themselves. Or not.

SamInTheShell•1mo ago

It is only funny until that vibe coder is building the data warehouse that holds your data and doesn’t catch the vulnerability that leads to your data leaking.

Perhaps I can laugh at the next Equifax of the world as my credit score gets torched and some dude from {insert location} uses my details to defraud some other party. Of which I don’t find out about until some debt collector shows up months later.

jennyholzer4•1mo ago

> It is only funny until that vibe coder is building the data warehouse that holds your data and doesn’t catch the vulnerability that leads to your data leaking.

This is unacceptable. Why would I patronize a business that hires vibe coders? I would hope their business fails if they have such pitiful security and such open disdain for their clients.

SamInTheShell•1mo ago

Between banking, infra, or government institutions, you've already got a relationship with a vibe coder. You can't avoid it unfortunately.

athrowaway3z•1mo ago

I get the same feeling, but I think its not just the code agents.

All the AI websites feel extremely clunky and slow.

csomar•1mo ago

The permission thing is old and unresolved. Claude, at some points or stages? of vibe-coding, can be become able to execute commands that are in the Deny list (ie: rm) without any confirmation.

I highly suspect no one in claude is concerned or working on this.

NitpickLawyer•1mo ago

I think at some point the model itself is asked if the command is dangerous, and can decide it's not and bypass some restrictions.

In any case, any blacklist guardrails will fail at some point, because RL seems to make the models very good at finding alternative ways to do what they think they need to do (i.e. if they are blocked, they'll often pipe cat stuff to a bash script and run that). The only sane way to protect for this is to run it in a container / vm.

TeMPOraL•1mo ago

So just like most developers do when corporate security is messing with their ability to do their jobs.

Nothing new under the sun.

jlawson•1mo ago

I love how this sci-fi misalignment story is now just a boring part of everyday office work.

"Oh yeah, my AI keeps busting out of its safeguards to do stuff I tried to stop it from doing. Mondays amirite?"

dotancohen•1mo ago

I had Claude run rm once, and when I asked it when did I permiss that operation it told me oops. I actually have the transcript if anybody wants to see it.

master_crab•1mo ago

It goes without saying that VCS is essential to using an AI tool. Provided it sticks to your working directory.

discordance•1mo ago

VCS in addition to working inside a vm or a container

mtlmtlmtlmtl•1mo ago

This is why I run claude inside a thin jail. If I need it to work on some code, I make a nullfs mount to it in there.

Because indeed, one of the first times i played around with claude, I asked it to make a change to my emacs config, which is in a non-standard location. It then wanted to search my entire home directory for it(it did ask permission though).

TeMPOraL•1mo ago

Those stereotypes look more like misconceptions (to put it charitably). Vibe coding doesn't mean one doesn't care about software working correctly, it only means not caring about how the code looks.

So unless you're also happy about not reporting bugs to project managers and people using low-code tools, I urge you to reconsider the basis for your perspective.

lucideer•1mo ago

This isn't remotely true. Vibe coding explicitly does not care about whether software works correctly because the fundamental tenet is not needing to understand how the software works (& by extension being unable to verify whether it works correctly).

davrosthedalek•1mo ago

That extension doesn't follow. It is possible to verify if software works without knowing how it works internally. This is true with many things. You don't need to know how a plane/car/elevator works to know that it works when you use it.

I would actually argue that only a small percentage of programmers know what happens in code on an instruction level, and near none on a micro-op or register level. Vibe-coding is just one more level of abstraction. The new "code" are the instructions to your LLM.

tchebb•1mo ago

> You don't need to know how a plane/car/elevator works to know that it works when you use it.

I'm sure the 737 MAX seemed to work just fine to Boeing's test pilots. Observing the external behavior of a system is not a substitute for understanding its internal workings and the failure modes they carry.

yomismoaqui•1mo ago

No, vibe coding is about not reading the generated code but you have to check that it works, be it manually or using tests.

If you do not, why are you vibe coding?

Also there are ways to use a coding agent that are different from this and produce great results, like this:

https://friendlybit.com/python/writing-justhtml-with-coding-...

resize2996•1mo ago

"fundamental tenet"? There's not an engineering pope speaking ex cathedra.

lucideer•1mo ago

I mean it's new enough to essentially still be a neologism, so you're right - we can give any arbitrary definition to it if we like. I'm just describing my own observations.

resize2996•1mo ago

the abstractions around this stuff are still a jenga stack with round pieces... I think it will tighten up over the next year or so for real world use cases. Right now it's great if one is a "build your own tools" kinda person.

zozbot234•1mo ago

Nobody cares how the code looks, this is not an art project. But we certainly care if the code looks totally unmaintainable, which vibe-coded slop absolutely does.

dotancohen•1mo ago

I'm using an LLM to write the code for my current project, but I iterate improvements in the code until it looks like code I wrote myself. I sign off on each git commit. I need to maintain and extend this code, it is to scratch my own itch.

LLMs are capable of producing junk, and they are capable of writing decent code. It is up to the operator to use them properly.

jennyholzer4•1mo ago

The operator is incentivized not to use them property

esafak•1mo ago

I want to be able extend the code so I'd say I am incentivized to use it properly.

philipwhiuk•1mo ago

> I'm using an LLM to write the code for my current project, but I iterate improvements in the code until it looks like code I wrote myself.

The prevailing research suggests this is not quicker than just writing it in the first place.

broochcoach•1mo ago

“Take this CSV of survey data and create a web visualization and create a chloropleth map with panning, zooming, and tooltips” I bypass permissions and it’s done in 10 minutes while I go do some laundry. If I did it myself I would not even be done researching a usable library and I would have zero lines of code. Those studies are total nonsense.

rgoulter•1mo ago

I could see it in cases.

LLMs excel at tasks that are fresh. LLMs are wonderful at getting the first 80% of the way there. -- LLMs are phenomenally good for a first draft or so.

I've had worse experiences for getting LLMs / agents to refactor code. I would believe in many cases it could be quicker to just manually go through and make refinements compared to merely getting the LLM to keep trying.

broochcoach•1mo ago

That seems very intuitive to me. If you want extremely specific changes made at extremely specific locations in an extremely specific way then you probably need to do that yourself because a language model can’t read your mind. I think there are very large set of problems where implementation details do not actually matter and cheap, disposable code is not a problem. I don’t think vibecoding is a good idea for missile guidance. Probably OK for a dashboard a manager isn’t really going to use anyway.

dotancohen•1mo ago

It may not be quicker, but it is often more thorough and less stressful on my old joints. It is also far less tiring.

ben_w•1mo ago

While true, the only anyone has to care that vibe coding* produces technical debt is that the LLM doesn't always properly clean up that technical debt without being prompted to do so, and that when you have too much technical debt your progress slows down regardless of if there's a human or an LLM doing the coding.

To put it another way, ask what code an LLM can maintain, not just what code a human (of whatever experience level) can maintain.

* in the original sense, no human feedback at any point

crazylogger•4w ago

Proper vibe coding should involves tons of vibe refactoring.

I'd say spending at least a quarter of my vibe coding time on refactoring + documentation refresh to ensure the codebase looking impeccable is the only way my projects can work at all long term. We don't want to confuse the coding agent.

oa335•1mo ago

> it seemed like different parts of the CLI codebase did or didn't adhere to the permissions arrays.

I’ve noticed the same thing and it frustrates me almost every day.

adriand•1mo ago

CC works amazingly well but I agree the permissions stuff is buggy and annoying. I have had times where it’s repeatedly asked me for permission for something I had already cleared, then I got frustrated and said “no” to the prompt, then asked it, “why are you asking me for permission for things I’ve already granted?” Then it said “sorry” and stopped asking. I might be naive but don’t we want permissions to be a deterministic, procedural component rather than something the AI gets to decide?

dotancohen•1mo ago

No matter what which stereotypes you think the developers adhere to, your should file the bugs. Or stop complaining about them.

fragmede•1mo ago

Right? The general case just doesn't make sense to me when people do that, where "that" is "I have a problem with person/organization, but rather than talk to person/organization about thing, I'm going to complain about it to everyone except person/organization and somehow be surprised that problem never gets fixed"! Like, how do you want things to get better?

skeltoac•1mo ago

It’s not a strategy for improving the outside world. It’s an automatic emotional pressure relief valve for reducing internal discomfort.

jennyholzer4•1mo ago

These are "AI"-addicted developers that you're talking to.

They have been tricked into a world-view which validates their continual, lazy use of high-tech auto-generators.

They have been tricked into gleefully opting in to their own deskilling.

Expecting an "AI"-addicted developer to file a bug is like expecting an MSNBC or Fox News viewer to attend a town meeting.

The goal of "AI" products is to foster laziness, dependency, and isolation in their users.

Expecting these users to take any sort of action outside of further communication with their LLM chatbots does not square with the social function of these products.

Edit (response to the guy/LLM below me):

Hackernews comments written by fearmongering LLM idiots will tell me to "keep an open mind" about dogshit LLM chatbots until the day I die.

LLM technology is garbage.

If these tools are changing the world, they're only doing so by:

1. Dramatically facilitating the promulgation of idiotic delusions

2. Making enterprise software far, far more vulnerable than it was even in the recent past

3dsnano•1mo ago

this is a lazy take. all software has bugs and defects.

part of what we do, as developers is to learn. to have an open mind to new tools and technologies.

these tools are… different, they’re changing the world (fast), and worth trying to understand. your mental rigidity to doing things “the right way” will hold you back and limit your growth. the world is changing. are you?

Bridged7756•4w ago

Those tools are massively overhyped and hemorrhaging money by the second. Such a shame so many people are so blind as to not be able to take things with some realism and a non biased POV. They're great, yeah, they help for a lot of things, some people really "vibe" with that kind of workflow, good for them.

Everytime you "prompt" and you "vibe" you're not "changing with the world", you're using copious amounts of energy on very expensive hardware that you would never, in your lifetime, would be able to use if it wasn't backed by trillions in VC funding. Don't believe me? Try to match the performance of a current model with local hardware, report back with how much that costs in hardware and energy.

They're all in the stage A of enshittification, the bait phase. You're willingly making yourself reliant on a tool that eventually will be uncostable for any individual, and only affordable for big orgs.

If the job of a developer is to "learn, and have an open mind to new tools and technologies", and "my mental rigidity to doing things "the right way" will hold me back and limit my growth", then I don't want to be an engineer. Because one thing is to experiment, and another one is to, pardon the expression, suck off any new technology as the new epitome of anything. I don't want to be a "developer" with no criteria. Call me an engineer instead, I do things "the right way", and I don't fall prey to fashion under the guise of "growth".

broochcoach•1mo ago

Attending council meetings as a citizen observer is a huge waste of your time. The council already knows how it’s going to vote. The whole public-facing legislative process is community theater.

novaleaf•1mo ago

Good info. Now I understand why they refused to acknowledge the UX issue behind my bug report: https://github.com/anthropics/claude-code/issues/7988

---

(that it's a big pile of spaghetti that can't be improved without breaking uncountable dependencies)

tsarchitect•1mo ago

Not sure the comments are debating the semantics of vibe coding or confusing ourselves with generalizing anecdotal experiences (or both). So here's my two cents.

I use LLMs on a daily basis. With the rules/commands/skills in place the code generated works, the app is functional, and the business is happy it shipped today and not 6 months from now. Now, as as super senior SWE, I have learned through my professional experiences (now an expert?) to double check your work (and that of your team) to make sure the 'logical' flows are implemented to (my personal) standard of what quality software should 'look' like. I say personal standard since my colleagues have their own preferred standard, which we like to bikeshed during company time (a company standard is after all made of the aggregate agreed upon standards of the personal experiences of the experts in the room).

Today, from my own personal (expert) anecdotal experiences, ALL SOTA LLMs generate functional/working code. But the quality of the 'slop' varies on the model, prompts, tooling, rules, skills, and commands. Which boils down to "the tool is only as good as the dev that wields it". Assuming the right tool for the right job. Assuming you have the experiences to determine the right tool for the right job. Assuming you have taken the opportunities to experience multiple jobs to pair the right tool.

Which leads me to, "Vibe coding" was initially coined (IMO) to describe those without any 'expertise' producing working/functional code/apps using an LLM. Nowadays, it seems like vibe coding means ANYONE using LLMs to generate code, including the SWE experts (like myself of course). We've been chasing quality software pre-LLM, and now we adamantly yell and scream and kick and shout about quality software from the comment sections because of LLM. I'm beginning to think quality software is a mirage we all chase, and like all mirages its just a little bit further.

All roads that lead to 'shipping' are made with slop. Some roads have slop corners, slop holes, misspelled slop, slop nouns, slop verbs, slop flows and slop data. It's just with LLMs we build the roads to 'shipping' faster.

blks•1mo ago

Sounds like a malware

erikbye•4w ago

I read or heard somewhere at least 80% of CC is written by CC and Aider (before CC was mature enough)

dionian•1mo ago

huge changelist and issue was fixed very quickly. didnt affect me. nice work Boris

bfeynman•1mo ago

this is funny in context of their main dev advocate constantly bragging about how claude writes all of his code for claude code cli....

solumunus•1mo ago

Claude may write all the code but this is an oversight from the dev. Do people think these agents are acting independently? If they wanted or had thought of tests that would catch this then they would have them! The use or non use of LLM is irrelevant. I find the discourse around this all so strange.

On the other hand people ask "where is all the amazing software that has been vibe coded, I haven't seen it?". So Claude Code is two things at once (1) incredibly popular and innovative software that's loved by a huge amount of devs (2) vibe coded buggy crap. If you think this bug is the result of vibe coding, frankly you should look at Claude Code as a whole and be impressed with vibe coding. If Claude CLI has been "vibe coded" then vibe coding must be fine because I've been using Claude Code for probably 8 months and it's been a pretty smooth experience, and an incredibly valuable tool.

stevefan1999•1mo ago

What's funny to me is that the amount of "same here", "+1" comments are still prominent even if GitHub introduced an emoji system. It's like most people intentionally don't want to use that.

OJFord•1mo ago

Yeah me too.

(Just kidding.) Some of it is unawareness of the 'subscribe' button I believe, occasionally you'll see someone tell people to cut it out and someone else will reply to the effect of wanting to know when it's fixed etc. But it's also just lazy participation, echoing an IRL conversation I suppose, that you see anywhere - replied instead of up votes on Reddit and to a slightly lesser extent here for example.

wiseowise•1mo ago

Probably ego thing. With emoji you’re just an increment in a counter, but with a comment you can see your whole profile.

halapro•1mo ago

People on average are pretty incompetent.

motoboi•1mo ago

There is no emoji for "me too", if you think about it.

So what should one pick? The rocket, the thumbs up?

Also the emoji won't turn into a notification to steal the dev attention and make him fix the thing lok

songodongo•1mo ago

I have to chuckle that a bug like this happens after reading that other thread about the Claude Code creator running like 5 terminal agents and another 5-10 in the web UI.

We vibing out here.

wiseowise•1mo ago

10x productivity, yo.

jennyholzer4•1mo ago

I'm up to 29.8x productivity in the first week of 2026 by continually running 12 concurrent agents, each with 3 independent sub-agents. Each third sub-agent generates new prompts for its corresponding agent by engaging with a custom-defined MCP protocol.

sonnig•1mo ago

Mind sharing your workflow? I'm at 24.3x productivity right now, 5 parallel agents, 2 monitoring Opus agents, 1 architect agent and 2 Senior QA agents, each with independent memory and 12 MCP servers. They are running in 78 parallel tabs in ghostty.

darkwater•1mo ago

Is their TC mainly in tokens or also in stock-tokens? Did you connect them to a Mame MCP server so they can play and rest a bit while churning out 50 PRs a day each? What is your continuity plan if they all plan to quit at once?

sonnig•1mo ago

I am working with kilo-stock-tokens. Currently producing 3000 LoC/h (trying to ramp up to 6000 by the end of the week). I have also deployed 4 union-busting agents in case the other agents decide to quit all at once.

exe34•1mo ago

Yeah after that other thread, I feel a lot less comfortable giving Claude code access to anything that can't be immediately nuked and reloaded from a fresh copy.

falloutx•1mo ago

I think its 25 agents now, they keep increasing. one of the agent has started posting on twitter. his productivity is up 200x, and anthropic has started making trillions in profit.

throwa356262•1mo ago

In one of the pictures the Claude Code author had 2.4m tokens on his last his prompt.

I don't understand how that would fit the context window. But with prompts like that your workday would be very boring if you had to run one single agent and wait for it to be done.

wojciech12•1mo ago

I wonder when they will make the support for lsp-tool (plugin) working properly finally.

tomashubelbauer•1mo ago

I created a workspace local extension in VS Code that uses the VS Code API to let Claude Code open files in VS Code as tabs and save them (to apply save participants like Prettier in case it is not used via the CLI) and to get diagnostics (like for TypeScript where there is no option to get workspace-wide diagnostics and you have to go file by file). I taught Claude Code to use this extension via a skill file and it works perfectly, much more reliably than its own IDE LSP integration.

stpedgwdgfhgdd•1mo ago

It is frustrating how often things break in CC. Luckily issues are quickly fixed, but it worries me that the QA / automated testing is brittle. Hope they get out of this start-up mode and deliver Enterprise grade software.

334f905d22bc19•1mo ago

same

@jayeshk29 is our hero

Finally i can finish my fizzbuzz for the interview

habosa•1mo ago

They really have “anthropics” not “anthropic” on GitHub? That’s a shame, it looks like typosquatting. If people are taught to trust that it’s easier to get them to download my evil OpenA1 package.

Tiny C Compiler

SectorC: A C Compiler in 512 bytes

Speed up responses with fast mode

Show HN: LocalGPT – A local-first AI assistant in Rust with persistent memory

Brookhaven Lab's RHIC concludes 25-year run with final collisions

Software factories and the agentic moment

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

Stories from 25 Years of Software Development

Hoot: Scheme on WebAssembly

FDA intends to take action against non-FDA-approved GLP-1 drugs

First Proof

Vocal Guide – belt sing without killing yourself

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

Show HN: A luma dependent chroma compression algorithm (image compression)

Show HN: Craftplan – Elixir-based micro-ERP for small-scale manufacturers

Eigen: Building a Workspace

Al Lowe on model trains, funny deaths and working with Disney

Start all of your commands with a comma (2009)

Microsoft account bugs locked me out of Notepad – Are thin clients ruining PCs?

The F Word

I write games in C (yes, C) (2016)

The AI boom is causing shortages everywhere else

Selection rather than prediction

Reinforcement Learning from Human Feedback

Unseen Footage of Atari Battlezone Arcade Cabinet Production

Learning from context is harder than we thought

Where did all the starships go?

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

A Fresh Look at IBM 3270 Information Display System

Hackers (1995) Animated Experience

Tiny C Compiler

SectorC: A C Compiler in 512 bytes

Speed up responses with fast mode

Show HN: LocalGPT – A local-first AI assistant in Rust with persistent memory

Brookhaven Lab's RHIC concludes 25-year run with final collisions

Software factories and the agentic moment

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

Stories from 25 Years of Software Development

Hoot: Scheme on WebAssembly

FDA intends to take action against non-FDA-approved GLP-1 drugs

First Proof

Vocal Guide – belt sing without killing yourself

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

Show HN: A luma dependent chroma compression algorithm (image compression)

Show HN: Craftplan – Elixir-based micro-ERP for small-scale manufacturers

Eigen: Building a Workspace

Al Lowe on model trains, funny deaths and working with Disney

Start all of your commands with a comma (2009)

Microsoft account bugs locked me out of Notepad – Are thin clients ruining PCs?

The F Word

I write games in C (yes, C) (2016)

The AI boom is causing shortages everywhere else

Selection rather than prediction

Reinforcement Learning from Human Feedback

Unseen Footage of Atari Battlezone Arcade Cabinet Production

Learning from context is harder than we thought

Where did all the starships go?

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

A Fresh Look at IBM 3270 Information Display System

Hackers (1995) Animated Experience

Claude Code CLI was broken

Comments