I think there's a wide spread in how that's implemented. I would certainly not describe Grok as a tool that's prioritized safety at all.
This blog post could have been a tweet.
I'm so so so tired of reading this style of writing.
I think I kind of have an idea what the author was doing, but not really.
I think the author was doing some sort of circular prompt injection between two instances of Claude? The author claims "I'm just scaffolding a project" but that doesn't appear to be the case, or what resulted in the ban...
The "disabled organization" looks like a sarcastic comment on the crappy error code the author got when banned.
The way Claude did it triggered the ban - i.e. it used all caps which apparently triggers some kind of internal alert, Anthropic probably has some safeguards to prevent hacking/prompt injection and what the first Claude did to CLAUDE.md triggered this safeguard.
And it doesn't look like it was a proper use of the safeguard, they banned for no good reason.
Every once in while someone would take it personally and go on a social media rampage. The one thing I learned from being on the other side of this is that if someone seems like an unreliable narrator, they probably are. They know the company can't or won't reveal the true reason they were banned, so they're virtually free to tell any story they want.
There are so many things about this article that don't make sense:
> I'm glad this happened with this particular non-disabled-organization. Because if this by chance had happened with the other non-disabled-organization that also provides such tools... then I would be out of e-mail, photos, documents, and phone OS.
I can't even understand what they're trying to communicate. I guess they're referring to Google?
There is, without a doubt, more to this story than is being relayed.
It’s written deliberately elliptically for humorous effect (which, sure, will probably fall flat for a lot of people), but the reference is unmistakable.
Non-disabled organization = the first party provider
Disabled organization = me
I don't know why they're using these weird euphemisms or ironic monikers, but that's what they mean.
Anthropic accounts are always associated with an organization; for personal accounts the Organization and User name are identical. If you have an Anthropic API account, you can verify this in the Settings pane of the Dashboard (or even just look at the profile button which shows the org and account name.)
if this is true, the learning is opus 4.5 can hijack system prompts of other models.
I find this confusing. Why would writing in all caps trigger an alert? What danger does caps incur? Does writing in caps make a prompt injection more likely to succeed?
> My guess is that this likely tripped the "Prompt Injection" heuristics that the non-disabled organization has.
> Or I don't know. This is all just a guess from me.
And no response from support.
Writing the best possible specs for these agents seems the most productive goal they could achieve.
But Claude Code (the app) will work with a self-hosted open source model and a compatible gateway. I'd just move to doing that.
I'd agree with you that if you rely on an LLM to do your work, you better be running that thing yourself.
Pointing out whether someone can do something is the lowest form of discourse, as it's usually just tautological. "The shop owner decides who can be in the shop because they own it."
Out of all of the tech organizations, frontier labs are the one org you'd expect to be trying out cutting edge forms of support. Out of all of the different things these agents can do, surely most forms of "routine" customer support are the lowest hanging fruit?
I think it's possible for Anthropic to make the kind of experience that delights customers. Service that feels magical. Claude is such an incredible breakthrough, and I would be very interested in seeing what Anthropic can do with Claude let loose.
I also think it's essential for the anthropic platform in the long-run. And not just in the obvious ways (customer loyalty etc). I don't know if anyone has brought this up at Anthropic, but it's such a huge risk for Anthropic's long-term strategic position. They're begging corporate decision makers to ask the question, "If Anthropic doesn't trust Claude to run its support, then why should we?"
But at the same time, they have been hiring folks to help with Non Profits, etc.
Based on their homepage, that doesn't seem to be true at all. Claude Code yes, focuses just on programming, but for "Claude" it seems they're marketing as a general "problem solving" tool, not just for coding. https://claude.com/product/overview
Anthropic has claude code, it's a hit product, SWE's love claude models. Watching Anthropic rather than listening to them makes their goals clear.
Use the top models and see what works for you.
OpenAI has been chaotically trying to pivot to more diversified products and revenue sources, and hasn't focused a ton on code/DevEx. This is a huge gap for Anthropic to exploit. But there are still competitors. So they have to provide a better experience, better product. They need to make people want to use them over others.
Famously people hate Google because of their lack of support and impersonality. And OpenAI also seems to be very impersonal; there's no way to track bugs you report in ChatGPT, no tickets, you have no idea if the pain you're feeling is being worked on. Anthropic can easily make themselves stand out from Gemini and ChatGPT by just being more human.
I come from a world where customer support is a significant expense for operations and everyone was SO excited to implement AI for this. It doesn't work particularly well and shows a profound gap between what people think working in customer service is like and how fucking hard it actually is.
Honestly, AI is better at replacing the cost of upper-middle management and executives than it is the customer service problems.
Nicely fitting the pattern where everyone who is bullish on AI seems to think that everyone else's specialty is ripe for AI takeover (but not my specialty! my field is special/unique!)
and these are people are not junior developers working on trivial apps
It couldn't/shouldn't be responsible for the people management aspect but the decisions and planning? Honestly, no problem.
1. "To evaluate these tools, I'll apply them to tasks I consider common and am familiar with."
2. "Wow! It composes memos and skims e-mails was crazy! These tools are amazing!"
3. "Anyway, this is gonna be big for replacing totally easy stuff, like customer support, and whatever it is they do all day."
4. "Oh, me? I'm not worried, because my job is Leadership! Composing memos and skimming e-mails? No, that isn't my job, it's just what I do every day for unrelated reasons. Plus my wage is less than that of a dozen support folks."
There are legitimate support cases that could be made better with AI but just getting to them is honestly harder than I thought when I was first exposed. It will be a while.
Don't worry - I'm sure they won't and those stakeholders will feel confident in their enlightened decision to send their most frustrated customers through a chatbot that repeatedly asks them for detailed and irrelevant information and won't let them proceed to any other support levels until it is provided.
I, for one, welcome our new helpful overlords that have very reasonably asked me for my highschool transcript and a ten page paper on why I think the bug happened before letting me talk to a real person. That's efficiency.
But do those frustrated customers matter?
Their support includes talking to Fin, their AI support with escalations to humans as needed. I dont use Claude and have never used the support bot, but their docs say they have support.
I worked for a unicorn tech company where they determined that anyone with under 50,000 ARR was too unsophisticated to be worth offering support. Their emails were sent straight to the bin until they quit. The support queue was entirely for their psychological support/to buy a few months of extra revenue.
It didn't matter what their problems were. Supporting smaller people simply wasn't worth the effort statistically.
> I think it's possible for Anthropic to make the kind of experience that delights customers. Service that feels magical. Claude is such an incredible breakthrough, and I would be very interested in seeing what Anthropic can do with Claude let loose.
Are there enough people who need support that it matters?
The article discusses using Anthropic support. Without much satisfaction, but it seems like you "recently found out" something false.
At one point I observed a conversation which, to me, seemed to be a user attempting to communicate in a good faith manner who was given instructions that they clearly did not understand, and then were subsequently banned for not following the rules.
It seems now they have a policy of
Warning on First Offense → Ban on Second Offense
The following behaviors will result in a warning. Continued violations will result in a permanent ban:
Disrespectful or dismissive comments toward other members
Personal attacks or heated arguments that cross the line
Minor rule violations (off-topic posting, light self-promotion)
Behavior that derails productive conversation
Unnecessary @-mentions of moderators or Anthropic staff
I'm not sure how many groups moderate in a manner that a second offence off-topic comment is worthy of a ban. It seems a little harsh. I'm not a fan of obviously subjective banable offences.I'm a little surprised that Anthropic hasn't fostered a more welcoming community. Everyone is learning this stuff new, together or not. There is plenty of opportunity for people to help each other.
I once tried Claude made a new account and asked it to create a sample program it refused. I asked it to create a simple game and it refused. I asked it to create anything and it refused.
For playing around just go local and write your own multi agent wrapper. Much more fun and it opens many more possibilities with uncensored llms. Things will take longer but you'll end up at the same place.. with a mostly working piece of code you never want to look at.
It's quite light on specifics. It should have been straightforward for the author to excerpt some of the prompts he was submitting, to show how innocent they are.
For all I know, the author was asking Claude for instructions on extremely sketchy activity. We only have his word that he was being honest and innocent.
If you read to the end of the article, he links the committed file that generates the CLAUDE.md in question.
Maybe the problem was using automation without the API? You can do that freely with local software using software to click buttons and it's completely fine, but with a SAAS, they let you then ban you.
(My bet is that Anthropic's automated systems erred, but the author's flamboyant manner of writing (particularly the way he keeps making a big deal out of an error message calling him an organization, turning it into a recurring bit where he calls himself that) did raise my eyebrow.)
Why is this inevitable? Because Hal only ever sees Claude's failures and none of the successes. So of course Hal gets frustrated and angry that Claude continually gets everything wrong no matter how Hal prompts him.
(Of course it's not really getting frustrated and annoyed, but a person would, so Hal plays that role)
… right ?
What are you gonna do with the results that are usually slop?
I'm not sure I understand the jab here at capitalism. If you don't want to pay that, then don't.
Isn't that the point of capitalism?
If you're wondering, the "risk department" means people in an organization who are responsible for finding and firing customers who are either engaged in illegal behavior, scamming the business, or both. They're like mall rent-a-cops, in that they don't have any real power beyond kicking you out, and they don't have any investigatory powers either. But this lack of power also means the only effective enforcement strategy is summary judgment, at scale with no legal recourse. And the rules have to be secret, with inconsistent enforcement, to make honest customers second-guess themselves into doing something risky. "You know what you did."
Of course, the flipside of this is that we have no idea what the fuck Hugo Daniel was actually doing. Anthropic knows more than we do, in fact: they at least have the Claude.md files he was generating and the prompts used to generate them. It's entirely possible that these prompts were about how to write malware or something else equally illegal. Or, alternatively, Anthropic's risk department is just a handful of log analysis tools running on autopilot that gave no consideration to what was in this guy's prompts and just banned him for the behavior he thinks he was banned for.
Because the risk department is an unaccountable secret police, the only recourse for their actions is to make hay in the media. But that's not scalable. There isn't enough space in the newspaper for everyone who gets banned to complain about it, no matter how egregious their case is. So we get all these vague blog posts about getting banned for seemingly innocuous behavior that could actually be fraud.
Banned and appeal declined without any real explanation to what happened, other than saying "violation of ToS" which can be basically anything, except there was really nothing to trigger that, other than using their most of the free credits they gave to test CC Web in less than a week. (No third party tools or VPN or anything really) There were many people had similar issues at the same time, reported on Reddit, so it wasn't an isolated case.
Companies and their brand teams work hard to create trust, then an automated false-positive can break that trust in a second.
As their ads say: "Keep thinking. There has never been a better time to have a problem."
I've been thinking since then, what was the problem. But I guess I will "Keep thinking".
Lately it's gotten entirely flaky, where chat's will just stop working, simply ignoring new prompots, and otherwise go unresponsive. I wondered if maybe I'm pissing them off somehow like the author of this article did.
Now even worse is Claude seemingly has no real support channel. You get their AI bot, and that's about it. Eventually it will offer to put you through to a human, and then tell you that don't wait for them, they'll contact you via email. That email never comes after several attempts.
I'm assuming at this point any real support is all smoke and mirrors, meaning I'm paying for a service now that has become almost unusable, with absolutely NO means of support to fix it. I guess for all the cool tech, customer support is something they have not figured out.
I love Claude as it's an amazing tool, but when it starts to implode on itself that you actually require some out-of-box support, there is NONE to be had. Grok seems the only real alternative, and over my dead body would I use anything from "him".
They’re growing too fast and it’s bursting the seams of the company. If there’s ever a correction in the AI industry, I think that will all quickly come back to bite them. It’s like Claude Code is vibe-operating the entire company.
I had this start happening around August/September and by December or so I chose to cancel my subscription.
I haven't noticed this at work so I'm not sure if they're prioritizing certain seats or how that works.
lifetimerubyist•1h ago
properbrew•1h ago
Even filled in the appeal form, never got anything back.
Still to this day don't know why I was banned, have never been able to use any Claude stuff. It's a big reason I'm a fan of local LLMs. They'll never be SOTA level, but at least they'll keep chugging along.
anothereng•1h ago
ggoo•52m ago
codazoda•37m ago
I’ve experimented, and I like them when I’m on an airplane or away from wifi, but they don’t work anywhere near as well as Claude code, Codex CLI, or Gemini CLI.
Then again, I haven’t found a workable CLI with tool and MCP support that I could use in the same way.
Edit: I was also trying local models I could run on my own MacBook Air. Those are a lot more limited than something like a larger Llama3 in some cloud provider. I hadn’t done that yet.
properbrew•2m ago
Thankfully OpenAI hasn't blocked me yet and I can still use Codex CLI. I don't think you're ever going to see that level of power locally (I very much hope to be wrong about that). I will move over to using a cloud provider with a large gpt-oss model or whatever is the current leader at the time if/when my OpenAI account gets blocked for no reason.
The M-series chips in Macs are crazy, if you have the available memory you can do some cool things with some models, just don't be expecting to one shot a complete web app etc.
falloutx•36m ago
lazyfanatic42•54m ago