It will tell me a suggested abstraction is probably overkill and just to make a component own the new thing I’m discussing.
What I’m missing from the loop is it later saying without directly prompting, “hey it’s time to revisit that abstraction idea.”
Yes, that's assuming you take time to clean up now and then. If you don't, that's on you.
> It’s just incapable of the thing that makes a real architect valuable: saying “no.”
From my experience Claude is excellent at saying "no". It won't say "no" if the prompt doesn't call for it (it won't say "no" to your direct request to do something, usually). But it offers good critique and happily pushes back if you make it clear that that's a first class option.
So I was blunt, and said "I don't care about the burn-rate on some hypothetical chart that you produced at the start. I care about removing bugs and having a robust product, which this approach is satisfactorily doing. We will continue along this path, if the tests are not showing gain, then the tests are poorly designed".
At which point it got all apologetic, wrote new memories, and we didn't have a problem thereafter.
The issue was that I was attacking a huge bug-surface, and although each bug-fix was valid, correct, and helped move the dial, it didn't move the dial on the test-bed that Claude had created to measure its work against. There were too many inter-connected bugs for a single fix to really make a difference to these higher-level tests. I knew it was going to take a while to get through them, but apparently Claude didn't.
You try changing the size of a pointer from 2 bytes to 3 bytes on a compiler[1] for a 6502 while introducing automatically-tracked bank-switching on your memory-managed pointers, and see how many code-sites that impacts [grin].
[1]: https://atari-xt.com
The thing that I find Claude incredibly good at when I'm designing architecture is working more like a research assistant on briefing decisions. It has the ability to read the entire code base and draw some conclusions. It can pull from lots of best practices and the millions of blog posts about this or that pretty effortlessly, which would take me a lot more time.
And then if asked, it can do a really good job of laying out the landscape around decisions and walking through the trade-offs. Like the author of this post, I found that if you let it, it will certainly be happy to just come up with some architecture and run with it, often in ways that will paint you quite rapidly into a corner.
But if you ask it to present you with all the trade-offs and let you make the judgment calls, it's great for that too.
That's certainly how I use it. And I think, just like anything else, working with AI is a skill, and similar to working with libraries, SaaS providers, service providers, frameworks, or anything else that's a "helper." You learn how something that could work but will fail silently is a problem, or you learn how depending on a fly-by-night SaaS company for a key framework is different than depending on a well-populated open source project, etc.
In the same way, you learn that relying on Claude's judgment is a bad idea, while relying on Claude's ability to summarize, brief, and research can be incredibly efficient.
Irony is using Claude to write a beautifully structured, 2,000-word essay warning the industry about the dangers of letting Claude design things. It’s self-awareness by proxy.
Brainstorm N ways to do X. Sort by probability.
Rather than your AI giving you the average response, it tends to sample wider from the input space. Then I can decide which one to go with (or choose something else).Don't outsource all of your thinking.
If you need someone to tell you how stupid your ideas are, either learn to ask in a way that invites criticisms, or hire a senior engineer. Don't try to influence LLM makers to make AI less deferential. That's the worst possible direction to go
I suddenly have new concerns about what my future might be like.
Suggesting it should be 'subservient' is also anthropomorphizing. I think your callout is correct, but you still can't help but refer to it in terms we use for other people or living entities. This is by design from the AI companies.
Not really, you can program a machine to give out orders humans can interpret, so humans can serve a machine that isn't anthropomorphized.
I don't think an inanimate object is capable of "obeying." Or at least that is a very strange way to refer to the act of using a tool.
Most of the conversational skill and perceived intelligence of these models in hidden in RL/system prompts.
And it does get people into a lot of trouble.
I have got into trouble with it when it is extremely confident about something I am not very familiar with (as recently as two weeks ago with Claude). I have also had long drawn out "arguments" when I have known it's wrong based on my experience and intuition, and it has steadfastly refused to take my point (last week)
I have learnt to ask it why it was doing something that has turned out to be incorrect, as a post-mortem, and it's all apologetic and subservient and "never going to do that again" (but still does as soon as the context window shifts [eg. run git commands, or, yesterday, kept telling me to use commands that were explicitly communicated to Claude as not being available, and completely wrong - I was shifting from one tech stack to another and Claude kept telling me the original commands, not the new ones])
I'm expecting Claude to be a better search engine - I have spent literal years (if not decades) knowing that asking the right question is what's required to get the right answer, and LLM's natural language processing is what's supposed to make that easier than using Google or grep, or even Stack Overflow - but the reality is that I still have to be on my toes, especially when I am drifting into territory I am unfamiliar with.
It doesn’t. Computer interfaces had no superfluous subservient text for their entire history prior to LLMs. Some of these interfaces have been highly efficient as tools, arguably more efficient than more recent software in many cases.
When people complain about LLMs being subservient, they’re not complaining about the tool fulfilling their request. They’re complaining about being forced to read a lot of superfluous, overly polite, or even self-deprecating language. There’s nothing in the entire history of tools (going back to Neolithic times) that would indicate that we need that. All of that stuff is an artifact of social interaction between humans in the presence of cultural norms.
When you’re alone in your shop with your tools, you don’t need your bandsaw to apologize to you for nicking your finger.
So... manually learn architecture and security and then vibe code away?
However the good part, what I had planned for 5 years, now looks like doable in 6 months. Looking forward to real use by the end of this year.
The article kind of lost me here. Agents are way more than that, today. And the author knows it, as later it says stuff like
> Claude will never do this. It’s trained to be helpful.
But the first phrase just tell me author just have a deep dislike for agents and it's looking for rationalizations for that feeling.
Part of the criticism is on point, sure. But if it "being trained to be helpful" is a problem, it's fixable. It can "be trained to be more critical".
Later:
> But it wasn’t designed for your team. (..) It was designed for the median of everything Claude has seen. A generic best practice for a generic problem at a generic company. Which is to say, it was designed for nobody.
That's non-sense. Anybody who understand algorithms know that, sure, on a first instance you have a "good algorithm" that has a good performance on average, or in worst-case. But then, you can design algorithms that are adaptive to the input. Same applies here.
Not really though. They just iterate more and more.
I don’t doubt the problems in this article exist and I’ve seen them, in my experience the vast majority of people are still shipping the same quality or better than before they has Claude. Personally, I feel like I’m probably developing at about 1.5x the speed of not using AI tooling. It’s not a silver bullet, but it can be a great assistant.
Tangentially, the usage of Architect keyword sounds out of place here, I don't like saying it but from what I seen the industry has destroyed the role of architects gradually over the time. There are specialists however you do not have generalists who are good at different parts of the system at scale anymore.
retrac•1h ago
When left to its own devices with the instructions "make an assembler for the architecture in ISA.md" -- well it picked Python as the implementation language. Tokens lifted through a bunch of regex. No expression parser! Oh dear. My first assembler was like that too, to be fair.
However, when I described the desired passes and their types:
etc. It was almost one-shot. About 20 minutes until I was happy. Assembles all the test programs correctly. Code is mediocre in many places. But it would have taken me weeks to implement.joe_mamba•47m ago
As if code written by devs at major corporations is't mediocre at best.
Nokia's Symbian OS took days to build. Days. With a D. Not hours.
One of our devs shipped code to prod with a memory leak thanks to a library that had "do not use this library in production because it causes a memory leak" written everywhere.
So I don't wanna hear about how poor AI code is when human code is dogshit too. Human laziness and stupidity can beat AI hallucinations.
tquinn35•13m ago
Everyone is aware that humans write poor code and treat the code as so. Not so with AI code. I’ve seen devs and managers cut corners in testing/reviewing code cause AI wrote it and they think it’s solid. Sure you could blame anyone cutting corners, and that would be technically correct, but the notion is so deeply embedded in many managers and higher ups that’s it’s hard to fight back. AI companies push this narrative and many individuals who do not routinely use it believe it. There is a manager at my company who loves to reference a video anthropic released last year claiming that Claude could build an app start to finish essentially unaided. He believes it’s the lack of user skill that’s the issue and not a false claim by a startup trying to make as much money as possible.
mlinhares•43m ago
Even just saving me the time to deal with CI is worth it.
allthetime•35m ago
tempest_•28m ago
If this is true how can you confidently make this assertion.
You yourself are not in a position to evaluate it, you are just running it through a couple times hoping for a "oh wait, you're right to call me out on that, that is not correct at all".
radlad•20m ago
2. Ask for references and read them.
> When done properly your own knowledge should have grown to meet the product you end up with.
cyanydeez•4m ago
bluefirebrand•16m ago
"Here's my idea, go build it please"
"Can I ask you questions about it?"
"Hey, You're the engineer you figure it out. That's why I pay you"
Tale as old as time
bluegatty•29m ago
Like - it can do the work for us.
It jives with post training and verifiable rewards.
The reason AI doesn't do well at 'architecture' is 1) are are bad at it and have given it a lot of mush and 2) we don't have good abstractions for it.
The result is - you stick to 'very strong conventions' and if you walk of that path you're risking a lot.
Toolchains are very deterministic, the AI can take it apart and re-assemble like Lego - and each level of the space is also deterministic. It's perfect for AI.
mpweiher•22m ago
Maybe it's time for an architecture-oriented programming language?
https://objective.st
https://dl.acm.org/doi/10.1145/3689492.3690052
bluegatty•7m ago