Claude Code is all you need

https://dwyer.co.za/static/claude-code-is-all-you-need.html

851•sixhobbits•6mo ago

Comments

doppelgunner•6mo ago

Is Claude Code really that good? I'm currently using Cursor and I let it pick the LLM model to use.

sixhobbits•5mo ago

Most of these are Anthropic models under the hood, so I think 'whatever fits your workflow best' is the main deciding factor. That's definitely Claude Code for me, and I do think there's some 'secret sauce' in the exact prompting and looping logic they use, but I haven't tried Cursor a lot to be sure.

libraryofbabel•5mo ago

any secret sauce in prompting etc could be trivially reverse engineered by the companies building the other agents, since they could easily capture all the prompts it sends to the LLM. If there’s any edge, it’s probably more around them fine-tuning the model itself on Claude Code tasks.

mwigdahl•5mo ago

Interesting that the other vendors haven't done this "trivial" task, then, and have pretty much ceded the field to Claude Code. _Every_ CLI interface I've used from another vendor has been markedly inferior to Claude Code, and that includes Codex CLI using GPT-5.

dangus•5mo ago

Claude code seems like the obvious choice for someone using Vim but even in the context of someone using a graphical IDE like VSCode I keep hearing that Claude is “better” but I just can’t fathom how that can be the case.

Even if the prompting and looping logic is better, the direct integration with the graphical system along with the integrated terminal is a huge benefit, and with graphical IDEs the setup and learning curve is minimal.

mwigdahl•5mo ago

You can run Claude Code in the terminal window of VS Code, and it has IDE integration so you can see diffs inline, etc. It's not fully integrated like Cursor but you get a lot of the benefits of the IDE in that way.

CodingJeebus•5mo ago

If Cursor works for you, then stick with it. Claude Code is great for terminal-based workflows. Whatever makes you more productive is the better tool.

I’m just glad we’re getting past the insufferable “use Cursor or get left behind” attitude that was taking off a year ago.

bicx•5mo ago

I use Cursor with Claude Code running in the integrated terminal (within a dev container in yolo mode). I'll often have multiple split terminals with different Claude Code instances running on their own worktrees. I keep Cursor around because I love the code completions.

vemv•5mo ago

When it comes to diffs (edits), Cursor is batch-oriented, while CC suggests one edit at a time and can be steered in real time.

That's a critical feature for keeping a human in the loop, preventing big detours and token waste.

reasonableklout•5mo ago

How do you stay in the loop with CC? I find myself using it for the exact opposite use case: large features or greenfield projects where I can just let it rip autonomously for a while. I find the TUI awkward for reviewing code.

vemv•5mo ago

I stay in the loop by reviewing each edit and telling it to do something differently asap if something was wrong.

The opposite approach is also possible: just hit the option (or .json setting) to accept all edits. Then you'd review the persisted changes using your favorite Git tool.

perryizgr8•5mo ago

I feel like as tool Claude code is superior to "regular code editor with ai plugin". This method of having the ai do your tasks feels like the future.

cyprien_g•5mo ago

I'm a long-time GitHub Copilot subscriber, but I have also tested many alternatives, such as Cursor.

Recently, I tried using Claude Code with my GitHub Copilot subscription (via unofficial support through https://github.com/ericc-ch/copilot-api), and I found it to be quite good. However, in my opinion, the main difference comes down to your preferred workflow. As someone who works with Neovim, I find that a tool that works in the terminal is more appropriate for me.

threecheese•5mo ago

Isn’t that usage a violation of ToS? In that repo there’s even an issue thread that mentions this. The way I rely on GitHub these days, losing my account would be far more than annoying.

cyprien_g•5mo ago

That’s indeed a valid concern. I set a low rate limit for copilot-api, but I’m not sure if it’s enough.

I may stop my experiment if there is any risk of being banned.

bfeynman•5mo ago

there is no reason to pay for cursor when claude is definitely the best coding model and you are paying essentially for just a bigger middle man.

CuriouslyC•5mo ago

Claude is the best agent model. Its one shot performance and large codebase comprehension both lag Gemini/GPT though.

suninsight•5mo ago

I think if you use Cursor, using Claude Code is a huge upgrade. The problem is that Cursor was a huge upgrade from the IDE, so we are still getting used to it.

The company I work for builds a similar tool - NonBioS.ai. It is in someways similar to what the author does above - but packaged as a service. So the nonbios agent has a root VM and can write/build all the software you want. You access/control it through a web chat interface - we take care of all the orchestration behind the scene.

Its also in free Beta right now, and signup takes a minute if you want to give it a shot. You can actually find out quickly if the Claude code/nonbios experience is better than Cursor.

CuriouslyC•5mo ago

I think the path forward there is slack/teams/discord/etc integration of agents, so you can monitor and control whatever agent software you like via a chat interface just like you would interact with any other teammate.

suninsight•5mo ago

So we tried that route - but problem is that these interfaces aren't suited for asynchronous updates. Like if the agent is working for the next hour or so - how do you communicate that in mediums like these. An Agent, unlike a human, is only invoked when you give it a task.

If you use the interface at nonbios.ai - you will quickly realize that it is hard to reproduce on slack/discord. Even though its still technically 'chat'

CuriouslyC•5mo ago

On Slack I think threads are fine for this. Have an agent work channel, and they can just create a thread for each task and just dump updates there. If an agent is really noisy about its thinking you might need a loglevel toggle but in my experience with Claude Code/Cursor you could dump almost everything they're currently emitting to the UI into a thread.

It's still nice to have a direct web interface to agents, but in general most orgs are dealing with service/information overload and chat is a good single source of truth, which is why integrations are so hot.

irskep•5mo ago

Letting Cursor pick the model for you is inviting them to pick the cheapest model for them, at the cost of your experience. It's better to develop your own sense of what model works in a given situation. Personally, I've had the most success with Claude, Gemini Pro, and o3 in Cursor.

epiccoleman•5mo ago

I love this article just for the spirit of fun and experimentation on display. Setting up a VPS where Claude is just asked to go nuts - to the point where you're building a little script to keep Claude humming away - is a really fun idea.

This sort of thing is a great demonstration of why I remain excited about AI in spite of all the hype and anti-hype. It's just fun to mess with these tools, to let them get friction out of your way. It's a revival of the feelings I had when I first started coding: "wow, I really can do anything if I can just figure out how."

Great article, thanks for sharing!

pyrale•5mo ago

On one hand, I agree with you that there is some fun in experimenting with silly stuff. On the other hand...

> Claude was trying to promote the startup on Hackernews without my sign off. [...] Then I posted its stuff to Hacker News and Reddit.

...I have the feeling that this kind of fun experiments is just setting up an automated firehose of shit to spray places where fellow humans congregate. And I have the feeling that it has stopped being fun a while ago for the fellow humans being sprayed.

the__alchemist•5mo ago

This is an excellent point that will immediately go off-topic for this thread. We are, I believe, committed, into a mire of CG content enveloping the internet. I believe we will go through a period where internet communications (like HN, Reddit, and pages indexed by search engines) in unviable. Life will go on; we will just be offline more. Then, the defense systems will be up to snuff, and we will find a stable balance.

mettamage•5mo ago

I hope you're right. I don't think you will be, AI will be too good at impersonating humans.

lukan•5mo ago

"we will just be offline more"

I think it will be quite some time into the future, before AI can impersonate humans in real life. Neither hardware, nor software is there, maybe something to fool humans for a first glance maybe, but nothing that would be convincing for a real interaction.

johnecheck•5mo ago

Indeed. I worry though. We need those defense systems ASAP. The misinformation and garbage engulfing the internet does real damage. We can't just tune it out and wait for it to get better.

theshrike79•5mo ago

My theory (and hope) is the rise of a web of trust system.

Implemented so that if a person in your web vouches for a specific url (“this is made by a human”) you can see it in your browser.

gowld•5mo ago

"Web of Trust" has been the proposed answer for, what, 30 years now? But everyone is too lazy to implement and abide by it.

vineyardmike•5mo ago

Don’t worry, it’s coming for real this time. The governments have been proposing a requirement that web companies connect accounts to government IDs.

If that isn’t exciting enough, Sam Altman (yea the one who popularized this LLM slop) will gladly sell you his WorldCoin to store your biometric data on the blockchain!

Analemma_•5mo ago

If your solution to this problem is the web of trust, to be blunt, you don't have a solution. I am techie whose social circle is mostly other techies, and I know precisely zero people who have ever used PGP keys or any other WoT-based system, despite 30 years of evangelism. It's just not a thing anybody wants.

theshrike79•5mo ago

It's 99.99% an UI issue.

If Google wouldn't have let perfect be the enemy of good and had added PGP support to Gmail early on (even just the shittiest signatures that are automatically applied and verified), the world would be a completely different place. Scams just wouldn't exist at this scale when signing mails with a known key would be the standard.

The tech is there, now we have Matrix and XMPP and PubSub and god knows how many protocols to share keys. Even Keybase.io still kind of exists.

What is lacking is a browser ecosystem for people to use their known identities to vouch for a specific url (with smart hashing so that changing the content would invalidate the trust).

We have the technology. Someone(tm) "just" needs to build it :)

kbar13•5mo ago

it's annoying but it'll be corrected by proper moderation on these forums

as an aside i've made it clear that just posting AI-written emoji slop PR review descriptions and letting claude code directly commit without self reviewing is unacceptable at work

DrSiemer•5mo ago

I'm not a fan of this option, but it seems to me the only way forward for online interaction is very strong identification on any place where you can post anything.

xnorswap•5mo ago

That can be automated away too.

People will be more than willing to say, "Claude, impersonate me and act on my behalf".

pyrale•5mo ago

> People will be more than willing to say, "Claude, impersonate me and act on my behalf".

I'm now imagining a future where actual people's identities are blacklisted just like some IP addresses are dead to email, and a market develops for people to sell their identity to spammers.

simonw•5mo ago

That's always been the biggest flaw in the Worldcoin idea in my opinion: if you have a billion+ humans get their eyeball scanned in exchange for some kind of cryptographic identity, you can guarantee that a VERY sizable portion of those billion people will happily sell that cryptographic identity (which they don't understand the value of) to anyone who offers them some money.

As far as I can tell the owner of the original iris can later invalidate an ID that they've sold, but if you buy an ID from someone who isn't strongly technically literate you can probably extract a bunch of value from it anyway.

zoeysmithe•5mo ago

I mean, that's fine I guess as long as its respectable and respects the forum.

"Claude write a summary of the word doc I wrote about x and post it as a reply comment," is fine. I dont see why it wouldnt be. Its a good faith effort to post.

"Claude, post every 10 seconds to reddit to spam people to believe my politics is correct," isn't but that's not the case. Its not a good faith effort.

The moderation rules for 'human slop' will apply to AI too. Try spamming a well moderated reddit and see how far you get, human or AI.

antonvs•5mo ago

The problem is speed and quantity. Humans weren't able to fight off the original email spam, it took automated systems. Forums will have to institute much stronger rate limiting and other such measures.

gowld•5mo ago

That's fine, because once someone is banned, the impersonations are also banned.

withinboredom•5mo ago

I do this every time I find myself typing something I could get written up over or even fired for.

1. I'm usually too emotional to write out why I feel that way instead of saying what I feel.

2. I really don't like the person (or their idea) but I don't want to get fired over it.

Claude is really great at this: "Other person said X, I think it is stupid and they're a moron for suggesting this. Explain to them why this is a terrible idea or tell me I'm being an idiot."

Sometimes it tells me I'm being an idiot, sometimes it gives me nearly copy-pasta text that I can use and agree with.

postexitus•5mo ago

Back in FidoNet days, some BBSs required identification papers for registering and only allowed real names to be used. Though not known for their level headed discussions, it definitely added a certain level of care in online interactions. I remember the shock seeing the anonymity Internet provided later, both positive and negative. I wouldn't be surprised if we revert to some central authentication mechanism which has some basic level of checks combined with some anonymity guarantees. For example, a government owned ID service, which creates a new user ID per website, so the website doesn't know you, but once they blacklist that one-off ID, you cannot get a new one.

andoando•5mo ago

id.me?

Not government owned, but even irs.gov uses it

benterix•5mo ago

Honestly, having seen how it can be used against you, retroactively, I would never ever engage in a discussion under my real name.

(The fact that someone could correlate posts[0] based on writing style, as previously demonstrated on HN and used to doxx some people, makes things even more convoluted - you should think twice what you write and where.)

[0] https://news.ycombinator.com/item?id=33755016

postexitus•5mo ago

This is a subset of "I don't have anything to hide" argument - if we use our real names, I think we'll have more responsibility about what we say. Of course, that's assuming our seemingly democratic governments don't turn authoritarian all of a sudden, as a Turkish citizen, I know that's not a given.

sleepybrett•5mo ago

Smaller communities too.

I grew up in... slightly rural america in the 80s-90s, we had probably a couple of dozen local BBSes the community was small enough that after a bit I just knew who everyone was OR could find out very easily.

When the internet came along in the early 90s and I started mudding and hanging out in newsgroups I liked them small where I could get to know most of the userbase, or at least most of the posing userbase. Once mega 'somewhat-anonymous' (i.e. posts tied to a username, not like 4chan madness) communities like slashdot, huge forums, etc started popping up and now with even more mega-communities like twitter and reddit. We lost something, you can now throw bombs without consequence.

I now spend most of my online time in a custom built forum with ~200 people in it that we started building in an invite only way. It's 'internally public' information who invited who. It's much easier to have a civil conversation there, though we still do get the occasional flame-out. Having a stable identity even if it's not tied to a government name is valuable for a thriving and healthy community.

DrSiemer•5mo ago

Sounds good!

A German forum I'm on allows members limited invites based on participation. The catch is, you are responsible for the people you invite. If they get in trouble, you will share a part of the punishment.

bookofjoe•5mo ago

See also: https://news.ycombinator.com/item?id=44860174 (posted 12 hours ago)

epiccoleman•5mo ago

I definitely understand the concern - I don't think I'd have hung out on HN for so long if LLM generated postings were common. I definitely recognize this is something you don't want to see happening at scale.

But I still can't help but grin at the thought that the bot knows that the thing to do when you've got a startup is to go put it on HN. It's almost... cute? If you give AI a VPS, of course it will eventually want to post its work on HN.

It's like when you catch your kid listening to Pink Floyd or something, and you have that little moment of triumph - "yes, he's learned something from me!"

sixhobbits•5mo ago

(author here) I did feel kinda bad about it as I've always been a 'good' HNer until that point but honestly it didn't feel that spammy to me compared to some human generated slop I see posted here, and as expected it wasn't high quality enough to get any attention so 99% of people would never have seen it.

I think the processes etc that HN have in place to deal with human-generated slop are more than adequate to deal with an influx of AI generated slop, and if something gets through then maybe it means it was good enough and it doesn't matter?

felixgallo•5mo ago

That kind of attitude is exactly why we're all about to get overwhelmed by the worst slop any of us could ever have imagined.

The bar is not 'oh well, it's not as bad as some, and I think maybe it's fine.'

taude•5mo ago

well, he was arguing that it's not worse than 99% of the human slop that gets posted, so where do you draw the line?

* well crafted, human only? * Well crafted, whether human or AI? * Poorly crafted, human * well crafted, AI only * Poorly crafted, AI only * Just junk?

etc.

I think people will intuitively get a feel for when content is only AI generated. If people spend time writing a prompt that doesn't make it so wordy, and has personality, and it OK, then fine.

Also, big opportunity going to be out there for AI detected content, whether in forums, coming in inmail inboxes, on your corp file share, etc...

AtlasBarfed•5mo ago

Did you?

Spoiler: no he didn't.

But the article is interesting...

It really highlights to me the pickle we are in with AI: because we are at a historical maximum already of "worse is better" with Javascript, and the last two decades have put out a LOT of javascript, AI will work best with....

Javascript.

Now MAYBE better AI models will be able to equivalently translate Javascript to "better" languages, and MAYBE AI coding will migrate "good" libraries in obscure languages to other "better" languages...

But I don't think so. It's going to be soooo much Javascript slop for the next ten years.

I HOPE that large language models, being language models, will figure out language translation/equivalency and enable porting and movement of good concepts between programming models... but that is clearly not what is being invested in.

What's being invested in is slop generation, because the prototype sells the product.

zoeysmithe•5mo ago

I mean I can spam HN right now with a script.

Forums like HN, reddit, etc will need to do a better job detecting this stuff, moderator staffing will need to be upped, AI resistant captchas need to be developed, etc.

Spam will always be here in some form, and its always an arms race. That doesnt really change anything. Its always been this way.

bongodongobob•5mo ago

The Internet is already 99% shit and always has been. This doesn't change anything.

zanellato19•5mo ago

It's gotten much worse. Before it was shit from people, now it's corporate shit. Corporate shit is so much worse.

georgeburdell•5mo ago

For me, I can’t get into using AI tools like Claude Code. As far as I go is chat style where I’m mostly in control. I enjoy the actual process of crafting code myself. For similar reasons, I could never be a manager.

Agents are a boon for extraverts and neurotypical people. If it gets to the point where the industry switches to agents, I’ll probably just find a new career

starfallg•5mo ago

Pretty sure we can make LLM agents to transform declarative inputs to agentic action.

fsloth•5mo ago

I strongly disagree agents are for extroverts.

I do agree it’s definetly a tool category with a unique set of features and am not surprised it’s offputting to some. But it’s appeal is definetly clear to me as an introvert.

For me LLM:s are just a computer interface you can program using natural language.

I think I’m slightly ADD. I love coding _interesting_ things but boring tasks cause extreme discomfort.

Now - I can offload the most boring task to LLM and spend my mental energy on the interesting stuff!

It’s a great time to be a software engineer!

pron•5mo ago

> For me LLM:s are just a computer interface you can program using natural language.

I wish they were, but they're not that yet because LLMs aren't very good at logical reasonsing. So it's more like an attempt to program using natural language. Sometimes it does what you ask, sometimes not.

I think "programming" implies that the machine will always do what you tell it, whatever the language, or reliably fail and say it can't be done because the "program" is contradictory, lacks sufficient detail, or doesn't have the necessary permissions/technical capabilities. If it only sometimes does what you ask, then it's not quite programming yet.

> Now - I can offload the most boring task to LLM and spend my mental energy on the interesting stuff!

I wish that, too, were true, and maybe it will be someday soon. But if I need to manually review the agent's output, then it doesn't feel like offloading much aside from the typing. All the same concentration and thought are still required, even for the boring things. If I could at least trust the agent to tell me if it did a good job or is unsure that would have been helpful, but we're not even there yet.

That's not to say the tools aren't useful, but they're not yet "programming in a natural language" and not yet able to "offload" stuff to.

yomismoaqui•5mo ago

You can view Claude Code as a non-deterministic compiler where you input english and get functioning code on the other end.

The non-determinism is not as much as a problem because you are reading over the results and validating that what it is created matches what you tell it to do.

I'm not talking about vibe-coding here, I'm grabbing the steering wheel with both hands because this car allows me to go faster than if I was driving myself, but sometimes you have to steer or brake. And the analogy favors Claude Code here because you don't have to react in milliseconds while programming.

TL;DR: if you do the commit you are responsible for the code it contains.

pron•5mo ago

Sure, and that may be valuable, but it's neither "programming" nor "offloading mental effort" (at least not much).

Some have compared it to working with a very junior programmer. I haven't done that in a long while, but when I did, it didn't really feel like I was "offloading" much, and I could still trust even the most junior programmer to tell me whether the job was done well or not (and of any difficulties they encountered or insight they've learnt) much more than I can an agent, at least today.

Trust is something we have, for the most part, when we work with either other people or with tools. Working without (or with little) trust is something quite novel. Personally, I don't mind that an agent can't accomplish many tasks; I mind a great deal that I can't trust it to tell me whether it was able to do what I asked or not.

gf000•5mo ago

There is probably a significant factor of what domain you work in, that might be part of the reason why you have different experiences.

I don't know your current domain, but stuff like Loom requires very complex reasoning capabilities, most of which lives outside the actual codebase itself. Business code on the other hand often has a more direct mapping from idea to code, so LLMs might show more promise there.

fsloth•5mo ago

”it's neither "programming"

Sure it is.

Modern ecosystem is sadly full of API:s like WPF on Windows that are both verbose and configuration heavy. Now, some people may be able to internalize xaml with little effort but not all us - and then you basically move forward iteratively, looking for code example, trying this or that … basically random walking towards something usable.

Or you use an agentic LLM and it does this peeking and poking for you, and with decades old APIs like WPF likely has enough context to do the thing you asked it to do far more competently than you could train yourself to program WPF in a few days.

Of course in the context of this example WPF was your main duty, you _would_ learn the ins and outs of it.

In quite many jobs a task like this may not infact be even among your top ten duties but you need to do it.

In these sort of situations a LLM is really nice. The worst it provides is a good first guess how to do something.

If your contex is something like radiology treatment then no, don’t use LLM!

But there are thoushands of miserable non-critical but necessary components in production for which LLM is just fine.

keeda•5mo ago

> ... LLMs aren't very good at logical reasonsing.

I'm curious about what experiences led you to that conclusion. IME, LLMs are very good at the type of logical reasoning required for most programming tasks. E.g. I only have to say something like "find the entries with the lowest X and highest Y that have a common Z from these N lists / maps / tables / files / etc." and it spits out mostly correct code instantly. I then review it and for any involved logic, rely on tests (also AI-generated) for correctness, where I find myself reviewing and tweaking the test cases much more than the business logic.

But then I do all that for all code anyway, including my own. So just starting off with a fully-fleshed out chunk of code, which typically looks like what I'd pictured in my head, is a huge load off my cognitive shoulders.

pron•5mo ago

The experience was that I once asked an LLM to write a simple function and it produced something very wrong that nothing with good reasoning abilities should ever do. Of course, a drunk or very tired human could have done the same mistake, but they would have at least told me that they were impaired and unsure of their work.

I agree that most of the time it does most simple tasks mostly right, but that's not good enough to truly "offload" my mental effort. Again, I'm not saying it's not useful, but more than working with a junior developer it's like working with a junior developer who may or may not be drunk or tired and doesn't tell you.

But mostly my point is that LLMs seem to do logical reasoning worse than other things they do better, such as generating prose or summarising a document. Of course, even then you can't trust them yet.

> But then I do all that for all code anyway, including my own

I don't, at least not constantly. I review other people's code only towards the very end of a project, and in between I trust that they tell me about any pertinent challenge or insight, precisely so that I can focus on other things unless they draw my attention to something I need to think about.

I still think that working with a coding assistant is interesting and even exciting, but the experience of not being able to trust anything, for me at least, is unlike working with another person or with a tool and doesn't yet allow me to focus on other things. Maybe with more practice I could learn to work with something I can't trust at all.

kenjackson•5mo ago

What was the simple function?

throwaway31131•5mo ago

I’m not sure what their simple function was but I tried to use Claude to recreate C++ code to implement the algorithms in this paper as practice for me in LLM use and it didn’t go well. But I’ll be the first to admit that I’m probably holding it wrong.

https://users.cs.duke.edu/~reif/paper/chen/graph/graph.pdf

pron•5mo ago

Can't remember, but it was something very basic - a 10/15-line routine that a first-year student would write in 3 minutes if they knew the relevant API. The reason I asked the model in the first place is because I didn't know the API. If memory serves, the model inverted an if or a loop condition.

p1esk•5mo ago

Did you use one of the latest frontier reasoning models? If not, how is your experience relevant?

totallykvothe•5mo ago

In what world is this an appropriate thing to say to someone?

guappa•5mo ago

In the creed of "AI is perfect, if you claim otherwise you're broken" that so many here embrace.

p1esk•5mo ago

In the world where you do not claim that LLMs suck today based on your attempt to use some shitty model three years ago.

jama211•5mo ago

So you tried it once and then gave up?

pron•5mo ago

I didn't give up, I just know that I can only use a model when I have the patience to work with something I can't trust at all on anything. So that's what I do.

jama211•5mo ago

Sounds like the spirit of my question remains intact

darkerside•5mo ago

> working with a junior developer who may or may not be drunk or tired and doesn't tell you.

Bad news, friend.

Overall though, I think you're right. It's a lot like working with people. The things you might be missing are that you can get better at this with practice, and that once you are multiplexing multiple Claudes, you can become hyper efficient. These are things I'm looking into now.

Do I know these for a fact? Not yet. But, like any tool, I'm sure that the investment won't pay off right away.

foobarbecue•5mo ago

In your example, you didn't ask the LLM to do any logic. You asked it to translate your logic into code.

Asking an LLM to do logic would be saying something like: "I have a row of a million light switches. They all start off. I start at the beginning and flip on every fourth one. Then I flip on every eighth one, then sixteen, and all the powers of two until I'm over a million. Now I do the same for the powers of three, then four, then five, and so on. How many light switches are on at the end? Do not use any external coding tools for this; use your own reasoning."

Note that the prompt itself is intentionally ambiguous -- a human getting this question should say "I don't understand why you started with every fourth instead of every second. Are you skipping the first integer of every power series or just when the exponent is two?"

When I asked GPT5 to do it, it didn't care about that; instead it complimented me on my "crisp statement of the problem," roughly described a similar problem, and gave a belivable but incorrect answer 270,961 .

I then asked it to write python code to simulate my question. It got the code correct, and said "If you run this, you’ll see it matches the 270,961 result I gave earlier." except, that was a hallucination.

Running the code actually produced 252711.

I guess it went with 270,961 because that was a lexically similar answer to some lexically similar problems in the training data.

cma•5mo ago

I would posit that most randomly selected AGIs (people) you ask this of with no tools allowed (pencil and paper) won't get closer on average. This doesn't prove people don't use logic

And allowing python shells for both I think the randomly selected human would do worse on average.

And I also think e.g. Google's IMO model would be able to pass but I have no way to verify that.

foobarbecue•5mo ago

That's true about people with pencils and paper. But this is a computer, and my point is that with LLMs, you have a computer program that doesn't compute or do logic -- and for some reason people act like it does.

I mean, this is a super basic python task that I could write in 2 minutes. The naiive implementation is a couple of basic for loops. I wouldn't hire a junior dev that couldn't write it.

My point here is that for someone who can program, it's easier and better to write this in python than to write it in english and have an LLM translate it to python (and LLMs themselves are incapable of logic or reasoning, so the only option is for them to translate it to real code).

cma•5mo ago

I added this line to the end of your prompt so it knew it could clarify and didn't need an immediate answer:

"You can ask any clarifying questions first before starting on the problem."

Here's what it asked (Claude Opus, web search off so it didn't find your post):

> Looking at this problem, I want to make sure I understand the process correctly before diving into the solution. Let me clarify a few things:

> When you say "flip on every fourth one" for the powers of two, do you mean:

> We flip switches at positions 4, 8, 12, 16, 20, 24, ... (every 4th position) Then flip switches at positions 8, 16, 24, 32, 40, 48, ... (every 8th position) And so on?

> When you move to "powers of three", do you mean:

> Start with every 3rd switch (3, 6, 9, 12, ...) Then every 9th switch (9, 18, 27, 36, ...) Then every 27th switch, etc.?

> Or does "powers of three" mean we start with 3² = 9 as the first increment, similar to how we started with 2² = 4 for powers of two?

> When you say "then four, then five, and so on" - do you mean we continue this process for all integers k ≥ 2, using powers of k as our increments?

> The final state of each switch depends on whether it gets flipped an odd or even number of times total, so I want to make sure I understand the flipping pattern correctly before calculating the answer.

pron•5mo ago

At work, you can trust people to either get the right answer or tell you they may not have the right answer. If someone is not trustworthy, you don't work with them again. The experience of trying to work with something that is completely not trustworthy on all fronts is novel and entirely dissimilar to working with either people or tools.

keeda•5mo ago

People themselves don't know when they are wrong, and that is why high-functioning organizations have all sorts of guardrails in place. Trivial example, code reviews. Now, code reviews are multi-purpose, and their primary benefit is not just catching bugs, but they do catch bugs pretty often (there are actual studies about this.)

So my experience in working with AI is actually much more similar to working with people, except I have to correct the AI much less frequently.

I always say, AI is technology that behaves like people, and so the trick to working with it effectively is to approach it like working with a colleague, with all their specific quirks and skillsets, rather than a tool.

keeda•5mo ago

That may be the case, but how often do you have to provide these kinds of puzzles as programming tasks? And even then, by your own example, it did product correct code, so it did "reason" through the problem statement!

foobarbecue•5mo ago

The point I'm making is that translating from english to python does not involve logical reasoning or any form of problem solving. You essentially replace english phrases with their python equivalents; it's a fuzzy version of looking things up in a dictionary.

It can translate, but it cannot reason or calculate. The python can calculate.

keeda•5mo ago

I've seen countless examples where it does do a lot of logical reasoning from a very high-level requirement. I only gave that example because that is the more common form of programming logic required, but there is a huge range of logical and coding tasks it handles seamlessly. It figures out the problem, the high-level approach, the appropriate algorithms, any intermediate book-keeping data structures required, any edge cases, what functions and classes from the contextual code matters, and how it should integrate with the rest of the codebase.

Here is one recent exchange (some proprietary details redacted) with Claude Sonnet 3.7 via the Cody VSCode Extension (RIP):

---

Me: At this point, "segments" represents a list of bounding boxes that can overlap. Write a function to coalesce only those boxes that are substantially (i.e. within some tolerance) horizontally aligned along one or both edges or their centers, either: 1) overlapping completely or partially; or 2) very closely spaced like the words of a sentence should be (you can assume the threshold for spacing could be a reasonable fraction of the height of the box).

Claude: I'll create a function to coalesce horizontally aligned boxes based on your requirements. This function will identify boxes that are either overlapping or closely spaced and horizontally aligned (by top edge, bottom edge, or vertical center).

```javascript function coalesceHorizontallyAlignedSegments(segments, options = {}) { // Default options const { verticalAlignmentTolerance = 0.1, // 10% tolerance for vertical alignment horizontalSpacingThreshold = 0.5, // Max horizontal spacing as a fraction of box height overlapThreshold = 0.2 // Minimum overlap required (20% of the smaller box) } = options;

  if (!segments || segments.length <= 1) {
    return segments;
  }

  // Sort segments by x position (left to right)
  const sortedSegments = [...segments].sort((a, b) => a.x - b.x);

  // Track which segments have been merged
  const merged = new Array(sortedSegments.length).fill(false);
  const mergedSegments = [];

  for (let i = 0; i < sortedSegments.length; i++) {
    // Skip if this segment has already been merged
    if (merged[i]) continue;

    const currentSegment = sortedSegments[i];
    let mergedSegment = { ...currentSegment };
    merged[i] = true;

    // Check for potential merges with subsequent segments
    let didMerge = true;

    while (didMerge) {
      didMerge = false;

      for (let j = 0; j < sortedSegments.length; j++) {
        if (merged[j]) continue;

        const nextSegment = sortedSegments[j];

        // Check if horizontally aligned (top, bottom, or center)
        const topAligned = Math.abs(mergedSegment.y - nextSegment.y) <=
                           verticalAlignmentTolerance * Math.min(mergedSegment.height, nextSegment.height);

        const bottomAligned = Math.abs((mergedSegment.y + mergedSegment.height) -
                              (nextSegment.y + nextSegment.height)) <=
                              verticalAlignmentTolerance * Math.min(mergedSegment.height, nextSegment.height);

        const centerAligned = Math.abs((mergedSegment.y + mergedSegment.height/2) -
                              (nextSegment.y + nextSegment.height/2)) <=
                              verticalAlignmentTolerance * Math.min(mergedSegment.height, nextSegment.height);

        const isVerticallyAligned = topAligned || bottomAligned || centerAligned;

        if (!isVerticallyAligned) continue;

        // Check for overlap or close spacing
        const mergedRight = mergedSegment.x + mergedSegment.width;
        const nextLeft = nextSegment.x;

        // Calculate horizontal spacing
        const horizontalSpacing = nextLeft - mergedRight;
        const maxAllowedSpacing = horizontalSpacingThreshold * Math.min(mergedSegment.height, nextSegment.height);

        // Check for overlap
        const isOverlapping = mergedRight >= nextLeft;

        // Calculate overlap percentage if overlapping
        let hasSignificantOverlap = false;
        if (isOverlapping) {
          const overlapWidth = Math.min(mergedRight, nextSegment.x + nextSegment.width) -
                              Math.max(mergedSegment.x, nextSegment.x);
          const smallerWidth = Math.min(mergedSegment.width, nextSegment.width);
          hasSignificantOverlap = overlapWidth / smallerWidth >= overlapThreshold;
        }

        // Merge if overlapping significantly or closely spaced
        if (hasSignificantOverlap || (horizontalSpacing >= 0 && horizontalSpacing <= maxAllowedSpacing)) {
          // Create a new bounding box that encompasses both segments
          const newX = Math.min(mergedSegment.x, nextSegment.x);
          const newY = Math.min(mergedSegment.y, nextSegment.y);
          const newRight = Math.max(mergedSegment.x + mergedSegment.width,
                                   nextSegment.x + nextSegment.width);
          const newBottom = Math.max(mergedSegment.y + mergedSegment.height,
                                    nextSegment.y + nextSegment.height);

          mergedSegment = {
            x: newX,
            y: newY,
            width: newRight - newX,
            height: newBottom - newY,
            // Preserve the virtual path from the leftmost segment
            imagePath: mergedSegment.x <= nextSegment.x ? mergedSegment.imagePath : nextSegment.imagePath
          };

          merged[j] = true;
          didMerge = true;
          break;
        }
      }
    }

    // Add the topLeft and bottomRight properties for consistency
    mergedSegment.topLeft = { x: mergedSegment.x, y: mergedSegment.y };
    mergedSegment.bottomRight = {
      x: mergedSegment.x + mergedSegment.width,
      y: mergedSegment.y + mergedSegment.height
    };

    mergedSegments.push(mergedSegment);
  }

  return mergedSegments;

} ```

This function works as follows:

1. It sorts segments from left to right to process them in reading order 2. For each segment, it looks for potential merges with other segments based on: - Vertical alignment (top edges, bottom edges, or centers within tolerance) - Either significant overlap (controlled by overlapThreshold) or - Close horizontal spacing (controlled by horizontalSpacingThreshold as a fraction of box height) 3. When segments are merged, it creates a new bounding box that encompasses both 4. It continues merging until no more merges are possible

You can integrate this function into your segmentation pipeline by adding it after the segment extraction:

```javascript function segmentImage(ctx, imageData, options = {}) { // ... existing code ...

  const segments = boundingRects.map(toSegment);

  // Coalesce horizontally aligned segments
  const coalescedSegments = coalesceHorizontallyAlignedSegments(segments, options.coalesceOptions);

  return { segments: coalescedSegments, contourTrees };

} ```

Me: Do not write any code yet, but would it help if the segments were first grouped into approximate "rows" and then coalesced, rather than simply sorted by the x-coordinate?

Claude: Yes, grouping segments into approximate "rows" before coalescing would be significantly more effective than simply sorting by x-coordinate. Here's why:

1. *Better handling of multi-column layouts*: When a page has multiple columns of text or elements, sorting purely by x-coordinate would attempt to merge segments across different columns if they happen to be horizontally aligned.

2. *Respecting natural reading flow*: Text typically flows in rows, and grouping by rows first would respect this natural organization of content.

3. *More accurate merging decisions*: By first establishing which segments belong to the same "row" or line of content, we can make more informed decisions about which segments should be considered for merging.

4. *Handling of varying vertical positions*: Text lines on a webpage aren't always perfectly aligned. Pre-grouping into rows would allow us to handle slight variations in vertical positioning within what should be considered the same line.

5. *Efficiency*: By limiting the coalescing operations to segments within the same row, we reduce the number of comparisons needed.

The approach would work like this: 1. Group segments into rows based on vertical overlap or proximity 2. Within each row, sort segments by x-coordinate 3. Apply the coalescing logic only to segments within the same row

This approach would better preserve the semantic structure of the content while still merging segments that represent parts of the same logical unit (like words in a sentence).

Looking at the codebase, this approach would align well with how visual content is typically structured and would produce more meaningful segment groupings for downstream processing.

---

And then it proceeded to update the code as discussed. Sure this is not a very novel problem and it required a bit of back and forth, but look at the ratio of prompt to code. This exchange took a couple of minutes; I'd estimate it would have taken me an hour to get that code down with all edge cases handled. Look at the exposition, the quality of code, the choice to use optional parameters for tolerances, and edge-case handling.

It's very, very hard for me to not see this as reasoning. I mean, how is this not mind-blowing?

__MatrixMan__•5mo ago

I'm not who you're replying to but I had a scenario where I needed to notice that a command had completed (exit code received) but keep listening for any output that was still buffered and only stop processing tokens after it had been quiet for a little bit.

Trying to get Claude to do this without introducing a deadlock and without exiting too early and leaving valuable output in the pipe was hellish.

It's very good at some kinds of reasoning and very bad at others. There's not much it's mediocre at.

keeda•5mo ago

> It's very good at some kinds of reasoning and very bad at others. There's not much it's mediocre at.

Yes, one of the terms for this phenomenon is "jagged frontier" from here:

https://www.oneusefulthing.org/p/centaurs-and-cyborgs-on-the...

This is also the main reason people have such diverging experiences and the discussion around AI is so bimodal in nature.

jpfromlondon•5mo ago

https://arstechnica.com/ai/2025/08/researchers-find-llms-are...

nostrademons•5mo ago

So here's a transcript of a recent Claude Code interaction I had, trying to build an MP3 organizer that relies on heuristics and fuzzy matching to make sense out of all the random MP3s I downloaded in the 90s. This all happened in-between prompts: I gave it a correct example and an incorrect example and asked it why it was giving the incorrect example.

● Let me trace through what happens with the "Jordan Rudess" example:

  1. Directory: Jordan Rudess/
  2. file_rel_path would be Path("Jordan Rudess") (after removing MP3 Disk prefix)
  3. If Jordan Rudess matches in path component 0, scope_path would be Path("Jordan Rudess")
  4. _path_is_in_scope checks if "Jordan Rudess".startswith("Jordan Rudess") → True

  This looks correct. Let me check the inference logic:

● Read(mp3_organizer_v2.py) ⎿ Read 15 lines (ctrl+r to expand)

● The inference priority looks correct - path_component_match is indeed the highest priority. But I think I see the issue now. Let me look at the actual logic flow in the main organize method:

● Read(mp3_organizer_v2.py) ⎿ Read 10 lines (ctrl+r to expand)

● The flow looks right. Let me examine the actual problem more carefully. Looking back at the _apply_path_component_matching method, I think there might be an issue with when the matching happens. Let me check the logic:

● Update Todos ⎿ Check _path_is_in_scope logic for correctness Verify artist inference priority logic works with path_component_match Test and fix any remaining issues with artist propagation

● I think I see the issue now. The problem is that the current code only looks for matches in files that already have file_id3 tags, but it needs to propagate those matches to all files in the directory. However, the current implementation is still only applying matches to files that have matching candidates, not ALL files. Let me check the exact logic:

● Read(mp3_organizer_v2.py) ⎿ Read 8 lines (ctrl+r to expand)

● Actually, looking at this more carefully, the logic should work. The matched_artists dictionary contains the matches found from ANY file that had a file_id3 tag matching a path component, and then it applies that match to ALL files in the file_infos list via the loop at line 737.

  Wait, let me understand the scoping issue better. Let me check what file_rel_path and scope_path would actually be in your example:

● Read(mp3_organizer_v2.py) ⎿ Read 5 lines (ctrl+r to expand)

● Ah! I think I see a potential issue. The file_rel_path is calculated as the directory path, not the individual file path. So for all files in the same directory, file_rel_path will be the same. This should actually work correctly.

  But let me double-check the scope matching logic. If the directory is "Jordan Rudess" and the scope_path is also "Jordan Rudess", then _path_is_in_scope should return
  True.

  Let me debug this by adding some debug output to see what's actually happening. But first, let me look at a potential issue: maybe the match isn't being found in the first place?

It's doing a pretty complex chain of step-by-step reasoning here, looking in detail at the code. It ended up by printing out more debug info and having me re-run it again, then paste in the debug info, then add even more debug info and run it again. But it did eventually get the bug, which was non-trivial to identify (it was an aliasing problem where Claude was mutating state on a list and that list was shared with all the other files in the directory).

Applejinx•5mo ago

Huh. Alternate explanation: there's a layer of indirection, drawing upon the unthinkable size of the source data, so rather than 'issue forth tokens as if there is a person answering a question', you've got 'issue forth tokens as if there is a person being challenged to talk about their process', something that's also in the training data but in different contexts.

I'm not sure statements of 'aha, I see it now!' are meaningful in this context. Surely this is just the em-dash of 'issue tokens to have the user react like you're thinking'?

nostrademons•5mo ago

I wonder if something else is going on, and perhaps Claude is using the LLM to identify the likely culprits within the codebase, sending the code around them to execute with an actual Python interpreter on their servers, feeding both the code and the result as the context window to another LLM query with a system prompt something like "What is this code doing, when it runs on this input and this output?", feeding the result of that back to the user, and then repeating as long as the overall bug remains unsolved. I've found that feedback is a very effective technique with LLMs, asking them to extract some data, testing that data through out-of-band mechanisms, then feeding the test results and the original context back into the LLM to explain its reasoning and why it got the result. The attention mechanisms in the transformer model function very well when they're prompted with specifics and asked to explain their reasoning.

Only an Anthropic engineer would know for sure. I'm pretty sure that it was making multiple queries on my behalf during the chat transcript - each "Read ... mp3organizer_v2.py" is a separate network round-trip.

thewebguyd•5mo ago

> For me LLM:s are just a computer interface you can program using natural language.

Sort of. You still can't get a reliable output for the same input. For example, I was toying with using ChatGPT with some Siri shortcuts on my iPhone. I do photography on the side, and finding a good time for lighting for photoshoots is a usecase I use a lot so I made a shortcut which sends my location to the API along with a prompt to get the sunset time for today, total amount of daylight, and golden hour times.

Sometimes it works, sometimes it says "I don't have specific golden hour times, but you can find those on the web" or a useless generic "Golden hour is typically 1 hour before sunset but can vary with location and season"

Doesn't feel like programming to me, as I can't get reproducible output.

I could just use the LLM to write some API calling script from some service that has that data, but then why bother with that middle man step.

I like LLMs, I think they are useful, I use them everyday but what I want is a way to get consistent, reproducible output for any given input/prompt.

stirfish•5mo ago

For things where I don't want creativity, I tell it to write a script.

For example, "write a comprehensive spec for a script that takes in the date and a location and computes when golden hour is." | "Implement this spec"

That variability is nice when you want some creativity, e.g. "write a beautiful, interactive boids simulation as a single file in html, css, and JavaScript."

Words like "beautiful" and interactive" are open to interpretation, and I've been happy with the different ways they are interpreted.

klipklop•5mo ago

>I think I’m slightly ADD. I love coding _interesting_ things but boring tasks cause extreme discomfort. >Now - I can offload the most boring task to LLM and spend my mental energy on the interesting stuff!

I agree and I feel that having LLM's do boilerplate type stuff is fantastic for ADD people. The dopamine hit you get making tremendous progress before you get utterly bored is nice. The thing that ADD/ADHD people are the WORST at is finishing projects. LLM will help them once the thrill of prototyping a green-field project is over.

80hd•5mo ago

Seconding this. My work has had the same problem - by the time I've got things all hooked up, figured out the complicated stuff - my brain (and body) clock out and I have to drag myself through hell to get to 100%. Even with ADHD stimulant medication. It didn't make it emotionally easier, just _possible_ lol.

LLMs, particularly Claude 4 and now GPT-5 are fantastic at working through these todo lists of tiny details. Perfectionism + ADHD not a fun combo, but it's way more bearable. It will only get better.

We have a huge moat in front of us of ever-more interesting tasks as LLMs race to pick up the pieces. I've never been more excited about the future of tech

r_lee•5mo ago

Same here, especially for making bash scripts or lots of if this if that with logging type stuff, error handling etc..

Oh and also, from what I know, ADHD and perfectionism is a very common combination, I'm not sure if everyone has that but I've heard it's the case for many with ADD. Same with "standards" being extremely high for everything

whartung•5mo ago

I'm kind of in this cohort. While in the groove, yea, things fly but, inevitably, my interest wanes. Either something too tedious, something too hard (or just a lot of work). Or, just something shinier shows up.

Bunch of 80% projects with, as you mentioned, the interesting parts finished (sorta -- you see the line at the end of the tunnel, it's bright, just don't bother finishing the journey).

However, at the same time, there's conflict.

Consider (one of) my current projects, I did the whole back end. I had ChatGPT help me stand up a web front end for it. I am not a "web person". GUIs and what not are a REAL struggle for me because on the one hand, I don't care how things look, but, on the other, "boy that sure looks better". But getting from "functional" to "looks better" is a bottomless chasm of yak shaving, bike shedding improvements. I'm even bad at copying styles.

My initial UI was time invested getting my UI to work, ugly as it was, with guidance from ChatGPT. Which means it gave me ways to do things, but mostly I coded up the actual work -- even if it was blindly typing it in vs just raw cut and paste. I understood how things were working, what it was doing, etc.

But then, I just got tired of it, and "this needs to be Better". So, I grabbed Claude and let it have its way.

And, its better! it certainly looks better, more features. It's head and shoulders better.

Claude wrote 2-3000 lines of javascript. In, like, 45m. It was very fast, very responsive. One thing Claude knows is boiler plate JS Web stuff. And the code looks OK to me. Imperfect, but absolutely functional.

But, I have zero investment in the code. No "ownership", certainly no pride. You know that little hit you get when you get Something Right, and it Works? None of that. Its amazing, its useful, its just not mine. And that's really weird.

I've been striving to finish projects, and, yea, for me, that's really hard. There is just SO MUCH necessary to ship. AI may be able to help polish stuff up, we'll see as I move forward. If nothing else it may help gathering up lists of stuff I miss to do.

brailsafe•5mo ago

Ironically, I find greenfield projects the least stimulating and the most rote, aside from thinking about system design.

I've always much preferred figuring out how to improve or build on existing messy systems and codebases, which is certainly aided by LLMs for big refactoring type stuff, but to be successful at it requires thinking about how some component of a system is already used and the complexity of that. Lots of edge cases and nuances, people problems, relative conservativeness.

skc•5mo ago

Looks like the definition of boilerplate will continue to shift up the chain

WhyOhWhyQ•5mo ago

It's interesting that every task in the world is boring to somebody, which means nothing left in the world will be done by those interested in it, because somebody will gladly shotgun it with an AI tool.

sleepybrett•5mo ago

Programming implies that it's going to do what i say. I wish it did.

kiitos•5mo ago

> For me LLM:s are just a computer interface you can program using natural language. ... boring tasks cause extreme discomfort ... Now - I can offload the most boring task to LLM and spend my mental energy on the interesting stuff!

The problem with this perspective, is that when you try to offload exactly the same boring task(s), to exactly the same LLM, the results you get back are never even close to being the same. This work you're offloading via natural language prompting is not programming in any meaningful sense.

Many people don't care about this non-determinism. Some, because they don't have enough knowledge to identify, much less evaluate, the consequent problems. Others, because they're happy to deal with those problems, under the belief that they are a cost that's worth the net benefit provided by the LLM.

And there are also many people who do care about this non-determinism, and aren't willing to accept the consequent problems.

Bluntly, I don't think that anyone in group (1) can call themselves a software engineer.

amelius•5mo ago

Last week there was this post about flow state, and pretty much explains the issue:

https://news.ycombinator.com/item?id=44811457

vlan0•5mo ago

What is described in that post is a skill issue. The attention issues spoke about can be worked through. Because attention is a skill like anything else. And if one does not learn to practice. One becomes victim to it

lsaferite•5mo ago

I find Claude great at all of the boilerplate needed to get testing in place. It's also pretty good at divining test cases to lock in the current behavior, even if it's buggy. I use Claude as a first pass on tests, then I run through each test case myself to make sure it's a meaningful test. I've let it loose on the code coverage loop as well, so it can drill in and get those uncommon lines covered. I still don't have a good process for path coverage, but I'm not sure how easy that is in go as I haven't checked into it much yet.

I'm with you 100% on the boring stuff. It's generally good at the boring stuff *because* it's boring and well-trod.

Terretta•5mo ago

> I can’t get into using AI tools like Claude Code. As far as I go is chat style where I’m mostly in control.

Try aider.chat (it's in the name), but specifically start with "ask" mode then dip a toe into "architect" mode, not "code" which is where Claude Code and the "vibe" nonsense is.

Let aider.chat use Opus 4.1 or GPT-5 for thinking, with no limit on reasoning tokens and --reasoning-effort high.

> agents are a boon for extraverts and neurotypical people.

On the contrary, I think the non-vibe tools are force multipliers for those with an ability to communicate so precisely they find “extraverts and neurotypical people” confounding when attempting to specify engineering work.

I'd put both aider.chat and Claude Code in the non-vibe class if you use them Socratically.

dionian•5mo ago

thanks for this, going to try it out - i need to use paid api and not my claude max or gpt pro subn, right?

victorbjorklund•5mo ago

Aider actually has a mode that is called "copy-paste" where it basically gives you a context to paste in an LLM chat and then you copy-paste back the reply to aider

mock-possum•5mo ago

Sounds like extra steps

garciasn•5mo ago

Agents are boon for introverts who fucking hate dealing with other people (read: me). I can iterate rapidly with another 'entity' in a technical fashion and not have to spend hours explaining in relatable language what to do next.

I feel as if you need to work with these things more, as you would prefer to work, and see just how good they are.

wahnfrieden•5mo ago

You are leaving a lot of productivity on the table by not parallelizing agents for any of your work. Seemingly for psychological comfort quirks rather than earnestly seeking results.

Automation productivity doesn’t remove your own agency. It frees more time for you to apply your desire for control more discerningly.

dexterlagan•5mo ago

I can imagine there are plenty of use cases, but I could not find one for myself. Can you give an example?

wahnfrieden•5mo ago

I personally use it for a lot of SwiftUI work. I parallelize it across at least 3 projects at once. I use only the largest models on highest thinking modes. I give instruction on implementation and provide reference implementations.

I also use it for adding features/feature polish that address user pain points but that I can't prioritize for my own manual work just yet. There are a variety of user requests which sometimes LLMs are able to get done very suddenly when I give it a shot quickly. For these tasks, it's ok to abandon & defer them if the LLM spins its wheels.

bastawhiz•5mo ago

> Agents are a boon for extraverts and neurotypical people

As a neurodivergent introvert, please don't speak for the rest of us.

antonvs•5mo ago

That stuck out to me as well. People will make up all sorts of stories to justify their resistance to change.

bastawhiz•5mo ago

It's the same as saying that writing good commit messages is a boon for extroverts and neurotypicals. It's a computer. You're giving it instructions, and the only difference to traditional coding is that the input is English text.

kevinsync•5mo ago

I kind of think we will see some industry attrition as a result of LLM coding and agent usage, simply because the ~vIbEs~ I'm witnessing boil down to quite a lot of resistance (for multiple reasons: stubbornness, ethics, exhaustion from the hype cycle, sticking with what you know, etc)

The thing is, they're just tools. You can choose to learn them, or not. They aren't going to make or break your career. People will do fine with and without them.

I do think it's worth learning new tools though, even if you're just a casual observer / conscientious objector -- the world is changing fast, for better or worse, and you'll be better prepared to do anything with a wider breadth of tech skill and experience than with less. And I'm not just talking about writing software for a living, you could go full Uncle Ted and be a farmer or a carpenter or a barista in the middle of nowhere, and you're going to be way better equipped to deal with logistical issues that WILL arise from the very nature of the planet hurtling towards 100% computerization. Inventory management, crop planning, point of sale, marketing, monitoring sensors on your brewery vats, whatever.

Another thought I had was that introverts often blame their deficits in sales, marketing and customer service on their introversion, but what if you could deploy an agent to either guide, perform, or prompt (the human) with some of those activities? I'd argue that it would be worth the time to kick the tires and see what's possible there.

It feels like early times still with some of these pie in the sky ideas, but just because it's not turn-key YET doesn't mean it won't be in the near future. Just food for thought!

HardCodedBias•5mo ago

"ethics"

I agree with all of your reasons but this one sticks out. Is this a big issue? Are many people refusing to use LLMs due to (I'm guessing here): perceived copyright issues, or power usage, or maybe that they think that automation is unjust?

kevinsync•5mo ago

I can't tell how widespread any of is, to be honest.. mostly because it's anecdata, and impossible to determine if what I'm seeing is just ragebait, or shallow dunks by reply-guys in comment sections, or particularly-loud voices on social media that aren't representative of the majority opinion, etc

That said, the amount of sort-of-thoughtless, I'm-just-repeating-something-I-heard-but-don't-really-understand outrage towards AI that I'm seeing appears to be increasing -- "how many bottles of water did that slop image waste??", "Clanker"-adjacent memes and commentary (include self-driving + robots in this category), people ranting about broligarchs stealing art, music, movies, books to train their models (oddly often while also performatively parroting party lines about how Spotify rips artists off), all the way to refusing to interact with people on dating apps if they have anything AI in their profiles hahaha (file "AI" alongside men holding fish in their pics, and "crypto" lol)

It's all chronically-online nonsense that may well just be perception that's artificially amplified by "the algorithm".

Me, I have no fundamental issue with any of it -- LLMs, like anything else, aren't categorically good or bad. They can be used positively and negatively. Everything we use and consume has hidden downsides and unsavory circumstances.

codingdave•5mo ago

Yes, people are refusing for those reasons. I don't know how many, but I'd say about half of the the people I know who do not work in tech are rejecting AI, with ethics being the primary reason. That is all just anecdata, but I suspect the tech bubble around AI is making people in tech underestimate how many people in the world simply are not interested in it being part of their lives.

cpldcpu•5mo ago

I think you misunderstand what this does. It is not only a coding agent. It is an abstraction layer between you and the computer.

klipklop•5mo ago

>Agents are a boon for extraverts and neurotypical people.

I completely disagree. Juggling several agents (and hopping from feature-to-feature) at once, is perfect for somebody with ADHD. Being an agent wrangler is great for introverts instead of having to talk to actual people.

filoleg•5mo ago

> Agents are a boon for extraverts and neurotypical people.

This sounds like a wild generalization.

I am in neither of those two groups, and I’ve been finding tools like Claude Code becoming increasingly more useful over time.

Made me much more optimistic about the direction of AI development in general too. Because with each iteration and new version it isn’t getting anywhere closer to replacing me or my colleagues, but it is becoming more and more useful and helpful to my workflow.

And I am not one of those people who are into “prompt engineering” or typing novels into the AI chatbox. My entire interaction is typically short 2-3 sentences “do this and that, make sure that XYZ is ABC”, attach the files that are relevant, let it do its thing, and then manual checks/adjustments. Saves me a boatload of work tbh, as I enjoy the debugging/fixing/“getting the nuanced details right” aspect of writing code (and am pretty decent at it, I think), but absolutely dread starting from a brand new empty file.

joshred•5mo ago

I think they're fantastic at generating the sort of thing I don't like writing out. For example, a dictionary mapping state names to their abbreviations, or extracting a data dictionary from a pdf so that I can include it with my documentation.

burnte•5mo ago

> Agents are a boon for extraverts and neurotypical people.

As an extrovert the chances I'll use an AI agent in the next year is zero. Not even a billion to one but a straight zero. I understand very well how AI works, and as such I have absolutely no trust in it for anything that isn't easy/simple/solved, which means I have virtually no use for generative AI. Search, reference, data transformation, sure. Coding? Not without verification or being able to understand the code.

I can't even trust Google Maps to give me a reliable route anymore, why would I actually believe some AI model can code? AI tools are helpers, not workers.

ragequittah•5mo ago

>no trust in it for anything that isn't easy/simple/solved

I'm not sure what part of programming isn't generally solved thousands of times over for most languages out there. I'm only using it for lowly web development but I can tell you that it can definitely do it at a level that surprises me. It's not just "auto-complete" it's actually able to 'think' over code I've broken or code that I want improved and give me not just one but multiple paths to make it better.

burnte•5mo ago

In the case of programming is not quite as problematic with unsolved problems as much as others, like completeness. In the case of programming, it's context and understanding. It's great for small chunks of code but people think you can vibe code entire interactive applications with no programming knowledge, but LLMs simply don't understand, so they can't keep a cohesive idea of what the end goal is in mind. The larger the codebase it needs to work on the more likely it is to make catastrophic errors, create massive security flaws, or just generate nonfunctional code.

Programming LLMs will become awesome when we create more narrowly targeted LLMs rather than these "train on everything" models.

taftster•5mo ago

For me (an introvert), I have found great value in these tools. Normally, I kind of talk to myself about a problem / algorithm / code segment as I'm fleshing it out. I'm not telling myself complete sentences, but there's some sort of logical dialog I am having with myself.

So I just have to convert that conversation into an AI prompt, basically. It just kind of does the typing for the construct already in my head. The trick is to just get the words out of my head as prompt input.

That's honestly not much different than an author writing a book, for example. The story line is in their head, they just have to get it on paper. And that's really the tricky part of writing a novel as much as writing code.

I therefore don't believe this is an introvert/extrovert thing. There are plenty of book authors which are both. The tools available as AI code agents are really just an advanced form of dictation.

sixo•5mo ago

At one point in my life I liked crafting code. I took a break, came back, and I no longer liked it--my thoughts ranged further, and the fine-grained details of implementations were a nuisance rather than ~pleasurable to deal with.

Whatever you like is probably what you should be doing right now. Nothing wrong with that.

wredcoll•5mo ago

> Agents are a boon for extraverts and neurotypical people.

Please stop with this kind of thing. It isn't true, it doesn't make sense and it doesn't help anyone.

MrDarcy•5mo ago

For what it’s worth I’m neurodivergent, introverted and have avoided management up to the staff+level. Claude Code is great I use it all day every day now.

mock-possum•5mo ago

It is effin nutzo that you would try to relate chatting with AI and agentic LLM codegen workflows to the intra/extra vert dichotomy or to neuro a/typicality - you so casually lean way into this absolute spectrum that I don’t even think associates the way you think it does, and it’s honestly kind of unsettling, like - what do you think you know about me, and about My People, that apparently I don’t know??

If it doesn’t work for you that’s fine, but turning it into some tribalised over-generalization is just… why, why would you do that, who is that kind of thing useful for??

block_dagger•5mo ago

I bet your code sucks in quality and quantity compared to the senior+ engineer who uses the modern tools. My code certainly did even after 20 years of experience, much of that as senior/staff level at well paying companies.

indigodaddy•5mo ago

Not sure if I'd want Claude doing whatever on a production vps/node, but I like the idea of a way to use Claude Code on the go/wherever you are. I'm going to setup KASM workspaces on my free OCI server and see how it works there.

https://hub.docker.com/r/linuxserver/kasm

prashantsengar•5mo ago

Thanks for sharing this! I have been trying on and off to run RooCode on a VPS to use it on the go. I tried Code Server but it does not share "sessions". KASM seems interesting for this. Do share if you write a blog post on setting it up

indigodaddy•5mo ago

It’s pretty straightforward through the Linuxserver docker image deployment. I have some notes here re: configuration and package persistence strategy via brew:

https://gist.githubusercontent.com/jgbrwn/28645fcf4ac5a4176f...

Lerc•5mo ago

This is the kind of thing people should be doing with AI. Weird and interesting stuff that has a "Let's find out!" Attitude.

Often there's as much to be learned from why it doesn't work.

I see the AI hype to be limited to a few domains.

People choosing to spend lots of money on things speculatively hoping to get a slice of whatever is cooking, even if they don't really know if it's a pie or not.

Forward looking imagining of what would change if these things get massively better.

Hyperbolic media coverage of the above two.

There are companies taking about adding AI for no other reason than they feel like that's what they should be doing, I think that counts as a weak driver of hype, but only because cumulatively, lots of companies are doing it. If anything I would consider this an outcome of hype.

Of these the only one that really affects me is AI being shoehorned into places it shouldn't

The media coverage stokes fires for and against, but I think it only changes the tone of annoyance I have to endure. They would do the same on another topic in the absence of AI. It used to be crypto,

I'm ok with people spending money that is not mine on high risk, high potential reward. It's not for me to judge how they calculate the potential risk or potential reward. It's their opinion, let them have it.

The weird thing I find is the complaints about AI hype dominating. I have read so many pieces where the main thrust of their argument is about the dominance of fringe viewpoints that I very rarely encounter. Frequently they take the stance that anyone imagining how the world might change from any particular form of AI as a claim that that form is inevitable and usually imminent. I don't see people making those claims.

I see people talking about what they tried, what they can do, and what they can't do. Everything they can't do is then held up by others as if it were a trophy and proof of some catestrophic weakness.

Just try stuff, have fun, if that doesn't interest you, go do something else. Tell us about what you are doing. You don't need to tell us that you aren't doing this particular thing, and why. If you find something interesting tell us about that, maybe we will too.

dizlexic•5mo ago

every vibe coded thing I've built is trash, but it's amazingly fun to do.

I've tried to explain it to other devs that it's like dumping out a 10000 piece jigsaw puzzle and trying to put it together again.

it's just fun.

throwaway31131•5mo ago

> it’s just fun

For some definitions of fun… :)

mmcconnell1618•5mo ago

There was a time when everyone hand-coded HTML. Then came Macromedia Dreamweaver and Microsoft FrontPage which promised a WYSIWYG experience. No one would ever need to "learn HTML and CSS" because the tool could write it for them. Those tools could crank out a website in minutes.

When those tools created some awful, complex and slow output, only the people who knew HTML could understand why it wasn't working and fix things.

Vibe coding is in a similar place. It demos really well. It can be powerful and allows for quick iteration on ideas. It works, most of the time. Vibe coding can produce some really terrible code that is not well architected and difficult to maintain. It can introduce basic logic errors that are not easily corrected through multiple prompts back to the system.

I don't know if they will ever be capable of creating production quality systems on par with what senior engineers produce or if they will only get incrementally better and remain best for prototypes and testing ideas.

cesarvarela•5mo ago

It is addicting

bkettle•5mo ago

> “wow, I really can do _anything_ if I can just figure out how

Except this time it’s “if I can just figure out how and pay for the Claude API usage”.

This is one of the sadder things about AI usage getting more standard that I haven’t seen discussed much—-the barrier to entry is now monetary rather than just knowledge-based, which will make it _much_ harder for young people with no money to pick up.

Yes, they can still write code the manual way, but if the norm is to use AI I suspect that beginner’s guides, tutorials, etc. will become less common.

mickael-kerjean•5mo ago

Yep, I used to spend a lot of time learning PHP on a web server which was part of my internet subscription. Without it being free, I would never have learn how to create websites and would have never got in programming, the trigger was that free web hosting with PHP that was part of the internet connection my parents were already paying for

alwillis•5mo ago

There are plenty of free models available; many that rival their paid counterparts.

A kid interested in trying stuff can use Qwen Coder for free [1].

If the kid's school has Apple Silicon Macs (or iPads), this fall, each one of them will have Apple's 3 billion parameter Foundation Models available to them for free [2].

Swift Playground [3] is a free download; Apple has an entire curriculum for schools. I would expect an upgrade to incorporate access to the on-board LLM

[1]: https://openrouter.ai/qwen/qwen3-coder:free

[2]: https://developer.apple.com/videos/play/wwdc2025/286

[3]: https://developer.apple.com/swift-playground/

yoz-y•5mo ago

I guess hardware being able to run a local model will eventually get cheap enough, but for a lot of people even buying an Apple device or something with a good enough GPU is prohibitive.

PeterStuer•5mo ago

True, it will get cheap to run today's frontier models. But, by that time, how much more advanced will the frontier models of that time be.

It is a real question. It all depends on whether the AI future is linear or exponential.

hadlock•5mo ago

I think we are already there. You can run a pretty ok LLM on a 4gb raspberry pi that will write most any simple 20-150 line bash script today, or toy application in python/rust. Old laptops pulled out of the trash are probably capable of running smaller LLMs and can explain how functions work. They're no claude code but you probably want a rough-around-the-edges LLM that can't do everything for you, if you're planning on using it to learn to code.

socalgal2•5mo ago

Swift and swift playground might be a good introduction to programming, but it feels likely not to lead to as many opportunities as a more popular system. And I don’t just mean job opportunities.

andai•5mo ago

Speaking of free models on OpenRouter, DeepSeek R1 0528 is also available for free.

https://openrouter.ai/deepseek/deepseek-r1-0528:free

pc86•5mo ago

"Already being paid for by someone else" is very different than "free."

sbarre•5mo ago

Eh back in the day computers were expensive and not everyone could afford one (and I don't mean a library computer that you can work on, one you can code and hack on). The ubiquity of computing is not something that's been around forever.

There have always been costs and barriers for the cutting edge.

horacemorace•5mo ago

The problem isn’t cost, it’s reproducibility and understanding. If rely on a service you can’t fully understand to get something done, you’re beholden to the whims of its provider.

sbarre•5mo ago

Sure but that's not what the person I was replying to was talking about, nor what I was talking about.

Cost of access is absolutely a problem in tech.

The problem can certainly be multi-faceted though.

nostrademons•5mo ago

They're not that expensive for anyone that has the tech skills to actually make good use out of them. I've been paying around with Claude Code, using API credits rather than the monthly fee. It costs about $5 per one-hour session. If you're going to be doing this professionally it's worth springing for the $100/month membership to avoid hitting credit limits, but if you just want to try it out, you can do so without breaking the bank.

A bigger question for me is "Does this actually increase my productivity?" The jury is still out on that - I've found that you really need to babysit the algorithm and apply your CS knowledge, and you also have to be very clear about what you're going to tell it later, don't let it make bad assumptions, and in many cases spell out the algorithm in detail. But it seems to be very good at looking up API details, writing the actual code, and debugging (if you guide it properly), all things that take a non-trivial amount of tedium in everyday programming.

eloisius•5mo ago

12-year-old me wasn’t putting my tech skills to good use enough to pay $5 every time I sat down at the computer. I was making things though, and the internet was full of tutorials, chat rooms, and other people you could learn from. I think it would be sad if the same curious kid today was told “just pay $5 and ask Claude” when pestering someone in IRC about how to write a guestbook in Perl.

nostrademons•5mo ago

12-year-old me wasn't either, but he was noodling around on a computer that cost $2500 (more like $5500 in today's dollars). I think our parents loved us very much and must have had some means to afford the capital cost of a computer back then.

I don't see my 7-year-old paying $5 for each hour he wants to program (and no way in hell would I give him my credit card), but I could easily envision paying $20/month for a Claude subscription and letting him use it. We pay more than that for Netflix & Disney+.

dingnuts•5mo ago

I'm extremely privileged and I had a quarter of what you did growing up. Your experience and your kids' is not typical.

jack_pp•5mo ago

12-year-old me was mostly procrastinating but sometimes fired up Pascal which required me to insert a floppy disk in my 486 compaq machine for it to work. the machine was a donation from my aunt, could only run DOS.

However chatgpt or gemini free tier is more than enough for a kid to figure out how python works and build some simple software. While I have the Gemini subscription I only got it because my family drive storage was nearly full. I could've probably got by with ChatGPT free to just stop using stackoverflow.

piva00•5mo ago

10-year-old me was programming on a salvaged 386 that my dad got from a friend after the company they worked at went bankrupt, and left the machine as trash. Instead of Christmas gifts I asked for programming books and pitched in some of the birthday money my grandparents would give me (about US$ 2 every birthday).

Not everyone was privileged, some of us were just lucky.

kelnos•5mo ago

> noodling around on a computer that cost $2500 (more like $5500 in today's dollars)

Wow! 12-year-old me was noodling around on a computer that my dad brought home from work because it would have otherwise ended up in landfill. We had very little money for computers back then, and I was thrilled when my parents gave me a budget to buy parts to build my own from scratch when I was about to go off to college (I'd saved up a bit myself, but not nearly enough).

I think your experience is pretty privileged, and not at all common.

matwood•5mo ago

Everyone has some level of privilege. I didn't get my first PC until I was a freshman in college. I had to spend part of my college loan buying one (~$3k IIRC). Up to that point I had only played with the Apple IIc's and the few Macs they had at my high school..

Information on programming also wasn't as readily available as it is now. I used to go the book stores and use pencil and paper and copy out solutions since $50+ for a book was way more money than I could spend.

Everything today is crazy inexpensive for the value.

pc86•5mo ago

So what?

That sounds dismissive, and maybe it is, but I'm being serious here. What is the point of coming here and saying "when I was 12, my parents had less money than yours did when you were 12?" Privilege is relative, "common" is relative, and constantly being dragged into oppression Olympics of who has or had things slightly worse is exhausting and not conducive to conversation.

Better keep in mind that someone here almost certainly had it even worse than you when they were in elementary school, lest you go a few seconds without acknowledging your privileged upbringing, for some reason, in a conversation where it bears absolutely no relation to anything.

nostrademons•5mo ago

There's always more to the story than the Internet assumes.

We were quite possibly less privileged than you were, if your dad brought a computer home from work. I grew up with a teacher and a househusband for parents; single-income, and that income made about 1/3 of what engineers or other computer professionals made. My kid had more passport stamps at age 2 than I did at age 18.

It was $2500 because it was a Mac LC, and it was a Mac LC because that could take an Apple 2E card and run both Mac and Apple software, and that was important because my mom was a teacher and had a large library of educational software at school that she could take home. Recall that in those days, software was sold in retail stores (no Internet), and cost $50-100 for kiddie stuff, and like $400 for productivity and compilers. 25 titles and the cost of the computer paid for itself in free software. I think we used about that.

It's a matter of priorities. My parents always prioritized my education: they bought a computer, and whatever software didn't come from my mom's workplace, and any books I wanted, and paid for my college education in full. We didn't have a whole lot other than that: we didn't take a lot of vacations or plane trips (and a single plane trip would cost more than that $2500 in those days), ran our cars into the ground (my mom owned 2 cars over my entire childhood), wore hand-me-downs.

sgarland•5mo ago

12-year-old me had (or rather, my family had) a Celeron 333 MHz and a Pentium III 550 MHz, both from Gateway, because that was the sole awesome perk my dad got from working there: literally free computers, with a required number of years of employment to pay them off. In 2000, the P3 was still pretty hot shit. I dual-booted them with every Linux distro under the sun. Since we had dial-up, the only way I had those distros was from 4-H [0], which at the time in Nebraska had a partnership with University of Nebraska to do tech instruction; once a quarter, we’d drive down to a campus (usually UNL) and spend a weekend learning something (LAMP stack, hardware troubleshooting, etc.), and having a LAN party at night. Also we had free access to their (at the time) screamingly fast internet, so I would download distros and packages to try out later.

My online upbringing was very much of the RTFM variety, and I am convinced that was and is a good method to learn. It’s not like the grumpy graybeards were cruel, they just didn’t want to waste their time answering the same “how do I…” questions from noobs. If you explained what you were experiencing, what you had read, and what you had tried, they were more than happy to help out. I don’t think that’s an unreasonable approach.

[0]: https://4-h.org/

hdjrudni•5mo ago

I think you said it. $100/mo and you're not even sure if it'll increase your productivity. Why on earth would I pay that? Do I want to flush $100 down the toilet and waste several days of my life to find out?

nostrademons•5mo ago

You don't have to pay $100 to find out, you can do that for ~$5-20 by directly buying API credits.

I don't know for sure whether it's worth it yet. Further experimentation is needed, as well as giving it an honest shot and trying to learn the nuances of the tool. But the way I look at it - if this actually is a future career path, the net present value of its payoff is measured in the millions of dollars. It's worth spending ~$20 and a few nights of my time to figure that out, because the odds can be pretty damn low and still have the expected value pencil out. It's sorta like spending $200 on 1/4 of a Bitcoin in 2013 because I was curious about the technology - I fully expected it to be throwing money down the toilet, but it ended up being quite worth it. (I wish I'd had the same mindset when I could've bought into the Ethereum ICO at a penny or so an ETH.)

handfuloflight•5mo ago

Are you spending every $100 absolutely efficiently? How do you know?

jrflowers•5mo ago

Can I have a hundred dollars? I assure you giving it to me is efficient

kyleee•5mo ago

And it’s even more efficient to give me $90

barrell•5mo ago

I have the tech skills to use them. In my 30s and I could not spend $5 on a one hour coding session even if it 10xed my productivity. 1-2 hours would literally break the bank for me

mark_l_watson•5mo ago

yes indeed, who will pay? I run a lot through open models locally using LM Studio and Ollama, and it is nice to only be spending a tiny amount of extra money for electricity.

I am retired and not wanting to spend a ton of money getting locked long term into using an expensive tool like Claude Code is a real thing. It is also more fun to sample different services. Don’t laugh but I am paying Ollama $20/month just to run gpt-oss-120b very fast on their (probably leased) hardware with good web search tooling. Is it worth $20/month? Perhaps not but I enjoy it.

I also like cheap APIs: Gemini 2.5-flash, pro when needed, Kimi K2, open models on Groq, etc.

The AI, meaning LLM, infrastructure picture is very blurred because of so many companies running at a loss - which I think should be illegal because long term I think it is misleading consumers.

piva00•5mo ago

> The AI, meaning LLM, infrastructure picture is very blurred because of so many companies running at a loss - which I think should be illegal because long term I think it is misleading consumers.

In a sense it is illegal, even though the whole tech scene has been doing it for decades, price dumping is an illegal practice and I still don't understand why it has never been considered as such with tech.

Most startups with VC investors work only through price dumping, most unicorns came to be from this bullshit practice...

nl•5mo ago

"Price dumping" isn't an economic term in common use.

"Dumping" in international trade is somewhat similar but the reasons that is illegal are very different: https://en.m.wikipedia.org/wiki/Dumping_(pricing_policy)

Pricing at a loss by VC funded companies is great for consumers. It rarely is at a loss though - they look at the lifetime value.

Pricing at a loss by big tech could be viewed as anticompetitive. Personally I like that Gemini keeps OpenAI prices lower but one could argue it has stopped OpenAIs growth.

Filligree•5mo ago

I wouldn’t assume Gemini is being run at a loss, though. At least not that, if it weren’t, that would help OpenAI much.

Google uses Google hardware, which costs them 1/10 what nvidia hardware costs everyone else.

piva00•5mo ago

> Pricing at a loss by VC funded companies is great for consumers. It rarely is at a loss though - they look at the lifetime value.

It's great for consumers only in the short term, the strategy to drive out competition that are not as well-funded has only one goal: to remove competition in the long-term to drive up prices at your will since most competitors won't have the chance to exist.

Edit: yes, technically dumping is a specific type of predatory pricing, so swap "price dumping" on my first comment to "predatory pricing" instead.

nl•5mo ago

It doesn't have one goal.

In fact driving out competition is rarely the goal at all.

Instead the goal is usually to reduce the barrier to people trying the thing - especially when it is a developer API which you hope developers will incorporate into their product.

piva00•5mo ago

> In fact driving out competition is rarely the goal at all.

Driving out competition is definitely a goal, the further you can snowball that makes your company a much more attractive investment since your competition will be bleeding money, attrition is definitely used as a tactic by VCs when a startup gets traction. Hell, it's one of the arguments they use to run further rounds of investments to others "this startup is very well capitalised and the competition has 1/10th of their funds, investing elsewhere is a losing proposition".

> Instead the goal is usually to reduce the barrier to people trying the thing - especially when it is a developer API which you hope developers will incorporate into their product.

I thought we were talking about unicorns such as Uber, AirBnb, etc., not some dev startup packaging APIs to serve other startups which is a whole other incestuous industry.

guappa•5mo ago

AI company founder and CTO defends those practices… yawn.

miohtama•5mo ago

One can create a free Google account and use Gemini for free.

Or think it this way: It's easy to get base level free LLM (Toyota) but one should not expect free top of the shelf (Porsche).

ubercow13•5mo ago

Previously most Porsche development tools were available to everyone though, such as GCC.

ijidak•5mo ago

Software development costs hundreds of dollars in the 90s. My parents bought VB 6 for $600.

Only in tech are we shocked when things cost money. I don't know that any other industry expects such a reality.

Arisaka1•5mo ago

You made me realize exactly why I love skill-based video games, and shun the gacha games (especially those with PvP). You swiped to gain power over players who don't. Yay?

The knowledge check will also slowly transfer towards the borders of fast iteration and not necessarily knowledge depth. The end goal is to make a commodity out of the myth of the 10x dev, and take more leverage away from the devs.

noobermin•5mo ago

This is a pro for a lot of the people whom AI people are targeting: idiots with money.

icemelt8•5mo ago

be careful maybe the idiots will be the only one left with money, and the smart people like you could be homeless.

kubb•5mo ago

If the trend of the last four decades continues, this is the outcome.

noelwelsh•5mo ago

I agree that access is a problem now, but I think it is one that hardware improvements will solve very quickly. We are a few generations of Strix Halo type hardware away from effortlessly running very good LLMs locally. (It's already possible, but the hardware is about $2000 and the LLMs you can run are good but not very good.) AFAIK AMD have not released the roadmap for Medusa Halo, but the rumours [1] are increased CPU and GPU performance, and increased bandwidth. Another iteration or two of this will make Strix Halo hardware more affordable, and the top-of-the-line models will be beasts for local LLMs.

[1]: https://www.notebookcheck.net/Powerful-Zen-6-Medusa-Halo-iGP...

dirkc•5mo ago

Maybe local models can address this, but for me the issue is that relying on LLMs for coding introduces gatekeepers.

> Uh oh. We're getting blocked again and I've heard Anthropic has a reputation for shutting down even paid accounts with very few or no warnings.

I'm in the slack community where the author shared their experiment with the autonomous startup and what stuck out to me is that they stopped the experiment out of fear of being suspended.

Something that is fun should not go hand-in-hand with fear of being cut off!

block_dagger•5mo ago

LLMs are quickly becoming cheaper. Soon they will be “cheap as free,” to quote Homestar Runner. Then programming will be solved, no need for meatbags. Enjoy the 2-5 years we have left in this profession.

barrell•5mo ago

You say that, but subscription prices keep going up. Token price goes down but token count goes up. Companies are burning billions to bring you the existing prices, and multiple hundreds per month is not enough to clear the bar to use these tools.

I’m personally hoping for a future with free local LLMs, and I do hope the prices go down. I also recognize I can do things a little cheaper each year with the API.

However it is far from a guaranteed which direction we’re heading in, and I don’t think we’re on track to get close to removing the monetary barrier anytime soon.

frognumber•5mo ago

My bill for LLMs is going up over time. The more capable, higher-context models dramatically increase my productivity.

The spend prices most of the developing world out -- an programmer earning $10k per year can't pay for a $200/month Claude Max subscription..

And it does better than $6k-$10k programmers in Africa, India, and Asia.

It's the mainframe era all over again, where access to computing is gated by $$$.

achenet•5mo ago

> The spend prices most of the developing world out -- an programmer earning $10k per year can't pay for a $200/month Claude Max subscription..

No, but a computer earning $10k per year can probably afford a $200 used ThinkPad, install Linux on it, build code that helps someone, rent a cheap server from a good cloud provider, advertise their new SaaS on HN, and have it start pulling in enough revenue to pay for a $200 Claude Max subscription.

> It's the mainframe era all over again, where access to computing is gated by $$$.

It's still the internet era, where access to $$$ is gated by computing skill :)

achenet•5mo ago

Did you read the original article?

LLM code still needs to be reviewed by actual thinking humans.

coldtea•5mo ago

>the barrier to entry is now monetary rather than just knowledge-based, which will make it _much_ harder for young people with no money to pick up.

Considering opportunity cost, a young person paying $20 or $100 per month to Claude API access is way cheaper than a young person spending a couple of years to learn to code, and some months coding something the AI can spit in 10 minutes.

AI coding will still create generations that even programming graduates know fuck all about how to code, and are also bad at reasoning about the AI produced code they depend on or thinking systematically (and that wont be getting any singularity to bail them out), but that's beside the point.

sdenton4•5mo ago

Applying opportunity cost to students is a bit strange...

People need to take time to get good at /something/. It's probably best to work with the systems we have and find the edge where things get hard, and then explore from there. It's partly about building knowledge, but also about gumption and getting some familiarity with how things work.

typewithrhythm•5mo ago

But all the other students are doing the same, so the expectation will quickly become use of tools for potentially years.

My introduction to programming was through my dad's outdated PC and an Arduino, and that put me on par with the best funded.

palata•5mo ago

> This is one of the sadder things about AI usage getting more standard that I haven’t seen discussed much—-the barrier to entry is now monetary

Agreed. And on the one hand you have those who pay an AI to produce a lot of code, and on the other hand you have those who have to review that code. I already regularly review code that has "strange" issues, and when I say "why does it do this?" the answer is "the AI did it".

Of course, one can pay for the AI and then review and refactor the code to make it good, but my experience is that most don't.

guappa•5mo ago

At my work I'm just never reviewing code of certain team members and I let the team leader do it. Their hire, their problem.

If the code becomes a mess and fixing anything becomes slower… well who cares I'm paid per hour not per task done.

infecto•5mo ago

There has generally always been some barrier. Computer access, internet access, books etc. If AI coding stays around, which looks like it will, it will just be the current generations barrier.

I don’t think it is sad at all. There are barriers to all aspects of life, life is not fair and at least in our lifetimes will never be. The best anyone can do is to help those around them and not get caught up the slog of the bad things happening in the world.

xnorswap•5mo ago

But traditional barriers have been able to be knocked down more easily with charity, because it's easier to raise charity money for capex than opex.

It was common to have charity drives to get computers into schools, for example, but it's much harder to see people donating money for tokens for poor people.

Previous-generation equipment can be donated, and can still spark an interest in computing and programming. Whereas you literally now can't even use ChatGPT-4.

conradev•5mo ago

Small models and processors are going to continue improving to the point that you’ll be able to vibe code locally on your phone at some point.

When the iPhone came out, not everyone had a smartphone. Now 90% of the US has a smartphone, and many of these smartphones run generative local models.

pc86•5mo ago

"It's harder to convince other people to pay for this for me" is an insane criticism. Not every AI model needs a premium account, you can even run many excellent models locally if you don't want to pay for an internet connection.

At some point you just have to accept that yes things are easier if you have a little bit of spending money for things. That's not "sad" it's a basic fact of life.

xnorswap•5mo ago

You have been mean with your interpretation of my statement.

I am not saying, "It's harder to convince other people to pay for this for me".

I am saying, "It is harder for me to pay for this for someone else".

pc86•5mo ago

It probably is harder to convince others to pay for opex than capex - maybe that's a good thing, maybe not.

But it's certainly not any harder for you to donate this money to someone because they want to spend that money on tokens instead of a computer, for example.

Zambyte•5mo ago

I think you're still misinterpreting them. It's harder to donate tokens than it is to donate books or a computer. Not all donations are cash.

Though, in a sense, I am donating tokens. I run an LLM box at my apartment, and I let some people in my community have a free account through my hosted Open WebUI.

infecto•5mo ago

This feels like picking a random thing against LLMs to complain about. These tools are not even necessary today I am not sure why they would be necessary tomorrow beyond efficiency. If that day does come though, you would have to assume open source models would also be coming a long way.

rurp•5mo ago

Having been a poor person learning how to code I'd say there's a huge difference between just needing a computer vs needing that plus a dozens per month subscription.

I don't know that there's much we can do about that potentially becoming the new normal in the future, but it bums me out.

reidjs•5mo ago

There are free and offline options, like Llama.cpp, but you will have to pay by giving up your privacy to Meta (or similar large companies)

rurp•5mo ago

Sure, for now, and maybe in the future. But it's possible that paid models will end up greatly outpacing free ones, and at some point the companies controlling them will stop burning billions of dollars per month and jack up prices.

Revisional_Sin•5mo ago

How does using an offline model give up your privacy?

Also, running models locally requires good hardware to get acceptable performance. It's still a large barrier to entry.

infecto•5mo ago

No it’s not much different. I grew up poor. It was a struggle to have internet access.

michaelrpeskin•5mo ago

I attribute my barriers to entry as things that forced me to really learn. All my family could afford was a 386 with 16MB of ram when 486s where pretty common. I had to really hack to make things work. Working under constraints meant I was exploring limits and understanding efficiency.

I still carry that in my day job which I _think_ helps me write better code - even in unconstrained systems, thinking in terms of memory and instruction efficiency can still help make better/faster code.

mosselman•5mo ago

Does anyone have a good recommendation of a claude code like tool that uses locally hosted models?

jsnider3•5mo ago

I believe gemini-cli can do this. I'm not sure though.

heap_perms•5mo ago

Yes! https://github.com/musistudio/claude-code-router

Claude Code wrapper to run any models. Even local ones.

rurp•5mo ago

Very true. One of the greatest aspects of the field is how accessible it is, and that is certainly going to get worse with LLM usage.

I'd probably be toiling away in a less productive industry if I hadn't been able to easily download Python and start learning it for free.

j45•5mo ago

Maintaining scheduled playing with what's changed/new/different is mandatory with the tools one already uses, let alone any new ones.

joks•5mo ago

> It's just fun to mess with these tools

I think this is the main sentiment I can't wrap my head around. Using Claude Code or Cursor has been entirely a mind-numbingly tedious experience to me (even when it's been useful.) It's often faster, but 80% of the time is spent just sitting there waiting for it to finish working, and I'm not proud of the result because I didn't do anything except come up with the idea and figure out how to describe it well. It just ends up feeling like the coding equivalent of...like...copying down answers to cheat on a test. Not in the sense that it feels gross and wrong and immoral, but in the sense that it's unsatisfying and unfulfilling and I don't feel any pride in the work I've done.

For things where I just want something that does something I need as quickly as possible, sure, I wasn't going to care either way, but personal projects are where I find myself least wanting to vibe code anything. It feels like hiring someone else to do my hobbies for me.

burntpineapple•5mo ago

if I don't see aider in the first sentence, I send it back

ramesh31•5mo ago

Particularly with the VSCode extension. I was a loyal Cline user until recently because of how good the editor experience was, but the ability for Claude to go off and run for 10+ minutes effectively autonomously, and show me the diffs in realtime is a gamechanger. The token usage has also gotten much more efficient in the last few months. With proper IDE support now I don't see any reason at all to use anything else, especially not the "credit" based middle-man providers (Windsurf/Cursor et. al).

monkpit•5mo ago

Same here, I was convinced Cline+OpenRouter was the way to go. But with Claude code I’m getting better results and saving money, even compared to planning with Sonnet and transitioning to act mode with DeepSeek, I was still using more than $20/mo easily.

chaosprint•5mo ago

The title is a bit exaggerated. The depth of the projects covered in the article is clearly not representative of "all".

In fact, I now prefer to use a purely chat window to plan the overall direction and let LLM provide a few different architectural ideas, rather than asking LLM to write a lot of code whose detail I have no idea about.

NitpickLawyer•5mo ago

Most harnesses provide this as a "plan" vs. "act" mode now. You first "chat" in plan mode (no access to tools, no instructions to write any code basically), you then can optionally write those plans in a memorybank / plan.md, and then say "now go implement it", and it moves to the "act" mode where it goes through and does it, updating progress in plan.md as it goes.

pseudosavant•5mo ago

I've found it very useful to have items like requirements.md, plans.md, or todo.md, in my LLM focused projects. I'll use AI to help take the ideas I have at that stage and refine them into something more appropriate for ingestion into the next stage. So, when I want it to come up with the plans, it is going to base is mostly on requirements.md, and then I'll have it act on the plans step by step after that.

chaosprint•5mo ago

the thing is, it's not working as the default mode, which is not ideal imho

skerit•5mo ago

I like using Claude-Code, it can be a real timesaver in certain cases.

But it's far from perfect. Really difficult things/big projects are nearly impossible. Even if you break it down into hundred small tasks.

I've tried to make it port an existing, big codebase from one language to another. So it has all of the original codebase in one folder, and a new project in another folder. No matter how much guidance you give it, or how clear you make your todos, it will not work.

crazygringo•5mo ago

What specifically are its modes of failure? I've never tried doing that, do very curious what the roadblocks are.

phist_mcgee•5mo ago

I've done something similar for a refactor.

It simply forgets code exists during a port. It will port part of a function and ignore the rest, it will scan a whole file into context and then forget that a different codepath exists.

I would never rely on it for a 1:1 mapping of large features/code transformations. Small stuff sure, but beyond say a few large files it will miss things and you will be scratching your head for why it's not working.

phito•5mo ago

Sometimes, it also tells you it is done, but if you look at the code, there's a bunch of placeholder comments in the style of "will implement this later if we figure out how to do it".

skerit•5mo ago

Indeed, all of the above. It's impossible to just say "Here is the original file, here is the new file, port everything over" and think that'll be enough. Maybe making a plan for each file and specifically tell it to port each specific function over would work, but at that point you're really just making a literal translation from one language to the other.

OldfieldFund•5mo ago

It's a play on the name of the paper that jump-started ChatGPT: "Attention Is All You Need:" https://arxiv.org/abs/1706.03762

dexwiz•5mo ago

That's my gist. All of these seem pretty basic apps I would see implemented to demo a new web or REST framework. Comment ranker is cool, but I can't imagine its doing much more than scrape text > call semantic api > modify DOM.

How much of this is buildings versus recalling tutorials in the dataset. For every vibe coded project with 20 lines of requirements, I have a model with 20 different fields all with unique semantic meanings. In focused areas, AI has been okay. But I have yet to see Claude or any model build and scale a code base with the same mindset.

dabedee•5mo ago

This article feels like it was written as a dialectical exercise between an AI and a human. It would probably benefit from some more heavy human editing to make it more succinct and to give the overall article a structure. As it is, it's very difficult to follow along.

4b11b4•5mo ago

very

turtletontine•5mo ago

I’ve seen a lot of articles like this on the HN page recently… stuff that has one or two interesting tidbits, but is clearly just a conversation someone had with an AI and dumped into an article. Don’t make me wade through all the AI word barf to get the interesting points, that’s what old fashioned editing is for.

jlengrand•5mo ago

Did you read his conclusion?

"I wrote this entire article in the Claude Code interactive window. The TUI flash (which I've read is a problem with the underlying library that's hard to fix) is really annoying, but it's a really nice writing flow to type stream of consciousness stuff into an editor, mixing text I want in the article, and instructions to Claude, and having it fix up the typos, do the formatting, and build the UX on the fly.

Nearly every word, choice of phrase, and the overall structure is still manually written by me, a human. I'm still on the fence about whether I'm just stuck in the old way by preferring to hand-craft my words, or if models are generally not good at writing.

Either he's lying, or you're wrong.

Agree on the structure part. I mostly read it as a piece from someone who's having fun with the tool. Not a structured article for future generations.

bapak•5mo ago

This is the longest article I read in its entirety this month so it can't be that bad. Maybe because I actually was interested in the details.

hungryhobbit•5mo ago

>1) Have faith (always run it with 'dangerously skip permissions', even on important resources like your production server and your main dev machine.

I thought the article was a satire after I read this ... but it wasn't!

bubblyworld•5mo ago

Haha, well at least they warned you!

> If you're from infosec, you might want to stop reading now — the rest of this article isn't going to make you any happier. Keep your medication close at hand if you decide to continue...

glitchc•5mo ago

You're just making a case for why developers cannot be trusted with sensitive information, and why cyber depts lock the machine down so extensively.

bubblyworld•5mo ago

Eh? I'm making no such case. It was a quote from the article.

csomar•5mo ago

I just came for the comments for this... I am not sure at what point we are. Think AI and Crypto are a match in hell, especially given that a lot of Crypto projects are made by bros who have no interest in tech. estimate we'll be seeing projects/companies that get hacked as soon as they launch by Claude itself.

Thrymr•5mo ago

> I hit a small snag where Anthropic decides that running Claude as root with --dangerously-skip-permissions / yolo-mode is not allowed. You can get past this dumb nanny-state stuff by running [fun dangerous command that lets you run as root]

Still not convinced it is not satire.

cm2012•5mo ago

Obviously you wouldn't want to do this to any revenue generating code, but when just mucking around on prototypes this seems fine.

newtwilly•5mo ago

I run with the dangerous option on my work computer. At first I was thinking I would be good if I just regularly kept full disk backups. But my company at least pays lip service to the fact that we want to protect our intellectual property. Plus I think it might be irresponsible to allow an AI model full internet access unsupervised.

So now I use a docker compose setup where I install Claude and run it in a container. I map source code volumes into the container. It uses a different container with dnsmasq with an allowlist.

I initially wanted to do HTTP proxying instead of DNS filtering since it would be more secure, but it was quite hard to set it up satisfactorily.

Running CLI programs with the dangerous full permissions is a lot more comfortable and fast, so I'm quite satisfied.

devmor•5mo ago

This article seems fun, and it's interesting, but I was waiting for the point and it never came.

The author didn't do anything actually useful or impactful, they played around with a toy and mimicked a portion of what it's like to spin up pet projects as a developer.

But hey, it could be that this says something after all. The first big public usages of AI were toys and vastly performed as a sideshow attraction for amused netizens. Maybe we haven't come very far at all, in comparison to the resources spent. It seems like all of the truly impressive and useful applications of this technology are still in specialized private sector work.

jofer•5mo ago

I appreciate this writeup. I live in the terminal and work primarily in vim, so I always appreciate folks talking about tooling from that perspective. Little of the article is that, but it's still interesting to see the workflow outlined here, and it gives me a few ideas to try more of.

However, I disagree that LLMs are anywhere near as good as what's described here for most things I've worked with.

So far, I'm pretty impressed with Cursor as a toy. It's not a usable tool for me, though. I haven't used Claude a ton, though I've seen co-workers use it quite a bit. Maybe I'm just not embracing the full "vibe coding" thing enough and not allowing AI agents to fully run wild.

I will concede that Claude and Cursor have gotten quite good at frontend web development generation. I don't doubt that there are a lot of tasks where they make sense.

However, I still have yet to see a _single_ example of any of these tools working for my domain. Every single case, even when the folks who are trumpeting the tools internally run the prompting/etc, results in catastrophic failure.

The ones people trumpet internally are cases where folks can't be bothered to learn the libraries they're working with.

The real issue is that people who aren't deeply familiar with the domain don't notice the problems with the changes LLMs make. They _seem_ reasonable. Essentially by definition.

Despite this, we are being nearly forced to use AI tooling on critical production scientific computing code. I have been told I should never be editing code directly and been told I must use AI tooling by various higher level execs and managers. Doing so is 10x to 100x slower than making changes directly. I don't have boilerplate. I do care about knowing what things do because I need to communicate that to customers and predict how changes to parameters will affect output.

I keep hearing things described as an "overactive intern", but I've never seen an intern this bad, and I've seen a _lot_ of interns. Interns don't make 1000 line changes that wreck core parts of the codebase despite being told to leave that part alone. Interns are willing to validate the underlying mathematical approximations to the physics and are capable of accurately reasoning about how different approximations will affect the output. Interns understand what the result of the pipeline will be used for and can communicate that in simple terms or more complex terms to customers. (You'd think this is what LLMs would be good at, but holy crap do they hallucinate when working with scientific terminology and jargon.)

Interns have PhDs (or in some cases, are still in grad school, but close to completion). They just don't have much software engineering experience yet. Maybe that's the ideal customer base for some of these LLM/AI code generation strategies, but those tools seem especially bad in the scientific computing domain.

My bottleneck isn't how fast I can type. My bottleneck is explaining to a customer how our data processing will affect their analysis.

(To our CEO) - Stop forcing us to use the wrong tools for our jobs.

(To the rest of the world) - Maybe I'm wrong and just being a luddite, but I haven't seem results that live up to the hype yet, especially within the scientific computing world.

smithkl42•5mo ago

This is roughly my experience with LLMs. I've had a lot of friends that have had good experience vibe coding very small new apps. And occasionally I've had AI speed things up for me when adding a specific feature to our main app. But at roughly 2 million lines of code, and with 10 years of accumulated tribal knowledge, LLMs really seem to struggle with our current codebase.

The last task I tried to get an LLM to do was a fairly straightforward refactor of some of our C# web controllers - just adding a CancellationToken to the controller method signature whenever the underlying services could accept one. It struggled so badly with that task that I eventually gave up and just did it by hand.

The widely cited study that shows LLMs slow things down by 20% or so very much coheres with my experience, which is generally: fight with the LLM, give up, do it by hand.

zanellato19•5mo ago

My experience is that sometimes they give you a 10x speedup but then you hit a wall and take 30 times longer to do a simple thing and a lot of people just keep hammering because of the first feeling. Outside of boilerplate, I haven't seen it be this magical tool people keep claiming it is.

SeriousM•5mo ago

That's the definition of an advanced scaffolding tool. And yes, I subscribe to that. From time to time I use Gemini CLI for little tools I have no time to read all the docs of thinkgs I'm not used to, but in the end I need to make flow changes and be forced to understand the generated code. x10 faster bootstrap, x30 slower manual changes, 100% my codebase problem.

zmmmmm•5mo ago

> I have been told I should never be editing code directly and been told I must use AI tooling by various higher level execs and managers

Wow, this is really extreme. We certainly got to this point faster than I expected.

jofer•5mo ago

To be fair, it's the higher level folks who are too far removed from things to have any actual authority. I've never heard a direct single-team engineering manager something like that. But yeah, CEOs say crazy crap. And we're definitely there, though to be fair, his exact quote was "I insist everyone try to have AI generate your code first before you try making any direct changes". It's not _quite_ as bad as what I described. But then the middle management buys in and says similar things. And we now have a company level OKR around having 80% of software engineers relying on AI tooling. It's a silly thing to dictate.

akra•5mo ago

In my view its a tool, at least for the moment. Learn it, work out how it works for you, and what it doesn't work for you. But assuming you are the professional they should trust your judgement, and you should also earn that trust. That's why you pay skilled people for. If that tool isn't the best to getting the job done use something else. Of course that professional should be evaluating tools and assuring us/management (whether by evidence or other means) that the most cost efficient and quality product is being built like any other profession.

I use AI, and for some things its great. But I'm feeling like they want us to use the "blunt instrument" that is AI when sometimes a smaller, more fine grained tool/just handcrafting code for accuracy at least for me is quicker and more appropriate. The autonomy window as I recently heard it expressed.

alberth•5mo ago

OT: my 14-year old nephew would like to use Claude Code. How do they signup for an account given they don’t have a cellphone?

(Sure, I could let them use my credentials but that isn’t really legit/fair use.)

colechristensen•5mo ago

Have someone who isn't ever going to use claude code sign up for him and then give him the credentials. (do you have a partner or other relative not in tech?)

traceroute66•5mo ago

Surely your love for your nephew is priceless ?

Do the right thing, sign up for an API account and put some credits on there...

(and keep topping up those credits ;-)

esafak•5mo ago

Use a local model like Qwen3-4B-Thinking-2507 or Gemini CLI, which has a generous free tier.

indigodaddy•5mo ago

Note that they should be able to get GitHub Copilot Pro for free via k-12 student 13yrs+

codazoda•5mo ago

Like other posters said, maybe a local model is a good option. I've found the Qwen3:4B (reasoning) model works pretty well for many things.

I'm planning to run a local model on a $149 mini-pc and host it for the world from my bedroom. You can read a bit more about my thinking below.

https://joeldare.com/my_plan_to_build_an_ai_chat_bot_in_my_b...

These hosted models are better but it feels like the gap is closing and I hope it continues to close.

lvl155•5mo ago

If Anthropic is smart they would open it up to other models now to make it default for everyone. Otherwise you are banking on Sonnet remaining the best coding model.

CuriouslyC•5mo ago

There's Claude Code Router, that lets you use any model with Claude Code. Claude is a really good model for agents though, even though Gemini 2.5 and GPT5 are better models overall, Claude uses tools and plans tasks more effectively. A better pattern is to provide sub agents in Claude Code that call out to other LLMs as tools for planning/architecture.

tommy_axle•5mo ago

This piece is also covered by a bunch of other cli/tui agents (like codex-cli and opencode) allowing you to switch between Claude and other models (comes in handy depending on the task) so it really all depends on the setup you like. As mentioned in the sibling comment there are ways to get it to work with Claude Code too.

eulers_secret•5mo ago

There's also opencode which is a fork(?) of Claude Code that runs on any model: https://github.com/sst/opencode

And of course, not the same, but Aider still exists and is still a great tool for AI dev.

It's interesting how everyone is suddenly OK with vendor lock-in, quite a change from years past!

chrismustcode•5mo ago

Not a fork opencode is a from scratch project

Claude code is completely closed source and even DMCA’d people reverse engineering it.

https://techcrunch.com/2025/04/25/anthropic-sent-a-takedown-...

nikcub•5mo ago

The entire point of CC is to drive anthropic subscriptions and it's working. even with the release of the long-awaited gpt5 the anthropic models are still the best coding models.

There are plenty of alternatives for other models like opencode et al, and you can always just set ANTHROPIC_BASE_URL with claude code to have it use another provider. I don't see why they need to do anything in addition to that.

My only request would be for claude code to be a bit more open, less obfuscated and to accept PRs - but I understand the unwillingness of also wanting to manage what would be a very popular open source project.

siva7•5mo ago

> I watched the autonomous startup builder a bit more.

I think i'm done with this community in the age of vibe coding. The line between satire, venture capitalism, business idea guys and sane tech enthusiasts is getting too blurry.

osn9363739•5mo ago

It didn't seem to do anything well. And weird quotes like 'I think it one-shotted that too' on something important. What on earth is this. Reading it is like experiencing a bad weird dream.

zb3•5mo ago

Umm, not really, you also need spare money to burn..

not_a_bot_4sho•5mo ago

Perhaps I'm a bit of a cynic but I'm no longer impressed by the AI slop websites and demo apps, like those showcased in the article.

sixhobbits•5mo ago

(author here) I think there's a difference between "I'm no longer impressed" (good) and "I was never impressed and never would have been impressed" (bad, but common).

Yes it's easy now so its by definition no longer impressive, but that in itself is impressive if you can correctly remember or imagine what your reaction _would_ have been 6 months ago.

Terretta•5mo ago

Never impressed, no longer impressed, feeling depressed ... Another option, newly impressed by the next iteration.

Up to a point these have been probability machines. There's probably a lot of code that does certain likely things. An almost astonishing amount doing the same things, in fact. As such, perhaps we shouldn't be surprised or impressed by the stochastic parrot aspect any more than we're impressed by 80% of such sites being copy pasta from Stack Overflow a few years ago.

However, what we perhaps didn't expect is that on the margins of the mass probability space, there are any number of less common things, yet still enough of those in aggregate that these tools can guess well how to do those things too, even things that we might not be able to search for. Same reason Perplexity has a business model when Google or DDG exist.

And now, recently, many didn't expect one might be able to simulate a tiny "society of mind" made of "agents" out of these parrots, a tiny society that's proving actually useful.

Parrots themselves still impress me, but a society of them making plans at our beck and call? That can keep us all peeking, pecking, and poking for a while yet.

// given enough time and typewriters, who wins: a million monkeys, a society of parrots, or six hobbits?

hattmall•5mo ago

It's a lot like the first time taking a metal detector to a beach. It's really cool and exciting (dopamine hit) to find stuff, but after a while it wears off because realistically you only found trash.

Buuut for some people it just clicks and it becomes their chore to go find trash in the beach everyday and the occasional nickel or broken bracelet they feel the need to tell people and show it off.

_pdp_•5mo ago

I've asked copilot (Claude Sonnet 4) to edit some specific parts of a project. It removed the lines that specifically have comments that say "do not remove" with long explanation why. Then it went ahead and modified the unit tests to ensure 100% coverage.

Using coding agent is great btw, but at least learn how to double check their work cuz they are also quite terrible.

benterix•5mo ago

This is the tricky part. The whole point of agents is, well, do things so that we don't have to. But if you need to check everything they do, you might as well copy and paste from a chat interface...

Which makes me feel early adopters pay with their time. I'm pretty sure the agents will be much better with time, but this time is not exactly now, with endless dances around their existing limitations. Claude Code is fun to experiment with but to use it in production I'd give it another couple of years (assuming they will focus on code stability ans reducing its natural optimism as it happily reports "Phase 2.1.1 has been successfully with some minor errors with API tests failing only 54.3% of the time").

combyn8tor•5mo ago

Claude loves to delete comments. I setup specific instructions telling it not to, and yet it regularly tries to delete comments that often have nothing to do with the code we're working on.

It's so hit and miss in Rust too. When I ask it for help with a bug it usually tries a few things then tries to just delete or comment out the buggy code. Another thing it does is to replace the buggy code with a manual return statement with a comment saying "Returning a manual response for now". It'll then do a cargo build, proclaim that there are no errors and call it a day. If you don't check what it's doing it would appear it has fixed the bug.

When I give it very specific instructions for implementation it regularly adds static code with comments like "this is where the functionality for X will be implemented. We'll use X for now". It does a cargo build then announces all of its achievements with a bunch of emojis despite having not implemented any of the logic that I asked it to.

tptacek•5mo ago

You run a coding agent with no permissions checks on a production server anywhere I'm involved in security and I will strike down upon thee with great vengeance and furious anger.

Really, any coding agent our shop didn't write itself, though in those cases the smiting might be less theatrical than if you literally ran a yolo-mode agent on a prod server.

sylens•5mo ago

Author kindly asked you to stop reading:

> 1) Have faith (always run it with 'dangerously skip permissions', even on important resources like your production server and your main dev machine. If you're from infosec, you might want to stop reading now—the rest of this article isn't going to make you any happier. Keep your medication close at hand if you decide to continue).

xpe•5mo ago

"Here is how you build a self-replicating unknown-impact protein structure that will survive in the wild. If this bothers you, stop reading".

Other people's blasé risk profile -- or worse, willful denial of risk -- is indeed our problem. Why?

1. Externalities, including but not limited to: security breaches, service abuse, resource depletion, and (repeat after me -- even if you only think the probability is 0.01%, such things do happen) letting a rogue AI get out of the box. *

2. Social contagion. Even if one person did think about the risks and deem them acceptable, other people all too often will just blindly copy the bottom-line result. We are only slightly evolved apes after all.

Ultimately, this is about probabilities. How many people actually take the fifteen minutes to thoughtfully build an attack tree? Or even one minute to listen to that voice in their head that says "yeah, I probably should think about this weird feeling I have ... ... maybe my subconscious mind is trying to tell me something ... maybe there is indeed a rational basis for my discomfort ... maybe there is a reason why people are warning me about this."

Remember, this isn't only about "your freedom" or "your appetite for risk" or some principle of your political philosophy that says no one should tell you what to do. What you do can affect other people, so you need to own that. Even if you don't care what other people think, that won't stop a backlash.

* https://www.aisafetybook.com/textbook/rogue-ai

sixhobbits•5mo ago

Gotta exaggerate a bit to get attention :D

But I think I'm getting to the point where "If I'd let an intern/junior dev have access while I'm watching then I'm probably OK with Claude having it too"

The thing that annoys me about a lot of infosec people is that they have all of these opinions about bad practice that are removed from the actual 'what's the worst that could happen here' impact/risk factor.

I'm not running lfg on a control tower that's landing boeing 737s, but for a simple non-critical CRUD app? Probably the tradeoff is worth it.

nvch•5mo ago

We allow juniors in risky areas because that’s how they will learn. Not the case for current AIs.

tptacek•5mo ago

I think that's like, fractally wrong. We don't allow early-stage developers to bypass security policies so that they can learn, and AI workflow and tool development is itself a learning process.

lmm•5mo ago

> We don't allow early-stage developers to bypass security policies so that they can learn

Back when I worked at an F500 it was normal practice to give early-stage developers access to a "research" environment where our normal security policies were not applied. (Of course the flipside was that that "research" environment didn't have any access to confidential data etc., but it was a "prod" environment for most purposes)

Thrymr•5mo ago

Why in the world would you advocate explicitly for letting it run on production servers, rather than teaching it how to test in a development or staging environment like you would with a junior engineer?

philipp-gayret•5mo ago

My workflow is somewhat similar to yours. I also much love --dangerously-skip-permissions, as root! I even like to do it from multiple Claude Code instances in parallel when I have parallel ideas that can be worked out.

Maybe my wrapper project is interesting for you? https://github.com/release-engineers/agent-sandbox It's to keep Claude Code containerized with a copy of the workspace and a firewall/proxy so it can only access certain sites. With my workflow I don't really risk much, and the "output" is a .patch file I can inspect before I git apply it.

indigodaddy•5mo ago

I've often gotten the sense that fly.io is not completely averse to some degree of "cowboying," meaning you should probably take heed to this particular advice coming from them..

tptacek•5mo ago

I have no idea what the fuck you're talking about but nobody is running Claude Code on our server fleet here.

indigodaddy•5mo ago

You took it wrong. I'm with you here.

tptacek•5mo ago

We're pretty averse to "cowboying". We're a small team working on an enormously ambitious problem at a much earlier point on the maturity curve than incumbents. It's fine if that maturity concern impacts people's take on the product, but not at all fine if people use it as a reflection on the people and processes building that product.

indigodaddy•5mo ago

I think I just meant perhaps fly isn't afraid of responsibly "moving fast" in certain situations. Sorry for any offense, didn't mean it like that at all and there was no ill intent (actually the opposite) in my OC. At the end of the day I was trying to convey that the security stances of fly should be paid attention to.

tptacek•5mo ago

Sorry, I was pretty knee-jerk here.

Terretta•5mo ago

Author (who also replied to you) might have been "doing it wrong" but no wonder, Anthropic only made Claude Code smarter about this 5 days ago and there's too much to keep up with:

https://github.com/anthropics/claude-code-security-review

The new command is something like /security-review and should be in the loop before any PR or commit especially for this type of web-facing app, which Claude Code makes easy.

This prompt will make Claude's code generally beat not just intern code, but probably most devs' code, for security mindedness:

https://raw.githubusercontent.com/anthropics/claude-code-sec...

The false positives judge shown here is particularly well done.

// Beyond that, run tools such as Kusari or Snyk. It's unlikely most shops have security engineers as qualified as these focused tools are becoming.

yahoozoo•5mo ago

How can an LLM determine a confidence score for its findings?

mdasen•5mo ago

Is Claude Code better than the Gemini CLI? I've been using the Gemini CLI with Gemini 2.5 Pro and haven't been impressed. Maybe these LLMs aren't as good with Rust codebases? I'm guessing there are a lot more people looking to use these tools with JS and Python.

aaomidi•5mo ago

Claude Code is insanely good with Rust.

risho•5mo ago

i've tried codex, cursor, and a few other agentic tools and nothing compares to claude code when it comes to UX. The other service's models are quickly catching up to claude, but the claude code ux is just magical to me. i havent used it with rust personally. like you suggested would be the average user, i've mostly stuck with js and python.

irskep•5mo ago

I was once a heavy user of Cursor with Gemini 2.5 Pro as a model, then a Claude Code convert. Occasionally I try out Gemini CLI and somehow it fails to impress, even as Cursor + Gemini still works well. I think it's something about the limited feature set and system prompt.

nestorD•5mo ago

I have found Claude code to be significantly better, both in how good the model ends up being and in how polished it is. To the point that I do not drop down to Gemini CLI when I reach my Claude usage limit.

rancar2•5mo ago

You can make Gemini CLI much better by making it behave more like Claude Code. Claude Code has some lovely prompt engineering at the system and subsystem level that can be replicated with Gemini CLI. I’m having great results already. I am still perfecting process and prompts to be a fully agentic system that can do well on benchmarks but more importantly do the right work with steerability, which was an absolute pain with Gemini CLI out-of-the-box. If you are interested, I can publish some of the basics now and then I can keep you posted as I develop it into a more robust system. Just email me at randycarlton@gmail.com with the subject: SaaS.bot (where this work will likely reside).

lukaslalinsky•5mo ago

I don't know if it's Gemini CLI or Gemini 2.5 Pro, but the combination is not even comparable to Claude Code with Sonnet. I was starting with these agent tools several weeks ago, so it was very tempting to use Gemini, instead of paying for Claude Pro, but the difference is huge. In my experience, Gemini was very quick to get stuck in debugging loop, fixing something "one last time" over and over again. Or it got into writing code, despite my explicitly saying not to do so. I'm still trying to figure out if I could use Gemini for something, but every time I try it, I regret it. Claude Code with GLM-4.5 is a good alternative to paying for Claude Pro, it's not as good as Sonnet, but close.

mdasen•5mo ago

I guess what seems surprising to me is that Gemini 2.5 Pro scores well above Claude Sonnet on Aider's leaderboard, even beating Claude Opus 4.

I have been kinda wondering if there's something that just isn't as good between the CLI and model because the Gemini CLI has been a mostly frustrating experience - and it's kept me from wanting to pay for Claude because I don't want to pay money for the same frustrating experience. But maybe I should try Claude and see.

https://aider.chat/docs/leaderboards/

csomar•5mo ago

It is much better but only because Sonnet 4 is better at handling more complexity and being very code at writing code.

modo_•5mo ago

I've been using both on a Rust codebase and have found both work fairly well. Claude code is definitely more capable than Gemini. What difficulties have you had?

The biggest pain point I've had is that both tools will try to guess the API of a crate instead of referencing the documentation. I've tried adding an MCP for this but have had mixed results.

https://github.com/d6e/cratedocs-mcp

mdasen•5mo ago

It might be that we have multiple features in our codebase and Gemini seems to struggle understanding that it needs to be aware of #[cfg(feature = "x")] and also that if it's trying to run things, it might need to specify the feature.

And yes, when they guess APIs, it's highly annoying.

prmph•5mo ago

Gemini CLI is terrible. I've had it:

- Repeat more than 20 times the same response to my prompt rejecting its proposed changes; I just kept prompting to see how far it would go before doing something different. Claude Code would quickly guess there is something wrong and try something else or ask what I'm getting at

- Continually refer to outdated versions of files, even after I've told to re-read the files

- Refer to files in a different session on a different machine that have no relevance to what I'm currently doing, presumably simply because I logged in with the same account.

- Randomly crash or enter infinite loops, sometimes soon after starting

- Refuse to read files in a sibling or parent folder

- Fail to understand simple request.

- Propose empty changes

Claude Code is just far better. I only use Gemini CLI for the simplest of tasks

esafak•5mo ago

We really need an agent benchmark to explore their ability-efficiency frontier.

jrflowers•5mo ago

This is good stuff. While somebody could build a Trello clone or an image generator by typing “git clone “ followed by any number of existing projects, the code you’d get might’ve been written by a person, plus if you do that you’re not even spending any money, which just doesn’t seem right.

The future is vibe coding but what some people don’t yet appreciate what that vibe is, which is a Pachinko machine permanently inserted between the user and the computer. It’s wild to think that anybody got anything done without the thrill of feeding quarters into the computer and seeing if the ball lands on “post on Reddit” or “delete database”

expensive_news•5mo ago

This is a great comment.

I’ve noticed a new genre of AI-hype posts that don’t attempt to build anything novel, just talk about how nice and easy building novel things has become with AI.

The obvious contradiction being that if it was really so easy their posts would actually be about the cool things they built instead of just saying what they “can” do.

I wouldn’t classify this article as one since the author does actually create something of this, but LinkedIn is absolutely full of that genre of post right now.

Tempest1981•5mo ago

> their posts would actually be about the cool things they built

Presumably, they are all startups in stealth mode. But in a few months, prepare to be blown away.

t0md4n•5mo ago

I'd personally rather use gpt-5. The sub price is cheap and offers more overall value than an Anthropic sub or paying per token. The chatgpt app on iPhone and Mac are native and nicer than Anthropic's and offer more features. Codex is close enough to Claude Code and also now native. For me it's nicer to use the "same" model across each use case like text, images, code etc. this way I better understand the limitations and quirks of the model rather than the constant context switching to different models to get maybe slightly better perf. To each their own though depending on your personal use case.

bn-l•5mo ago

I wish the app supported mcp. Is this “not invented here”?

cpursley•5mo ago

The problem is GPT-5 is not in the same league as even Claude 3.5. But I do hope their lower pricing puts some downward pressure on Anthropic's next release.

t0md4n•5mo ago

I don’t believe this is true but I’m willing to be proven wrong. I believe people who think this are just used to Claude’s models and therefore understand the capabilities and limitations due to their experience using them.

js2•5mo ago

> export IS_SANDBOX=1 && claude --dangerously-skip-permissions

FYI, this can be shortened to:

  IS_SANDBOX=1 claude --dangerously-skip-permissions

You don't need the export in this case, nor does it need to be two separate commands joined by &&. (It's semantically different in that the variable is set only for the single `claude` invocation, not any commands which follow. That's often what you want though.)

> I asked Claude to rename all the files and I could go do something else while it churned away, reading the files and figuring out the correct names.

It's got infinite patience for performing tedious tasks manually and will gladly eat up all your tokens. When I see it doing something like this manually, I stop it and tell it to write a program to do the thing I want. e.g. I needed to change the shape of about 100 JSON files the other day and it wanted to go through them one-by-one. I stopped it after the third file, told it to write a script to import the old shape and write out the new shape, and 30 seconds later it was done. I also had it write me a script to... rename my stupidly named bank statements. :-)

indigodaddy•5mo ago

Does it even work with the &&? Iirc, I've never had luck putting env vars before the && and always had to do it the way you describe

DiabloD3•5mo ago

It works because they exported it. VAR=foo bar only sets it for the env passed to that exec or subshell, export VAR=foo && bar adds it to the current env then executes bar.

export VAR=foo && bar is dangerous because it stays set.

indigodaddy•5mo ago

Ah, that's what I had done wrong, thank you! And agree I wouldn't want to just one-off export it and have it be set, better to not export it for one-liner one-offs for sure

ActionHank•5mo ago

Can shorten further to rm -rf /

kiitos•5mo ago

make it work more generally via `env`

    env IS_SANDBOX=1 claude --dangerously-skip-permissions

not all shells support FOO=bar prefixes, in particular fish does not, but the above works everywhere

rirze•5mo ago

This might have been the case for fish shell; but not anymore, it works in current version. I myself have used the popular syntax without specifying `env` in my aliases.

Dragonai•5mo ago

> It's got infinite patience for performing tedious tasks manually and will gladly eat up all your tokens. When I see it doing something like this manually, I stop it and tell it to write a program to do the thing I want.

This is so funny. Thank you for sharing :)

jama211•5mo ago

This. I had a 10000 line css file, and told it to do a find and replace on some colours. It was hilariously bad at this and started chewing tokens. Asking it to write a script to swap it out and then execute that script for me and it was done instantly. Knowing the right questions to ask an AI is everything.

jofzar•5mo ago

I actually have noticed it do this by itself a couple of times, it's where I got the idea to do the same

felineflock•5mo ago

Waiting for the follow up article "Claude Code considered harmful"

narrator•5mo ago

The follow up: Why I decided to go back to handcoding assembly language after using Claude code.

JSR_FDED•5mo ago

Separation of concerns..why AI and non-AI dependent code should never be mixed.

JSR_FDED•5mo ago

Locality of behavior - why separating AI and non-AI code introduces needless complexity.

poemxo•5mo ago

"x considered harmful" was my favorite coding meme before "x is all you need"

felineflock•5mo ago

You might like this one:

https://meyerweb.com/eric/comment/chech.html

1gn15•5mo ago

Why isn't anyone talking about the HackerNews Comment Ranker plugin? [1] That's amazing. I had this idea too -- to rank HN comments by their relevance to the actual article, and filter out comments that obviously didn't read it.

Repo: https://github.com/sixhobbits/hn-comment-ranker

I need to modify this to work with local models, though. But this does illustrate the article's point -- we both had an idea, but only one person actually went ahead and did it, because they're more familiar with agentic coding than me.

[1] Oh. I think I understand why. /lh

andrewmcwatters•5mo ago

The screenshot was a really great example how bad that can end up in practice. One comment asking "What's the catch?" which is a good follow-up question to further conversation was ranked a 1/5.

hext•5mo ago

Probably just needs a slight update to expand the relevant context of child comments. I bet it's still comparing "What's the catch?" to the OP article.

visarga•5mo ago

And loads of money to pay for tokens, because every month I am out of tokens after a week or two.

aantix•5mo ago

Has anyone run with `dangerously skip permissions` and had something catastrophic happen?

Are there internal guardrails within Claude Code to prevent such incidents?

rm -rf, drop database, etc?

browningstreet•5mo ago

I don't know about Claude Code, but here's my story. With Replit, I have a bunch of tasks that I want Replit to do at the end of a coding session -- push to Github, update user visible Changelogs, etc. It's a list in my replit.md file.

A couple of weeks ago I asked it to "clean up" instead of the word I usually use and it ended up deleting both my production and dev databases (a little bit my fault too -- I thought it deleted the dev database so I asked it to copy over from production, but it had deleted the production database and so it then copied production back to dev, leaving me with no data in either; I was also able to reconstruct my content from a ETL export I had handy).

This was after the replit production db database wipe-out story that had gone viral (which was different, that dev was pushing things on purpose). I have no doubt it's pretty easy to do something similar in Claude Code, especially as Replit uses Claude models.

Anyway, I'm still working on things in Replit and having a very good time. I have a bunch of personal purpose-built utilities that have changed my daily tech life in significant ways. What vibe coding does allow me to do is grind on "n" of unrelated projects in mini-sprints. There is personal, intellectual, and project cost to this context switching, but I'm exploring some projects I've had on my lists for a long time, and I'm also building my base replit.md requirements to match my own project tendencies.

I vibe coded a couple of things that I think could be interesting to a broader userbase, but I've stepped back and re-implemented some of the back-end things to a more specific, higher-end vibe coded environment standard. I've also re-started a few projects from scratch with my evolved replit.md... I built an alpha, saw some issues, upgraded my instructions, built it again as a beta, saw some issues... working on a beta+ version.

I'm finding the process to be valuable. I think this will be something I commit to commercially, but I'm also willing to be patient to see what each of the next few months brings in terms of upgraded maturity and improved devops.

ethan_smith•5mo ago

Claude Code has minimal internal guardrails against destructive operations when using --dangerously-skip-permissions, which is why it's a major security risk for production environments regardless of how convenient it seems.

azuanrb•5mo ago

I run it locally all the time. Nothing catastrophic happened so far.

swader999•5mo ago

It commits sometimes when I'm not ready, that's about it.

danielbln•5mo ago

An over eager helm update lead to some "uh oh, I hope the volume is still there" and it was. Otherwise no, haven't had anything bad happen. Of course, it's just a matter of time, and with the most recent version it's easy to toggle permissions back on without having to restart Claude Code, so for spicy tasks I tend to disable YOLO mode.

almosthere•5mo ago

I have noticed that using LLMs does not increase tech debt, it infact erases it, and can do so codebase wide in half an hour.

andrewstuart•5mo ago

I use Claude, ChatGPT and Gemini all together.

I throw their results at each other, get them to debug and review each others work.

Often a get all three to write the code for a given need and then ask all three to review all three answers to find the best solution.

If I’m building something sophisticated there might be 50 cycles of three way code review until they are all agreed that there no critical problems.

There’s no way I could do without all three at the same time it’s essential.

elevation•5mo ago

What's your tooling workflow like? Are you using them from a browser? Or commanding them from VSCode?

wedn3sday•5mo ago

I dont know about yall, but personally I love to see an AI running with "--dangerously-skip-permissions" in an infinite loop. Every day we get closer to the cyberpunk future we deserve.

zmmmmm•5mo ago

with public access to the internet, apparently!

I'm surprised it didn't get to the point of blackmailing the author to give it more resources and keep itself running.

sgt101•5mo ago

I'm sorry but people who let an agent run on prod deserve what they get. Basically even saying you would do that should disqualify you from working in IT in the way saying "I like to drink when I'm working" should disqualify you from airtraffic control.

noahjk•5mo ago

I haven't been following too closely, but is there even a reason to do this? What are the benefits of allowing production access versus just asking for a simple build system which promotes git tags, writes database migration scripts, etc.? From my perspective, it should be easier than ever to use a "work" workflow for side projects, where code is being written to PR's, which could optionally be reviewed or even just auto approved as a historical record of changes, and use a trunk-based development workflow with simple CI/CD systems - all of which could even be a cookie cutter template/scaffolding to be reused on every project. Doesn't it make sense now more than ever to do something like that for every project?

latexr•5mo ago

> I'm sorry but people who let an agent run on prod deserve what they get.

The problem is that whatever consequences come of it won’t affect just them. You don’t really have any way of knowing if any service you use or depend on has developers running LLMs in production. One day not too far off in the future, people who don’t even like or use LLMs will be bitten hard by those who do.

rglover•5mo ago

How does Claude Code compare to just using a Sonnet/Opus model with Cline? Imagine the results have to be fairly similar?

vibecoding-grft•5mo ago

I've found Claude's CLI to be the best of what I've tried. I've moved away from cursor and found myself in a much better programming headspace wherein I can "toggle" this AI-enabled mode. It has to be a more mindful approach to when/how I use AI in my day-to-day work instead of it being a temptation to "AI" some of the work away in the Cursor IDE.

g42gregory•5mo ago

I read the section "Hitting a snag: the model builders are also the police now".

It absolutely boggles my mind how anybody thinks that this is Ok?

Unless you are in North Korea, of course.

cobbzilla•5mo ago

As I understood it, the AI company is trying to prevent itself and its customers from engaging in (probably unintentional, but any) criminal activity.

When the AI company is "policing" your agent by requiring a "human in the loop", it's just CYA (cover your ass) for the AI company.

If your agent goes off and does something illegal, the AI company would be liable unless they have some legal deniability. By requiring you the human account owner to sign-off on what your agent is doing, you become liable for any crimes your agent commits on your behalf. I haven't read their TOS but I can guarantee there is some clause like this in there.

You are still completely free to commit crimes with your agent and suffer whatever legal consequences follow!! You just have to be clear that you intentionally wanted those actions to occur which resulted in the crime. If you repeatedly allow your agents to take actions that could potentially be criminal without any human-in-the-loop, they're going to ban you because it exposes themselves to potential criminal charges.

nickradford•5mo ago

I'm curious what the prompt is you used for the poster background generation. I really like the soft illustrated feel for the images I got back

sixhobbits•5mo ago

Fine make me spill all my secrets then xD

                system_instructions = """You will generate an image. The image will be used as the background of a poster, so keep it muted and not too detailed so text can still easily be seen on top. The actual poster elements like margin etc will be handled separately so just generate a normal image that works well in A4 ratio and that works well as a background."""

            full_prompt = f"{system_instructions}\n\nGenerate a background image for an A4 poster with the following description: {prompt}"

            openai_request = {
                'model': 'gpt-4.1-mini',
                'input': full_prompt,
                'tools': [{
                    'type': 'image_generation',
                    'size': '1024x1536',
                    'quality': 'medium'
                }]
            }

            # Make request to OpenAI
            response_data = self.call_openai_api('/v1/responses', openai_request)

cultofmetatron•5mo ago

All this AI coding stuff is scaring the shit out of me. a few months ago my team were hiring for a new engineer. of the 9 candidates we ran technical interviews with, only two could work without the ai. The rest literally just vibe coded their way though the app. as soon as it was taken away, they couldn't even write a basic sql query in ecto (we're a phoenix app). when questioned about tradeoffs inherent in the ai generated implementation, all but one was completely in the dark.

danielbln•5mo ago

Now take Google away, and LSP. And the computer. Write CTEs with a pencil or bust.

I'm exaggerating of ourse, and I hear what you're saying, but I'd rather hire someone who is really really good at squeezing the most out of current day AI (read: not vibe coding slop) than someone who can do the work manually without assistance or fizz buzz on a whiteboard.

dnoberon•5mo ago

I think the point is how can you squeeze anything out of the AI without knowing the stuff at a deep enough level?

kenjackson•5mo ago

Ask most folks about the code generated by the compiler or interpreter and you’ll get blank stares. Even game devs now barely know assembly, much less efficient assembly.

There is still a place for someone who is going to rewrite your inner-loops with hand-tuned assembly, but most coding is about delivering on functional requirement. And using tools to do this, AI or not, tend to be the prudent path in many if not most cases.

jamesmccann•5mo ago

I don't think these are comparable though. Compiler generation is deterministic and more or less provably correct. LLM code is a mile away from that.

skydhash•5mo ago

Apart from the whole argument about compilers being deterministic and not LLMS.

You don't collaborate on compiled code. They are artifacts. But you're collaborating on source code, so whatever you write, someone else (or you in the future) will need to understand it and alter it. That's what the whole maintainability, testability,... is about. And that's why code is a liability, because it takes times for someone else to understand it. So the less you write, the better it is (there's some tradeoffs about complexity).

kenjackson•5mo ago

You can make LLMs deterministic, but that's not a priority right now. In the same way we used to not capture dev environments and end up in situations where you couldn't rebuild a binary exactly because the OS version, the compiler version, the CRT version, etc... all changed -- of course that's a 20 year old problem now, but was a legitimate problem as recently as 2000.

And again, we're at a point in time where we do collaborate on the source code artifacts. But maybe we won't in the future. It assumes that we see AI progress, I can see a world where asking questions of the AI about the code is better than 99% of developers. There will be the John Carmack's of the world though who know better than the AI, but the common case is that we eventually move away from looking at code directly. But this does rely on continued progress that we may not get.

jama211•5mo ago

Being able to memorise things that are easily looked up (like syntax) doesn’t demonstrate deep knowledge. It’s a bad interview question.

myaccountonhn•5mo ago

I mean maybe these juniors are geniuses, but I often find it very non-obvious why LLM-generated code it wrong and it requires me to have an even deeper knowledge. Sometimes the code is correct, but overly complicated.

One small example was a coworker that generated random numbers with AI using `dd count=30 if=/dev/urandom | tr -c "[a-z][A-Z]" | base64 | head -c20` instead of just `head -c20 /dev/urandom | base64`. I didn't actually know `dd` beyond that it's used for writing to usb-sticks, but I suddenly became really unsure if I was missing something and needing to double check the documentation. All that to say that I think if you vibe-code, you really need to know what you're generating and to keep in mind that other will need to be able to read and understand what you've written.

instig007•5mo ago

> I'd rather hire someone [...] than someone who can do the work manually without assistance or fizz buzz on a whiteboard

and the reason for you to do that would be to punish the remaining bits of competence in the name of "the current thing"? What's your strategy?

timeon•5mo ago

Used to write Perl scripts with pencil while waiting at the airport.

christophilus•5mo ago

And then later, when you couldn’t read your own handwriting, you said, “This damn thing is illegible— so it’s probably valid Perl.”

yen223•5mo ago

Perl is just Python with bad handwriting

kfajdsl•5mo ago

For your examples, honestly yeah. A dev should familiar with the basic concepts of their language and tech stack. So yes, they should be able to understand a basic snippet of code without Google, an LSP, or even a computer. They should even be able to "write CTEs with a pencil and paper". I don't expect them to get the syntax perfect, but they should just know the basic tools and concepts enough to have something at least semantically correct. And they certainly should be able to understand the code produced by an AI tool for a take home toy project.

I say this as someone who would definitely be far less productive without Google, LSP, or Claude Code.

jama211•5mo ago

I’ve written huge queries and CTE’s in my career. But I haven’t done it recently. Personally, I’d need 10 minutes of google time to refresh my memory before being able to write much sql on paper, even with bad syntax. It doesn’t mean I’m a bad engineer because I don’t bother to memorise stuff that’s easily googleable.

runako•5mo ago

> couldn't even write a basic sql query

Not the point at all, but I have found it quite common among younger professional engineers to not know SQL at all. A combination of specialization (e.g. only work on microservices that do not directly touch a database) and NoSQL has made the skill of SQL more obscure than I would have thought possible as recently as 5 years ago.

ElCapitanMarkla•5mo ago

I started to notice this in a big way at my last job which I started in 2013. We were a rails shop and by about 2016 I was noticing most new hires would have no idea how to write a SQL query.

cultofmetatron•5mo ago

> most new hires would have no idea how to write a SQL query.

probably why people think rails is slow. our integration partners and our customers are constantly amazed by how fast and efficient our system is. The secret is I know how to write a damn query. you can push a lot of logic that would otherwise be done in the api layer into a query. if done properly with the right indexes, its going to be WAY faster than pulling the data into the api server and doing clumsy data transformations there.

hollowturtle•5mo ago

You actually confirmed that rails is slow if the optimization is on the database server and doing data mangling in ruby is less efficient

richwater•5mo ago

You've correctly identified that filtering a list is slower than looking up from an index. Congratulations.

hollowturtle•5mo ago

Thank you, let me give you the eli5: I just wanted to say that you can't claim that something is fast if speed is thanks to something else

runako•5mo ago

Constructively, I would suggest some areas for study:

- relative speeds of programming languages (https://github.com/niklas-heer/speed-comparison)

- database indexing (https://stackoverflow.com/questions/1108/how-does-database-i...)

- numbers everyone should know (https://news.ycombinator.com/item?id=39658138)

And note that databases are generally written in C.

hollowturtle•5mo ago

Constructively, I just wanted to say that you can't claim that something is fast if speed is thanks to something else. OP said people thinks rails is slow but if you have a fast query it's a solved problem. Even python would be fast in this instance with an optimized query

cultofmetatron•5mo ago

> Even python would be fast in this instance with an optimized query

I wasn't trying to argue that ruby is slow (it objectively is). I was arguing that its slowness is irrelevant for most webapps because you should be offloading most of the load to your database with efficient queries.

closeparen•5mo ago

Unless the database is in your process's address space (SQLite, Datomic, etc) your first problem is going to be shipping the data from the database server to the application process.

strtok•5mo ago

1000%. It’s all about limiting those round trips to the database…

ggregoire•5mo ago

That's so weird to me, SQL is the very first language they taught me in college 20 years ago, before even learning how to write a for loop in pseudo code. Nowadays it's still the language I use the most on a daily basis.

ramchip•5mo ago

It's a wide field so it depends on the specialization. I did computer engineering 15+ years ago and we never touched SQL, but I think the software engineering people did have a class on it.

sampullman•5mo ago

I learned it ~15 years ago, and when I use it a lot it sticks with my pretty well. But if I go a month or two without writing raw queries I lose anything more advanced than select/update/delete/join. I think I forget it faster than other things because none of syntax/semantics aren't shared with anything else I use.

closeparen•5mo ago

You should at least know how to query your data warehouse environment to debug your services / find out if they're working!

ASinclair•5mo ago

I'm nearly guilty of this. I've been in industry for a bit over 10 years and I can barely write SQL. That's despite writing a bunch of queries by hand in my undergrad databases course. I almost never deal with databases myself outside of some ad-hoc queries.

phito•5mo ago

Same here, mostly because I avoid it because I really do not like writing queries. Something about the syntax rubs me the wrong way, especially if I have to switch from MySQL/Postgres/MSSQL regularly. I'll use an ORM whenever I can, if performances do not matter.

mattmanser•5mo ago

It's because it's logically in the wrong order, it should be:

   FROM Users 
   WHERE Type = 'Foo'
   SELECT id, name

They use the right order in a lot of ORMs and as I was a SQL expert (but not master), I found it so jarring at first.

You probably have the reverse problem, it doesn't fit your mental model which is in fact the right logical model.

It gets even worse when you add LIMIT/TOP or GROUP BY. SQL is great in a lot of ways, but logically not very consistent. And UPDATE now I think about it, in SQL Server you get this bizarreness:

    UPDATE u
    SET u.Type = 'Bar'
    FROM Users u
    JOIN Company c on u.companyId = c.id
    WHERE c.name = 'Baz'

skydhash•5mo ago

That's because in the relational model, you deal mostly in terms of projections. There's an action and then the rest of the statement is creating the projection of the data the action will apply to. The action always applies to the whole of the projection (IIRC).d

The semantics of SQL and a standard programming language are quite different as they are based on different computing/data model.

cultofmetatron•5mo ago

you would LOVE ecto. its an elixir dsl for writing sql and fixes all the issues I have with sql including what you just mentioned here

mattmanser•5mo ago

I actually still love writing SQL to be honest,

I can't have explained myself well, I find the SQL way "normal" even though it's logically/semantically a bit silly.

Because that's how I learnt.

My point was, if you learnt on ORMs, the SQL way must be jarring.

cultofmetatron•5mo ago

can't stand orms and I wrote one myself a long time ago.

BUT

ecto isnt' an orm. its a sql dsl and it take a lot of pain out of writing your sql while being very easy to map what you're writing to teh output dsl

so instead of

``` select Users.id, count(posts.id) as posts_count from Users left join Posts on Posts.user_id = Users.id group by users.id ```

you can write ``` from(u in User) |> join(:left, [u], p in Post, on: u.id = p.user_id, as: :posts) |> select([u, posts: p], %{ id: u.ud, posts_count: count(p.id) }) |> group_by([u], u.id)

```

the |> you see here is a pipe operator. I've effectively decomposed the large block query into a series of function calls.

you can assign subqueries as separate values and join into those as well. it doesn't try to change sql. it just makes it vastly more ergonomic to write

mattmanser•5mo ago

That's pretty much identical to an ORM:

    db.Users
      .Inlude(u => Posts)
      .Select(u => new {
        u.Id,
        Count = u.Posts.Count()});

chadcmulligan•5mo ago

I dont think they teach SQL or relational algebra any more, or at least its easy to get an IT degree and avoid it altogether.

nevir•5mo ago

I see this too, also for engineers that have only interacted with relational dbs via ORMs & query builders

abustamam•5mo ago

I've been a full stack engineer for 10 years and I know SQL syntax but a few years ago I was asked at an interview "make a relation between users and posts" and I went "rails generate user" or something, and he's like, "not that," so I was like "OK I'll add it to a prisma file" and he's like "not that, write the SQL. I dunno what to do because this has never happened before."

Needless to say, I did not get the job, but several years later I still don't know how to answer his question.

I've worked with NOSQL (Mongo/Mongoose, Firebase) and I've worked with ORMs (Prisma, drizzle, Hasura), and I've been able to implement any feature asked of me, across several companies and projects. Maybe there's a subset of people who really do need to know this for some really low level stuff, but I feel like your average startup would not.

I think maybe it's similar to "can you reverse a linked list" question in that maybe you won't need the answer to that particular question on the job, but knowing the answer will help you solve adjacent problems. But even so, I don't think it's a good qualifier for good vs bad coders.

Zizizizz•5mo ago

I would assume he means creating a foreign key relationship from the posts and users table. Doesn't rails or prisma have a SQL migration tool? (Upon looking it looks like it's Active Record)

So the equivalent of

`rails db:migrate` after doing what you suggested in the interview. You could write in SQL as..

``` ALTER TABLE posts ADD COLUMN user_id INT, ADD CONSTRAINT fk_user FOREIGN KEY (user_id) REFERENCES users(id); ```

I don't know if that's what he was after but that's what my mind jumped to immediately. I'd recommend learning a bit as sometimes I've found that orms can be a lot slower than writing plain SQL for some more complex data fetching.

jama211•5mo ago

I’ve written manual SQL for years in previous roles, but because I haven’t touched it in 6 months I’d have had to double check how to write that with a quick google. It’s just a bad interview technique to require write learned syntax.

abustamam•5mo ago

I agree but lots of companies do similar things in their interview processes so we just have to know everything I guess.

jama211•5mo ago

What’s more likely is you get lucky one day and the company that hires you is the one that happens to have asked you the questions you brushed up on recently, rather than the company that’s the best fit.

Or if you have the luxury to choose, which can happen later in your software engineering career, you can simply turn down companies with bad interview processes. Personally I’m a fan of this method, but it’s a luxury for sure.

abustamam•5mo ago

I've been fortunate in that I haven't had to do a formal interview for any job I've ever had, except for my first job. I've just been working either with people with whom I've worked, or referrals of those people. I know that's an extremely privileged place to be, but it's the best place to be IMO.

jama211•5mo ago

100%. Your experience and references should be able to speak to your abilities better than some challenge would anyways.

abustamam•5mo ago

Thanks! I think I was (conceptually) missing the constraint/references part. Prior to that I had only worked with firebase and Mongo so I was just like "OK so I just put userID column on post table right?" and apparently no, not right, lol.

What's nice about prisma and hasura is that you can actually read the sql migration files generated, and you can set the logging to a level where you can read the sql being run when performing a query or mutation. I found that helpful to understand how sql is written, but since I'm not actually writing it I can't claim proficiency. But I can understand it.

vbezhenar•5mo ago

All projects I worked with, that used ORM, were burning pile of shit and ORM was a big part of this. I hate ORM and would prefer SQL any day. In my projects I almost never choose ORM.

I trust that some people can deal with ORM, but I know that I can't and I didn't see anyone who can do it properly.

So, I guess, there are some radical views on this issue. I wouldn't want to work with person who prefers to use ORM and avoids know SQL, and they probably hold similar opinion.

It is really weird to me that someone would call SQL low level. SQL is the highest level language available in the industry, definitely level above ordinary programming languages.

abustamam•5mo ago

Interesting. The reason I like ORMs is because of type generation in TypeScript. I've never worked in a company that didn't use an ORM which is likely why I prefer it. But if I ever did work in a company that used raw SQL I'd probably just suck it up and learn better SQL. Maybe then I'd be able to make a more informed opinion.

With regards to SQL being low level, I primarily work with TypeScript so a language that talks directly with the DB (SQL) seems pretty low level compared to TS. I'm not sure what you mean by an ordinary programming language though (obviously not machine code).

vbezhenar•5mo ago

Java, Python, JavaScript, TypeScript are more or less the same and on level below SQL (when it comes to querying structured data).

The SQL is declarative query language. You describe the query, and database engine automatically builds a plan to execute the query. This plan automatically uses statistics, indices and so on. You don't generally specify that this query must use this index, then iterate over this table, then sort it, sort another table, merge them, the database engine does it for you.

Imagine that you have few arrays of records in JavaScript and you need to aggregate them, sort them, in an efficient way. You'll have to write your logic in an imperative way. You'll have to write procedures to maintain indices, if necessary. SQL does it better.

It it an interesting exercise to imagine programming in a language with built-in RDBMS (or object database system) for local or global variables. For example React Redux uses structures, which are somewhat similar to database. I don't really know if it would be useful or not, to write SQL instead of functional API (and get performant execution, not just dumb "table scan") but I'd like to try. C# have similar feature (LINQ), but it's just API, no real engine behind it.

abustamam•5mo ago

Thanks! That makes sense.

withinboredom•5mo ago

I spent 5 years working at a place without an ORM. Due to sharding and scalability issues, an ORM wasn't possible (the tech was nearly 20 years old when I left, so they didn't get some of the later database scaling tech). When I went to a company with an ORM, I had problems.

Namely, the ORM got in my way so much. I knew exactly which query to run and how to word it efficiently, but getting the ORM to generate sane SQL was nearly impossible. I eventually had to accept my fate of generating shitty SQL at every company since then...

That being said, I'll always advocate for ditching an ORM if given the chance and the expertise is available. If nobody knows why you generally wouldn't want to put an index on a boolean column, we're probably good. If people think it will help performance on a randomly set boolean field, we should probably stick with an ORM.

abustamam•5mo ago

Most ORMs I've worked with have a special method that lets you write raw SQL. Probably not the most ergonomic way, but it is an escape hatch. Which ORM were you using?

withinboredom•5mo ago

Most teams I've been on will automatically reject a PR writing raw sql due to maintainability concerns. I would never consider it in a professional context unless it can be done in a way that guarantees it is easy to maintain without using concatenation.

Most ORMs don't have the SQL tools we did to sanitize variables when putting them into queries. Some do, but not all.

abustamam•5mo ago

Gotcha, that makes sense.

gabrieledarrigo•5mo ago

One thing is reversing a linked list during a white board interview. Another write a simple JOIN between two tables.

Come on guys, working on backend applications and not having a clue about writing simple SQL statements, even for extracting some data from a database feels...awkward

abustamam•5mo ago

With NOSQL becoming more ubiquitous (for better or worse), it's not unfathomable that someone simply never had an opportunity to do something as simple write a join between two tables. Someone replied to my comment and taught me how in 5 lines of code. I read it and I'm like, oh that makes sense. Cool. I won't remember it exactly but I understand it. I wouldn't hold it against a front-end developer who's only ever worked with Vue to understand what happens when a React node rerenders.

My point is that there are acceptable levels of abstraction in all parts of software. Some companies will have different tolerances for understanding of that abstraction. Maybe they want a front-end dev to understand the CSS generated from tailwind. Or maybe they want them to know exactly what happens when a React node is rerendered. Or maybe the company doesn't care as long as the person is demonstrably productive and efficient at building stuff. What some consider basic knowledge can be considered irrelevant to others. Whether or not that has lasting consequences is to be seen, but that just brings us full circle back to the original problem at hand (is it good that people can vibe code something and not understand the code it generates)

myaccountonhn•5mo ago

I guess this is my first old-senior moment, but even if you use an ORM then you should know basic SQL. That table structure will be the most long-living thing in your system and probably outlive the ORMs and a bad table structure is going to create a mess later on.

abustamam•5mo ago

I agree. I know _basic_ SQL but because I don't work with it every day I don't consider myself proficient. I read the migration files generated by my ORMs just out of curiosity and to make sure it's not doing anything crazy, but I wouldn't be able to write the migration file myself.

jon-wood•5mo ago

Maybe this makes me a grumpy old man, but I feel like if you're primary role is to write software which interacts with a SQL database you should understand how to interact directly with that database. Not because you're going to do it frequently, but because understanding the thing your ORM is abstracting away for you allows you to more intelligently use those abstractions without making whoever runs the underlying database cry.

abustamam•5mo ago

I agree in a sense but I'm primarily a front end dev. Our backend devs (whose SQL skills are probably on par with mine) maintain the DB tables and such using our ORM but I would still need to use the ORM to query the DB. I'll hop in and make a DB update on the backend if I need to, but my point is that many front end devs are probably just using the ORM to query and run mutations. Such queries could certainly be optimized a bit by knowing a bit of SQL (adding indices to commonly queried cols for example) but I'd still consider solid knowledge in SQL to be a solely backend thing.

That said, I do know enough basic SQL to understand what ORMs are doing at a high level, but because I almost never write SQL I wouldn't consider myself proficient in it.

cutemonster•5mo ago

It'd been nice if the interviewer had informed you in advance that they were going to ask you to write SQL, so you could have prepared? Or you could have saved some time by cancelling? Why don't more companies do that

abustamam•5mo ago

The screening call had some sql questions which I was able to answer. Basic sql. I feel like I could have saved a day of interviewing if this simple fkey question was in the screening.

Companies should do that more!

hobofan•5mo ago

> because understanding the thing your ORM is abstracting away for you allows you to more intelligently use those abstractions

If that's required, then you are working with a bad abstraction. (Which in the case of ORMs you'll probably find many people arguing that they are often bad abstractions.)

Fervicus•5mo ago

> if you're primary role is to write software which interacts with a SQL database you should understand how to interact directly with that database.

I agree that there should be a general understanding one should be able to interact with it when needed. But at the same time I don't think devs need to be able to spit out queries with the right syntax on the spot in an interview setting.

cultofmetatron•5mo ago

in our particular case, they could use google. that said, I didn't ask them anything particularly complex. the AI depndant coders(I saw nothing in their abilities that demonstrated engineering) broke down trying to update a single record of known id with a new value.

jon-wood•5mo ago

Unless I'm doing an exercise where the candidate is actually writing software (in which case they can have Google, their favourite IDE, and if they must an LLM) I never expect exactly correct syntax in an interview. I'm assessing whether they demonstrate the sort of thought processes you'd expect from someone who knows what they're talking about, and can get in the same ballpark.

switchbak•5mo ago

One of the things that taught me the most about SQL is trying to use Hibernate (There's not enough expletives in the english language to describe that thing).

It's one thing to have a query language (DDL and DML no less) that was built for a different use case than how it's used today (eg: it's not really composable). But then you stack a completely different layer on top that tries to abstract across many relational DBs - and it multiplies the cognitive surface area significantly. It makes you become an expert at Hibernate (JPA), then learn a lot about SQL, then learn even more about how it maps into a particular dialect of SQL.

After a while you realize that the damn ORM isn't really buying you very much, and that you're often just better off writing that non-composable boring SQL by hand.

- assuming you have a decent testing infrastructure in place. Much of the supposed benefit of ORMs is about a form of psuedo-type safety, and making it easier to add more fields. If you have fast running tests that exercise the SQL layer, you might find those benefits aren't very compelling since you have such rapid feedback for your plain SQL anyway.

I've almost never changed the vendor of DB in a project, so that's another supposed benefit that doesn't buy me much. I have often wanted to use vendor-specific functionality however, and often find an ORM gets in the way of that.

To sum it up - I agree completely. If it's your job to wrangle an SQL DB - you ought to learn some SQL.

abustamam•5mo ago

Can you explain this part to me?

> assuming you have a decent testing infrastructure in place. Much of the supposed benefit of ORMs is about a form of psuedo-type safety, and making it easier to add more fields. If you have fast running tests that exercise the SQL layer, you might find those benefits aren't very compelling since you have such rapid feedback for your plain SQL anyway.

"decent testing infrastructure" is kinda doing a lot of heavy lifting — I love TDD but none of the startups I've worked at agreed with my love of TDD. There are tests, but I suspect they wouldn't fall under your label of decent testing infrastructure.

But let's say we do have a decent testing infrastructure — how does this solve the type safety benefit that you mentioned?

Rapzid•5mo ago

Not everyone wants to make that much money.

sjapkee•5mo ago

Wait, people still unironically use ORMs instead of writing queries directly? Not surprising then that everything works like shit

abustamam•5mo ago

Every startup I've worked at has at least done their first few MVPs using an ORM. I imagine because it's just a quick way to bootstrap a project. Since performance isn't really an issue before scale, sql vs ORM is just a matter of dev velocity, but I guess at scale most companies just never repaid the tech debt.

Rapzid•5mo ago

It would be a crap shoot if I nailed the syntax for creating the tables and the foreign key constraint.

This might be something I'd ask about in an interview, but I'd be looking for general knowledge about the columns, join, and key constraint. Wouldn't expect anyone to write it out; that's the boring part.

shortrounddev2•5mo ago

I think ORM did a lot of that too

gavinray•5mo ago

Terrifying.

bapak•5mo ago

I don't deal with SQL and my knowledge of it is limited to what I learned in high school a long time ago, but… isn't SQL like super easy? What's so difficult about it that people don't know how to use it? To me git is harder and I use that tool daily.

slightwinder•5mo ago

It's simple if you've learned it, but a bit hard to remember on specific details if you are not using it regulary. And every database is also a bit different on the advanced features.

withinboredom•5mo ago

Same. One candidate out of 6.

I use claude code quite liberally, but I very often tell it why I won't accept it's changes and why; sometimes I just do it myself if it doesn't "get it".

pryelluw•5mo ago

This was my experience prior to any of the llm tools. It’s hard to find people with all around knowledge. Plus someone good in one context is awful in another. Your hiring process should find people who are a good fit and not look for people with just certain technical skills. The basics of SQL can be learned quickly. Fit cannot be learned.

jama211•5mo ago

Well said. Some of the best engineers I know looked up syntax whenever they needed it because there’s not much point in wrote learning everything. As long as they understand what they’re doing, that’s the main point.

I’m honestly so sick of interviews filled with gotcha questions that if you’d happened to study the right thing you could outperform a great experienced engineer who hadn’t brushed up on a couple of specific googlable things before the interview. It’s such a bad practice.

lvl155•5mo ago

AI can also help you learn new things much faster. It’s just a tool.

hollowturtle•5mo ago

I'd say "Learn the wrong things much faster". But I'd actually argue that learning isn't a fast process, it's rather a very slow journey, takes time and dedication to master deep knowledge. You won't learn anything that will stay with llms, if they got the output correct

lvl155•5mo ago

Sorry to be harsh but that just sounds ignorant. LLM can be a great tool to speed up your learning process especially for devs. It can help me get over blocks that used to take me half a day of digging through docs and codes.

skydhash•5mo ago

Having a quick answer is different than internalizing some domain, especially when you may need to argue about it or propose some novel method. The former can work if you know the right question, but that's just as difficult as knowing the answer.

And most things that are useful daily is not pure knowledge. It's adapting the latter to the current circumstances (aka making tradeoffs). Coding is where pure knowledge shines and it's the easiest part. Before that comes designing a solution and fitting it to the current architecture and that's where judgement and domain knowledge are important.

hollowturtle•5mo ago

I may sound ignorant but if you delegate your knowledge to LLMs you'll stay ignorant. Learning is a never ending journey, you can chat about a problem with a non deterministic stocastic compressor with stale knowledge, no problem but don't call it education. Core domain knowledge still makes people successful not chatting skills. Sure if you have deep domain knowledge you can still benefinit AI filling the tedious details, I'm not anti AI

stronglikedan•5mo ago

> I may sound ignorant but if you delegate your knowledge to LLMs you'll stay ignorant.

That's kind of an ironic statement given the context. AI is just a glorified search engine that makes it very easy to find relevant information on a topic, just like a search engine but faster. One must still verify the results to be true, just like a search engine. AI is a tool to help you do your work faster, not do it for you, and should be trusted as much as any other anonymous source.

hollowturtle•5mo ago

Search engines index content, LLMs "compress" content during training and regurgitate it back wrong or in non deterministic way

paffdragon•5mo ago

We have also seen this about a year ago when hiring. But only a couple of them made it to the live interview and then it was evident. Most of them were quickly filtered out based on the coding submissions. We are soon about to hire again, with the uptick in LLM usage and newer more up to date models, I'm not looking forward too much having to deal with all of this.

hopelite•5mo ago

Maybe it’s time for you to update your perspective. You strike me as the old guy who curses all the young’ens who use the fancy electric power tools and never learn how to use a manual saw … and you don’t like it much.

paffdragon•5mo ago

I don't know why you make up such stories for yourself. It's completely pointless and has zero value.

Let me share you something, maybe it helps to update your perspective.

We reject people not because they help themselves with AI, everyone on the team uses AI in some form. Candidates are mostly rejected, because they don't understand what they write and can't explain what they just dumped into the editor from their AI assistant. We don't need colleagues who don't know what they ship and can't reason about code and won't be able to maintain it and troubleshoot issues. I can get the same level from an AI assistant without hiring. It's not old vs. young, we have plenty of young people on the team, it's about our time and efforts spent on people trying to fake their skills with AI help and then eventually fail and we wasted our time. This is the annoying part, the waste, because AI makes it easier to fake the process longer for people without the required skills.

jama211•5mo ago

I’ve worked for years in the past on huge complex sql. I wouldn’t have been able to remember exactly what that looks like in sql without a quick search. Your interview questions are bad if they require wrote learned syntax. Great programmers exist who barely bother to remember anything they can’t just look up.

trenchpilgrim•5mo ago

I know SQL but rarely have to use it because my career has mostly been nonrelational data (etcd, prometheus, object storage, dataframes). So for anything more than the basics I have to read the manual to refresh my memory.

ramoz•5mo ago

I've learned that I also need Gemini 2.5 and long context.

cloudking•5mo ago

Claude Code has very reliable tool calls, that's why I'm using it.

Tried Cursor, Windsurf and always ran into tool failures, edit failures etc.

throwaway-11-1•5mo ago

"ai is pretty good at design" its really cool thats its functional but am I going crazy or does all this stuff look really bad? I understand that more than anything engineers believe having to spend money on design is a total waste, but none of this is pleasing or balanced at all

Hrun0•5mo ago

> "ai is pretty good at design" its really cool thats its functional but am I going crazy or does all this stuff look really bad?

I agree that it's bad. What I noticed using AI was that it tends to introduce gradients whenever you want to make something look nice. Whenever I see a gradient now I immediately assume that it was designed with AI

weego•5mo ago

It's really bad if you value design as a primary part of your business language. If not then it's fine.

For example, AI could produce Graphanas design standards, which is fine for the audience.

mccoyb•5mo ago

Just from personal experience, visual design is the task with the worst outcomes for Claude Code (w/ latest Opus 4.1, etc).

It truly cannot reason well yet about geometry, visual aesthetics, placement, etc. The performance varies: it's quite good at matplotlib but terrible at any non-trivial LaTeX / TikZ layout or graphic. Why? Not a clear idea yet -- would love to think more about it.

I've tried many things now to try and give it eyes (via text), and this is unavoidably a place where things are ... rough ... right now.

I've had bad results with image screenshotting. More often than not, it has no idea what it is looking at -- won't even summarize the image correctly -- or will give me an incorrect take "Yes indeed we fixed the problem as you can tell by <this part of the UI> and <that part of the UI>" which is wrong.

I typically have to come in and make a bunch of fine-grained changes to get something visually appealing. I'm sure at some point we'll have a system which can go all the way, and I'd be excited to find approaches to this problem.

Note -- tasks which involve visual design which I've run into diminishing returns: * proper academic figures (had a good laugh at the GPT 5 announcement issues) * video game UI / assets * UI design for IDEs * Responsive web design for chat-based interfaces

All of these feel like "pelican" tasks -- they enter into a valley which can't be effectively communicated via textual feedback yet ...

mccoyb•5mo ago

Just reflecting on my own comment -- what one might want is an automated layout system with a simple "natural language"-like API (perhaps similar to Penrose, although it's been awhile since I looked at that project).

Hardened and long use systems like TikZ of course, do have something like this -- but in complex TikZ graphics, you end up with a mixture of "right of" and "left of" (etc) and low-level manual specification, which I think tends to fall into the zone of issues.

jama211•5mo ago

Compared to what it’d look like if I’d styled it myself, it’s great lol

btbuildem•5mo ago

Bro did nothing but bland mid web apps. Sometimes I think all this hype around vibe coding is simply because 95% of people who use it that way, they don't ever colour outside the lines.

I've been leaning hard on the code-gen crutch, don't get me wrong, and it's a force multiplier some of the time. I'm not even doing anything that out there, but it keeps stumbling over its shoelaces all the time.

alecco•5mo ago

> I had some issues getting it to understand that it wasn't meant to terminate, so I instead told it to write a basic bash script that calls claude with the -p flag and "please continue" whenever it detects its not running.

This is why we can't have nice things. Anthropic is placing more restrictive limits and now you risk being locked out for hours if you need to use it a bit more than usual (e.g. you have an impending deadline or presentation).

I wish Anthropic just banned these abusive accounts instead of placing draconian (and fuzzy) limits. The other day there was an idiot YouTube streamer actively looking to hit limits with as many concurrent Claude Code sessions as he could, doing nonsense projects.

mccoyb•5mo ago

I believe these are fundamentally two different types of abuse — the OP is engaging in a significantly less harmful version … the seriously harmful version is the account sharing / massively concurrent one which abuses whatever holes exist in the streaming API to allow Claude to “complete the completion and then stop because of limit” (which I think is there to make the UX better)

Just letting a single instance run all the time … is not that bad, seriously.

darqis•5mo ago

I think there's been enough free ad posts for this

varispeed•5mo ago

This looks like absolute nightmare. I cannot work in console like this.

SuperSandro2000•5mo ago

Fuck claude! They DDoS'ed my infra until I completely banned them!

prmph•5mo ago

> Have faith (always run it with 'dangerously skip permissions', even on important resources like your production server and your main dev machine.

Wow, the danger is not so much from Claude Code itself, but that it might download a package that will do nasty things on your machine when executed.

serf•5mo ago

Being on day 4 of being ignored entirely by CSRs from Anthropic after 5 months of paying for Max x20 has put a sufficiently bad taste in my mouth that it has killed all of my previous Claude Code cheer-leading efforts.

Sure, go have fun with the new software -- but for godsake don't actually depend on a company that can't bother to reply to you. Even Amazon replies.

elliotec•5mo ago

I had a problem with signing up for max with the wrong email, then thinking I didn’t actually do it, so I signed up with the one I wanted.

Saw the double bill and contacted them, I had a full refund in a couple days and a nice personal note thanking me for being a max user.

This was a couple months ago so it’s possible they’ve had a huge influx of requests that made it difficult to respond fast lately but I had a good experience with their customer service.

dehugger•5mo ago

Isn't a large part of AWS's reputation based around providing surprisingly good customer support?

steve_adams_86•5mo ago

My experience has generally been positive, even as a small customer spending in the low thousands per month. I've definitely had help that wasn't particularly effective or adept. Lots of gradual escalations which are fairly time consuming. But they've certainly made sure I had assistance, and it was fairly prompt.

archon810•5mo ago

Curious, what is the issue you're running into that you want them to resolve?

tomashubelbauer•5mo ago

For me it was the constant overloads. Paying 200 USD a month only to open Claude Code and half the time it would get stuck at the initial prompt. Sometimes with an overload error, sometimes just stuck forever. Maybe they improved it now, but it has motivated me to switch to Cursor Agent (also TUI based like CC) with GPT-5 to see if it was a viable alternative to Claude Code and so far it is working even a bit better for my use-cases.

bodge5000•5mo ago

Feels like the word "all" is pulling a lot of weight in that sentence, it's not cheap and you're forever reliant on a company to keep providing the service (which has already changed recently I think, seem to remember something about harsher rate limiting being put in place). "All you need" typically implies that you don't need much (eg "The terminal is all you need"), and Claude Code isn't that.

Otherwise good article, I'm still not sure vibe coding is for me and at the price, it's hard to justify trying to out, but things like this do make me a little more tempted to give it a shot. I doubt it'd ever replace writing code by hand for me, but could be fun for prototyping I suppose

umvi•5mo ago

Too bad Claude Code doesn't have a fixed cost plan for teams. The API is super expensive (I can easily spend $6-10 in a single sitting in tokens)

howToTestFE•5mo ago

I wonder what it will be like in 5-10 years time to look back at this sort of time, as we start to figure out the next way to code...

rcvassallo83•5mo ago

Let's vibe some crud?

Sir, do you realize that crud is such a solved problem that popular MVC frameworks from over a decade ago generate it for you from templates? No wasteful LLM prompting required.

amelius•5mo ago

Note: use the Esc key to close images after you opened them (back button does not work).

jmull•5mo ago

Why do you even need Claude Code?

A frustration of using tools is that they never entirely act exactly the way you want... instead of it working the way you want, you have to work they way it wants (and before that, you have to figure out what that is).

...We're stuck with this, because it's just not feasible to build custom software for each person, that works exactly the way they want.

...Or is it?

I'm intrigued by the possibility that coding models do in fact make it feasible to have software customized exactly to what I want.

Of course, that includes the coding agent, so no need for Claude Code.

zmmmmm•5mo ago

> instead of it working the way you want, you have to work they way it wants

You can use these tools in lots of different ways. You don't have to go all the way to the vibe coding extreme. You can go down to individual functions and have it write them and adjust them step by step if you want, approving every diff. Or even just have it on the side to chat with about your code (Aider is great for this in /ask mode). You can guide its behaviour with CONVENTIONS.md / CLAUDE.md and get it to work how you want as well. I find it handy just asking Aider to write comments for my functions.

It's definitely early days but I'm expecting we'll pretty quickly evolve to having standard "profiles" you can apply to work in different ways, and projects will start shipping with them out of the box (a bit like lint config). This will relieve a lot of the friction at the moment where you spend the first hours / days just corralling it to work the way you want.

interpol_p•5mo ago

I've been diving into Claude Code after reading articles constantly praising its abilities. But I think perhaps it's better suited to web development

Using it for iOS development is interesting. It does produce working output (sometimes!) but it's very hit-or-miss. Recently I gave it a couple hours to build a CarPlay prototype of one of my apps. It was completely unable to refactor the codebase to correctly support CarPlay (even though I passed the entire CarPlay documentation into it). I gave it three attempts at it. Then I intervened and added support for CarPlay manually, following that I added a lot of skeleton code for it to flesh out. Claude was then able to build a prototype

However, over the next few days as I tried to maintain the code I ended up rewriting 60% of it because it was not maintainable or correct. (By "not correct" I mean it had logic errors and was updating the display multiple times with incorrect information before replacing it with correct information, causing the data displayed to randomly refresh)

I also tried getting it to add some new screens to a game I develop. I wanted it to add some of the purchase flows into the app (boring code that I hate writing). It managed to do it with compile errors, and was unable to fix its own build output despite having the tools to do so. Instead of fixing the build errors it caused, Claude Code decided it would manually verify that only its own changes were correct by running `swiftc` on only files that it touched. Which was nonsense

All that said, there was a benefit in that Claude Code writing all this code and getting something up on the screen motivated me to finally pick up the work and do some of these tasks. I had been putting them off for months and just having the work "get started" no matter how bad, was a good kick start

cyral•5mo ago

I haven't tried Claude Code yet, but I extensively use Cursor and find it can do similar work (which makes sense). I think it's good you led with the length prompt/spec, which will maybe help the "AI can't do ____" crowd grasp that it really is a skill issue when they can't get any good results from LLMs. (Unfortunately it cannot help the "well, this isn't actually useful or making an impact" group)

Regarding some of the comments here: I found the article style fine, and I even like the "follow my journey" style writing as it helps the reader understand the process you went though. That kind of engineering and debugging workflow is something I enjoy about this industry.

ehnto•5mo ago

> help the "AI can't do ____" crowd grasp that it really is a skill issue when they can't get any good results from LLMs.

I think the issue with this conversation is that no one tells you what they are working on, and I suspect there is both a skill gap in usage, but also a lack of capability in the LLMs, both surfacing as the same outcome.

There is definitely stuff an LLM cannot do on its own, at which point is the LLM really achieving the outcome or is it the human just achieving it by backseat driving. Much like a senior telling the junior how to solve a tricky bug, you wouldn't say the junior came up with the solution and therefore you would not say the junior is capable of solving the bug.

dazzaji•5mo ago

This is a fun project to be sure. I just wish the author would not refer to the experiment as an "autonomous startup builder" unless they mean it humorously. Having poked around the GitHub repo and read through the materials, it seems like more of an AI coding assistant running in a loop that built and deployed a broken web application with no users, no business model, and no understanding of what problem it was trying to solve. There were quasi-autonomous processes and there were things that were built, but nothing I'd call a startup.

buyx•5mo ago

$100 a month for a South African is not a trivial amount. Definitely putting his money where his mouth is.

ontigola•5mo ago

"Thanks for the article. I found it interesting, though the 'vibe coding' method isn't something I can apply in my own IT environment, mainly due to its complexity and technology stack. The piece makes a strong case for why LLMs are so good at coding. It's clear they excel with the technologies and patterns most common in their training data—the same ones their own developers likely use. While their performance drops off with less familiar tech, it's undeniable that for certain types of tasks, they are very effective."

mock-possum•5mo ago

Who are you quoting here?

anotherpaul•5mo ago

I feel like Claude code is great but how is it so much different from the agent of GitHub copilot? They both can use the same model (Claude 4) and the agent behaves very similar. For 10 bucks I think GitHub copilot is very reasonably priced as well. But ofc usage limits are maybe where they differ a lot. Am I missing out?

nicewood•5mo ago

I only know the old Github Copilot (like 2yrs+ ago) so cannot speak to it directly, but even the Cursor Agent (with Sonnet 4 or GPT-5) is IMO inferior to Claude Code (CC). In my experience, it is faster and better performing. CC seems to spend tokens more deliberately + gives superior coding tools to the model than other provider.

Recently my CC subscription ran out, tried 3 prompts with Cursor Agent and then went back to subscribing CC. I still use Cursor though for autocompletion.

anotherpaul•5mo ago

Yeah the old one was much worse. They have really stepped up their game recently. Thats why I am wondering how large the gap still is. I only ever see people compare cursor and Claude code. Never GitHub copilot. So I assume there is either a blind spot or it's much worse

wolvesechoes•5mo ago

I don't even need Claude Code.

codeulike•5mo ago

How does privacy work with Claude Code in the real world? Its arbitrarily sending files from current folder/subfolders to anthropics cloud, right? Does Claude have a good privacy policy? E.g. do they promise not to retain what gets sent?

Because I think 'sending everything to the ai' would be a bit of an obstacle for most company environments

apexalpha•5mo ago

Almost every company on the planet sends ALL their data to the cloud.

They even PAY for this privilege.

codeulike•5mo ago

Some companies are pretty careful and each cloud has to have the right certifications and there will be an internal data protection policy about what sorts of things are allowed to be uploaded there. Something like Claude Code which arbitrarily yeets local files to some cloud somewhere is a liability in that environment ... unless they have a very good privacy policy. hence my question.

spyder•5mo ago

"Now I can just tell Claude to write an article (like the one you're currently reading) and give it some pointers regarding how I want it to look, and it can generate any custom HTML and CSS and JavaScript I want on the fly."

Yea, I know that was the case when I clicked on the thumbnails and couldn't close the image and had to reload the whole page. Good thing that you could just ask AI to fix this, but the bad thing is that you assumed it would produce fully working code in one shot and didn't test it properly.

sixhobbits•5mo ago

When I asked it to add the gallery I also asked it to make sure the images close if you press escape or outside the image. I guess I wasn't thinking about mobile users, but definitely on me, not Claude there :)

*EDIT* prominent close button and closing on back navigation added (probably people will complain about hijackng the back button now)

cschmatzler•5mo ago

and again we can tell based on how the x isn’t centered in the close button

_heimdall•5mo ago

If a button with content out of center is a clear sign of LLM use, these tools are decades older than I realized.

jdiff•5mo ago

I wouldn't call it a clear sign of LLM use myself but in the year of our lord 2025 it should be unheard of, we've got so many nice tools for layouting nowadays. It's certainly below par if LLMs can't reliably manage it.

wiseowise•5mo ago

Bold of you to assume that they assumed something instead of thinking that they might just not give a shit about it.

bapak•5mo ago

To be fair a lot of custom-built websites are crap too and generally they cost a lot more time and money.

e-brake•5mo ago

Nothing new but you can configure Claude Code to use GPT5 as its LLM model: https://docs.litellm.ai/docs/tutorials/claude_responses_api

yahoozoo•5mo ago

Is there a way to route Claude Code through a GitHub Copilot subscription?

adhamsalama•5mo ago

You can use GitHub Copilot as an API (free unlimited gpt-4.1) with the generated token used in the JetBrains extension, but it's really stupid so I canceled the subscription and subscribed to Claude.

mdrzn•5mo ago

This how I run CC, directly on my server, with root, but no permission skip.

javier_e06•5mo ago

AI is a box, beware of how our brains mold inside of it.

https://www.media.mit.edu/articles/a-i-is-homogenizing-our-t...

dangoodmanUT•5mo ago

I personally like Cursor ssh, because I get the editor too

meistertigran•5mo ago

I have been doing something very simillar!

Instead of giving it a VPS, I just made a tool that allows synchronization of localStorage data.

I now just upload the HTML it generates and have an app instantly.

einpoklum•5mo ago

> Within a few days I'd upgraded to the $100/month MAX plan

This is something a third of your total gross income - if we take a median over the people of the world.

Pomelolo•5mo ago

This reads like an ad.

Start all of your commands with a comma (2009)

Software Engineering Is Back

Hoot: Scheme on WebAssembly

Reinforcement Learning from Human Feedback

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

The Waymo World Model

Vocal Guide – belt sing without killing yourself

France's homegrown open source online office suite

Making geo joins faster with H3 indexes

Unseen Footage of Atari Battlezone Arcade Cabinet Production

Ga68, a GNU Algol 68 Compiler

What Is Ruliology?

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

Monty: A minimal, secure Python interpreter written in Rust for use by AI

Cross-Region MSK Replication: K2K vs. MirrorMaker2

Show HN: I spent 4 years building a UI design tool with only the features I use

Hackers (1995) Animated Experience

Sheldon Brown's Bicycle Technical Info

Show HN: If you lose your memory, how to regain access to your computer?

Microsoft open-sources LiteBox, a security-focused library OS

An Update on Heroku

The AI boom is causing shortages everywhere else

PC Floppy Copy Protection: Vault Prolok

Was Benoit Mandelbrot a hedgehog or a fox?

Dark Alley Mathematics

How to effectively write quality code with AI

Female Asian Elephant Calf Born at the Smithsonian National Zoo

I now assume that all ads on Apple news are scams

Understanding Neural Network, Visually

Delimited Continuations vs. Lwt for Threads

Start all of your commands with a comma (2009)

Software Engineering Is Back

Hoot: Scheme on WebAssembly

Reinforcement Learning from Human Feedback

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

The Waymo World Model

Vocal Guide – belt sing without killing yourself

France's homegrown open source online office suite

Making geo joins faster with H3 indexes

Unseen Footage of Atari Battlezone Arcade Cabinet Production

Ga68, a GNU Algol 68 Compiler

What Is Ruliology?

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

Monty: A minimal, secure Python interpreter written in Rust for use by AI

Cross-Region MSK Replication: K2K vs. MirrorMaker2

Show HN: I spent 4 years building a UI design tool with only the features I use

Hackers (1995) Animated Experience

Sheldon Brown's Bicycle Technical Info

Show HN: If you lose your memory, how to regain access to your computer?

Microsoft open-sources LiteBox, a security-focused library OS

An Update on Heroku

The AI boom is causing shortages everywhere else

PC Floppy Copy Protection: Vault Prolok

Was Benoit Mandelbrot a hedgehog or a fox?

Dark Alley Mathematics

How to effectively write quality code with AI

Female Asian Elephant Calf Born at the Smithsonian National Zoo

I now assume that all ads on Apple news are scams

Understanding Neural Network, Visually

Delimited Continuations vs. Lwt for Threads

Claude Code is all you need

Comments