Snorting the AGI with Claude Code

https://kadekillary.work/blog/#2025-06-16-snorting-the-agi-with-claude-code

226•beigebrucewayne•16h ago

Comments

dwohnitmok•5h ago

On the one hand very cool.

On the other hand, every time people are just spinning off sub-agents I am reminded of this: https://www.lesswrong.com/posts/kpPnReyBC54KESiSn/optimality...

It's simultaneously the obvious next step and portends a potentially very dangerous future.

TeMPOraL•5h ago

> It's simultaneously the obvious next step

As it has been over three years ago, when that was originally published.

I'm continuously surprised both by how fast the models themselves evolve, and how slow their use patterns are. We're still barely playing with the patterns that were obvious and thoroughly discussed back before GPT-4 was a thing.

Right now, the whole industry is obsessed with "agents", aka. giving LLMs function calls and limited control over the loop they're running under. How many years before the industry will get to the point of giving LLMs proper control over the top-level loop and managing the context, plus an ability to "shell out" to "subagents" as a matter of course?

qsort•5h ago

> How many years before the industry will get to the point

When/if the underlying model gets good enough to support that pattern. As an extreme example, you aren't ever going to make even a basic agent with GPT-3 as the base model, the juice isn't worth the squeeze.

Models have gotten way better and I'm now convinced (new data -> new opinion) that they are a major win for coding, but they still need a lot, a lot of handholding, left to their own devices they just make a mess.

The underlying capabilities of the model are the entire ballgame, the "use patterns" aren't exactly rocket science.

benlivengood•5h ago

We haven't hit the RSI threshold yet and so evolution is so slow that it's usually terminated as not-useful or it solves a concrete problem and is terminated by itself or a human. Earlier model+frameworks merely petered out almost immediately. I'm guessing it's roughly correlated with the progress on METR.

lubujackson•3h ago

Am I the only one who saw in the prompt:

> ${SUGESTION}

And recognized it wouldn't do anything because of a typo? Alas, my kind is not long for this world...

floren•2h ago

I noticed it and then scrolled through looking for the place where they called it out... sadly disappointed but I don't know what I expected from lesswrong

SamPatt•5h ago

>Claude code feels more powerful than cursor, but why? One of the reasons seems it's ability to be scripted. At the end of the day, cursor is an editor, while claude code is a swiss army knife (on steroids).

Agreed, and I find that I use Claude Code on more than traditional code bases. I run it in my Obsidian vault for all kinds of things. I run it to build local custom keyboard bindings with scripts that publish screenshots to my CDN and give me a markdown link, or to build a program that talks to Ollama to summarize my terminal commands for the last day.

I remember the old days of needing to figure out if the formatting changes I wanted to make to a file were sufficient to build a script or just do them manually - now I just run Claude in the directory and have it done for me. It's useful for so many things.

jjice•5h ago

I'm very interested to hear what your uses cases are when using it in your Obsidian Vault

SamPatt•25m ago

Formatting changes across lots of notes, creating custom plugins, diagnosing problems with community plugins, creating a syncing program that compares my vault (with publish:true frontmatter) to my blog repo and if see changes then automatically updates the repo (which is used to build my site), creating a tool that converts inline urls to markdown footnotes, etc.

Obsidian is my source of truth and Claude is really good at managing text, formatting, markdown, JS, etc. I never let it make changes automatically, I don't trust it that much yet, but it has undoubtedly saved me hours of manual fiddling with plugins and formatting alone.

Aeolun•5h ago

The thing is, Claude Code only works if you have the plan. It’s impossible to use it on the API, and it makes me wonder if $100/month is truly enough. I use it all day every day now, and I must be consuming a whole lot more than my $100 is worth.

sorcerer-mar•5h ago

> It’s impossible to use it on the API

What does this mean?

oxidant•4h ago

Not OP but probably just cost.

SV_BubbleTime•3h ago

This.

You can EASILY burn $20 a day doing little, and surely could top $50 a day.

It works fine, but the $100 I put in to test it out did not last very long even on Sonnet.

ggsp•4h ago

You can definitely use Claude Code via the API

lawrencechen•4h ago

I think he means it's not economically sound to use it via API

wahnfrieden•1h ago

A well-known iOS dev used Claude Code to build an iOS app and wrote a custom checking tool for how many tokens it consumed on the plan to compare with API pricing.

He uses two max plans ($200/mo + $200/mo) and his API estimate was north of $10,000/mo

practal•4h ago

I think it is available on Claude Pro now, so just $20.

razemio•3h ago

It is but very limited. I use API only, since this is the only plan, without usage limits and on demand pricing:

5x Pro usage ($100/month)

20x Pro usage ($200/month)

Source: https://support.anthropic.com/en/articles/11145838-using-cla...

"Pro ($20/month): Average users can send approximately 45 messages with Claude every 5 hours, OR send approximately 10-40 prompts with Claude Code every 5 hours."

"You will have the flexibility to switch to pay-as-you-go usage with an Anthropic Console account for intensive coding sprints."

CGamesPlay•3h ago

You use it "all day every day", so it makes sense that you would prefer the plan. It's perfectly economical to use it without a plan, if your usage patterns are different. Here's a tool someone else wrote that can help you decide: https://github.com/ryoppippi/ccusage

davidw•2h ago

One thing that I am not liking about the LLM world is that it seems to be tilting several things back in favor of BigCorps.

The open source world is one where antirez, working on his own off in Sicily, could create a project like Redis and then watch it snowball as people all over got involved.

Needing a subscription to something only a large company can provide makes me unhappy.

We'll see if "can be run locally" models for more specific tasks like coding will become a thing, I guess.

TSiege•1h ago

This is some nightmare fuel vendor lock-in where the codebase isn't understood by anyone and companies have to fork over more and more otherwise their business couldn't grow, adapt, etc

datameta•1h ago

Yikes, you've just perfectly articulated a trajectory that I've been using subconsciously as one of the primary reasons why I want to keep my coding craft sharp.

3rdDeviation•1h ago

Visionary, well done. Then comes the claim that AGI can unwind the spaghetti, and then the reality check.

I, for one, welcome our new LLM overlords.

SamPatt•18m ago

I share this concern - given the trajectory of improvements I do hope that we'll have something close to this level that can run locally within 18 months or so. And of course the closed source stuff will likely be better by then, but I genuinely believe I would choose an open source version of this right now if I had the choice.

The open source alternatives I've used aren't there yet on my 4090. Fingers crossed we'll get there.

AstroBen•15m ago

I had an LLM sort a crap-tonne of my notes into category folders the other day. My god that was helpful

tinyhouse•5h ago

This article is a bit all over the place. First, a slide deck to describe a codebase is not that useful. There's a reason why no one ever uses a slide deck for anything besides supporting an oral presentation.

Most of these things in the post aren't new capabilities. The automation of workflows is indeed valuable and cool. Not sure what AGI has anything to do with it.

bravesoul2•5h ago

Also I don't trust it. They touched on that I think (I only skimmed).

Plus you shouldn't need an LLM to understand a codebase. Just make it more understandable! Of course capital likes shortcuts and hacks to get the next feature out in Q3.

imiric•4h ago

> Plus you shouldn't need an LLM to understand a codebase. Just make it more understandable!

The kind of person who prefers this setup wants to read (and write) the least amount of code on their own. So their ideal workflow is one where they get to make programs through natural language. Making codebases understandable for this group is mostly a waste of effort.

It's a wild twist of fate that programming languages were intended to make programming friendly to humans, and now humans don't want to read them at all. Code is becoming just an intermediary artifact useless to machines, which can instead write machine code directly.

I wish someone could put this genie back in the bottle.

DougMerritt•4h ago

> It's a wild twist of fate that programming languages were intended to make programming friendly to humans, and now humans don't want to read them at all.

Those are two different groups of humans, as you implied yourself.

lelandbatey•4h ago

There is no amount of static material that will perfectly conform to the shape and contours of every mind that consumes that static material such that they can learn what they want to learn when they want to learn it.

Having a thing that is interactive and which can answer questions is a very useful thing. A slide deck that sits around for the next person is probably not that great, I agree. But if you desperately want a slide deck, then an agent like Claude which can create it on demand is pretty good. If you want summaries of changes over time, or to know "what's the overall approach at a jargon-filled but still overview level explanation of how feature/behavior X is implemented?", an agent can generate a mediocre (but probably serviceable) answer to any of those by reading the repo. That's an amazing swiss-army knife to have in your pocket.

I really used to be a hater, and I really did not trust it, but just using the thing has left me unable to deny its utility.

bravesoul2•2h ago

The problem is if no one can describe something with words without an LLM to scour though every line of code it probably means it can't make sense to humans.

Maybe that is the idea (vibe coding ftw!) but if you want something people can understand and refine it is good to make it modular and decomposable and understandable. Then use AI to help you with the words for sure but at some level there is a human that understands the structure.

groby_b•3h ago

> Plus you shouldn't need an LLM to understand a codebase. Just make it more understandable!

And fundamentally, that isn't a function of "capital". All code bases are shaped by the implicit assumptions of their writers. If there's a fundamental mismatch or gap between reader and writer assumptions, it won't be readable.

LLMs are a way to make (some of) these implict assumptions more legible. They're not a panacea, but the idea of "just make it more understandable" is not viable. It's on par with "you don't need debuggers, just don't write bugs"

Uehreka•4h ago

> Not sure what AGI has anything to do with it.

Judging from the tone of the article, they’re using the term AGI in a jokey way and not taking themselves too seriously, which is refreshing.

I mean like, it wouldn’t be refreshing if the article didn’t also have useful information, but I do actually think a slide deck could be a useful way to understand a codebase. It’s exactly the kind of nice-to-have that I’d never want a junior wasting time on, but if it costs like $5 and gets me something minorly useful, that’s pretty cool.

Part of the mind-expanding transition to using LLMs involves recognizing that there are some things we used to dislike because of how much effort they took relative to their worth. But if you don’t need to do the thing yourself or burn through a team member’s time/sanity doing it, it can make you start to go “yeah fuck it, trawl the codebase and try to write a markdown document describing all of the features and requirements in a tabular format. Maybe it’ll go better than I expect, and if it doesn’t then on to something else.”

abhisheksp1993•5h ago

``` claude --dangerously-skip-permissions # science mode ```

This made me chuckle

42lux•5h ago

If people would be as patient and inventive to teach junior devs as they are with llms the whole industry would be better of.

sorcerer-mar•5h ago

You pay junior devs way way way more money for the privilege of them being bad.

And since they're human, the juniors themselves do not have the patience of an LLM.

I really would not want to be a junior dev right now... Very unfair and undesirable situation they've landed in.

mentos•4h ago

At least it’s easier to teach yourself anything now with an LLM? So maybe it balances out.

sorcerer-mar•4h ago

I think it's actually even worse: it's easier to trick yourself into thinking you're teaching yourself anything.

Learning comes from grinding and LLMs are the ultimate anti-intellectual-grind machines. Which is great for when you're not trying to learn a skill!

jyounker•4h ago

Yeah, you have to be really careful about how you use LLMs. I've been finding it very useful to use them as teachers, or to use them in the same way that I'd use a coworker. "What's the idiomatic ways to write this python comprehension in javascript?" Or, "Hey, do you remember what you call it when..." And when I request these things I'll try to ask in the most generic way possible so that I then get retype the relevant code, filling in the blanks with my own values.

That's just one use though. The other is treating it like it's a jr developer, which has its own shift in thinking. Practice in writing details specs goes a long way here.

sorcerer-mar•4h ago

100% agreed.

> Practice in writing details specs goes a long way here.

This is an additional asymmetric advantage to more senior engineers as they use these tools

tnel77•4h ago

>>Learning comes from grinding

Says who? While “grinding” is one way to learn something, asking AI for a detailed explanation and actually consuming that knowledge with the intent to learn (rather than just copy and pasting) is another way.

Yes, you should be on guard since a lot of what it says can be false, but it’s still a great tool to help you learn something. It doesn’t completely replace technical blogs, books, and hard earned experience, but let’s not pretend that LLMs, when used appropriately, don’t provide an educational benefit.

sorcerer-mar•4h ago

Pretty much all education research ever points to the act of actually applying knowledge, especially against variable cases, to be required to learn something.

There is no learning by consumption (unfortunately, given how we mostly attempt to "educate" our youth).

I didn't say they don't or can't provide an educational benefit.

fullstackchris•1h ago

Some of the best software learning I ever had when I was starting out was following along with video courses and writing the code line by line along with the instructor... or does this not count as "consumption"?

sorcerer-mar•1h ago

> I was... following along and writing the code line by line

That's application. Then presumably you started deviating a little bit from exactly what the instructor was doing. Then you deviated more and more.

If you had the instructor just writing the code for every new deviation you wanted to build and you just had to mash the "Accept Edit" button, you would not have learned very effectively.

djeastm•1h ago

Sure, but easy in, easy out. Hard earned experience is worth soo much more than slick summaries of the last twenty years of blog articles.

andy99•4h ago

Even though I think most people know this deep down, I still don't think we actively realize how optimized LLMs are towards sounding good. It's the ultra processed food version of information consumption. People are super lazy (economical if you like) and rlhf et al have optimized LLM output to being easy to digest.

Consequence is you get a bunch of output that looks really good as long as you don't think about it (and they actively promotes not thinking about it) that you don't really understand, and that if you did dig into you'd realize is empty fluff or actively wrong.

It's worse than not learning, it's actively generating unthinking but palatable garbage that's the opposite of learning.

drewlesueur•4h ago

I think it would be great to be a junior dev now and be able to learn quickly with llms.

lelanthran•3h ago

> I think it would be great to be a junior dev now and be able to learn quickly with llms.

I'm not so sure; I get great results (learning) with them because I can nitpick what they give me, attempt to explain how I understand it and I pretty much always preface my prompts with "be critical and show me where I am wrong".

I've seen a junior use it to "learn", which was basically "How do I do $FOO in $LANGUAGE".

For that junior to turn into a senior who prompts the way I do, they need a critical view of their questions, not just answers.

jml78•3h ago

If you actually want to learn………

I have experienced multiple instances of junior devs using llm outputs without any understanding.

When I look at the PR, it is immediately obvious.

I use these tools everyday to help accelerate. But I know the limitations and can look at the output to throw certain junk away.

I feel junior devs are using it not to learn but to try to just complete shit faster. Which doesn’t actually happen because their prompts suck and their understanding of the results is bad.

fallinditch•4h ago

Maybe it's the senior devs who should be the ones to worry?

Seniors' attitudes on HN are often quick to dismiss AI assisted coding as something that can't replace the hard-earned experience and skill they've built up during their careers. Well maybe, maybe not. Senior devs can get a bit myopic in their specializations. Whereas a junior Dev doesn't have so much baggage, maybe the fertile brains of youth are better in times of rapid disruption where extreme flexibility of thought is the killer skill.

Or maybe the whole senior/junior thing is a red herring and pure coding and tech skills are being deflated all across the board. Perhaps what is needed now is an entirely new skill set that we're only just starting to grasp.

sorcerer-mar•4h ago

Maybe! Probably not though.

yakz•4h ago

Senior devs provide better instructions to the agent, and can recognize more kinds of mistakes and can recognize mistakes more quickly. The feedback loop is more useful to someone with more experience.

I had a feeling today that I should really be managing multiple instances at once, because they’re currently so slow that there’s some “downtime”.

bakugo•3h ago

> Maybe it's the senior devs who should be the ones to worry?

Why would they be worried?

Who else going to maintain the massive piles of badly designed vibe code being churned out at an increasingly alarming pace? The juniors prompting it certainly don't know what any of it does, and the AIs themselves have proven time and again to be incapable of performing basic maintenance on codebases above a very basic level of complexity.

As the ladder gets pulled up on new juniors, and the "fertile brains" of the few who do get a chance are wasted as they are actively encouraged to not learn anything and just let a computer algorithm do the thinking for them, ensuring they will never have a chance to become seniors themselves, who else will be left to fix the mess?

tonyhart7•3h ago

we literally have many no code solution like wordpress etc

do webdev is still there??? yes there are just because you can "create" something that doesn't mean you knowledge able in that area

we literally have entire industry created to fix wordpress instance + code, what do you else we need to worry for

sally_glance•3h ago

Wherever you look, the conclusion is the same - balance is required. Too many seniors, you get stuck in one way streets. Too many juniors, you trip over your own feet and diverge into unknown avenues. Mix AI in, I don't see how that changes much at all... Juniors drive into unknown territory faster, Seniors get stuck in their niche just as well. Acceleration yes, fundamental change of how we work - I don't see it yet.

AdieuToLogic•1h ago

> Seniors' attitudes on HN are often quick to dismiss AI assisted coding as something that can't replace the hard-earned experience and skill they've built up during their careers.

One definition of experience[0] is:

  direct observation of or participation in events as a basis of knowledge

Since I assume by "AI assisted coding" you are referring to LLM-based offerings, then yes, "hard-earned experience and skill" cannot be replaced with a statistical text generator.

One might as well assert an MS-Word document template can produce a novel Shakespearean play or that a spreadsheet is an IRS auditor.

> Or maybe the whole senior/junior thing is a red herring and pure coding and tech skills are being deflated all across the board. Perhaps what is needed now is an entirely new skill set that we're only just starting to grasp.

For a repudiation of this hypothesis, see this post[1] also currently on HN.

0 - https://www.merriam-webster.com/dictionary/experience

1 - https://blog.miguelgrinberg.com/post/why-generative-ai-codin...

yieldcrv•4h ago

> I really would not want to be a junior dev right now... Very unfair and undesirable situation they've landed in.

I don't really get this, at the beginning of my career I masquaraded as a senior dev with experience as fast as I could until it was laundered into actual experience

Form the LLC and that's your prior professional experience, working for it

I felt I needed to do that and that was way before generative AI, like at least a decade

beefnugs•3h ago

See if the promise was real: llms are great skill multipliers! Then it is the new renaissance of one developer businesses popping up left and right every day! Ain't nobody got time for corporate coercion hierarchy nonsense.

Hmm no news about that really

jwr•2h ago

> You pay junior devs way way way more money for the privilege of them being bad.

Oh, it's worse than that. You do that, and they complain that they are underpaid and should earn much, much more. They also think they are great, it's just you, the old-timer, that "doesn't get it". You invest lots of time to work with them, train them, and teach them how to work with your codebase.

And then they quit because the company next door offered them slightly more money and the job was easier, too.

leptons•1h ago

>You pay junior devs way way way more money for the privilege of them being bad.

I hope you don't think that what you're paying for an LLM today is what it actually costs to run the LLM. You're paying a small fraction.

So much investment money is being pumped into AI that it's going to make the 2000 dot-com bubble burst look tiny in comparison, if LLMs don't start actually returning on the massive investments. People are waking up to the realities of what an LLM can and can't do, and it's turning out to not be the genie in the bottle that a lot of hype was suggesting. Same as crypto.

The tech world needs a hype machine and "AI" is the current darling. Movie streaming was once in the spotlight too. "AI" will get old pretty soon if it can't stop "hallucinating". Trust me I would know if a junior dev is hallucinating and if they actually are then I can choose another one that won't and will actually become a great software developer. I have no such hope for LLMs based on my experiences with them so far.

sorcerer-mar•50m ago

Yeah, all fair, but I think there's enough capital to keep the gravy train rolling until the cost-per-performance actually get way, way, way below human junior engineers.

A lot of the application layer will disappear when it fails to show ROI, but the foundation models will continue to have obscene amounts of money dumped into them, and the coding use case will come along with that.

godelski•4h ago

A constant reminder: you can't have wizards without having noobs.

Every wizard was once a noob. No one is born that way, they were forged. It's in everybody's interest to train them. If they leave, you still benefit from the other companies who trained them, making the cost equal. Though if they leave, there's probably better ways to make them stay that you haven't considered (e.g. have you considered not paying new juniors more than your current junior that has been with the company for a few years? They should be able to get a pay bump without leaving)

QuantumGood•4h ago

I think too many see it more as "every stem cell has the potential to be any [something]", but it's generally better to let them self differentiate until survivors with more potential exist.

lunarboy•4h ago

I'm sure people (esp engineers) know this. But imagine you're starting a company: would you try to deploy N agents (even if shitty), or take a financial/time/legal/social risk with a new hire. When you consider short-term costs, the math just never works out in favor of real humans.

geraneum•4h ago

Well, in the beginning, the math doesn’t work out in favor of building the software (or the thing you want to sell) either.

QuercusMax•4h ago

What about the financial / legal / social risk of your AI agent doing something bad? You're only looking at cost savings, without seeing the potentially major downsides.

shinycode•3h ago

To follow up my previous comment, I worked on a project where someone fixed an old bug. This bug became a feature for clients who build their systems around this api endpoint. The consequence is hundreds of thousands of user duplicates with automations attaching new ressources and actions randomly on the duplicates. Massive consequences for the customers. If it were an AI doing the fixing with no human intervention, good luck understanding, cleaning the mess and holding accountable. People seem lightly think that if the agent is doing something bad it’s just a risk to take. But when a codebase with massive amounts of loc and logic is build and no human knows it, how to deal with the consequences on people’s business ? Can’t help but think it’s crappy software with a « Google closed your Gmail account, no one knows why and we can’t do anything about it, sorry ». But instead of a mail account it’s part of your business

tonyhart7•3h ago

"What about the financial / legal / social risk of your AI agent doing something bad?"

the same way we treat it like human making mistake??? AI cant code themselves, someone command them to create something

shinycode•4h ago

I can’t stop thinking that this way of thinking is either plain wrong and misses completely what software development is really about. Or very true and in X years people will just ask the trending AI « I need a billing/CRM/X system with those constraints ». Then the AI will ask questions and refine the need. Work for 30mn the time to use libs and code the whole thing, pass into systems to test and deploy and voila. Custom feature on demand. No CEO, no sales, nobody. You just deploy your own SaaS feature. Then good luck to scale properly and migrate data and add features and complexity. If agents hold onto their promise, then the future is custom based, you deploy what you need, SaaS platform is dead with everyone in between useless.

TuringNYC•3h ago

>> A constant reminder: you can't have wizards without having noobs.

Try telling that to companies with quarterly earnings. Very few resist the urge to optimize for the short term.

qsort•4h ago

The vilification of juniors and the abandonment of the idea that teaching and mentoring are worthwhile are single-handedly making me speedrun burnout. May a hundred years of Microsoft Visio befall anybody who thinks that way.

empireofdust•1h ago

What’s the best implementation of junior training or teaching/mentoring in general within tech that you’ve seen?

jayofdoom•3h ago

I spent a lot of time in my career, honestly some of the most impactful stuff I've done, mentoring college students and junior developers. I think you are dead on about the skills being very similar. Being verbose, not making assumptions about existing context, and generalized warnings against pitfalls when doing the sort of thing you're asking it to do goes a long long way.

Just make sure you talk to Claude in addition to the humans and not instead of.

jasonthorsness•4h ago

The terminal really is sort of the perfect interface for an LLM; I wonder whether this approach will become favored over the custom IDE integrations.

drcode•4h ago

sort of, except I think the future of llms will be to to have the llm try 5 separate attempts to create a fix in parallel, since llm time is cheaper than human time... and once you introduce this aspect into the workflow, you'll want to spin up multiple containers, and the benefits of the terminal aren't as strong anymore.

jyounker•4h ago

Having command line tools to spin up multiple containers and then to collect their results seems like it would be a pretty natural fit.

jtms•3h ago

Tmux?

peab•3h ago

dagger does this: https://www.youtube.com/watch?v=C2g3vdbffOI

sally_glance•3h ago

Who or what will review the 5 PRs (including their updates to automated tests)? If it's just yet another agent, do we need 5 of these reviews for each PR too?

In the end, you either concede control over 'details' and just trust the output or you spend the effort and validate results manually. Not saying either is bad.

smallnamespace•3h ago

If you can define your problem well then you can write tests up front. An ML person would call tests a "verifier". Verifiers let you pump compute into finding solutions.

djeastm•1h ago

Will people be willing to make their full time job writing tests?

ehnto•19m ago

As well, just because it pasts a test doesn't mean it doesn't do wonky, non-performant stuff. Or worse, side effects no one verified. Plenty often the LLM output will add new fields I didn't ask it to change as one example.

cwlb•2h ago

https://github.com/dagger/container-use

ldjkfkdsjnv•3h ago

as the models get better, IDEs will be seen as low level

magackame•2h ago

Wait you write your code by hand??? ewww...

fragmede•2h ago

Aider's supported /voice for a while now.

42lux•1h ago

voice is probably the worst human -> compute interface we have.

datameta•1h ago

Human speech evolved with biological constraints and through neurological adaptions to emit and understand the nonlinear output that has lexically fuzzy areas to the untrained ear. So I think it's a rather "lossy" analog to digital conversion because the computer is simulating understanding of a form of information transfer that it itself is not constrained by (digital systems don't have vocal cords and could transmit anything).

mountainriver•2h ago

What??? It’s literally the worst interface

Do you not want to edit your code after it’s generated?

aaronbrethorst•2h ago

Sure, in VS Code. Or Xcode. Or IntelliJ/GoLand/RubyMine.

handfuloflight•2h ago

...if your IDE doesn't have a terminal then it isn't an IDE.

leptons•1h ago

I have a whole other screen for my terminal(s). The IDE already has enough going on in it.

handfuloflight•1h ago

Then you are not impeded from editing your code because it was written through a terminal process, which seems to be OP's contention.

bretpiatt•40m ago

I'm running terminal in one window with AI interaction and then VS Code with project on same directories so I can see via color coding updated or new files to review in the IDE.

How do you interact with your projects?

never_inline•16m ago

I run aider in VSCode terminal so that I can fix smaller lint errors myself without another AI back-and-forth.

ed_mercer•1h ago

Exactly. It has access to literally everything including any MCP server. It's so awesome having claude code check my database using a read-only user, or have it open a puppeteer browser and check whether its CSS changes look weird or not. It's the perfect interface and anthropic nailed it.

It can even debug my k8s cluster using kubectl commands and check prometheus over the API, how awesome is this?

leptons•1h ago

> or have it open a puppeteer browser and check whether its CSS changes look weird or not.

It's got 7 fingers? Looks fine to me! - AI

bionhoward•4h ago

Assuming attention to detail is one of the best signs people give a fuck about craftsmanship, isn’t the fact the Anthropic legal terms are logically impossible to satisfy a bad sign for their ability to be trusted as careful stewards of ASI?

Not exactly “three laws safe” if we can’t use the thing for work without violating their competitive use prohibition

alwa•3h ago

I can’t speak for their legal department, but their product, Claude Code, bears signs of lavish attention to detail. Right down to running Haiku on the context to come up with cute appropriate verbs for the “working…” indicators.

konexis007•3h ago

jilles•2h ago

How does this compare with Apples or Orange?

brcmthrowaway•2h ago

How does this compare with Code::Blocks?

blahgeek•3h ago

Asking it to explain rust borrow checker is one of the worst examples to demonstrate its ability to read code. There are piles of that in its training data.

dundarious•59m ago

Agreed, ask it to explain how exceptions are handled in python asyncio tasks, even given all the code, and it will vacillate like the worst intern in the world. What's more, there's no way to "teach" it, and even if there was, it would not last beyond the current context.

A complete waste of time for important but relatively simple tasks.

dirtbag__dad•2h ago

This article is inspiring. I haven’t had the moment to get my head out of the Cursor + biz logic water until now. Very cool to think about LLMs automagically creating changelogs, testing packaging when dependencies are bumped, forcing unit tests on features.

Is anyone aware of something like this? Maybe in the GitHub actions or pre-commit world?

pjm331•2h ago

https://docs.anthropic.com/en/docs/claude-code/github-action...

rbren•2h ago

I’m biased [0], but I think we should be scripting around LLM-agnostic open source agents. This technology is changing software development at its foundations—-we need to ensure we continue to control how we work.

[0] https://github.com/all-hands-ai/openhands

ProofHouse•2h ago

This 10000%

handfuloflight•2h ago

But what do we do if the closed models are just better?

davidmurdoch•2h ago

Wait?

handfuloflight•2h ago

And get superseded by competitors willing to spend on those models?

bluefirebrand•2h ago

Steal from them shamelessly, the same way they stole from everyone else?

hsuduebc2•1h ago

You are onto something.

datameta•1h ago

Seems ethically sound to me.

robotbikes•1h ago

This looks like a good resource. There are some pretty powerful models that will run on a Nvidia 4090 w/ 24gb of RAM. Devstral and Queen 3. Ollama makes it simple to run them on your own hardware, but the cost of the GPU is a significant investment. But if you are paying $250 a month for a proprietary tool it would pay for itself pretty quickly.

seanmcdirmid•50m ago

A Max M3 with 64 GB works well for a wider range of models although it fairs worse on stable diffusion jobs. Plus you can get it as a laptop.

mjrbrennan•2h ago

Not trying to be rude here, but that `last_week.md` is horrible to me. I can't imagine having to read that let alone listen to the computer say it to me. It's so much blah blah and fluff that reads like a bad PR piece. I'd much rather scan through commits of the last week.

I've found this generally with AI summaries...usually their writing style is terrible, and I feel like I cannot really trust them to get the facts right, and reading the original text is often faster and better.

fullstackchris•1h ago

Yeah I was done at "What happened here was more than just code..." -_-

jsjohnst•50m ago

You got past the grey text on gray background? -_-

block_dagger•1h ago

You can specify desired style in the prompt. The author seems to like PR sounding fluff while making morning coffee.

WD-42•56m ago

I felt the same thing about the onboarding. Like what future are we trying to build for ourselves here, exactly? The kind where instead of sitting down with a coworker to learn about a codebase, instead we get an ai generated PowerPoint to read alone????

Im so over this timeline.

JohnMakin•38m ago

all of this just reads like the supposed UML zeitgeist that was supposed to transform java and eliminate development 20 years ago

if this is all ultimately java but with even more steps, its a sign im definitely getting old. it’s just the same pattern of non technical people deceiving themselves into believing they dont need to be technical to build tech and then ultimately resulting in again 10-20 years of re-learning the painful lessons of that.

let me off this train too im tired already

never_inline•21m ago

Here's a system prompt I tend to use

    ## Instructions
    * Be concise
    * Use simple sentences. But feel free to use technical jargon.
    * Do NOT overexplain basic concepts. Assume the user is technically proficient.
    * AVOID flattering, corporate-ish or marketing language. Maintain a neutral viewpoint.
    * AVOID vague and / or generic claims which may seem correct but are not substantiated by the the context.

Cannot completely avoid hallucinations and it's good to avoid AI for text that's used for human-to-human communication. But this makes AI answers to coding and technical questions easier to read.

aussieguy1234•2h ago

I played around with agents yesterday, now I'm hooked.

I got Claude Code (With CLine and VSCode) to do a task for a personal project. It did it about 5x faster than i'd have been able to do manually including running bash commands e.g. to install dependencies for new npm packages.

These things can do real work. If you have things in plain text format like markdown, csv spreadsheets etc, alot of what normal human employees do today could be somewhat automated.

You currently still need a human to supervise the agent and what its doing, but that won't be needed anymore in the not so distant future.

johnwheeler•1h ago

I've actually stumbled upon a novel new way of using Claude code that I don't think anybody else is doing that's insanely better. I'll release it soon.

tom_m•1h ago

Well, there will always be a job for programmers folks.

fullstackchris•1h ago

Gonna be a bit blunt here and ask why hooking up an agentic CLI tool to one or more other software tool(s) is the top post on HN right now... sure, some of these ideas are interesting but at the end of the day literally all of them have been explored / revisited by various MCP tools (or can be done more or less in scripted / hacked ways as the author shows here)

I don't know, just feels like a weird community response to something that is the equivalent to me of bash piping...

AstroBen•56m ago

Side note but the contrast between background and text here makes this really hard to read

jsjohnst•48m ago

You aren’t missing much if you just skip it

thunkle•17m ago

For me it's the blinking cursor at the top... It's hard to focus on the text.

jvanderbot•24m ago

I can't wait until Section 174 changes are repealed and nobody is financially invested in software from AI anymore.

b0a04gl•13m ago

summaries like this are less about helping the dev and more about shaping commit history. when you let a model generate descriptions, tests, and boilerplate, you're also letting it define what counts as acceptable change. over time that shifts the team's review habits. if the model consistently downplays risky edits or adds vague tests, the bar drops silently. would be more useful to trace how model-written code affects long-term bug rate and revert patterns

distortionfield•6m ago

Unrelated; but I am absolutely in love with this blog theme and color scheme.

dweinus•4m ago

> Is it Shakespeare? No.

Is it good?

> What emerged over these seven days was more than just code...

Still no.

But is it accurate?

> Over time this will likely degrade the performance and truthfulness

Nope.

> Is this useful? Probably not.

Ok, but it's cheap right?

> $250 a month.

Also no.

Well at least it's not horrible for the environment and built on top of massive copyright violations, right?

Right?

ZX Spectrum Graphics Magic: The Basics Every Spectrum Fan Should Know

What Happens When Clergy Take Psilocybin

Generative AI coding tools and agents do not work for me

Show HN: Canine – A Heroku alternative built on Kubernetes

Snorting the AGI with Claude Code

Benzene at 200

Show HN: Chawan TUI web browser

Show HN: Nexus.js - Fabric.js for 3D

Battle to eradicate invasive pythons in Florida achieves milestone

Selfish reasons for building accessible UIs

Privacy implications of browsers’ (mis)implementations of Widevine EME (2023)

Dull Men’s Club

Open-Source RISC-V: Energy Efficiency of Superscalar, Out-of-Order Execution

Ask HN: How to Deal with a Bad Manager?

OpenAI wins $200M U.S. defense contract

What I talk about when I talk about IRs

Retrobootstrapping Rust for some reason

Show HN: Zeekstd – Rust Implementation of the ZSTD Seekable Format

Blaze (YC S24) Is Hiring

OpenTelemetry for Go: Measuring overhead costs

Working on databases from prison

Nanonets-OCR-s – OCR model that transforms documents into structured markdown

Breaking Quadratic Barriers: A Non-Attention LLM for Ultra-Long Context Horizons

Finland warms up the world's largest sand battery, the economics look appealing

Is gravity just entropy rising? Long-shot idea gets another look

Show HN: dk – A script runner and cross-compiler, written in OCaml

Adding public transport data to Transitous

WhatsApp introduces ads in its app

Occurences of swearing in the Linux kernel source code over time

Identity Assertion Authorization Grant

ZX Spectrum Graphics Magic: The Basics Every Spectrum Fan Should Know

What Happens When Clergy Take Psilocybin

Generative AI coding tools and agents do not work for me

Show HN: Canine – A Heroku alternative built on Kubernetes

Snorting the AGI with Claude Code

Benzene at 200

Show HN: Chawan TUI web browser

Show HN: Nexus.js - Fabric.js for 3D

Battle to eradicate invasive pythons in Florida achieves milestone

Selfish reasons for building accessible UIs

Privacy implications of browsers’ (mis)implementations of Widevine EME (2023)

Dull Men’s Club

Open-Source RISC-V: Energy Efficiency of Superscalar, Out-of-Order Execution

Ask HN: How to Deal with a Bad Manager?

OpenAI wins $200M U.S. defense contract

What I talk about when I talk about IRs

Retrobootstrapping Rust for some reason

Show HN: Zeekstd – Rust Implementation of the ZSTD Seekable Format

Blaze (YC S24) Is Hiring

OpenTelemetry for Go: Measuring overhead costs

Working on databases from prison

Nanonets-OCR-s – OCR model that transforms documents into structured markdown

Breaking Quadratic Barriers: A Non-Attention LLM for Ultra-Long Context Horizons

Finland warms up the world's largest sand battery, the economics look appealing

Is gravity just entropy rising? Long-shot idea gets another look

Show HN: dk – A script runner and cross-compiler, written in OCaml

Adding public transport data to Transitous

WhatsApp introduces ads in its app

Occurences of swearing in the Linux kernel source code over time

Identity Assertion Authorization Grant

Snorting the AGI with Claude Code

Comments