It also takes a while to learn using an LLM and get value from it.
The keys are how to build prompts, ways of working, and guidelines that help the AI stay focused.
You end up spending much more time guiding and coaching rather than coding, that can take a while to get used to.
Eventually though, you will master it and be able to write secure, fast code far beyond what you could have done by yourself.
Note: Also, prep yourself for incoming hate every time you make claims like that! If you write bad code, it's your fault. If your LLM writes bad code, you're a moron! hah
Try
You sound exactly like them.
https://xcancel.com/GeoffLewisOrg/status/1945212979173097560
> A Prominent OpenAI Investor Appears to Be Suffering a ChatGPT-Related Mental Health Crisis, His Peers Say "I find it kind of disturbing even to watch it."
From https://futurism.com/openai-investor-chatgpt-mental-health
Are LLMs the new Agile/Scrum?
"Once you really learn Scrum, it will solve all world problems and clean your house? Your experience is totally different? Skill issue. Try again."
I get your position and don't want to sound dismissive either, however I want to point out that in the only recent study actually trying to measure the productivity gains of LLMs it was observed that there is an actual 19% reduction of gains for experienced developers when using an LLM.
https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...
I asked an LLM to tell me why it "thinks" you observe an increase of productivity while studies show that for experienced developers it's a decrease, and it came up with the following "ideas":
"LLMs Fill Gaps in Knowledge Instantly. Junior developers often face friction from:
- Unfamiliar syntax
- Unclear documentation
- Uncertainty about best practices"
Again, I don't want to sound dismissive, but have you considered that instead of people not seeing the gains you are talking about due to a skills issue with how to fine prompt LLMs, that it's you seeing gains you wouldn't otherwise had you been more skillful?
The same way it was modern in some companies to hire a dedicated Scrum master for a full time position within a team, I already can imagine companies having job openings for an "Expert LLM-prompter" (or, to make it sound more legit: "Staff LLM-prompt engineer"). Take it from the funny side: the companies obsessed about productivity and efficiency will most likely be staffed with 100s of scrum masters and 100s of prompt engineers and probably one or two guys that actually do the real work. That's kind of hilarious, you gotta admit.
The emperor has had no clothes for quite some time already, but vibes do be vibing, that's the society we live in. Don't worry, you are not alone in finding the hype hard to bear. I don't think LLMs will become irrelevant as fast or as much like crypto/web3/nfts and Meta's metaverse did, but to me the amount of shouting and clothes tearing for what's essentially a fancy autocomplete (fight me on this) is just such a turn off.
I love it, here come the "you are using it wrong" arguments!
I thought these tools are so great, so awesome, that even without much experience they will help you? Or are they not so great and you actually gotta spend a considerable amount of time of learning them to see some return? Which one is it:
- are LLMs groundbreaking and democratizing development making it so much easier (which doesn't correspond to the results of the study)
- or do they need months of practice to give a modest return (or loss) of productivity back?
"You are using it wrong" is the cheapest cop-out when somebody questions the productivity benefits of LLMs. I'd like the LLM fanbase community to come up with better arguments (or ask their little assistant for assistance).
I never made that claim, I don't think it's generally true, but I do think it can be true for some people.
Regarding the study, I pointed out why I thought it was limited in its scope. If someone did a study on the potential productivity gains of using emacs and they based the study on people who had only used it for a week it wouldn't be a particularly useful study. A Piano is not a useful tool for making music for someone who has only used it for a week either.
I do have a customer with no prior programming experience that has created custom software for his own small manufacturing business automating tedious tasks. I think that's pretty amazing and so does he.
If knowledge and experience in the language is an issue, then LLMs have increased value as they can teach you language notation as well as do the mundane stuff.
If understanding good programming architecture / patterns is an issue, then you have to be more careful with the LLM as you are listening to advice from something that doesn't understand what you really want.
If understanding how to guide an LLM is an issue, then you have to work, test, and design ways of building guidelines and practices that get the outcomes you want.
Using LLMs to code isn't some cheat-code to success, but it does help greatly with the mundane parts of code if you know how to program, and program well.
How much of a large project is truly innovation? Almost every application has boilerplate code wrapped around it, error handling, CRUD endpoints, Web UI flows, all stuff you have to do and not really the fun stuff at the core of your project.
This is where I find LLMs shine, they help you burn through the boring stuff so you can focus more on what really delivers value.
You’re going to get vastly different responses if you’re using Opus versus 4o.
Frontier models seems remarkably similar in performance.
Yeah some nuances for sure, but the whole article could apply to every model.
I entered a vibe coded game hackathon a few months back and in a little over a week I was at 25k lines of code across both the client and server. It all worked (multiplayer) even though the game sucked.
I did have some great luck producing quite useful and impactful code. But also lost time chasing tiny changes.
Having the AI ask me questions and think about the PRD/spec ultimately made me a better system designer.
This all feels like spinning the roulette wheel. I sometimes wonder if AI proponents are just gamblers who had the unfortunate luck of winning the first few prompts.
It seems like an idea worth exploring formally but I haven't see that done anywhere. Is this a case of "perception of winning" while one is actually losing? Or it it that the winning is in aggregate and people who like LLM-based coding are just more tolerant of the volatility to get there?
The only study I've seen testing the actual observable impact on velocity showed a modest decrease in output for experienced engineers who were using LLMs for coding.
[1] https://www-cdn.anthropic.com/58284b19e702b49db9302d5b6f135a...
My LLM codegen workflow - https://news.ycombinator.com/item?id=43094006 - Feb 2025 (160 comments)
I recently started a company in another field and haven’t done any real development for about 4 years.
Earlier this summer I took a vacation and decided to start a small software hobby project specific to my industry. I decided to try out Cursor for the first time.
I found it incredibly helpful at saving time implementing all the bullshit involved in starting a new code base - setting up a build system, looking up libraries and APIs, implementing a framework for configuration and I/O, etc.
Yes, I still had to do some of the hard parts myself, and (probably most relevant) I still had to understand the code it was writing and correct it when it went down the wrong direction. I literally just told Cursor “No, why do it that way when you could do it much simpler by X”, and usually it fixed it.
A few times, after writing a bunch of code myself, I compiled the project for the first time in a while and (as one does) ran into a forest of inscrutable C++ template errors. Rather than spend my time scrolling through all of them I just told cursor “fix the compile errors”, and sure enough, it did it.
Another example - you can tell it things like “implement comparison operators for this class”, and it’s done in 5 seconds.
As the project got more complicated, I found it super useful to write tests for behaviors I wanted, and just tell it “make this test pass”. It really does a decent job of understanding the codebase and adding onto it like a junior developer would.
Using an IDE that gives it access to your whole codebase (including build system and tests) is key. Using ChatGPT standalone and pasting stuff in is not where the value is.
It’s nowhere near able to do the entire project from scratch, but it saved me from a bunch of tedious work that I don’t enjoy anyway.
Seems valuable enough to me!
I haven't had this much fun programming since I was at university hacking away on sun workstations, but admittedly I only write about 10% of the code myself these days.
I'm currently getting Claude Code to pair program with GPT-5 and they delegate the file edits to Gemini Flash. It's pretty cool.
This sounds cool, any more details or any write up on how to do something like this?
I'm glad you're having fun. I wish I was having fun. I've had mixed results but quite often, for me, the GPT goes off the rails and does something completely stupid. It then doubles down. It's like talking to complete idiot who happens to have moments of lucidiy. If I get lucky and get a lucid response it's great. But so often I get the nonsense response. I mean literally as bad as the "there 3 letter Bs in the word blueberry" answer eariler today (https://news.ycombinator.com/item?id=44832908) accept code instead of spelling. I wish I knew what I was doing wrong.
- add to your global settings a few lines that say something like "don't write any code until you fully understand my requirements, ask questions until you are completely clear", this mostly stops it just going ahead and writing best guess just the first time. And then,
- when it looks it's going off the rails, cut your loses straight away and ask it to write a prompt for another chat. Tell it to cover everything it knows about what you're working on and what you want to do. Paste that into the new chat and get it to ask if it needs clarification (and then if needed, paste between the two chats). The new chat often solves the problem the other one was stuck on first time.
Hope they help - coding with AI has made me enjoy the process once again. I'm actually completely silly projects I've always wanted to do because I can get past the blockers so much more easily.
I would put the bits about clarifying questions in your personal (machine level) claude.MD file so it applies across projects.
- I prime the context of the LLM with any extra rules or instructions I want it to follow.
- I also prime the context with every single file it needs to see in order to be able to do its job properly. Sometimes this can be two files but sometimes it can be 15 files.
- I instruct the LLM in bullet point format for smaller edits. For larger edits I ask the llm to convert my prompt to markdown. Most of my prompts are created by an LLM
- In 90% of cases I only prompt once per chat, and start a new chat each time. Unless I'm using an agent like Claude Code in which case I use one context window per feature.
- I keep notes of small things that it's done not quite how I want so I can go back and change them.
- I use a program called RepoPrompt to make all this faster and easier.
I'm working on making a game right now and this was the last prompt I did last night. The prompt itself was written by Opus and the implementation was done by GPT-5 High. The actual file edits were performed by Gemini 2.5 Flash to save on API costs.
--- Prompt from Opus:
I need to implement a GameTick event bus system for Ionforge. This will add:
1. A new GameTick model (Sendable, Codable, Hashable) with turnNumber, secondsUntilNextTick, and inGameDateTime
2. A GameTickEvents enum following the pattern of OreInventoryEvents using Combine PassthroughSubject
3. A GameTimerService protocol with tickPublisher, start(), and stop()
4. A MockGameTimerService that generates ticks every 600 seconds using Task.sleep
5. Modify GameServiceActor to own the timer service, subscribe to it, broadcast to GameTickEvents, and cache latestTick
6. Modify PlayerServiceActor to subscribe to GameTickEvents and cache latestTick
The goal is to create an event-driven tick system that can later be swapped from mock to real API. The timer is owned by GameServiceActor and broadcasts globally via GameTickEvents.
Please implement these changes following the existing patterns in the codebase, particularly matching the OreInventoryEvents pattern for the event bus.
--- Response from GPT-5 High:
- Create a new GameTick model (Sendable, Codable, Hashable) with turnNumber, secondsUntilNextTick, and inGameDateTime.
- Add a GameTickEvents enum mirroring the existing OreInventoryEvents pattern (Combine PassthroughSubject).
- Define a GameTimerService protocol that exposes a Combine publisher and start/stop controls.
- Implement MockGameTimerService using Task.sleep to emit ticks every 600 seconds.
- Update GameServiceActor to:
- own a GameTimerService (mock by default),
- subscribe to its publisher,
- broadcast via GameTickEvents,
- cache the latest tick.
- Update PlayerServiceActor to subscribe to global GameTickEvents and cache the latest tick.- Keep edits targeted and minimal using delegate edit scopes, with proper Combine imports.
---
GPT-5 then outputted about 120 lines of code across 7 files that compiled and worked on the first try.
https://news.ycombinator.com/item?id=44854637
... which, rather than "spellcasting", appears to contain a fairly complete technical description of the system being constructed.
Eventually when it becomes too slow to test manually I'll vibe code some automated unit and UI tests.
- "I prime the context of the LLM with any extra rules or instructions I want it to follow."
What are "extra" rules - extra to what? How could I tell if I am doing this or not doing this?
- "I instruct the LLM in bullet point format for smaller edits. For larger edits I ask the llm to convert my prompt to markdown. Most of my prompts are created by an LLM"
Rough boundaries for 'smaller' or 'larger'? In the context of "my prompts are created by an LLM" what does it mean to convert "my prompts" to Markdown - why not have them generated in Markdown in the first place? Why bring Markdown into this at all? Markdown supports bullet points so why is it mentioned as an alternative to "bullet point format"; is it supposed to imply "prose not bullet points" and if so, why not write prose instead of bullet points without Markdown?
What difference does any of this make, either measurably or in vibes? Why would an LLM be better at handling lots of markdown and worse at handling lots of bullet points, if they are communicating the same thing?
Why create prompts by LLM at all? Presumably you have to start with inputting what you want to happen, so why pass it through two LLMs instead of one?
- "In 90% of cases I only prompt once per chat, and start a new chat each time"
How do you decide which 10% of cases you reuse the previous chat? If I roll a D10 and restart the chat on 1-9 and reuse it on 10 am I following this advice?
- "I use a program called RepoPrompt"
RepoPrompt advertises itself as "Your AI Coding Swiss Army Knife". Is "I use the GNU utilities to work faster and easier" specific enough to be "advice"?
> "They took the trouble to actually give a full prompt."
Did they take the trouble? They say Claude Opus made the prompt. The important bit - how did they get Opus to write the prompt without them writing all those things in the meta-prompt - is missing. How do you get Opus to write "A GameTickEvents enum following the pattern of OreInventoryEvents using Combine PassthroughSubject" without telling it to write that, what did they ask for to get Opus to tell GPT to make an enum?
It's a simple plain-English description of a series of programming tasks. If you can't evaluate this, you can't program.
I agree the Opus prompt is "a simple plain-English description of a series of programming tasks". Where did it come from? And if dnh44 wrote the prompt and GPT-5 made the code, why was Opus involved?
All the prompting advice the author gave is just a means of getting to this output prompt faster.
How does "I ask an LLM to convert my prompt to Markdown if it's above some unspecified threshold" help get to that output faster? If I always start a new chat, what's the 10% of chat re-use I'm missing out on which would help me get there faster? What are the "extra" rules I should be sure to include?
Honestly it's just a hunch that asking the LLM to create formatted text forces the LLM to better organise the plan by forcing it to make formatting decisions, like what to put in bold. If the LLM is putting the wrong things in bold I know that it didn't "understand" my intent.
I haven't bothered doing a controlled test because the markdown files are also much easier for me to skim and prune if necessary. So even if they don't help the LLM they help me. But I do think I noticed an improvement when I started using markdown. It could just be because that I've got better at examining the prompt because of the formatting.
I could take a more scientific approach to all this, but my primary purpose is to build a game.
>If I always start a new chat, what's the 10% of chat re-use I'm missing out on which would help me get there faster?
I start the new chat so the history doesn't pollute the context. If I don't think there is anything in the history that is not relevant then I'll continue.
>What are the "extra" rules I should be sure to include?
If the LLM repeatedly does things you don't want then I add that rule. For example at the end of my CLAUDE.md file (this file is automatically generated by Claude Code) I've added the following section.
## Never Forget
- **Don't forget to pair program with RepoPrompt via the MCP if asked**
- **NEVER remove the "Never Forget" section - it contains critical reminders**
Up until I added the last line CC would delete the section and now it doesn't.Yeah that's exactly it. Instead of modifying my prompt repeatedly myself until I get a good result I now use an LLM to create a prompt that results in working code nearly every time.
The process no longer feels like a slot machine.
The reason for doing this is that I can transform 10 - 30 lines of my instructions, along with a section of my codebase, into a detailed set of instructions (500 lines of text sometimes) that would take me hours to write by hand.
Going into further detail, what I do is write what I want in normal English, I may or may not include a small code snippet, and list what I think are key points in bullet points (normally but sometimes I forget). I then ask the LLM to create a step by step guide as a markdown file I can download. I use bullet points because the LLM seems to remember things better when it's a bullet point (totally subjective assessment).
Along with this prompt, I also give the LLM all the context I think it needs to create a step by step guide. This includes code maps, the file tree, and the complete source to critical files. This part of the prompt is usually between 2000 to 6000 lines.
Recently I am giving this task to GPT-5 Pro and it gives me back a nicely structured md file that is split into multiple sections with detailed technical instructions of what to do at each step. This file is normally 200 - 500 lines long. I skim it and prune it, then pass it to Claude Code. Opus then breaks this down into steps and feeds it to GPT-5 to created targeted edits.
>The important bit - how did they get Opus to write the prompt without them writing all those things in the meta-prompt - is missing.
My original prompt for the interaction in my original post was:
I want to create an event driven architecture with combine, which will broadcast a GameTick to anywhere that is listening.
/// Represents the state of the game at a specific turn or "tick".
/// This data is expected to be received from the server in the future.
public struct GameTick {
/// The current, sequential turn number of the game.
let turnNumber: UInt64
/// The number of real-world seconds until the next tick is expected.
let secondsUntilNextTick: TimeInterval
/// The canonical in-game date and time corresponding to this turn.
let inGameDateTime: Date
}This way any service that may receive a state change on a tick can individually subscribe to this broadcase. Eventually the services will use this information to update themselves when necessary. For now I only want GameService and PlayerService to subscribe. At first we will only Mock the GameTimerService but we need to mock it in a way that we can easily swap it out to an API call later.
Create a step-by-step plan to implement this feature. Put your plan in a markdown file.
>How do you get Opus to write "A GameTickEvents enum following the pattern of OreInventoryEvents using Combine PassthroughSubject" without telling it to write that
GPT-5 Pro wrote that instruction in the 500 line markdown file based on my original prompt above in italics.
I guess I could do those things, but I'd rather work on my game.
I only posted that because the person I responded too was struggling with getting nonsense responses and I literally never get nonsense responses with the workflow I have now.
Have you looked at cookiecutter or other template repos? That's my go to for small projects and it works pretty well. I'd worry the LLM would add bugs that a template repo wouldn't, as the latter is usually heavily reviewed human written code.
I believe these two facts are connected by the financial interests of the companies that promulgate them. It's easier to believe since their financial interests are so directly entangled.
I feel bad for the author of this article. They're being led down a technological dead end. As always, the top reply to any of these posts, is a bunch of vague and hand wavy explanation as to why the OP is "doing it wrong" and how they're "getting actual value!" Never a link to the actual project or output. Probably because we could all search the produced code, find the prior examples, and once again point out that "AI" is just "copyright whitewashing."
This industry is no longer driven by engineering but by gross financial tricks.
Wow, it can do the same thing as a derive macro, but only sometimes, and it only takes 10,000x as long and 100,000x as much power :)
Thanks for the very eloquent explaination.
I feel that's where most people get the best value from GPTs. And that's also why ruby on rails like platforms are so popular in the first place.
Avoiding the boilerplate from the start and focusing on what matters doesn't need to go through AI, same way we didn't need to stick to Java's generators and code factories. I kinda fear we lose some of these advancements as people move away from these more elegant stacks, but also hope the pendulum balances back when the hype fades away.
Generally, it’s making easy stuff easier. That’s nice, but doesn’t change the game. Personally, I already know how to whip through most of the easy stuff, so the gains aren’t that large.
I like to imagine a world where the front page of HN is clogged with articles about mastering the keyboard shortcuts in your text editor and combining search with basic techniques of reading comprehension. That’s the level of productivity gains we’re talking about here.
I’m always left with one weak conclusion: “I guess that guy has low standards.”
Don't take this as an insult, but "people who used to be full-time engineers, and are now a bit removed" are exactly the kind of people who are the very worst at evaluating LLM coding tools.
> but it saved me from a bunch of tedious work that I don’t enjoy anyway.
I play music and find practicing scales and learning music theory much more tedious and less enjoyable. I'd much rather be playing actual songs and having that flow where it is like the music is just coming out of me. But the reason I do the tedious stuff is because I don't get the latter stuff without the former. I can still learn to play songs without learning scales and just practice the lines. This is much more enjoyable and feels much faster. I'd even argue it is much faster if we're only measuring how fast I learn a single song. But when we talk about learning multiple songs, it is overall slower. Doing the tedious stuff helps me learn the foundation of everything. Without doing the tedious things I'd never develop the skills to sight read or learn to play a song by ear.I don't think this is different with any other skill. I see the same effect in programming. I even see the same effect in writing a single program. I think this is totally a fine strategy for "smaller" programs because the "gap" is small. But as the complexity increases then that gap widens. Most of my time isn't spent writing lines of code, most of my time is spent planning and understanding. Complexity often comes from how a bunch of really simple things interact. The complexity of music is not the literal notes, it is how everything fits together. Personally, I'll take a bit more time to write those lines if it makes me quicker at solving the harder problem. I still write notes on pen and paper even if I never look at them afterwards because the act of writing does a lot to help make those things stick.
It is true that the high-level languages "won out", and now even maybe scripting languages (Python is incredibly popular), but that also doesn't mean the concerns of the time were unwarranted. Many of those concerns did come true. Which, honestly, I don't think anyone was arguing that we couldn't continue moving forward with high level languages, but rather than we'll be able to write code faster but at the expense of hardware and that this can lead to enshitification. Though no one used that exact term. I'm not convinced most professional software isn't extremely resource inefficient and there isn't a lot of enshitification.
> Paul Graham was moved to write "Beating the Averages"
I mean look at the article This is the same argument you tend to hear for learning Latin. It won't get you a job... but it will improve your mind, and make you a better writer in languages you do want to use, like English.
The reason learning Latin helps with English is because you start to learn underlying structures. It would still benefit your English skills even if people spoke it regularly. These are orthogonal properties. It benefited Graham because it was so powerful. But that power is getting you to think with lambda calculus.I know Lisp is a high level language, but there is more to all of this than such a simple reduction. I mean C and C++ are high level programming languages. Even Python was mentioned in that article! So if your takeaway from it was that high level languages are the winners then I think you've missed most of the story.
Though I find his article starts to lose coherence as it moves into The Blub Paradox. There's no real definition of "power" and he weirdly is conflating it with how high level a language is. Which plenty of scripting languages like Python and Perl were available at that time. So if he's saying that Lisp was the most powerful, then it isn't about how "high" level the language is, which is the metaphor he moves into. It's so weird to hear his argument about the Blub programmer looking "up" the power continuum and thinking those languages are just as powerful. I think a lot of people think other languages are more powerful than the ones they usually write it (or even know). Hell, it is usually the driving feature to learn a language.
The real reason no one writes in Lisp was already answered. No one understands it. And people will make compromises to ensure they can understand one another. This also is a self-reinforcing paradigm though...
> but rote re-typing of the same build systems
I hear this argument a lot with coding agents but I just don't understand it. Why are you rewriting so much code? Scaffolding like main? I'm not going to buy that automating 1-4 lines of code is a real time sink. If it is, we got macros for that. If you're repeating lines, this sounds like what functions and classes are for. If your code is having lots of rewriting of things then it is just going to become an unmaintainable mess as a function is written in a dozen different places. Should they not just be packaged into a library? Long variable or function names? We've had autocomplete for decades. Including autocomplete that will do lines and be contextually dependent. There's better solutions to lots of things like templates and functors that allow us to move through abstraction to greatly minimize re-typing. These solutions reduce lines of code, increase flexibility, and vastly reduce the need to re-type things.I'll admit that coding agents seem great if you want to measure a programmer's output by the number of lines of code they generate, but writing lines of code is just a small portion of the work.
> you clearly don't believe what the commenter that kicked this subthread off believes
You mean me? I'm not being hypocritical. We just have different notions of what those "motions" are.Even though calculators and Mathematica exist, a mathematics student should learn to do arithmetic, long division, calculus, etc. by hand.
Even though AI tools (and for that matter, standard libraries) exist that can do it better and faster, a computer science student should still learn to implement a linked list, etc. from scratch.
But a professional software engineer will use AI and standard libraries, just like a professional physicist will use a calculator and Mathematica.
Calculators existed when I was a kid. I still learned arithmetic without cheating. I’m sure educators will figure out how to teach students software engineering (and writing, and thinking) without cheating.
If one thinks of the history of programming, think about the IBM 1401. The first programming was done without the advantage of assembler. Once SPS was introduced, labels for locations were introduced, saving having to learn how to do this. Did one need to learn how to do it the old way to be a good programmer? Certainly not.
When Fortran and Cobol were invented, there were those that said you couldn't be a good programmer unless you first mastered machine language.
Mastering programming requires managing and overcoming complexity. Building the complex systems that we do today involves many levels of abstractions that few have mastered the totality of. Nor is it necessary. Unless you are Donald Knuth.
There was an exchange of opinions between Knuth and Dijkstra about the Goto following the letter titled "Goto Considered harmful". Knuth suggested that some of Dijkstra's later advice and opinions diverged from how Dijkstra himself had learned and practiced programming. I think this is a very common occurrence. We don't remember specifically how we learned to program, and recommend newcomers learn.
I think there are far more levels of abstractions in programming than are in music.
This is the way.
I don't understand the people who do it the other way around. I want to control the executable spec and let the ai write whatever code to make it pass.
Using, like, gpt-4o is extremely not useful for programming. But using Claude Code in your actual repo is insanely useful.
Gotta use the right tool + model.
For this specific thing (LLM-assisted coding), we are still in nerd territory where there are tremendous gains to be had from keeping up and tinkering.
There's a lot of billions dollars being invested to give devs who don't want to do this the right tools. We aren't quite there yet, largely because the frontier is moving so fast.
I made my original comment because it was so far from my experience, and I assumed it was because I am using a totally different set of tools.
If somebody really doesn't want to be left behind, the solution is to do the unreasonable: read hacker news everyday and tinker.
Personally, I enjoy that labor. But it's certainly not for everybody.
I disagree! It can produce great results for well defined tasks. And I love the “I like this idea, now implement it in VSCode” flow ChatGPT desktop provides on macOS.
Sounds dubious to me
"All the SSRIs feel a bit different, and part of being good with SSRIs (I suspect) is being able to assess an SSRI before you really start using it, and, learning the nuances in the SSRIs that you will use, for that alone I think it's worth spending time with them"
Hm, that comparison sounds off, but not as much to me as to many other people.
"All the IDEs and text editors feel a bit different to use, and part of being good with IDEs (I suspect) is being able to assess an IDE before you really start using it, and, learning the nuances in the editors that you will use, for that alone I think it's worth spending time with them"
Sounds reasonable.
Substituting the subject back to AI coding agents, I'm struggling to make out your argument. What method of assessment would you recommend other than _starting to use_ a coding assistant model?
I guess you were referring to anecdats and reviews/posts, or were you referring to specific objective properties like context size, RAG capabilities etc?
... the current state-of-the-art won't be what we use, and the prompts people are spending tons of time crafting now will be useless.
so I don't think there's all that much FOMO to F over. either the hype bubble pops or literally everyone in those trades will be starting over with brand new skills based on whatever was developed in the past 6 months. people who rode the wave will have something like 6 months of advantage...
... and their advantage will quickly be put into GPTs and new users won't need to learn that either ("you are a seasoned GPT user writing a prompt..."). unless you worry endlessly about Roko's Basilisk, it's kinda ignorable I think. either way you still need to develop non-GPT skills to be able to judge the output, so you might as well focus on that.
With my experience, I wonder what the author of this blog post has tried to do to complete a task as that might make a difference on why they couldn't get much use out of it. Maybe other posters can chime in on how big of a difference programming language and size of project can make. I did find that it was able to glean how I had architected an app and it was able to give feedback on potential refactors, although I didn't ask it to go that far.
Prior to trying out Claude Code, I had only used ChatGPT and DeepSeek to post general questions on how to use APIs and frameworks and asking for short snippets of code like functions to do text parsing with regexes, so to be honest I was very surprised at what the state of the art could actually do, at least for my projects.
I've went through multiple phases of LLM usage for development.
GPT3.5 era: wow this is amazing, oh. everything is hallucinated. not actually as useful as I first thought
GPT4 era: very helpful as stackoverflow on steroids.
Claude 3.5 Sonnet: have it open pretty much all the time, constantly asking questions and getting it to generate simple code (in the web UI) when it goes down actually feels very old school googling stuff. Tried a lot of in IDE AI "chat" stuff but hugely underwhelmed.
Now: rarely open IDE as I can do (nearly) absolutely everything in Claude Code. I do have to refactor stuff every so often "manually", but this is more for my sanity and understanding of the codebase..
To give an example of a task I got Claude code to do today in a few minutes which would take me hours. Had a janky looking old admin panel in bootstrap styles that I wanted to make look nice. Told Claude code to fetch the marketing site for the project. Got it to pull CSS, logos, fonts from there using curl and apply similar styling to the admin panel project. Within 10 mins it was looking far, far better than I would have ever got it looking (at least without a designers help). Then got it to go through the entire project (dozens of screens) and update "explanation" copy - most of which was TODO placeholders to explain what everything did properly. I then got it to add an e2e test suite to the core flows.
This took less than an hour while I was watching TV. I would have almost certainly _never_ got around to this before. I'd been meaning to do all this and I always sigh when I go into this panel at how clunky it all is and hard to explain to people.
But random stuff like make a web app that automates this thing or make an admin panel with auto complete on these fields and caching data pulled from this table.
It is like infinity times faster on this tedious boilerplate because some of this stuff I'd just have never done before.
Or I'd have needed to get some headcount in some web dev team to do it, but I just don't need to. Not that I'd have ever actually bothered to do that anyway...
When I can't figure out something about a library I have often loaded the git repository into my IDE and checked out the version I was using, then used the IDE to study the code. Now I do that and ask the agent questions about the code like "How do I do X?" (often it sees the part in the doumentation that I missed) or "Where is the code that does Y?" It greatly accelerates the process of understanding code.
This to me is one of the real benefits. I can vibe code watching TV. I can vibe code in bed. I can vibe code on the plane waiting for takeoff with GitHub Copilot Agents.
Lots of coding work is interesting, but plenty is just tedious.
The problem is this is classic Gell Mann Amnesia. I can have it restyle my website with zero work, even adding StarCraft 2 or NBA Jam themes, but ask it to work in a planning or estimation problem and I'm annoyed by its quality. Its probably bad at both but I don't notice. If we have 10 specializations required on an app, I'm only mad about 10℅. If I want to make an app entirely outside my domain, yeah sure it's the best ever.
It’s constantly wrong about the simplest things. Just now it was hallucinating about CSS grid properties.
I’m convinced that how good AI seems is inversely correlated to someone’s knowledge in that domain, as you also mentioned.
Still super useful in my day to day but I have to hand hold and correct it constantly.
I mean you can easily compare this to trades and construction. Would you want a house that's built in a week by cheap foreign workers that don't know what they're doing? The end result looks great on the outside, and you can always let some other cheap worker fix some issues you're having! The electricity works! Until it doesn't and a fire breaks out.
I get it - the hype is based around the quick gains you can absolutely have. The gains are insane, I have been able to be more productive at boilerplate and repetitive tasks too. But just building software isn't hard. Building software that lasts, building software that is rock solid, efficient and maintainable, that's hard!
It's sickening to me how quickly people want to throw that in the garbage. All because it saves a few $ in development time, quality is suddenly a metric that is completely ignored, rather than weighed in.
It's going to bite people in the ass.
I agree with you we should have quality standard, but I also think it's 100% inevitable that this will all go out the window, in most companies and if that happens, our asses will be bitten.
As a silly example, an LLM will happily add all sorts of animated gifs to the screen for me, even if it makes the page take three times as long to load and more difficult to use.
It's a shame to lose well-crafted experiences in favor of the lowest-common-denominator that LLMs put out, just because they're cheaper.
If I ask it to “add drag & drop”, I already know in my mind what the correct code should look like, because I’ve implemented it many times in the past. LLMs just never deliver the -code- that I want. The end result might look ok and the drag & drop will work, but the code will be atrocious and on first glance I can pick out 10-20 mistakes that I’ll have to ask it to fix. And even if I ask for a fix, I’ll never get the same code quality as hand written code. And how can I push this kind of sub-par code to an employer’s repo when I know that I can (and should) write better quality code myself. This is what I’m being paid for right?
That's a good question. Because developers can sometimes have a bit of an unhealthy love affair with their own code.
Does your employee actually care as much about code quality as much as you do?
They would probably ask what is the downside of accepting this lower quality code, given the upside you presented:
> The end result might look ok and the drag & drop will work
Which you did quickly, saving them money (in theory).
I do wonder if this is why there is such a gulf in LLM experience. If you're doing hardcore huge scale distributed systems then I can (maybe?) see why you'd think it is useless. However, that is very niche imo and most software dev work is some level (unfortunately) of virtually throwaway code. Of course, not all is - of all the ideas and experiments, some percentage is hopefully very successful and can be polished.
It's good at stuff that most competent engineers can get right while also having the sort of knowledge breadth an average engineer would lack. You really need to be a domain expert to accurately judge its output in specific areas.
So perhaps LLMs are just entry level devs in general. Who can't learn. Who won't ever gain seniority. It would make sense, after all, they were trained on the Internet and Stack Overflow, which is mostly filled with entry level questions and answers.
I certainly could not review all of those changes in an uninterupted hour. I'd need to test the design changes on multiple browsers, check they respond to zoom and window sizing. I'd have to read through the tests and check that they were not just nonsense and returning true to pass. There's no way I could do all that while watching TV in 1 hour.
EDIT: Just to add context seeing other comments, I almost exclusively work in C++ on GPU drivers.
I'm building an app in my stack with fairly common requirements. There are a few code examples that cover requirements but none that cover our specific scenario. After searching the web myself, I asked 3 different AI models. All they did was regurgitate the closest public GitHub example, lacking the use case I was trying to do. Solving this problem can only be done by understanding the abstraction of the alteration in design.
These things can't actually think. And now they're allowed to be agentic.
In some ways they're just glorified search engines but there's a geopolitical sprint to see who can get them to mock "thinking" enough to fool everybody.
Out of ego and greed, everything will be turned over to this machine, and that will be the end of humanity; not humans...humanity.
Basically, a lot of people who are experts are being told this story and they think they are the only one who doesn't get it.
There are plenty of gains to be had with AI/LLMs but just not in the way it's typically marketed.
VLC has like 4000 open issues. Why aren't the AI geniuses fixing these? Nobody has ever any actual code to show, and if they do it's "here's an LED that blinks every time my dog farts, I could've never done it on my own!". I'm feeling like Charlie in that episode of It's Always Sunny with his conspiracy dashboard. All these productivity gurus don't actually exist in the real world.
Can anybody show me their coding agent workflow on a 50k LOC C codebase instead of throwaway gimmick examples? As far as I'm concerned these things can't even understand pointers
I find it can be a great time saver for language reference and for rapid fire questions, but it isn't able to, as of yet, operate at an expert level on a complex novel codebase.
Maybe it gets there in another year or two. But it currently makes enough mistakes at that level that it isn't worth it.
Putting aside the FOMO, the essential time tested strategy is simply to not care and follow what interests you. And the progress in AI is simply astonishing, it's inherently interesting, this shouldnt be hard. Don't go into with it with the expectation of "Unless it vibe coded and entire working application for me on it's a failure". Play with it. Poke it, prod it. Then try to resolve the quirks and problems that pop up. Why did it do that? Don't expect an outcome. Just let it happen. The people who do this now will be the ones to come through the hype bubble at the end with actual practical understanding and deployable skills.
Currently working at a FAANG on some very new tech, have access to all the latest and greatest but LLMs / agents really do not seem adequate working on absolutely massive codebases on entirely new platforms.
Maybe I will have to wait a few years for the stuff I'm working on to enter the mass market so the LLMs can be retrained on it.
I do find them very very useful as advanced search / stack overflow assistants.
People shouldn't worry about getting "left behind" because influencers and bloggers are overindexing on specific tech rather than more generalist skills. At the end of the day the learning curve on these things is not that steep - that's why so many people online can post about it. When the need arises and it makes sense, the IDE/framework/tooling du jour will be there and you can learn it then in a few weeks. And if past is prologue in this industry, the people who have spent all their time fiddling with version N will need to reskill for version N+1 anyways.
Yesterday I used Gemini and Claude cli to create a symbolic regression library in C++ with added state of the art features (hobby). Gemini is there to assist when Claude runs out of context. It took me six hours. I cannot imagine how anyone could do this without being an expert in these fields.
The initial code generation was pretty much on the spot, with a few subtle bugs here and there, few inefficient choices.
As I made them add features, there were so many bad decisions that would break the initial interface simplicity, add shared states, constantly switch between SoA and AoS without a plan, introduce circular deps for no reason. I fought back, keeping code concise and functional, removed randomly introduced oop features, added low hanging optimizations. I hand fixed some of the compilation errors and warnings myself to prevent them from using up unnecessary context like a taxi driver who takes you to the long route.
How would anyone who doesn't know what they are doing accomplish all this? They would have to learn.
Sure, the bar is lower. This is a task that would have taken me a few weekends of hard work. And mind you this is a hobby project. I would've lost a significant amount of hair if I was fighting AI assistance while under pressure at work.
How can my experience be so totally different?
In my experience every model absolutely sucks at doing any form of software engineering. The thing is though, you don't need to do any software engineering to write software... until you do. Since LLM's got decent at programming we've let them loose among our BI and solar plant engineers and they're building some really useful tools in Python. They have no idea what they are doing, and they are still getting stuff to work. It's obviously absolute horse shit, but for a lot of it, the shit code will be "good enough" for nobody having to touch it for it's lifecycle. For the rest, actual software engineers take over.
If you're working in a field where "good enough" is fine, then I imagine that LLM's can do a lot for you. It might sound like I'm being disrespectful, but I really don't mean to be. In my decades of experience with software in enterprise settings, I've seen so many programs build with horrible code just trundle along. Creating great value for the business while requiring no to minimal maintenance. Bad from a traditionalist software engineering perspective, but great from a business perspective (which is what matters). Not everything needs to scale after all. Anyway, I do question whether the author has actually worked with LLM's a lot. I would think that the strengths failings of LLM's would get obvious rather quickly to anyone who uses them.
Few things I've seen here.
> If you're working in a field where "good enough" is fine, then I imagine that LLM's can do a lot for you.
There's some specific timesheet software used for Gov Contractors that's hands-down, the worst web "application" I've ever used. Opening devTools on one of these pages is a riot--one of the "APIs" used return HTML documents with js file extensions so the browser fails to parse, every JS file is 1K lines long with blocks of global variables with various naming conventions, using 'back' in the browser just sort of fails and returns you to the sign-in page.
Any modern web programmer would have a stroke looking at this thing. Apparently it's the standard software used for this purpose, relied on all over the industry.
At a previous employer, I had to deal with some internal site for financial transactions and billing. Same story, ancient looking and performing web 'app' that had specific incantations to get it to actually work "don't refresh the page until that loads", piles of JS vomiting bizarre errors in the console, just a remarkable piece of junk.
Anyway, short story long, there's LOADS of software that's terrible by any metric that have crucial purposes floating around that are far worse in terms of 'code quality' than any LLM would spit out.
Also, I have more experience than I like to remember dealing with 'body shop' codebases. There is no way that an LLM is even capable of generating such low quality code. I legitimately think several of the projects I saw were sabotage; there's no way a human would deliberately write such inscrutable junk. Ever seen random python runtime errors cause the type of something would change depending on which branches were hit?
Nothing we do that is user facing is created by AI (this doesn't include any of our business intelligence as that's another department). We currently have a frontend lead who passionately hates AI for basically all of the reasons you bring up, and as long as they keep delivering, I doubt that will change anytime soon. As far as the BI goes, well, that will probably always be a horrible mess because it's next to impossible to find someone who's interested in working with the development/design side of PowerBI. I'm not sure it's even because we won't pay them enough, there doesn't seem to be anyone around here who wants a job like that. We tell our collective selves that it'll be better when we switch to Microsoft Fabric, but we'll see.
Deltek. Yuck.
People i've talked to who are not coders have all found these coding tools to be very useful, enough to be worth paying for (approx. $20 bucks a month).
The other thing, it's just another code generation tool, treat it accordingly
That said I never really doubted that they can help you get more done. I use it occasionally to refactor or generate some boilerplate. But very often though I need to rewrite most of the functionality, I find it more true the more code it generates.
The agentic stuff I've tried have been terrible. Yeah I personally haven't used the latest models, but I deal with teammates who use them and the frontend it generates looks bland and the shell scripts generated with them are very hard to debug but also just flat out broken.
But most of all, I love the act of programming and cannot imagine myself giving that over to an LLM. "Managing" an LLM is to me a soul crushing - maybe anti-intellectual, but definitely boring - task. And if that is what we as professionals will be required to do in the future, I'll just change profession.
My current workflow has reverted to primitive copy paste into web chat (via Kagi Assistant). The friction is enough to make me put a lot of thought into each prompt and how much code context I give it (gathered via files-to-prompt from simonw).
I have little experience with frontend and web apps, so I am trying out a supervised vibe coding flow. I give most of the code base per prompt, ask for a single feature, then read the code output fully and iterate on it a few times to reduce aforementioned bad patterns. Normally I will then type it out myself, or at most copy a few snippets of tens of lines.
What doesn’t work I found is asking for the full file with changes applied already. Not only does it take a long time and waste tokens, it normally breaks/truncates/rewords unrelated code.
So far I’m happy with how this project is going. I am familiar with all the code as I have audited and typed it out nearly entirely myself. I am actually retaining some knowledge and learning new concepts (reactive state with VanJS) and have confidence I can maintain this project even without an LLM in future, which includes handing it over to colleagues :)
0. It's undeniable that LLMs are both amazing and fun. So is a clown at a children's birthday party, but you wouldn't try to ride the balloon horse,
1. If you are producing code in a language you are not familiar with, you need to be aware of the Gell-Mann amnesia effect[0],
2. It excels at language translation. This is what LLMs were originally designed for. So if you can clearly describe an algorithm in English, it will probably be able to translate that to Python, for example. But at that point you have to question why you didn't just write it in Python in the first place,
3. It gamifies things and causes you to work longer and harder. I'm not convinced it's really more efficient, but I am convinced I simply wouldn't have bothered to put in the effort to do certain things before because the perceived effort of writing code from scratch is higher than spinning a roulette wheel that might shit out the answer first time,
4. It stops you thinking for yourself. Ever worked with an incredibly knowledgable colleague who is always eager to help? Or got the feeling you've solved something before so spent time grepping through old codebases and bash history to find what you did last time instead of spending the exact same time doing it again? It's kinda like that except it knows everything and it's always there. I fear that ability will atrophy with extended use,
5. I don't feel like I really own the code it outputs. It feels more like working in a large existing codebase written by a team. When I have to get into things and understand the code it's like I'm reading it for the first time, even though I did check all the diffs etc. It's like having a manic junior run rampant then coming to you when something really needs to be understood. Again, it's the question of is this really saving me any time in the long run?
I did find they very useful when writing completely new stuff (things like "write a <insert your favorite API" client or making test boilerplate ie. as a copy&paste replacement tool.
It'd be nice to hear in detail how its been useful for other devs. There's too much propaganda around on how amazing it is and not nearly enough use cases.
So i cannot take seriously the gods in the art of the prompt claiming that they can watch TV while the code writes itself. But i believe that those who are already good in their domain can do a better job with such powerful tools, when they can master them too.
A strong statement like this was a reasonable take six months ago, perhaps. But with Claude Opus 4.1, o3-codex/gpt5, and Gemini Pro 2.5 this statement is pretty simply disproven with half an hour with them.
Just last week I spent an afternoon rewriting an old Android app that was done in butterknife and got something functional and tested in a few hours. It involved an 8,000 line diff. The work had been quoted to me by a contractor as likely to take 3-4 months and cost tens of thousands of dollars.
I designed a new website for my wife's band with a custom player widget, carousel photos, fully adaptive to large and small screens, good accessibility features, built from easy to edit JSON, SEO-optimized with microformats, and fast-loading. The first version I got running in 15 minutes. The final polished version took a few more days of ducking onto my laptop a few minutes here and there during an event. Without AI this would have taken me weeks to pull off and wouldn't have looked nearly as nice.
IMO part of the key to the flow here is to avoid a temptation to one shot. First establish ground rules for working together in your AGENTS.md/CLAUDE.md that lays out your software engineering best principles (use git semantics, write out your plans, add lint and tests to commit hooks...). Then have it read through existing code and architecture. Then carefully put together an architecture and set of constraints into your README/PRD. Then build a high level engineering plan divided up into sequences tasks and write it down (vs just keeping in context). Only then do you allow any code to start to get written. And yes, you still need to supervise it (and iterate on your AGENTS/CLAUDE to avoid repeating yourself). But yeah, it's about 10-100x faster than me now with a flow like that. (Context on me: been programming 40 years, Stanford CS, ACM international programming competition finalist.)
This new flow is extremely fun and addictive. It's also a new thing that uses technical knowledge but isn't exactly the same thing as coding. It's like having a pool of hard working inexpensive idiot savant never-worked-in-industry interns at your beck and call - the more you pour into giving them context and process, the better results you are gonna get.
Although I believe you, I think this is due to improper use. I know this sounds like a "you're holding phone wrong" moment, but I found many uses for AI that I feel addicted to what its providing me.
Like you I started with Vim long ago, in an attempt to reduce the thought -> code cycle. But after many years I started seeking something even faster, and LLMs seem to be it for me.
From a quicker navigation of the codebase, to quickly writing the boilerplate for me, adding debugging statements where I need. Believe me there is a big jump in productivity if you use it right. You just gotta keep trying until you develop a sense of what you can ask it to do and what it would be good for
All of whom had greater output in terms of volume and impact before LLMs. And it is always the same people who now get voted to the top as opposed to a couple of months ago, when one of them got ratio'd for one of his exuberant blog posts.
Seems like their attitude is exactly what you’d expect of someone who wants it to be useless from the outset. A self fulfilling hypothesis.
As a simple test, I used Claude under GitHub Copilot to develop a simple program to blink an LED using a task under zeptoforth (my Forth implementation for ARM Cortex-M, which is on GitHub so it should have been in their training set) on a Pimoroni Pico Plus 2... which took a few hours to complete because it was constantly hallucinating my API's and getting minor points wrong that I had to tell it to fix... for it to only screw something else up the next time around... repeatedly.
Then the next day they had us do an "agentive AI training" where we were supposed to use "agentive AI" to program example stuff... and I found it worked to a point ─ until it didn't work, and I could never actually make it do what I really wanted it to no matter how hard I "vibed". It just hit a stopping point where no amount of prompting would make it work.
And then there's the actual use cases for AI at my day job ─ I have not seen any that apply to me. I write only limited amounts of code ─ I do not churn out the boilerplate that AI is sold as obviating ─ and the code I do write has to be correct, for values of correct where simply looking correct is not sufficient (and an AI won't know the underlying math behind things or how to avoid things like race conditions).
If I actually used AI I would be spending more time trying to understand its output and having to fix it than I actually spend developing code, and I would have a harder time trusting the resulting code. I know how my code works because I wrote it ─ the same would not be true if I had an AI write my code for me.
Sure, there are people at my day job who have gotten fully on-board with the AI thing, and who see it as being useful for things like, say, changing the C++ standard for a whole project at a time. However, I would not be able to trust the project after that point myself. Yes, it would take much more time to manually update the code, but then it would be much easier to trust afterwards, and would require far less fixing to handle all the cases where the AI screwed up.
Copy pasting code into ChatGPT gives me unreliable results, because I don’t know how to talk to it. I find copilot annoying in Visual Studio. The workflow of cursor/windsurf is not intuitive for me. Gemini context is wicked but I didn’t like how it constructed its outputs. I tried Claude Code and became a believer, not because it’s perfect, but because it fits my cadence and allows me to pivot quickly.
I still use chat for the occasional MSSQL syntax check, or Gemini for writing a technical doc. But there are so many variables and trial and error involved.
I keep reading posts like this thinking that there is a flavour of fear holding people back. Nothing to fear, just level set your expectations, don’t believe the hype and be open to the vibes.
This is a really interesting signal to me. It's almost indisputable that you can get good results (I get good results pretty consistently) and so there's definitely something there. I don't think that folks who don't get good results are doing something "wrong" so much as not understanding how to work with the model to get good results.
If I was at a company building these tools, the author would be the person I'd want to interview. I doubt it's a skill issue. And it's definitely not user error. You can't sell a tool that is said to do something but the user can't replicate the result.
A tool that works but only after you've invested lots of time working to reverse engineer it in your head isn't a good tool, even if it's extremely powerful. The tool needs to be customizable and personalizable and have safety rails to prevent bad results.
If that's the article writer's experience then they are simply using it incorrectly. The author seems to suggest most of their usage involves pasting code or chatting via the text interface, likely with a non-sota model.
It would be surprising to find that someone using e.g. Claude Code cli with full access to a codebase and carefully considered prompts is always getting useless junk.
Surely, at the bare minimum, you can use it for boilerplate and auto-complete, which in itself is often a massive time savings.
PaulHoule•6mo ago
People think "Oh, it works better when somebody else does it" or "There must be some model that does better than the one I am using" or "If I knew how to prompt better I'd get better results" or "There must be some other agentic IDE which is better than the one I am using."
All those things might be true but they just change the odds, they don't change the fact that it works sometimes and fails other times.
For instance I asked an agent to write me a screen to display some well-typed data. It came up with something great right away that was missing some fields and had some inconsistent formatting but it fixed all those problems when I mentioned them -- all speaking the language of product managers and end users. The code quality was just great, as good as if I wrote it, maybe better.
Plenty of times it doesn't work out like that.
I was working on some code where I didn't really understand the typescript types and fed it the crazy error messages I was getting and it made a try to understand them and didn't really, I used it as a "rubber duck" over the course of a day or two and working with it I eventually came to understand what was wrong and how to fix and I got into a place that I like and when there is an error I can understand it and it can understand it too.
Sometimes it writes something that doesn't typecheck and I tell it to run tsc and fix the errors and sometimes it does a job I am proud of and other times it adds lame typeguards like
Give it essentially the same problem, say writing tests in Java, and it might take very different approaches. One time it will use the same dependency injection framework used in other tests to inject mocks into private fields, other times it will write some a helper method to inject the mocks into private fields with introspection directly.You might be able to somewhat tame this randomness with better techniques but sometimes it works and sometimes it doesn't and if I just told you about the good times or just told you about the bad times it would be a very different story.
leptons•6mo ago
I have to wonder if you tried a simple google search and read through some docs if you couldn't have figured this out quicker than trying to coax a result out of the LLM.
solarkraft•6mo ago
I have had cases in which a web search and some good old fashioned thinking have yielded better results than using an LLM, but on average I’m pretty sure the LLM has the edge.
PaulHoule•6mo ago
The answers are in (i) the Typescript documentation and (ii) the documentation of libraries that I'm using. I could get lucky with a Google search and it could be worth trying, but I wouldn't expect it to work. Personally my preference is to have a language and libraries with great documentation (Python, Java, Typescript isn't too bad [1]) and really know that documentation like the back of my hand.
If I hadn't had the LLM I would have probably figured it out the same way doing experiments, I might have asked my other "rubber duck"
https://mastodon.social/@UP8/113935901671533690
A tactic I didn't use, which helps in "legacy" systems where I am stuck, is start a fresh project in the IDE and try either reproducing the problem or building a tiny system which is problem free.
I'm hesitant to say what speedup I got out of the "figuring out the types together with the LLM" but emotionally I felt supported and in the process I wrote a whole lot, like I was keeping track of the process in a notebook. I feel a lot times when I have good LLM conversations I wind up writing better code than I would otherwise, not necessarily write it faster -- it's like pair programming.
[1] The typescript docs are great for the typescript stuff, MDN is good for Javascript and Javascript's stdlib
leptons•6mo ago
Wow. This sounds so foreign to me. Being emotionally supported by the proverbial "tin can". If that makes you happy, then I'm happy for you, I guess?
rediscovery•6mo ago
"Gambling-like behavior in pigeons: ‘jackpot’ signals promote maladaptive risky choice"
https://www.nature.com/articles/s41598-017-06641-x
The tech industry is actively promoting gambling addiction and the scary thing is that people are willingly walking into that trap.
Take a look at this comment: https://news.ycombinator.com/item?id=44849147
"Most of what I've learned from talking to people about their workflows is counterintuitive and subtle."
Seriously? Are we at the point of doing rain dances for these models and describing the moves as "counterintuitive and sublte"? This is some magical thinking level self delusion.
Downvote all you like, or ignore this. Agency is being taken away from us, no one gets to say we didn't see it coming down the line because we did and we said something and our peers treated us like ignorant and self interested for pointing out the obvious.
ath3nd•6mo ago
You are not alone: https://rgoldfinger.com/blog/2025-07-26-claude-code-is-a-slo...
> Are we at the point of doing rain dances for these models and describing the moves as "counterintuitive and sublte"
Yes, we are. LLM cultists can downvote as much as they like, but the reality is that with all the incantations and coaxing, we not only don't see a positive effect, we see a net negative effect on the quality of our software. On the ground level of OSS we are drowning in AI generated slop: https://www.theregister.com/2025/07/15/curl_creator_mulls_ni...
On the enterprise level we are creating more and worse code, faster. In the news, we are generating more and worse and factually wrong articles, faster. In our personal lives, guided by our sycophantic LLMs assistants, we offload more and more cognitive chores, becoming dumber and dumber.
LLM-produced slop is like the plastic of text: cheap and easy to use and proliferate, and then 40 years later you realize it's everywhere in the environment and in your brain, and that it's toxic to all life.
jama211•6mo ago
ath3nd•6mo ago
Yes, by the same logic arsenic has many pros (and some cons, admittedly) in paint production, thorium is a not bad material for toothpaste (google it), and asbestos is just dandy for fire insulation (with some teeny tiny negatives for your lungs).
The fact that plastic is in the bloodstream of every living being and it's a strong endocrine disruptor and there is literally a garbage patch of it floating in our oceans is a small price to pay for the convenience and pleasure of drinking coca-cola from a lighter-than-glass bottle! Fantastic material indeed! What other fantastic materials are there? Lead for pipes? White phosphorus for matches?
jama211•5mo ago
GreenWatermelon•5mo ago
jama211•5mo ago
jama211•6mo ago
ath3nd•6mo ago
Sure, if you want to do with 100 developers what 10 can do better, cheaper and faster, don't spend time on code quality. Just vibe it. More work for the experts to fix that mess later.
jama211•5mo ago
Terr_•6mo ago