Below average people can use AI to get average results.
I suppose it's all a matter of what one is using an LLM for, no?
GPT is great at citing sources for most of my requests -- even if not always prompted to do so. So, in a way, I kind of use LLMs as a search engine/Wikipedia hybrid (used to follow links on Wiki a lot too). I ask it what I want, ask for sources if none are provided, and just follow the sources to verify information. I just prefer the natural language interface over search engines. Plus, results are not cluttered with SEO ads and clickbait rubbish.
Also I think what you're saying is a direct contradiction of the parent. Below average people can now get average results; in other words: The LLM will boost your capabilities (at least if you're already 'less' capable than average). This is a huge benefit if you are in that camp.
But for other cases too, all you need to know is where your knowledge ends, and that you can't just blindly accept what the AI responds with. In fact, I find LLMs are often most useful precisely when you don’t know the answer. When you’re trying to fill in conceptual gaps and explore an idea.
Even say during code generation, where you might not fully grasp what’s produced, you can treat the model like pair programming and ask it follow-up questions and dig into what each part does. They're very good at converting "nebulous concept description" into "legitimate standard keyword" so that you can go and find out about said concept that you're unfamiliar with.
Realistically the only time I feel I know more than the LLM is when I am working on something that I am explicitly an expert in, and in which case often find that LLMs provide nuance lacking suggestions that don’t always add much. It takes a lot more filling in context in these situations for it to be beneficial (but still can be).
Take a random example of nifty bit of engineering: The powerline ethernet adapter. A curious person might encounter these and wonder how they work. I don't believe an understanding of this technology is very obvious to a layman. Start asking questions and you very quickly come to understand how it embeds bits in the very same signal that transmits power through your house without any interference between the two "types" of signal. It adds data to high frequencies on one end, and filters out the regular power transmitting frequencies at the other end so that the signal can be converted back into bits for use in the ethernet cable (for a super brief summary). But if want to really drill into each and every engineering concept, all I need to do is continue the conversation.
I personally find this loop to be unlike anything I've experienced as far as getting immediate access to an understanding and supplementary material for the exact thing Im wondering about.
But that would shift the average up.
Nothing is free, and I for one prefer a subscription model, if only as a change from the ad model.
I am sure we will see the worst of all worlds, but for now, for this moment in history, subscription is better than ads.
Let’s also never have ads in GenAi tools. The kind of invasive intent level influence these things can achieve, will make our current situation look like a paradise
Sure you can't point it to a Jira ticket and get a PR but you certainly can use it as a pair programmer. I wouldn't say it is much faster than working alone but I end up writing more tests and arguing with it over error handling means I do a better job in the end.
You absolutely can. This is exactly what SWE-Bench[0] measures, and I've been amazed at how quickly AIs have been climbing those ladders. I personally have been using Warp [1] a lot recently and in quite a lot of low-medium difficulty cases it can one-shot a decent PR. For most of my work I still find that I need to pair with it to get sufficiently good results (and that's why I still prefer it to something cloud-based like Codex [2], but otherwise it's quite good too), and I expect the situation to flip over the coming couple of years.
Something about a brand-new project often makes LLMs drop to "example grade" code, the kind you'd never put in production. (An example: claude implemented per-task file logging in my prototype project by pushing to an array of log lines, serializing the entire thing to JSON and rewriting the entire file, for every logged event)
GPT-1 June 2018
GPT-2 February 2019
GPT-3 November 2021
GPT-4 March 2023
Claude tells me this is the rough improvement of each:
GPT-1 to 2: 5-10x
GPT-2 to 3: 10-20x
GPT 3 to 4: 2-4x
Now it's been 2.5 years since 4.
Are you expecting 5 to be 2-4x better, or 10-20x better?
More recently, it seems like that's not the case. Larger models sometimes even hallucinate more [0]. I think the entire sector is suffering from a Dunning Kruger effect -- making an LLM is difficult, and they managed to get something incredible working in a much shorter timeframe than anyone really expected back in the early 2010s. But that led to overconfidence and hype, and I think there will be a much longer tail in terms of future improvements than the industry would like to admit.
Even the more advanced reasoning models will struggle to play a valid game of chess, much less win one, despite having plenty of chess games in their training data [1]. I think that, combined with the trouble of hallucinations, hints at where the limitations of the technology really are.
Hopefully LLMs will scare society into planning how to handle mass automation of thinking and logic, before a more powerful technology that can really do it arrives.
[0]: https://techcrunch.com/2025/04/18/openais-new-reasoning-ai-m...
[1]: https://dev.to/maximsaplin/can-llms-play-chess-ive-tested-13...
I believe hallucinations are partly an artifact of imperfect model training, and thus can be ameliorated with better technique.
Smaller models may hallucinate less: https://www.intel.com/content/www/us/en/developer/articles/t...
The RAG technique uses a smaller model and an external knowledge base that's queried based on the prompt. The technique allows small models to outperform far larger ones in terms of hallucinations, at the cost of performance. That is, to eliminate hallucinations, we should alter how the model works, not increase its scale: https://highlearningrate.substack.com/p/solving-hallucinatio....
Pruned models, with fewer parameters, generally have a lower hallucination risk: https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00695.... "Our analysis suggests that pruned models tend to generate summaries that have a greater lexical overlap with the source document, offering a possible explanation for the lower hallucination risk."
At the same time, all of this should be contrasted with the "Bitter Lesson" (https://www.cs.utexas.edu/~eunsol/courses/data/bitter_lesson...). IMO, making a larger LLMs does indeed produce a generally superior LLM. It produces more trained responses to a wider set of inputs. However, it does not change that it's an LLM, so fundamental traits of LLMs - like hallucinations - remain.
Even in nature this is clear. Humans are a great example: cooked food predates homo sapiens and it is largely considered to be a pre-requisite for having human level intelligence because of the enormous energy demands of our brains. And nature has given us wildly more efficient brains in almost every possible way. The human brain runs on about 20 watts of power, my RTX uses 450 watts at full capacity.
The idea of "runaway" super intelligence has baked in some very extreme assumptions about the nature of thermodynamics and intelligence, that are largely just hand waved away.
On top of that, AI hasn't changed in a notable way for me personally in a year. The difference between 2022 and 2023 was wild, between 2023 and 2024 changed some of my workflows, 2024 to today largely is just more options around which tooling I used and how these tools can be combined, but nothing really at a fundamental level feels improved for me.
Because it can't actually model these complex problems, it really requires awareness from the user regarding what questions should and shouldn't be asked. An LLM can probably tell you how a knight moves, or how to respond to the London System. It probably can't play a full game of chess with you, and will virtually never be able to advise you on the best move given the state of the board. It probably can give you information about big companies that are well-covered in its training data. It probably can't give you good information about most sub-$1b public companies. But, if you ask, it will give a confident answer.
They're a minefield for most people and use cases, because people aren't aware of how wrong they can be, and the errors take effort and knowledge to notice. It's like walking on a glacier and hoping your next step doesn't plunge through the snow and into a deep, hidden crevasse.
I have friends who are highly educated professionals (PhDs, MDs) who just assume that AI\LLMs make no mistakes.
They were shocked that it's possible for hallucinations to occur. I wonder if there's a halo effect where the perfect grammar, structure, and confidence of LLM output causes some users to assume expertise?
AI, in all its glory, is seen as an extension of that. A deterministic thing which is meticulously crafted to provide an undisputed truth, and it can't make mistakes because computers are deterministic machines.
The idea of LLMs being networks with weights plus some randomness is both a vague and too complicated abstraction for most people. Also, companies tend to say this part very quietly, so when people read the fine print, they get shocked.
I think it's just that LLMs are modeling generative probability distributions of sequences of tokens so well that what they actually are nearly infallible at is producing convincing results. Often times the correct result is the most convincing, but other times what seems most convincing to an LLM just happens to also be most convincing to a human regardless of correctness.
> In computer science, the ELIZA effect is a tendency to project human traits — such as experience, semantic comprehension or empathy — onto rudimentary computer programs having a textual interface. ELIZA was a symbolic AI chatbot developed in 1966 by Joseph Weizenbaum and imitating a psychotherapist. Many early users were convinced of ELIZA's intelligence and understanding, despite its basic text-processing approach and the explanations of its limitations.
We are barely talking modern media literacy, and now we have machines that talk like 'trusted' face to face humans, and can be "tuned" to suggest specific products or use any specific tone the owner/operator of the system wants.
(To be fair, in many cases, I'm not terribly interested in learning the details of their field.)
Highly educated professionals in my experience are often very bad at applied epistemology -- they have no idea what they do and don't know.
There are lies, statistics and goddamn hallucinations.
In chess, previous moves are irrelevant, and LLM aren't good with filtering out irrelevant data [1]. For better performance, you should include only the relevant data in the context window: the current state of then board.
Insane that the model actually does seem to internalize a representation of the state of the board -- rather than just hitting training data with similar move sequences.
...Makes me wish I could get back into a research lab. Been a while since I've stuck to reading a whole paper out of legitimate interest.
(Edit) At the same time, it's still worth noting the accuracy errors and the potential for illegal moves. That's still enough to prevent LLMs from being applied to problem domains with severe consequences, like banking, security, medicine, law, etc.
it feels like when you need to paint walls in your house. If you've never done it before you'll probably reach for tape to make sure you don't ruin the ceiling and floors. the tape is a tool for amateur wall painters to get decent results somewhat efficiently compared to if they didn't. If your an actual good wall painter, tape only slows you down. You'll go faster without the "help".
I recall he mentions in this video that the new advice they are giving to founders is to throw away prototypes when they pivot instead of building onto a core foundation. This is because of the effects described in the article.
He also gives some provisional numbers (see the section "Rapid Prototyping and Engineering" and slides ~10:30) where he suggests prototype development sees a 10x boost compared to a 30-50% improvement for existing production codebases.
This feels vaguely analogous to the switch from "pets" to "livestock" when the industry switched from VMs to containers. Except, the new view is that your codebase is more like livestock and less like a pet. If true (and no doubt this will be a contentious topic to programmers who are excellent "pet" owners) then there may be some advantage in this new coding agent world to getting in on the ground floor and adopting practices that make LLMs productive.
Before you call this pattern silly, consider that the fairly normal plural “Unices” is by analogy with Latin plurals in -x = -c|s ~ -c|ēs, where I’ve expanded -x into -cs to make it clear that the Latin singular comprises a noun stem ending in -c- and a (nominative) singular ending -s, which does exist in Latin but is otherwise completely nonexistent in English. (This is extra funny for Unix < Unics < Multics.) Analogies are the order of the day in this language.
Also, probably because of approaching graybeard territory, Thinking about boxen of VAXen running UNIXen makes me feel warm and fuzzy. :D
We use the same analogy for the last 20 years or so. Provisioning 150 cattle servers take 15 minutes or so, and we can provision a pet in a couple of hours, at most.
[0]: https://www.engineyard.com/blog/pets-vs-cattle/
*: Engine Yard post notes that Microsoft's Bill Baker used the term earlier, though CERN's date (2012) checks out with our effort timeline and how we got started.
this tweet by Tim Bell seems to indicate shared credit with Bill Baker and Randy Bias
https://x.com/noggin143/status/354666097691205633
@randybias @dberkholz CERN's presentation of pets and cattle was derived from Randy's (and Bill Baker's previously).
Because using an LLM doesn't mean you devalue well-crafted or understandable results. But it does indicate a significant shift in how you view the code itself. It is more about the emotional attachment to code vs. code as a means to an end.
I don't have a crystal ball and I can't predict the actual future. But I can see the list of potential futures and I can assign likelihoods to them. And among the potential futures is one where the need for humans to fix the problems created by poor AI coding agents dwindles as the industry completely reshapes itself.
I just wouldn't want to be responsible for servicing a guarantee about the reliability of early cars.
And I'll feel no sense of vindication if I do get that support case. I will probably just sigh and feel a little more tired.
So consider differing perspectives. Like a teenage kid that is hanging around the stables, listening to the veteran coachmen laugh about the new loud, smoky machines. Proudly declaring how they'll be the ones mopping up the mess, picking up the stragglers, cashing it in.
The career advice you give to the kid may be different than the advice you'd give to the coachman. That is the context of my post: Andrew Ng isn't giving you advice, he is giving advice to people at the AI school who hope to be the founders of tomorrow.
And you are probably mistaken if you think the solution to the problems that arise due to LLMs will result in those kids looking at the past. Just like the ultimate solution to car reliability wasn't a return to horses but rather the invention of mechanics, the solution to problems caused by AI may not be the return to some software engineering past that the old veterans still hold dear.
Things change, and that's ok. I guess I just got lucky so far that this thing I like doing just so happens to align with a valuable skill.
I'm not arguing for or against anything, but I'll miss it if it goes away.
Obviously their software sucks, and eventually parts of it always escalates into a support ticket which reaches my colleagues and me. It's almost always some form of performance issue, this is in part because we have monthly sessions where they can bring issues they simply can't get to work to us. Anyway, I see that as a good thing. It means their software is serving the business and now we need to deal with the issues to make it work even better. Sometimes that is because their code is shit, most times it's because they've reached an actual bottleneck and we need to replace part of their Python with a C/Zig library.
The important part of this is that many of these bottlenecks appear in areas that many software enginering teams that I have known wouldn't necessarily have predicted. Mean while a lot of the areas that traditional "best practices" call for better software architecture for, work fine for entire software lifecycles being absolutely horrible AI slop.
I think that is where the emotional attachment is meant to fit in. Being fine with all the slop that never actually matters during a piece of softwares lifecycle.
This probably does not apply to legacy code that has been in use for several years where the production deployment gives you a higher level of confidence (and a higher risk of regression errors with changes).
Have you blogged about your insights, the https://stillpointlab.com site is very sparse as is @stillpointlab
Once I have the MVP working, I will be working on publishing as a means to dogfood the tool. So, check back soon!
What you could do: sign in using one of the OAuth methods, go to the user page and then go to the feedback section. Let me know in a message your email and I'll ping you once the blog is setup.
Sorry it is primitive at this stage but I'm prioritizing MVP before marketing.
It is a false confidence generator
> This means cheaters will plateau at whatever level the AI can provide
From my experience, the skill of using AI effectively is of treating the AI with a "growth mindset" rather than a "fixed" one. What I do is that I roleplay as the AI's manager, giving it a task, and as long as I know enough to tell whether its output is "good enough", I can lend it some of my metagcognition via prompting to get it to continue working through obstacles until I'm happy with the result.
There are diminishing returns of course, but I found that I can get significantly better quality output than what it gave me initially without having to learn the "how" of the skill myself (i.e. I'm still "cheating"), and only focusing my learning on the boundary of what is hard about the task. By doing this, I feel that over time I become a better manager in that domain, without having to spend the amount of effort to become a practitioner myself.
Does it solve the problem? As long as it isn't prohibitively costly in terms of time or resources, then the rest is really just taste. As a user I have no interest whatsoever if your code is "idiomatic" or "modular" or "functional". In other industries "quality" usually means free of defects, but software is unique in that we just expect products to be defective. Your surgeon operates on the wrong knee? The board could revoke the license, and you are getting a windfall settlement. A bridge design fails? Someone is getting sued or even prosecuted. SharePoint gets breached? Well, that's just one of those things, I guess. I'm not really bothered that AI is peeing in the pool that has been a sewer as long as I can remember. At least the AI doesn't bill at an attorney's rate to write a mess that barely works.
It's the opposite of finding an answer on reddit, insta, tvtropes.
I can't wait for the first distraction free OS that is a thinking and imagination helper and not a consumption device where I have to block urls on my router so my kids don't get sucked into a skinners box.
I love being able to get answers from documentation and work questions without having to wade through some arbitrary UI bs a designer has implemented in adhoc fashion.
>It's the opposite of finding an answer on reddit, insta, tvtropes.
Yeah it really is because I can tell when someone doesn't know the topic well on reddit, or other forums, but usually someone does and the answer is there. Unfortunately the "AI" was trained on all of this, and the "AI" is just as likely to spit out the wrong answer as the correct one. That is not an improvement on anything.
> wade through UI distraction like ads and social media
Oh, so you think "AI" is going to be free and clear forever? Enjoy it while it lasts, because these "AI" companies are in way over their heads, they are bleeding money like their aorta is a fire hose, and there will be plenty of ads and social whatever coming to brighten your day soon enough. The free ride won't go on forever - think of it as a "loss leader" to get you hooked.
Will some LLMs have ads? Sure, especially at a free tier. But I bet the option to pay $20/month for ad-free LLM usage will always be there.
$20 month won't get you much, if you're paying above what it costs to run the "AI", and for what? Answers that are in the ballpark of suspicious and untrustworthy?
Maybe they just need to keep spending until all the people who can tell slop from actual knowledge are all dead and gone.
https://www.economist.com/content-assets/images/20250215_FNC...
https://www.economist.com/finance-and-economics/2025/02/13/h...
>The shift in recent economic research supports his observation. Although early studies suggested that lower performers could benefit simply by copying AI outputs, newer studies look at more complex tasks, such as scientific research, running a business and investing money. In these contexts, high performers benefit far more than their lower-performing peers. In some cases, less productive workers see no improvement, or even lose ground.
Right, the reason why I pointed out "recent" is that it's new evidence that people might not be aware of, given that there were also earlier studies showing AI had the opposite effect on inequality. The "recent" studies also had varied methodology compared to the earlier studies.
>The studies showing reduced equality all apply to uncommon tasks like material discovery and debate points
"Debating points" is uncommon? Maybe not everyone was in the high school debate club, but "debating points" is something that anyone in a leadership position does on a daily basis. You're also conveniently omitting "investment decisions" and "profits and revenue", which basically everyone is trying to optimize. You might be tempted to think "Coding efficiency" represents a high complexity task, but the abstract says the test involved "Recruited software developers were asked to implement an HTTP server in JavaScript as quickly as possible". The same is true of the task used in the "legal analysis" study, which involved drafting contracts or complaints. This seems exactly like the type of cookie cutter tasks that the article describes would become like cashiers and have their wages stagnate. Meanwhile the studies with negative results were far more realistic and measured actual results. Otis et al 2023 measured profits and revenue of actual Kenyan SMBs. Roldan-Mones measured debate performance as judged by humans.
Okay, well the majority of this "recent" evidence agrees with the pre-existing evidence that inequality is reduced.
> "Debating points" is uncommon?
Yes. That is nobody's job. Maybe every now and then you might need to come up with some arguments to support a position, but that's not what you get paid to do day to day.
> You're also conveniently omitting "investment decisions" and "profits and revenue", which basically everyone is trying to optimize.
Very few people are making investment decisions as part of their day to day job. Hedge funds may experience increasing inequality, but that kinda seems on brand.
On the other hand "profits and revenue" is not a task.
> You might be tempted to think "Coding efficiency" represents a high complexity task, but the abstract says the test involved "Recruited software developers were asked to implement an HTTP server in JavaScript as quickly as possible". The same is true of the task used in the "legal analysis" study, which involved drafting contracts or complaints.
These sound like real tasks that a decent number of people have to do on a regular basis.
> Meanwhile the studies with negative results were far more realistic and measured actual results. Otis et al 2023 measured profits and revenue of actual Kenyan SMBs. Roldan-Mones measured debate performance as judged by humans.
These sound like niche activities that are not widely applicable.
Three, since Toner-Rodgers 2024 currently seems to be a total fabrication.
The means of production are in a small oligopoly, the rest will be redundant or exploitable sharecroppers.
(All this under the assumption that "AI" works, which its proponents affirm in public at least.)
There's also a gap in addressing vibe coded "side projects" that get deployed online as a business. Is the code base super large and complex? No. Is AI capable of taking input from a novice and making something "good enough" in this space? Also no.
AI tools are great at unblocking and helping their users explore beyond their own understanding. The tokens in are limited to the users' comprehension, but the tokens out are generated from a vast collection of greater comprehension.
For the novice, it's great at unblocking and expanding capabilities. "Good enough" results from novices are tangible. There is no doubt the volume of "good enough" is perceived as very low by many.
For large and complex codebases, unfortunately the effects of tech debt (read: objectively subpar practices) translate into context rot at development time. A properly architected and documented codebase that adheres to common well structured patterns can easily be broken down into small easily digestible contexts. i.e. a fragmented codebase does not scale well with LLMs, because the fragmentation is seeding the context for the model. The model reflects and acts as an amplifier to what it's fed.
For personal tools or whatever, sure. And the tooling or infrastructure might get there for real projects eventually, but it’s not currently. The prospect of someone naively vibe coding a side business including a payment or authentication system or something that stores PII— all areas developers learn the dangers of through the wisdom gained only by experience— sends shivers down my spine. Even amateur coders trying that stuff try old fashioned way must read their code and the docs and info on the net and such and will likely get some sense of the danger. Yesterday I saw someone here recounting a disastrous data breach of their friend’s vibe coded side hustle.
The big problem I see here is people not knowing enough to realize that something functioning is almost never a sign that it is “good enough” for many things they might assume it is. Gaining the amount of base knowledge to evaluate things like form security nearly makes the idea of vibe coding useless for anything more than hobby or personal utility projects.
It seems like you're claiming complex codebases are hard for LLMs because of human skill issues. IME it's rather the opposite - an LLM makes it easier for a human to ramp up on what a messy codebase is actually doing, in a standard request/response model or in terms of looking at one call path (however messy) at a time. The models are well trained on such things and are much faster at deciphering what all the random branches and nested bits and pieces do.
But complex codebases actually usually arise because of changing business requirements, changing market conditions, and iteration on features and offerings. Execution quality of this varies but a "properly architected and documented codebase" is rare in any industry with (a) competitive pressure and (b) tolerance for occasional bugs. LLMs do not make the need to serve those varied business goals go away, nor do they remove the competitive pressure to move rapidly vs gardening your codebase.
And if you're working in an area with extreme quality requirements that have forced you into doing more internal maintenance and better codebase hygiene then you find yourself with very different problems with unleashing LLMs into that code. Most of your time was never spent writing new features anyway, and LLM-driven insight into rare or complex bugs, interactions, and performance still appears quite hit or miss. Sometimes it saves me a bunch of time. Sometimes it goes in entirely wrong directions. Asking it to make major changes, vs just investigate/explain things, has an even lower hit rate.
Too wide of surface area in one context also causes efficiency issues. Lack of definition in context and you'll get less lower quality results.
Do keep in mind the code being read and written is intrinsically added to context.
I think this could still be used as a valuable form of communication if you can clearly express the idea that this is representing a hypothesis rather than a measurement. The simplest would be to label the graphs as "hypothesis". but a subtle but easily identifiable visual change might be better.
Wavy lines for the axis spring to mind as an idea to express that. I would worry about the ability to express hypotheses about definitive events that happen when a value crosses an axis though, You'd probably want a straight line for that. Perhaps it would be sufficient to just have wavy lines at the ends of the axes beyond the point at which the plot appears.
Beyond that. I think the article presumes the flattening of the curve as mastery is achieved. I'm not sure that's a given, perhaps it seems that way because we evaluate proportional improvement, implicitly placing skill on a logarithmic scale.
I'd still consider the post from the author as being done in better faith than the economist links.
Id like to know what people think, and for them to say that honestly. If they have hard data, they show it and how it confirms their hypothesis. At the other end of the scale is gathering data and only exposing the measurements that imply a hypothesis that you are not brave enough to state explicitly.
It's free for everyone with a phone or a laptop.
For a time I refused to talk with anybody or read anything about AI, because it was all noise that didn't match my hard-earned experience. Recently HN has included some fascinating takes. This isn't one.
I have the opinion that neurodivergents are more successful using AI. This is so easily dismissed as hollow blather, but I have a precise theory backing this opinion.
AI is a giant association engine. Linear encoding (the "King - Man + Woman = Queen" thing) is linear algebra. I taught linear algebra for decades.
As I explained to my optometrist today, if you're trying to balance a plate (define a hyperplane) with three fingers, it works better if your fingers are farther apart.
My whole life people have rolled their eyes when I categorize a situation using analogies that are too far flung for their tolerances.
Now I spend most of my time coding with AI, and it responds very well to my "fingers farther apart" far reaching analogies for what I'm trying to focus on. It's an association engine based on linear algebra, and I have an astounding knack for describing subspaces.
AI is raising the ceiling, not the floor.
If you made analogies based on Warhammer 40k or species of mosquitoes it would have reacted exactly the same.
For a statistician, determining a plane from three approximate points on the plane is far more accurate if the points aren't next to each other.
When we offer examples or associations in a prompt, we experience a similar effect in coaxing a response from AI. This is counter-intuitive.
I'm fully aware that most of what I post on HN is intended for each future AI training corpus. If what I have to say was already understood I wouldn't say it.
To become good at something you have to work through the lower rungs and acquire skill. AI does all those lower level jobs, puts the people who need those jobs for experience on the street, and robs us of future experts.
The people who benefit the most are those who are already up on top of the ladder investing billions to make the ladder raise faster and faster.
But I share your concerns that:
AI doing the lesser tasks of [whatever] ->
less(no?) humans will do those tasks ->
less(no?) experienced humans to further the state of the art ->
automation-but-stagnation.
But tragedy of the commons says I have to teach my kid to use AI!
There are always people willing to take shortcuts at long-term expense. Frankly I'm fine with the selection pressure changing in our industry. Those who want to learn will still find a way to do it.
I've been coding for 15 years but I find I'm able to learn new languages and concepts faster by asking questions to ChatGPT.
It takes discipline. I have to turn off cursor tab when doing coding exercises. I have to take the time to ask questions and follow-up questions.
But yes I worry it's too easy to use AI as a crutch
I've been coding for decades already, but if I need to put something together in an unfamiliar language? I can just ask AI about any stupid noob mistake I make.
It knows every single stupid noob mistake, it knows every "how do I sort an array", and it explains well, with examples. Like StackOverflow on steroids.
The caveat is that you need to WANT to learn. If you don't, then not learning is easier than ever too.
So you aren’t still learning foundational concepts or how to think about problems, you are using it as a translation tool. Very different, in my opinion.
So off course AI falls into this realm.
> To become good at something you have to work through the lower rungs and acquire skill. AI does all those lower level jobs, puts the people who need those jobs for experience on the street, and robs us of future experts.
You can still do that with AI, you give yourself assignments and then use the AI as a resource when you get stuck. As you get better you ask the AI less and less. The fact that the AI is wrong sometimes is like test that allows you to evaluate if you are internalizing the skills or just trusting the AI.
If we ever have AIs which don't hallucinate, I'd want that added back in as a feature.
And my client is often the brokerage, they just want their agents to produce commissions so they make a cut. They know their top producers probably wont get much from what I offer, but we all see that their worst performers could easily double their business.
I wonder: the graphs treat learning with and without AI as two different paths. But obviously people can switch between learning methods or abandon one of them.
Then again, I wonder how many people go from learning about a topic using LLMs to then leaving them behind to continue the old school way. I think the early spoils of LLM usage could poison your motivation to engage with the topic on your own later on.
I can watch a video about the subject, when I want to go deeper, I go to LLMs, throw a bunch of questions at it, because thanks to the videos I now know what to ask. Then the AI responses tell me what I need to understand deeper, so I pick a book that addresses those subjects. Then as I read the book and I don’t understand something, or I have some questions that I want the answer for immediately, I consult ChatGPT (or any other tool I want to try). At different points in the journey, I find something I could build myself to deepen my understanding. I google open source implementations, read them, ask LLMs again, I watch summary videos, and work my way through the problem.
LLMs serve as a “much better StackOverflow / Google”.
But once you know basics, LLMs are really good to deepen the knowledge, but using only them is quite challenging. But as a complementary tool I find them excellent.
Unlike the telephone (telephones excited a certain class of people into believing that world-wide enlightenment was on their doorstep), LLMs don't just reduce reliance on visual tells and mannerisms, they reduce reliance on thinking itself. And that's a very dangerous slope to go down on. What will happen to the next generation when their parents supply substandard socially-computed results of their mental work (aka language)? Culture will decay and societal norms will veer towards anti-civilizational trends. And that's exactly what we're witnessing these days. The things that were commonplace are now rare and sometimes mythic.
Everyone has the same number of hours and days and years. Some people master some difficult, arcane field while others while it away in front of the television. LLMs make it easier for the television-watchers to experience "entertainment nirvana" while enticing the smart, hard-workers to give up their toil and engage "just a little" rest, which due to the insidious nature of AI-based entertainment, meshes more readily with their more receptive minds.
When it comes to things I am not good at at, it has given me the illusion of getting 'up to speed' faster. Perhaps that's a personal ceiling raise?
I think a lot of these upskilling utilities will come down to delivery format. If you use a chat that gives you answers, don't expect to get better at that topic. If you use a tool that forces you to come up with answers yourself and get personalized validation, you might find yourself leveling up.
Disagree. It's only the illusion of a personal ceiling raise.
---
Example 1:
Alice has a simple basic text only blog. She wants to update the styles on his website, but wants to keep his previous posts.
She does research to learn how to update a page's styles to something more "modern". She updates the homepage, post page, about page. She doesn't know how to update the login page without breaking it because it uses different elements she hasn't seen before.
She does research to learn what the new form elements and on the way sees recommendations on how to build login systems. She builds some test pages to learn how to restyle forms and while she's at it, also learns how to build login systems.
She redesigns her login page.
Alice believes she has raised the ceiling what she can accomplish.
Alice is correct.
---
Example 2:
Bob has a simple basic text only blog. He wants to update the styles on his website, but wants to keep his previous posts.
He asks the LLM to help him update styles to something more "modern". He updates the homepage, post page, about page, and login page.
The login page doesn't work anymore.
Bob asks the LLM to fix it and after some back and forth it works again.
Bob believes she has raised the ceiling what he can accomplish.
Bob is incorrect. He has not increased his own knowledge or abilities.
A week later his posts are gone.
---
There are only a few differences between both examples:
1. Alice does not use LLMs, but Bob does. 2. Alice knows how to redesign pages, but Bob does not. 3. Alice knows how login systems work, but Bob does not.
Bob simply asked the LLM to redesign the login page, and it did.
When the page broke, he checked that he was definitely using the right username and password but it still wasn't working. He asked the LLM to change the login page to always work with his username and password. The LLM produced a login form that now always accepted a hard coded username and password. The hardcoded check was taking place on the client where the username and password were now publicly viewable.
Bob didn't ask the LLM to make the form secure, he didn't even know that he had to ask. He didn't know what any of the footguns to avoid were because he didn't even know there were any footguns to avoid in the first place.
Both Alice and Bob started from the same place. They both lacked knowledge on how login systems should be built. That knowledge was known because it is documented somewhere, but it was unknown to them. It is a "known unknown".
When Alice learned how to style form elements, she also read links on how forms work which lead her to links on how login systems work. That knowledge for her went from an unknown known to a "known known" (knowledge that is known, that she now also knows).
When Bob asked the LLM to redesign his login page, at no point in time does the knowledge of how login systems work become a "known known" for him. And a week later some bored kid finds the page, right clicks on the form, clicks inspect and sees a username and password to log in with.
AI also enables much more efficient early stage idea validation, the point at which ideas/projects are the least anchored in established theory/technique. Thus AI will be a great aid in idea generation and early stage refinement, which is where most novel approaches stall or sit on a shelf as a hobby project because the progenitor doesn't have enough spare time to work through it.
edit: this comment was posted tongue-in-cheek after my comment reflecting my actual opinion was downvoted with no rebuttals:
It seems with computers we often think and reason far less than without. Everything required thought previously, now we can just copy and paste out word docs for everything. PowerPoints are how key decisions are communicated in most professional settings.
Before modern computers and especially the internet we also had more time for deep thinking and reasoning. The sublimity of deep thought in older books amazes me and it feels like modern authors are just slightly less deep on average.
So then LLMs are in my view an incremental change rather than a stepwise change with respect to its effects on human cognitive.
In some ways LLMs allow us to return a bit to more humanistic deep thinking. Instead of spending hours looking up minutia on Google, StackOverflow, etc now we can ask our favorite LLM instead. It gives us responses with far less noise.
Unlike with textbooks we can have dialogues and have it take different perspectives. Whereas textbooks only gave you that authors perspective.
Of course, it’s up to individuals to use it well and as a tool to sharpen thinking rather than replace it.
Think of how a similar chart for chess/go/starcraft-playing proficiency has changed over the years.
There will come a time when the hardest work is being done by AI. Will that be three years from now or thirty? We don't know yet, but it will come.
To clarify, I mean that the AI tools can help you get things done really fast but they lack both breadth and depth. You can move fast with them to generate proofs of concept (even around subproblems to large problems), but without breadth they lack the big picture context and without depth they lack the insights that any greybeard (master) has. On the other hand, the "engineering" side is so much more than "things work". It is about everything working in the right way, handling edge cases, being cognizant of context, creating failure modes, and all these other things. You could be the best programmer in the world, but that wouldn't mean you're even a good engineer (in real world these are coupled as skills learned simultaneously. You could be a perfect leetcoder and not helpful on an actual team, but these skills correlate).
The thing is, there will never be a magic button that a manager can press to engineer a product. The thing is, for a graybeard most of the time isn't spent around implementation, but design. The thing is, to get to mastery you need experience, and that experience requires understanding of nuanced things. Things that are non-obvious. There may be a magic button that allows an engineer to generate all the code for codebase, but that doesn't replace engineers. (I think this is also a problem in how we've been designing AI code generators. It's as if they're designed for management to magically generate features. The same thing they wish they could do with their engineers. But I think the better tool would be to focus on making a code generator that would generate based on an engineer's description.
I think Dijkstra's comments apply today just as much as they did then[0]
[0] On the foolishness of "natural language programming" https://www.cs.utexas.edu/~EWD/transcriptions/EWD06xx/EWD667...
as you say, software engineering is not only constructing program texts, its not even only applied math or overly scientific. at least that is my stance. I suspect AI code editors have lots of said tacit knowledge baked in (via the black box itself or its engineers) but we would be better off thinking about this explicitly.
> I suspect AI code editors have lots of said tacit knowledge baked in (via the black box itself or its engineers) but we would be better off thinking about this explicitly.
Until the AI is actually AGI I suspect it'll be better for us to do it. After all, if you don't do the design then you probably don't understand the design. Those details will kill youSo even if you follow the artcles premise (I do not), it still can potentially 'raise' you wherever you were.
Key seems to be wether you have enough expertise to evaluate or test the outputs. Some others have refered to this as having a good sense of the "known/unknown" matrix for the domain.
The AI will be most helpful for you in the known-unknown / unknown-known axis, not so much in the known-known / unknown-unknown parts. The latter unfortunatly is were you see the most derailed use of the tech.
All the AI junk like Agents in Service centers that you need to outplay in order to get in touch with a human, us as consumers are accepting this new status quo. We will accept products that sometimes can do crazy stuff because of hallucinations. Why? Ultimate capitalism consumerism sheepism, some other ism.
So AI (and whether it is correlarion or causation, i dont know) also corresponds with lower level of Mastery
The very highest levels of mastery can only come from slow, careful, self-directed learning that someone in a hurry to speedrun the process isn't focusing on.
I wanted to know how to clone a single folder in a Git repository. Having done this before, I knew that there was some incantation I needed to make to the git CLI to do it, but I couldn't remember what it was.
I'm very anti-AI for a number of reasons, but I've been trying to use it here and there to give it the benefit of the doubt and avoid becoming a _complete_ dinosaur. (I was very anti-vim ages ago when I learned emacs; I spent two weeks with vim and never looked back. I apply this philosophy to almost everything as a result.)
I asked Qwen3-235B (reasoning) via Kagi Assistant how I could do this. It gave me a long block of text back that told me to do the thing I didn't want it to do: mkdir a directory, clone into it, move the directory I wanted into the root of the directory, delete everything else.
When I asked it if it was possible to do this without creating the directory, it, incorrectly, told me that it was not. It used RAG-retrieved content in its chain of thought, for what that's worth.
It took me only 30 seconds or so to find the answer I wanted on StackOverflow. It was the second most popular answer in the thread. (git clone --filter=tree: --depth=0, then git sparse-checkout set --no-cone $FOLDER, found here: https://stackoverflow.com/a/52269934)
I nudged the Assistant a smidge more by asking it if there was a subcommand I could use instead. It, then, suggested "sparse-checkout init", which, according to the man page for this subcommand, is deprecated in favor of "set". (I went to the man page to understand what the "cone" method was and stumbled on that tidbit.)
THIS is the thing that disappoints me so much about LLMs being heralded as the next generation of search. Search engines give you many, many sources to guide you to the correct answer if you're willing to do the work. LLM services tell you what the "answer" is, even if it's wrong. You get potential misinformation back while also turning your brain off and learning less; a classic lose-lose.
Also part of the workflow of using AI is accepting that your initial prompts might not get the right answer. It's important to scan the answer like you did and use intuition to know 'this isn't right', then try again. Just like we learned how to type in good search queries, we'll also learn how to write good prompts. Sands will shift frequently at first, with prompt strategies that worked well yesterday requiring a different strategy tomorrow, but eventually it will stabilize like search query strategies did.
Someone who doesn't use the Git CLI at all and is relying on an LLM to do it will not know that. There's also no reason for them to search beyond the LLM or use the LLM to go deeper because the answer is "good enough."
That's the point of what I'm trying to make. You don't know what you don't know.
Trying different paths that might go down dead ends is part of the learning process. LLMs short-circuit that. This is fine if you think that learning isn't valuable in, this case, software development. I think it is.
<soapbox>
More specifically, I think that this will, in the long term, create a pyramidal economy where engineers like you and I who learned "the old way" will reap most of the rewards while everyone else coming into the industry will fight for scraps.
I suppose this is fine if you think that this is just the natural order of things. I do not.
Tech is one of, if not the only, career path(s) that could give almost anyone a very high quality of life (at least in the US) without gatekeeping people behind the school they attend (i.e. being born into the right social strata, basically), many years of additional education and even more years of grinding like law, medicine and consulting do.
I'm very saddened to see this going away while us in the old guard cheer its destruction (because our jobs will probably be safe regardless).
</soapbox>
I also disagree with the claim that the LLM gives you a "plethora" of sources. The search I used gave me three [^0]. A search on the same topic gave me more than 15. [^1]
Yes, the 15 it gives me are all over the quality map, but I have much information at my disposal to find the answer I'm looking for. It also doesn't proport to be "the answer," like LLMs tend to do.
[^0] https://kagi.com/assistant/839b0239-e240-4fcb-b5da-c5f819a0f...
amelius•23h ago
throe23486•23h ago
exasperaited•22h ago
Intraloper, weirdly enough, is a word in use.
djeastm•22h ago
I'm exter mad about that.
jjk166•21h ago
exasperaited•20h ago
In any single sentence context they cannot refer to the same relationships, and that which they are not is precisely the domain of the other word: they are true antonyms.
jjk166•16h ago
External relationships are a thing, which are in neither of those domains.
An intramural activity is not the same thing as an intermural activity, but the overwhelming majority of activities which are not intramural activities are not intermural activities either, most are extramural activities.
exasperaited•6h ago
They are a subset of inter-, pretty obviously.
shagie•22h ago
saltcured•18h ago
burnt-resistor•18h ago
saltcured•17h ago
Edit: Geometrically, I agree an extraloper could run on secants (or radii) but they're not allowed to strike a chord.
But getting back to generative text, I feel that "tangent" is more appropriate ;-)
canadaduane•22h ago