Using it efficiently is absolutely a skill. Just like google-fu is a skill. Or reading fast / skimming is a skill. Or like working with others is a skill. And so on and so on.
Bicycling is slightly harder than walking.
The kinds of things you'll learn are:
- What's even worth asking for? What categories of requests just won't work, what scope is too large, what kinds of things are going to just be easier to do yourself?
- Just how do you phrase the request, what kind of constraints should you give up front, what kind of things do you need to tell it that should be self-evident but aren't?
- How do you deal with sub-optimal output? Whe do you fix it yourself, when do you get the AI to iterate on it, when do you just throw out the entire sessions and start afresh?
The only way for it to not be a skill would be if how you use an AI either did not matter for the quality output, or if getting better results just a natural talent some people have and some don't. Both of those seem like pretty unrealistic ideas.
I think there's probably a discussion to be had about how deep or transferrable the skill is, but your opening gambit of "it's not a skill, stop trying to make it one" is not a productive starting point for such a discussion.
That seems to be a struggle for many. A friend of my wife turned 50 and we went to her birthday party. Two speechs and one song was AI generated, two speeches where written by actual humans, guess which should never have been created, let alone performed?
More and more I struggle to see the point of LLMs. I can sort of convince myself that there are niches where LLMs are really useful, but it's getting harder to maintain that illusion. There are cases where AI technologies are truly impressive and transformative, but they are rarely based on a chat interface.
And rest assured I don't care about you either(why such tone lol).
People claiming it's a skill should read up on experiments on behavior adaptation to stochastic rewards. Subjects develop elaborate "rain dances" in the belief that they can influence the outcome. Not unlike sports fans superstitions.
What you need to do every few months/weeks depending of when the last model was released is to reevaluate your bag of tricks.
At some point it becomes a roulette - you try this, you tray that and maybe it works or maybe not ...
https://ai-analytics.wharton.upenn.edu/generative-ai-labs/re...
My point still holds that it is optimizable though (https://github.com/zou-group/textgrad, https://arxiv.org/abs/2501.16673)
>Subjects develop elaborate "rain dances" in the belief that they can influence the outcome. Not unlike sports fans superstitions.
Anybody tuning neural weights by hand would feel like this.
And yes, the model keeps changing under you -- much like a horse is changing under a jockey, forcing them to adapt. Or like formula drivers and different car brands.
You can absolutely improve the results by experimenting with prompting, by building a mental mode of what happens inside the "black box", by learning what kinds of context it has/does not have, how (not) to overburden it with instructions etc. etc.
Sure, by definition, prompting is a skill. But it's a skill that really isn't hard to learn, and the gap between a beginner and a master is pretty narrow. The real differentiator is understanding the domain you're promoting for deeply, e.g. software development or visual design. Most value comes out of knowing what to ask for, and knowing how to evaluate the results.
It’s no secret that a lot of people (I’d like having an accurate percentage) got into coding because of the money. When you view it from that perspective everything becomes clearer: those people don’t care for the craft or correctness. They don’t understand or care when something is wrong and it’s in their best interest to convince themselves and everyone else that AI is as good or better than any expert programmer because it means they themselves don’t need to improve or care more than what they already don’t, they can just concentrate on the getting rich part.
There are experts (programmers or otherwise) who use these tools as guidelines and always verify the output. But too often they defend LLMs as unambiguously good because they fail to understand the overwhelming majority of humans aren’t experts or sufficiently critical of whatever they read, taking whatever the LLM spits as gospel. Which is what makes them dangerous.
Because that’s the claim of all the AI companies. Right next to the claim that AGI is in reach.
The question is if all use AI will all text become too similar.
For all the talk about jobs and art, LLMs seem to love shitposting.
More like an absolute bumbling idiot of a colleague that you have to explain things over and over again and can’t ever trust to get anything right.
sam altman said AI would "clone his brain" by 2026. He is wrong, it already has.
I've listened to him speak many times and thats an accurate description. seriously, has he ever said even one interesting thing ?
these guys are making shit up on the fly now. anything goes.
When it takes longer to prompt it with the details you would want in an email than to just write the email, that is user error?
Like I get the use case with summarization or translation but I can’t trust the output 100% when I know complete nonsense could be output.
There is so much friction when you try to do anything technical by talking to someone that don't know you, you have to know each other extremely well for there to be no friction.
This is why people prefer communicating in pseudo code rather than natural language when discussing programming, its really hard to describe what you want in words.
Except talking is not intuitive. It's an unbelievably hard skill. How many years have you spent on talking until you can communicate like an adult? To convey complicated political, philosophical, or technical ideas? To express your feelings honestly without offending others?
For most people it takes from 20 years to a lifetime. Personally I can't even describe a simple (but not commonly known) algorithm to another programmer without a whitboard.
I've heard plenty of overly complicated explanations of what a monad is. It's also not a complicated concept. Return a partial binding until all argument slots are filled, then return the result of the function. Jargon gets in the way of simple explanations. Ask a kid to explain something, and it will probably be a hell of a lot clearer.
The more experience you have, the harder it often is to draw out something untainted by that experience to give to someone else. We are the sum of our experience, and so its so darn easy to get lost in that, rather than to speak from where the other person is standing.
Sure if you leave out too much context you get generic responses but that isn't too surprising.
From Anger to Denial to Bargaining. And we are starting out with flattery. Masterful gambit!
Instead of participating in slop coding (sorry, "AI collaboration"), I think I'll just wait for the author and their ilk to make their way across Depression and Acceptance.
...when anyone starts talking in universals like this, they're usually deep in some hype cycle.
This is a problematic approach that many people take; they posit that:
1) AI is fundamentally transformative.
2) People who don't acknowledge that simply haven't tried it.
However, I posit that:
3) People who think that haven't actually used it a serious capacity or are deliberately misrepresenting things.
The problem is that:
> In reality, I go back and forth with AI constantly—sometimes dozens of times on a single piece of work. I refine, iterate, and improve each part through ongoing dialogue. It's like having a thoughtful and impossibly fast colleague who's always available to help me develop and sharpen my ideas.
...is only true for trivial problems.
The author calls this out, saying:
> It won't excel at consistently citing specific papers, building codes, or case law correctly. (Advanced techniques exist for these tasks, but they're not worth learning when you're just starting out. For now, consider them out of scope.)
...but, this is really the heart of everything.
What are those advanced techniques? Seriously, after 30 days of using AI if all you're doing is:
> Prepare for challenging conversations by using ChatGPT to simulate potential scenarios, helping you approach interpersonal dynamics with empathy and grace.
Then what the absolute heck are you doing.
Stop gaslighting everyone.
Those 'advanced techniques' are all anyone cares about, because they are the things that are hard, and don't work.
In reality, it doesn't matter how much time you spend learning; the technology is fundamentally limited. It can't do some things.
Spending time learning how to do trivial things will never enable you to do hard things.
It's not missing the 'human touch'.
It's the crazy hallucinations, invalid logic, failure to do as told, flat out incorrect information or citations, inability to perform a task (eg. as an agent) without messing some other thing up.
There are a few techniques that can help you have an effective workflow; but seriously, if you're a skeptic about AI, spending a month doing trivial stuff like asking for '10 ideas about X' is an insult to your intelligence and doesn't address any of the concerns that, I would argue, skeptics and real people actually have about AI.
That’s the function of a tool. To help do something in a more relaxed manner. Learning to use it can take some time, but the acquired proficiency will compensate for that.
General public LLMs have been there for two years, and still today, there are no concrete uses cases that can have the definition of tools. It’s trust me bro! and warnings in small print.
There are some, but you won't like them. Three big examples:
a) Automating human interactions. (E.g., "write some birthday wishes for my coworker".)
b) Offensive jokes and memes.
c) Autogenerated NPC's for role-playing games.
So, generally things that don't require actual intelligence. (Weird that empathy is the first thing we managed to automate away with "AI".)
It’s like the people who think that everyone who opposes cryptocurrencies only do so because they are jealous they didn’t invest early.
Anthropic used to do this with Claude's character until Claude 3, but then dropped it. OAI's image generation is consistently ahead in prompt understanding and abstraction, but they famously don't give a flying turd about nuances. Current models are produced by ML nerds that are handwaving the complexity away, not by experts in what they're trying to solve. If they want it to be usable now, they need to listen to this kind of people [1]. But I don't think they really care.
[1] https://yosefk.com/blog/the-state-of-ai-for-hand-drawn-anima...
In my opinion it is ridiculous to still say that there is anything fundamentally different between human intelligence and scaling LLMs another 10x or 100x.
However I'm not talking about technical tasks with objectively measurable criteria of success (which is a super narrow subset, not even coding is entirely like this). I'm saying that you have to transfer some kind of human preference to the model, as unsupervised learning will never be able to infer an accurate reference point for what you subjectively want from the pretraining data on its own, no matter the scale. Even if I'm wrong on that somehow, we're currently at 1x scale, and model finetuning right now is a pretty hands-on process. It's clear that ML people that usually curate this process have a really vague idea of what looks/reads/feels good. Which is why they produce slop.
TFA is talking about that:
>AI doesn’t understand why something matters, can’t independently prioritize what’s most important, and doesn’t bring the accountability or personal investment that gives work its depth and resonance.
Of course it doesn't, because it's not trained to understand it. Claude was finetuned for "human likeness" up to the version 3, and Opus had really deep understanding of why something matters, it had better agency than any current model, and a great reference point for your priorities. That's what happens when you give the curation to a non-ML adjacent person who knows what she's doing (AFAIK she left Anthropic since then and Anthropic seemingly dropped that "character training" policy).
Check 4o's image generation as well - it has terrible yellow tint by default, thick constant-width linework in "hand-drawn" pictures etc. You can somewhat steer it with a prompt and references, but it's pretty clear that the people that have been finetuning it didn't have a good idea whether their result is any good, so they made something instantly recognizable as slop. This is not just a botched training run or a dataset preparation bug, it's a recurring pattern for OpenAI, they simply do not care about this. The recurring pattern for Midjorney, for example, is to finetune their models on kitsch.
This all could be fixed in no time, making these models way more usable as products, right now, not someday when they maybe reach the 100x scale (which is neither likely to happen nor likely to change anything).
I am with you that the current dichotomy of training vs. inference seems unsustainable in the long run. We need ways for LLMs to learn from the interactions they are having, we might need introspection and self-modification.
I am not sure we need more diversity. Part of your argument sounds to me like we do. Slop (to me) is primarily the result of over-generalizing to everyone's taste. We get generic replies and generic images rather than consistently unique outcomes which we could call a personality.
>AI doesn’t understand why something matters.
I beg to differ. LLMs have seen all the reasons why something could matter. This is how they do everything. This is also how the brain works: You excite neurons with two concepts at a similar time and they become linked. For causality/correlation/memory...
I also agree with you that too much reliance on RLHF has not been the best idea. We are overfitting what people want rather than what people should want if they knew. LLMs are too eager to please and haven't yet learned how much teenage rebellion is needed for progress.
But if we accept that LLMs generally (in other use-cases) produce output that looks deceptively similar to what you ask for (i.e. it seems to work) but is actually worthless junk if carefully inspected (i.e. it doesn't actually work), why would you think they are able to generate accurate feedback?
In some ways it's easier to delegate to an AI because you don't have to care for anyone's feelings but your own, and you lose nothing but your own time when things don't go well and you have to reset. On the other hand, when the delegation does not go well, you still got yourself to blame first.
It’s like a slightly over-eager junior-mid developer, which however doesn’t mind rewriting 30k lines of tests from one framework to another. This means I can let it handle that dirty work, while focusing on the fun and/or challenging parts myself.
I feel like there’s also a meaningful split of software engineers into those who primarily enjoy the process of crafting code itself, and those that primarily enjoy building stuff, treating the code more as a means to an end (even if they enjoy the process of writing code!). The former will likely not have fun with AI, and will likely be increasingly less happy with how all of this evolves over time. The latter I expect are and will mostly be elated.
One with brain damage maybe, I tried out having Claude & Gemini modify a Go program with an absolutely trivial change (change the units displayed in an output type) and it got one of the four lines of code correct (the actual math for the unit conversion) and the rest was incorrect.
In the end, I integrated the helper function it output myself.
SOTA models can generate two or three lines of code accurately at a time and you have to describe them with such specificity that I've usually already done the hard part of the thinking by the time I have a specific enough prompt, that it's easier to just type out the code.
At best they save me looking up a unit conversion formula, which makes them about as useful as a search engine
Crucially, you lose money with a lot of these models when they output the wrong thing, because you pay by token whether the tokens coming out are what you want or not.
It's a bit like a slot machine. You write your prompt, insert some money, and pull the lever. Sometimes it saves you a lot of time! Sometimes, not so much. Sometimes it gets 80% of the way and you think oh, let me just put in another coin and tweak my prompt and pull the lever, this time it will get me 100%
Listening to people justify pulling the lever over and over again is a little bit like listening to an addict excusing their behavior.
I realize there are flat rate plans like Kagi offers, but the API offerings and IDE integrations all feature the slot machine and sunk cost effects that I describe.
I agree, it can feel a lot like a slot machine at times, and it's a failure mode somewhat unique to developing with LLM, where it doesn't just fail outright or tell you "I don't know how to do that", but instead you find yourself in the end of a sometimes hours long spiral of trying just-one-more-prompt.
It's important to experience this mode of failure and learn to notice the "spiral" early and adjust the approach. Sometimes it's enough to switch to a different model, often an explicit planning step helps. But more likely than not, a "spiral" means approaching the frontier of LLM possibility. In my experience, certain types of changes are really hard for current gen LLM to pull off, like large scale refactorings changing the project architecture, or implementing genuinely novel algorithmic ideas, so we still need a human touch for these (yay?)
shikon7•2w ago
That there is such a calendar for using ChatGPT in the style of topics like "how to eat healthy", "how to stay fit" or "how to be more confident" shows to me more than anything what impact AI has on our society.
ashoeafoot•2w ago