This is pretty ironic, considering the subject matter of that blog post. It's a super-common misconception that's gained very wide popularity due to reactionary (and, imo, rather poor) popular science reporting.
The author parroting that with confidence in a post about Dunner-Krugering gives me a bit of a chuckle.
ANNs are arbitrary function approximators. The training process uses statistical methods to identify a set of parameters that approximate the function as best as possible. That doesn't necessarily mean that the end result is equivalent to a very fancy multi-stage linear regression. It's a possible outcome of the process, but it's not the only possible outcome.
Looking at a LLMs I/O structure and training process is not enough to conclude much of anything. And that's the misconception.
I'm not sure I follow. LLMs are probabilistic next-token prediction based on current context, that is a factual, foundational statement about the technology that runs all LLMs today.
We can ascribe other things to that, such as reasoning or knowledge or agency, but that doesn't change how they work. Their fundamental architecture is well understood, even if we allow for the idea that maybe there are some emergent behaviors that we haven't described completely.
> It's a possible outcome of the process, but it's not the only possible outcome.
Again, you can ascribe these other things to it, but to say that these external descriptions of outputs call into question the architecture that runs these LLMs is a strange thing to say.
> Looking at a LLMs I/O structure and training process is not enough to conclude much of anything. And that's the misconception.
I don't see how that's a misconception. We evaluate all pretty much everything by inputs and outputs. And we use those to infer internal state. Because that's all we're capable of in the real world.
What more are LLMs than statistical inference machines? I don't know that I'd assert that's all they are with confidence but all the configurations options I can play with during generation (Top K, Top P, Temperature, etc.) are all ways to _not_ select the most likely next token which leads me to believe that they are, in fact, just statistical inference machines.
> As I ChatGPT user I notice that I’m often left with a sense of certainty.
They have almost the opposite effect on me.
Even with knowledge from books or articles I've learned to multi-source and question things, and my mind treats the LLMs as a less reliable averaging of sources.
"Don't just trust wikipedia, check it's resources, because it's crowdsourced and can be wrong".
Now, almost 2 decades later, I rarely hear this stance and I see people relying on wikipedia as an authoritative source of truth. i.e, linking to wikipedia instead of the underlying sources.
In the same sense, I can see that "Don't trust LLMs" will slowly fade away and people will blindly trust them.
Ive noticed things like gemini summaries on Google searches are also generally close enough.
That's a different scenario. You shouldn't _cite wikipedia in a paper_ (instead you should generally use its sources), but it's perfectly fine in most circumstances to link it in the course of an internet argument or whatever.
This comes from decades of teachers misremembering what the rule was, and eventually it morphed into the Wikipedia specific form we see today - the actual rule is that you cannot cite an encyclopaedia in an academic paper. full stop.
Wikipedia is an encyclopaedia and therefore should not be cited.
Wikipedia is the only encyclopaedia most people have used in the last 20 years, therefore Wikipedia = encyclopaedia in most people's minds.
There's nothing wrong with using an encyclopaedia for learning or introducing yourself to a topic (in fact this is what teachers told students to do). And there's nothing specifically wrong about Wikipedia either.
Yeah, the stupid.
I've been thinking about this a bit. We don't really think this way in other areas, is it appropriate to think this way here?
My car has an automatic transmission, am I a fraud because the machine is shifting gears for me?
My tractor plows a field, am I a fraud because I'm not using draft horses or digging manually?
Spell check caught a word, am I a fraud because I didn't look it up in a dictionary?
And, for instance, I have barely any knowledge of how my computer works, but it's a tool I use to do my job. (and to have fun at home.)
Why are these different than using LLMs? I think at least for me the distinction is whether or not something enables me to perform a task, or whether it's just doing the task for me. If I had to write my own OS and word processor just to write a letter, it'd never happen. The fact that the computer does this for me facilitates my task. I could write the letter by hand, but doing it in a word processor is way better. Especially if I want to print multiple copies of the letter.
But for LLMs, my task might be something like "setting up apache is easy, but I've never done it so just tell me how do it so I don't fumble through learning and make it take way longer." The task was setting up Apache. The task was assigned to me, but I didn't really do it. There wasn't necessarily some higher level task that I merely needed Apache for. Apache was the whole task! And I didn't do it!
Now, this will not be the case for all LLM-enabled tasks, but I think this distinction speaks to my experience. In the previous word processor example, the LLM would just write my document for me. It doesn't allow me to write my document more efficiently. It's efficient, only in the sense that I no longer need to actually do it myself, except for maybe to act as an editor. (and most people don't even do much of that work) My skill in writing either atrophies or never fully develops since I don't actually need to spend any time doing it or thinking about it.
In a perfect world, I use self-discipline to have the LLM show me how to set up Apache, then take notes, and then research, and then set it up manually in subsequent runs; I'd have benefited from learning the task much more quickly than if I'd done it alone, but also used my self-discipline to make sure I actually really learned something and developed expertise as well. My argument is that most people will not succeed in doing this, and will just let the LLM think for them.
So, while it's an imperfect answer that I haven't really nailed down yet, maybe the answer is just to realize this and make sure we're doing hard things on purpose sometimes. This stuff has enabled free time, we just can't use it to doomscroll.
That's an interesting take on the loneliness crisis that I had not considered. I think you're really onto something. Thanks for sharing. I don't want to dive into this topic too much since it's political and really off-topic for the thread, but thank you for suggesting this.
The jury is Still out on what value these things will bring
Some people certainly seem to be. You see this a lot on webforums; someone spews a lot of confident superficially plausible-looking nonsense, then when someone points out that it is nonsense, they say they got it from a magic robot.
I think this is particularly common for non-tech people, who are more likely to believe that the magic robots are actually intelligent.
It is not.
Is it me or does everyone find that dumb people seem to use this statement more than ever?
We are all geniuses!
This is not what the Dunning-Kruger effect is. It's lacking metacognitive ability to understand one's own skill level. Overconfidence resulting from ignorance isn't the same thing. Joe Rogan propagated the version of this phenomenon that infiltrated public consciousness, and we've been stuck with it ever since.
Ironically, you can plug this story into your favorite LLM, and it will tell you the same thing. And, also ironically, the LLM will generally know more than you in most contexts, so anyone with a degree epistemic humility is better served taking it at least as seriously as their own thoughts and intuitions, if not at face value.
LLMs are cool and useful technology, but if you approach them with the attitude you're talking with an other, you are leaving yourself vulnerable to all sorts of cognitive distortions.
The larger problem is cognitive offloading. The people for whom this is a problem were already not doing the cognitive work of verifying facts and forming their own opinions. Maybe they watched the news, read a Wikipedia article, or listened to a TEDtalk, but the results are the same: an opinion they felt confident in without a verified basis.
To the extent this is on 'steroids', it is because they see it as an expert (in everything) computer and because it is so much faster than watching a TED talk or reading a long form article.
Provide a person confidence in their opinion and they will not challenge it, as that would risk the reward of lend you live in a coherent universe.
The majority person has never heard the term “epistemology” despite the concept being central to how people derive coherence. Yet all these trite pieces written about AI and its intersectionality with knowledge claim some important technical distinction.
I’m hopeful that a crisis of epistemology is coming, though that’s probably too hopeful. I’m just enjoying the circus at this point
Regardless of what media you get your info from you have to be selective of what sources you trust. It's more true today than ever before, because the bar for creating content has never been lower.
When I use chatGPT I do the same before I've asked for the fact: how common is this problem? how well known is it? How likely is that chatgpt both knows it and can surface it? Afterwards I don't feel like I know something, I feel like I've got a faster broad idea of what facts might exist and where to look for them, a good set of things to investigate, etc.
This more closely fits our models of cognition anyway. There is nothing really very like a filter in the human mind, though there are things that feel like them.
The fact that LLMs seem like people but aren't, specifically have a lot of the signals of a reliable source in some ways, I'm not sure how these processes will map. I'm skeptical of anyone who is confident about it in either way, in fact.
> The mental motion of “I didn’t really parse that paragraph, but sure, whatever, I’ll take the author’s word for it” is, in my introspective experience, absolutely identical to “I didn’t really parse that paragraph because it was bot-generated and didn’t make any sense so I couldn’t possibly have parsed it”, except that in the first case, I assume that the error lies with me rather than the text. This is not a safe assumption in a post-GPT2 world. Instead of “default to humility” (assume that when you don’t understand a passage, the passage is true and you’re just missing something) the ideal mental action in a world full of bots is “default to null” (if you don’t understand a passage, assume you’re in the same epistemic state as if you’d never read it at all.)
https://www.greaterwrong.com/posts/4AHXDwcGab5PhKhHT/humans-...
Can you cite a specific example where this happened for you? I'm interested in how you think you went from "broad idea" to building actual knowledge.
To make his point, you need specific examples from specific LLMs.
[1] https://www.mcgill.ca/oss/article/critical-thinking/dunning-...
The term I’ve been using of late is “authority simulator.” My formative experiences with “authority figures” was a person who can speak with breadth and depth about a subject and who seems to have internalized it because they can answer quickly and thoroughly. Because LLMs do this so well, it’s really easy to feel like you’re talking to an authority in a subject. And even though my brain intellectually knows this isn’t true, emotionally, the simulation of authority is comforting.
Quantity has a quality of its own. The first chess engine to beat Gary Kasparov wasn't fundamentally different than earlier ones--it just had a lot more compute power.
The original Google algorithm was trivial: rank web pages by incoming links--its superhuman power at giving us answers ("I'm feeling lucky") was/is entirely due to a massive trove of data.
And remember all the articles about how unreliable Wikipedia was? How can you trust something when anyone can edit a page? But again, the power of quantity--thousands or millions of eyeballs identifying errors--swamped any simple attacks.
Yes, LLMs are literally just matmul. How can anything useful, much less intelligent, emerge from multiplying numbers really fast? But then again, how can anything intelligent emerge from a wet mass of brain cells? After all, we're just meat. How can meat think?
Some of us used to think that meat spontaneously generated flies. Maybe someday we'll (re-)learn that meat doesn't spontaneously generate thought either?
But I always resist the urge. Because I think: Isn't it always going to have some kinds of people like that? With or without this LLM thing.
If there is anything to hate about this technology, for the more and more bullshits we see/hear in daily life, it is: (1) Its reach: More people of all ages, of different backgrounds, expertise, and intents are using it. Some are heavily misusing it. (2) Its (ever increasing) capability: Yes, it has already become pretty easy for ChatGPT or any other LLMs to produce a sophisticated but wrong answer on a difficult topic. And I think the trend is that with later, more advanced versions, it would become harder and take more effort to spot a hidden failure lurking in a more information-dense LLM's answer.
Despite, LLM's are useful. I could write the code faster without an LLM, but then I'd have code that wasn't carefully reviewed line-by-line because my coworkers trust me (the fools). It'd have far fewer tests because nobody forced me to prove everything. It'd have worse naming because every once in a while the LLM does that better than me. It'll be missing a few edge cases the LLM thought of that I didn't. It'd have forest/trees problems because if I was writing the code I'd be focused on the code instead of the big picture.
Brendinooo•1h ago
This is a good line, and I think it tempers the "not just misinformed, but misinformed with conviction" observation quite a bit, because sometimes moving forward with an idea at less than 100% accuracy will still bring the best outcome.
Obviously that's a less than ideal thing to say, but imo (and in my experience as the former gifted student who struggles to ship) intelligent people tend to underestimate the importance of doing stuff with confidence.
shermantanktop•1h ago
Seeing others get burned by that pattern over and over can encourage hesitation and humility, and discourage confident action. It’s essentially an academic attitude and can be very unfortunate and self-defeating.