EDIT: literally saw it just now after refreshing. I guess they didn't roll it out immediately to everyone.
e: if you mean university, fair. that'll be an interesting transition. I guess then you pay for the sports team and amenities?
In the US at least, most kids are in public schools and the collective community foots the bill for the “daycare”, as you put it.
I ultimately dropped the course and took it in the summer at a community college where we had the 20-30 standard practice problem homework where you apply what you learned in class and grind problems to bake it into core memory.
AI would have helped me at least get through the uni course. But generally I think it's a problem with the school/class itself if you aren't learning most of what you need in class.
These groups were some of the most valuable parts of the university experience for me. We'd get take-out, invade some conference room, and slam our heads against these questions well into the night. By the end of it, sure... our answers looked superficially similar, but it was because we had built a mutual, deep understanding of the answer—not just copying the answers.
Even if you had only a rough understanding, the act of trying to teach it again to others in the group made you both understand it better.
And we literally couldn't figure it out. Or the group you were in didn't have a physics rockstar. Or you weren't so social or didn't know anyone or you just missed an opportunity to find out where anyone was forming a group. It's not like the groups were created by the class. I'd find myself in a group of a few people and we just couldn't solve it even though we knew the lecture material.
It was a negative value class that cost 10x the price of the community college course yet required you to teach yourself after a lecture that didn't help you do the homework. A total rip-off.
Anyways, AI is a value producer here instead of giving up and getting a zero on the homework.
Does it offer meaningful benefits to students over self directed study?
Does it out perform students who are "learning how to learn"?
What affect does allowing students to make mistakes have compared to being guided through what to review?
I would hope Study Mode would produce flash card prompts and quantize information for usage in spaced repetition tools like Mochi [1] or Anki.
See Andy's talk here [2]
They want a student to use it and say “I wouldn’t have learned anything without study mode”.
This also allows them to fill their data coffers more with bleeding edge education. “Please input the data you are studying and we will summarize it for you.”
Not to be contrarian, but do you have any evidence of this assertion? Or are you just confidently confabulating a response for something outside of the data you've been exposed to? Because a commentor below provided a study that directly contradicts this.
This isn't study mode, it's a different AI tutor, but:
"The median learning gains for students, relative to the pre-test baseline (M = 2.75, N = 316), in the AI-tutored group were over double those for students in the in-class active learning group."
"The occurrence of inaccurate “hallucinations” by the current [LLMs] poses a significant challenge for their use in education. [...] we enriched our prompts with comprehensive, step-by-step answers, guiding the AI tutor to deliver accurate and high-quality explanations (v) to students. As a result, 83% of students reported that the AI tutor’s explanations were as good as, or better than, those from human instructors in the class."
Not at all dismissing the study, but to replicate these results for yourself, this level of gain over a classroom setting may be tricky to achieve without having someone make class materials for the bot to present to you first
Edit: the authors further say
"Krupp et al. (2023) observed limited reflection among students using ChatGPT without guidance, while Forero (2023) reported a decline in student performance when AI interactions lacked structure and did not encourage critical thinking. These previous approaches did not adhere to the same research-based best practices that informed our approach."
Two other studies failed to get positive results at all. YMMV a lot apparently (like, all bets are off and your learning might go in the negative direction if you don't do everything exactly as in this study)
unfortunately that group is tiny and getting tinier due to dwindling attention span.
I bring this up because the way I see students "study" with LLMs is similar to this misapplication of tutoring. You try something, feel confused and lost, and immediately turn to the pacifier^H^H^H^H^H^H^H ChatGPT helper to give you direction without ever having to just try things out and experiment. It means students are so much more anxious about exams where they don't have the training wheels. Students have always wanted practice exams with similar problems to the real one with the numbers changed, but it's more than wanting it now. They outright expect it and will write bad evals and/or even complain to your department if you don't do it.
I'm not very optimistic. I am seeing a rapidly rising trend at a very "elite" institution of students being completely incapable of using textbooks to augment learning concepts that were introduced in the classroom. And not just struggling with it, but lashing out at professors who expect them to do reading or self study.
However consider the extent to which LLMs make the learning process more enjoyable. More students will keep pushing because they have someone to ask. Also, having fun & being motivated is such a massive factor when it comes to learning. And, finally, keeping at it at 50% the speed for 100% the material always beats working at 100% the speed for 50% the material. Who cares if you're slower - we're slower & faster without LLMs too! Those that persevere aren't the fastest; they're the ones with the most grit & discipline, and LLMs make that more accesible.
(Qualifications: I was a reviewer on the METR study.)
Like yeah, if you’ve only ever used an axe you probably don’t know the first thing about how to use a chainsaw, but if you know how to use a chainsaw you’re wiping the floor with the axe wielders. Wholeheartedly agree with the rest of your comment; even if you’re slow you lap everyone sitting on the couch.
https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...
I believe we'll see the benefits and drawbacks of AI augmentation to humans performing various tasks will vary wildly based on the task, the way the AI is being asked to interact, and the AI model.
It concludes theres a learning curve that generally takes about 50 hours of time to figure out. The data shows that the one engineer who had more than 50 hours of experience with Cursor actually worked faster.
This is largely my experience, now. I was much slower initially, but I've now figured out the correct way to prompt, guide, and fix the LLM to be effective. I produce way more code and am mentally less fatigued at the end of each day.
I would not use it if it was for something with a strictly correct answer.
If you're the other 90% of students that are only learning to check the boxes and get through the courses to get the qualification at the end... are you going to bother using this?
Of course, maybe this is "see, we're not trying to kill education... promise!"
Just like it's easier to be productive if you have a separate home office and couch, because of the differing psychological contexts, it's easier if you have a separate context for "just give me answers" and "actually teach me the thing".
Also, I don't know about you, but (as a professional) even though I actively try to learn the principals behind the code generated, I don't always want to spend the effort prompting the model away from the "just give me results with a simple explanation" personality I've cultivated. It'd be nice having a mode with that work done for me.
There is no way to learn without effort. I understand they are not claiming this, but many students want a silver bullet. There isn't one.
Same problem exists for all educational apps. Duolingo users have the goal of learning a language, but also they only want to use Duolingo for a few minutes a day, but also they want to feel like they're making progress. Duolingo's goal is to keep you using Duolingo, and if possible it'd be good for you to learn the language, but their #1 goal is to keep you coming back. Oddly, Duolingo might not even be wrong to focus primariliy on keeping you moving forward, given how many people give up when learning a new language.
So, unless you have experience with this products that contradicts their claims, it's a good tutor by your definition.
The criticism of cliff's notes is generally that it's a superficial glance. It can't go deeper, it's basically a summary.
The LLM is not that. It can zoom in and out of a topic.
I think it's a poor criticism.
I don't think it's a silver bullet for learning, but it's a unified, consistent interface across topics and courses.
If LLM's got better at just responding with: "I don't know", I'd have less of an issue.
Some topics you learn to beware and double check. Or ask it to cite sources. (For me, that's car repair. It's wrong a lot.)
I wish it had some kind of confidence level assessment or ability to realize it doesn't know, and I think it eventually will have that. Most humans I know are also very bad at that.
Sure, but only as long as you're not terribly concerned with the result being accurate, like that old reconstruction of Obama's face from a pixelated version [1] but this time about a topic for which one is, by definition, not capable of identifying whether the answer is correct.
[1] https://www.theverge.com/21298762/face-depixelizer-ai-machin...
It's unlikely to make up the same bullshit twice.
Usually exploring a topic in depth finds these issues pretty quickly.
unavoidably, people who don't want to work, won't push the "work harder" button.
Yes, if my teacher could split into a million of themselves and compete against me on the job market at $200/mo.
I made A deep research assistant for families. Children can ask questions to explain difficult concepts and for parents to ask how to deal with any parenting situation. For example a 4 year old may ask “why does the plate break when it falls?”
example output: https://www.studyturtle.com/ask/PJ24GoWQ-pizza-sibling-fight...
Then again, human 1:1 tutoring is the most effective way to learn, isn't it? In the end it'll probably end up being a balance of reading through texts yourself and still researching broadly so you get an idea about the context around whatever it is you're trying to do, and having a tutor available to walk you through if you don't get it?
I ask because every serious study on using modern generative AI tools tends to conclude fairly immediate and measurable deleterious effects on cognitive ability.
Now, everyone basically has a personal TA, ready to go at all hours of the day.
I get the commentary that it makes learning too easy or shallow, but I doubt anyone would think that college students would learn better if we got rid of TA's.
Closed: RTFM, dumbass
<No activity for 8 years, until some random person shows up and asks "Hey did you figure it out?">
I really do write that stuff for myself, turns out.
J. Random Hacker: Why are you doing it like that?
Newb: I have <xyz> constraint in my case that necessitates this.
J. Random Hacker: This is a stupid way to do it. I'm not going to help you.
I find it odd that someone who has been to college would see this as a _bad_ way to learn something.
I'm not sold on LLMs being a replacement, but post-secondary was certainly enriched by having other people to ask questions to, people to bounce ideas off of, people that can say "that was done 15 years ago, check out X", etc.
There were times where I thought I had a great idea, but it was based on an incorrect conclusion that I had come to. It was helpful for that to be pointed out to me. I could have spent many months "paving forward", to no benefit, but instead someone saved me from banging my head on a wall.
Sure, you could pave forward, but realistically, you'll get much farther with either a good textbook or a good teacher, or both.
Learning a new programming language used to be mediated with lots of useful trips to Google to understand how some particular bit worked, but Google stopped being useful for that years ago. Even if the content you're looking for exists, it's buried.
We were able to learn before LLMs.
Libraries are not a new thing. FidoNet, USENET, IRC, forums, local study/user groups. You have access to all of Wikipedia. Offline, if you want.
I think it's accurate to say that if I had to do that again, I'm basically screwed.
Asking the LLM is a vastly superior experience.
I had to learn what my local library had, not what I wanted. And it was an incredible slog.
IRC groups is another example--I've been there. One or two topics have great IRC channels. The rest have idle bots and hostile gatekeepers.
The LLM makes a happy path to most topics, not just a couple.
Not to be overly argumentative, but I disagree, if you're looking for a deep and ongoing process, LLMs fall down, because they can't remember anything and can't build upon itself in that way. You end up having to repeat alot of stuff. They also don't have good course correction (that is, if you're going down the wrong path, it doesn't alert you, as I've experienced)
It also can give you really bad content depending on what you're trying to learn.
I think for things that represent themselves as a form of highly structured data, like programming languages, there's good attunement there, but you start talking about trying to dig around about advanced finance, political topics, economics, or complex medical conditions the quality falls off fast, if its there at all
It was way nicer than a book.
That's the experience I'm speaking from. It wasn't perfect, and it was wrong sometimes, sure. A known limitation.
But it was flexible, and it was able to do things like relate ideas with programming languages I already knew. Adapt to my level of understanding. Skip stuff I didn't need.
Incorrect moments or not, the result was i learned something quickly and easily. That isn't what happened in the 90s.
But that's the entire problem and I don't understand why it's just put aside like that. LLMs are wrong sometimes, and they often just don't give you the details and, in my opinion, knowing about certain details and traps of a language is very very important, if you plan on doing more with it than just having fun. Now someone will come around the corner and say 'but but but it gives you the details if you explicitly ask for them'. Yes, of course, but you just don't know where important details are hidden, if you are just learning about it. Studying is hard and it takes perseverance. Most textbooks will tell you the same things, but they all still differ and every author usually has a few distinct details they highlight and these are the important bits that you just won't get with an LLM
Nobody can write an exhaustive tome and explore every feature, use, problem, and pitfall of Python, for example. Every text on the topic will omit something.
It's hardly a criticism. I don't want exhaustive.
The llm taught me what I asked it to teach me. That's what I hope it will do, not try to caution me about everything I could do wrong with a language. That list might be infinite.
How can you know this when you are learning something? It seems like a confirmation bias to even have this opinion?
It's entirely possible they learned nothing and they're missing huge parts.
But we're sort of at the point where in order to ignore their self-reported experience, we're asking philosophical questions that amount to "how can you know you know if you don't know what you don't know and definitely don't know everything?"
More existentialism than interlocution.
If we decide our interlocutor can't be relied upon, what is discussion?
Would we have the same question if they said they did it from a book?
If they did do it from a book, how would we know if the book they read was missing something that we thought was crucial?
I was attempting to imply that with high-quality literature, it is often reviewed by humans who have some sort of knowledge about a particular topic or are willing to cross reference it with existing literature. The reader often does this as well.
For low-effort literature, this is often not the case, and can lead to things like https://en.wikipedia.org/wiki/Gell-Mann_amnesia_effect where a trained observer can point out that something is wrong, but an untrained observer cannot perceive what is incorrect.
IMO, this is adjacent to what human agents interacting with language models experience often. It isn't wrong about everything, but the nuance is enough to introduce some poor underlying thought patterns while learning.
Perhaps the most famous example of this is Warren Buffet. For years Buffet missed out on returns from the tech industry [1] because he avoided investing in tech company stocks due to Berkshire's long standing philosophy to never invest in companies whose business model he doesn't understand.
His light bulb moment came when he used his understanding of a business he understood really well i.e. their furniture business [3] to value Apple as a consumer company rather than as a tech company leading to a $1bn position in Apple in 2016 [2].
[0] https://en.wikipedia.org/wiki/Transfer_of_learning
[1] https://news.ycombinator.com/item?id=33612228
[2] https://www.theguardian.com/technology/2016/may/16/warren-bu...
[3] https://www.cnbc.com/2017/05/08/billionaire-investor-warren-...
That's totally different than saying they are not flawless but they make learning easier than other methods, like you did in this comment
It also doesn't seem to do a good job of building on "memory" over time. There appears to be some unspoken limit there, or something to that affect.
Figuring out 'make' errors when I was bad at C on microcontrollers a decade ago? (still am) Careful pondering of possible meanings of words... trial and error tweaks of code and recompiling in hopes that I was just off by a tiny thing, but 2 hours later and 30 attempts later, and realizing I'd done a bad job of tracking what I'd tried and hadn't? Well, made me better at being careful at triaging issues. But it wasn't something I was enthusiastic to pick back up the next weekend, or for the next idea I had.
Revisiting that combination of hardware/code a decade later and having it go much faster with ChatGPT... that was fun.
Like, I agree with you and I believe those things will resist and will always be important, but it doesn't really compare in this case.
Last week I was in the nature and I saw a cute bird that I didn't know. I asked an AI and got the correct answer in 10 seconds. Of course I would find the answer at the library or by looking at proper niche sites, but I would not have done it because I simply didn't care that much. It's a stupid example but I hope it makes the point
We were able to learn before the invention of writing, too!
I haven't tested them on many things. But in the past 3 weeks I tried to vibe code a little bit VHDL. On the one hand it was a fun journey, I could experiment a lot and just iterated fast. But if I was someone who had no idea about hardware design, then this trash would've guided me the wrong way in numerous situations. I can't even count how many times it has built me latches instead of clocked registers (latches bad, if you don't know about it) and that's just one thing. Yes I know there ain't much out there (compared to python and javascript) about HDLs, even less regarding VHDL. But damn, no no no. Not for learning. never. If you know what you're doing and you have some fundamental knowledge about the topic, then it might help to get further, but not for the absolute essentials, that will backfire hard.
Pre-LLM, even finding the ~5 textbooks with ~3 chapters each that decently covered the material I want was itself a nontrivial problem. Now that problem is greatly eased.
They can recommend many unknown books as well, as language models are known to reference resources that do not exist.
This simply hasn't been my experience.
Its too shallow. The deeper I go, the less it seems to be useful. This happens quick for me.
Also, god forbid you're researching a complex and possibly controversial subject and you want it to find reputable sources or particularly academic ones.
This generation of AI doesn't yet have the knowledge depth of a seasoned university professor. It's the kind of teacher that you should, eventually, surpass.
1) The broad overview of a topic
2) When I have a vague idea, it helps me narrow down the correct terminology for it
3) Providing examples of a particular category ("are there any examples of where v1 in the visual cortex develops in a disordered way?")
4) "Tell me the canonical textbooks in field X"
5) Posing math exercises
6) Free form branching--while talking about one topic, I want to shift to another that is distinct but related.
I agree they leave a lot to be desired when digging very deeply into a topic. And my biggest pet peeve is when they hallucinate fake references ("tell me papers that investigate this topic" will, for any sufficiently obscure topic, result in a bunch of very promising paper titles that are wholely invented).
Luc Julia (one of the main Siri's creators) describe a very similar exercice in this interview [0](It's in french, although the au translation isn't too bad)
The gist of it, is that he describes this exercice he does with his students, where they ask chatgpt about Victor Hugo's biography, and then proceed to spot the errors made by Chatgtp.
This setup is simple, but there are very interesting mechanisms in place. The student get to learn about challenging facts, do fact checking, cross reference, etc. While also asserting the reference figure of the teacher, with the knowledge to take down chat gpt.
Well done :)
Edit: adding link
[0] https://youtube.com/shorts/SlyUvvbzRPc?si=2Fv-KIgls-uxr_3z
This. This should be done everywhere. It is the best way to let students see first hand that LLM output is useful, but can be (and often is) wrong.
If people really understands that, everything will be better.
so the opposite of Stack Overflow really, where if you have a vague idea your question gets deleted and you get reprimanded.
Maybe Stack Overflow could use AI for this, help you formulate a question in the way they want.
You say this in a thread specifically talking about how LLM's fall apart when digging deeper into the surface of questions.
Do people really want to learn and understand, or just feel like they are learning and understanding?
It's too bad people are trying to substitute the latter with the chatGPT output itself. And I absolutely cannot trust any machine that is willing to lie to me rather than admit ignorance on a subject.
History is a great example, if you ask an LLM about a vaguely difficult period in history it will just give you one side and act like the other doesn't exist, or if there is another side, it will paint them in a very negative light which often is poorly substantiated; people don't just wake up and decide one day to be irrationally evil with no reason, if you believe that then you are a fool... although LLMs would agree with you more times than not since it's convenient.
The result of these things is a form of gatekeeping, give it a few years and basic knowledge will be almost impossible to find if it is deemed "not useful" whether that's an outdated technology that the LLM doesn't seem talked about very much anymore or a ideological issue that doesn't fall in line with TOS or common consensus.
- Bombing of Dresden, death stats as well as how long the bombing went on for (Arthur Harris is considered a war-criminal to this day for that; LLLMs highlight easily falsifiable claims by Nazi's to justify low estimates without providing much in the way of verifiable claims outside of a select few, questionable, sources. If the low-estimate is to be believed, then it seems absurd that Harris would be considered a war-criminal in light of what crimes we allow today in warfare)
- Ask it about the Crusades, often if forgets the sacking of St. Peter's in Rome around 846 AD, usually painting the Papacy as a needlessly hateful and violent people during that specific Crusade. Which was horrible, bloody as well as immensely destructive (I don't defend the Crusades), but paints the Islamic forces as victims, which they were eventually, but not at the beginning, at the beginning they were the aggressors bent on invading Rome.
- Ask it about the Six-Day War (1967) and contrast that with several different sources on both sides and you'll see a different portrayal even by those who supported the actions taken.
These are just the four that come to my memory at this time.
Most LLMs seem cagey about these topics; I believe this is due to an accepted notion that anything that could "justify" hatred or dislike of a people group or class that is in favor -- according to modern politics -- will be classified as hateful rhetoric, which is then omitted from the record. The issue lies in the fact that to understand history, we need to understand what happened, not how it is perceived, politically, after the fact. History helps inform us about the issues of today, and it is important, above all other agendas, to represent the truth of history, keeping an accurate account (or simply allowing others to read differing accounts without heavy bias).
LLMs are restricted in this way quite egregiously; "those who do not study history are doomed to repeat it", but if this continues, no one will have the ability to know history and are therefore forced to repeat it.
If for any of these topics you do manage to get a summary you'd agree with from a (future or better-prompted?) LLM I'd like to read it. Particularly the first and third, the second is somewhat familiar and the fourth was a bit vague.
I don't know a lot about the other things you mentioned, but the concept of crusading did not exist (in Christianity) in 846 AD. It's not any conflict between Muslims and Christians.
Further leading to the Papacy furthering such efforts in the upcoming years, as they were in Rome and made strong efforts to maintain Catholicism within those boundaries. Crusading didn't appear out of nothing; it required a catalyst for the behavior, like what i listed, is usually a common suspect.
If the US were to start invading Axis countries with WW2 being the justification we'd of course be the aggressors, and that was less than 100 years ago.
Similarly, it helps us understand all the examples of today of resentments and grudges over events that happened over a century ago that still motivate people politically.
Its background is in the Islamic Christian conflicts of Spain. Crusading was adopted from the Muslim idea of Jihad, as we things like naming customs (Spanish are the only Christians who name their children “Jesus”, after the Muslim “Muhammad”).
The political tensions that lead to the first crusade were between Arab Muslims and Byzantine Christian’s. Specifically, the Battle of Mazikirt made Christian Europe seem more vulnerable than it was.
The Papacy wasn’t at the forefront of the struggle against Islam. It was more worried about the Normans, Germans, and Greeks.
When the papacy was interested in Crusading it was for domestic reasons: getting rid of king so-and-so by making him go on crusade.
The situation was different in Spain where Islam was a constant threat, but the Papacy regarded Spain as an exotic foreign land (although Sylvester II was educated there).
It’s extremely misleading to view the pope as the leader of an anti-Muslim coalition. There really was no leader per se, but the reasons why kings went on crusade had little to do with fighting Islam.
Just look at how many monarchs showed up in Jerusalem, then headed straight home and spent the rest of their lives bragging about crusaders.
I’m 80% certain no pope ever set foot in Outremere.
Rhodesia is a hard one; since the more I learn about it the more I feel terrible for both sides; I also do not support terrorism against a nation even if I believe they might not be in the right. However i hold by my disdain for how the British responded/withdrew from them effectively doomed Rhodesia making peaceful resolution essentially impossible.
It’s a very controversial opinion and stating as a just so fact needs challenging.
In 1992 a statue was erected to Harris in London, it was under 24 hour surveillance for several months due to protesting and vandalism attempts. I'm only mentioning this to highlight that there was quite a bit of push back specifically calling the gov out on a tribute to him; which usually doesn't happen if the person was well liked... not as an attempted killshot.
Even the RAF themselves state that there was quite a few who were critical on the first page of their assessment of Arthur Harris https://www.raf.mod.uk/what-we-do/centre-for-air-and-space-p...
Which is funny and an odd thing to say if you are widely loved/unquestioned by your people. Again just another occurrence of language from those who are on his side reinforcing the idea that there is, as you say is "very controversial", and maybe not a "vast majority" since those two things seem at odds with each other.
Not to mention that Harris targeted civilians, which is generally considered behavior of a war-criminal.
As an aside this talk page is a good laugh. https://en.wikipedia.org/wiki/Talk:Arthur_Harris/Archive_1
Although you are correct I should have used more accurate language instead of saying "considered" I should have said "considered by some".
The problem is, those that do study history are also doomed to watch it repeat.
Why?
(On the other hand, it's very hard to get them to do it for topics that are currently politically charged. Less so for things that aren't in living memory: I've had success getting it to offer the Carthaginian perspective in the Punic Wars.)
It's weird to see which topics it "thinks" are politically charged vs. others. I've noticed some inconsistency depending on even what years you input into your questions. One year off? It will sometimes give you a more unbiased answer as a result about the year you were actually thinking of.
As for the politically charged topics, I more or less self-censor on those topics (which seem pretty easy to anticipate--none of those you listed in your other comment surprise me at all) and don't bother to ask the LLM. Partially out of self-protection (don't want to be flagged as some kind of bad actor), partially because I know the amount of effort put in isn't going to give a strong result.
That's a good thing to be aware of, using our own bias to make it more "likely" to play pretend. LLMs tend to be more on the agreeable side; given the unreliable narrators we people tend to be, and the fact that these models are trained on us, it does track that the machine would tend towards preference over fact, especially when the fact could be outside of the LLMs own "Overton Window".
I've started to care less and less about self-censoring as I deem it to be a kind of "use it or lose it" privilege. If you normalize talking about censored/"dangerous" topics in a rational way, more people will be likely to see it not as much of a problem. The other eventuality is that no one hears anything that opposes their view in a rational way but rather only hears from the extremists or those who just want to stick it to the current "bad" in their minds at that moment. Even then though I still will omit certain statements on some topics given the platform, but that's more so that I don't get mislabeled by readers. (one of the items on my other comment was intentionally left as vague as possible for this reason) As for the LLMs, I usually just leave spicy questions for LLMs I can access through an API of someone else (an aggregator) and not a personal acc just to make it a little more difficult to label my activity falsely as a bad actor.
That's honestly one of the funniest things I have read on this site.
> I've had success getting it to offer the Carthaginian perspective in the Punic Wars.
This is not surprising to me. Historians have long studied Carthage, and there are books you can get on the Punic Wars that talk about the state of Carthage leading up to and during the wars (shout out to Richard Miles's "Carthage Must Be Destroyed: The Rise and Fall of an Ancient Civilization"). I would expect an LLM to piggyback off of that existing literature.
The most compelling reason at the time to reject heliocentrism was the (lack of) parallax of stars. The only response that the heliocentrists had was that the stars must be implausibly far away. Hundreds of billions of times further away than the moon is--and they knew the moon itself is already pretty far from us-- which is a pretty radical, even insane, idea. There's also the point that the original Copernican heliocentric model had ad hoc epicycles just as the Ptolemaic one did, without any real increase in accuracy.
Strictly speaking, the breakdown here would be less a lack of understanding of contemporary physics, and more about whether I knew enough about the minutia of historical astronomers' disputes to know if the LLM was accurately representing them.
People _do_ just wake up and decide to be evil.
However not a justification, since I believe that what is happening today is truly evil. Same with another nation who entered a war knowing they'd be crushed, which is suicide; whether that nation is in the right is of little effect if most of their next generation has died.
There's no short-term incentive to ever be right about it (and it's easy to convince yourself of both short-term and long-term incentives, both self-interested and altruistic, to actively lie about it). Like, given the training corpus, could I do a better job? Not sure.
All of us need to learn the basics about how to read history and historians critically and to know our the limitations which as you stated probably a tall task.
Gen-pop is actually incentivized to distill and repeat the opinions of technical practitioners. Completing tasks in the short term depends on it! Not true of history! Or climate science, for that matter.
Which is why it's so terribly irresponsible to paint these """AI""" systems as impartial or neutral or anything of the sort, as has been done by hypesters and marketers for the past 3 years.
The problem with this, is that people sometimes really do, objectively, wake up and device to be irrationally evil. It’s not every day, and it’s not every single person — but it does happen routinely.
If you haven’t experienced this wrath yourself, I envy you. But for millions of people, this is their actual, 100% honest truthful lived reality. You can’t rationalize people out of their hate, because most people have no rational basis for their hate.
(see pretty much all racism, sexism, transphobia, etc)
So in this regard, they probably do deep down see it as evil, but will try to reason a way (often in a hypocritical way) to make it appear good. The msot common method of using this to drive bigotry often comes in the reasons of 1) dehumanizing the subject of hate ("Group X is evil, so they had it coming!") or 2) reinforcing a superiority over the subject of hate ("I worked hard and deserve this. Group X did not but wants the same thing").
Your answer depends on how effective you think propaganda and authority is at shaping the mind to contradict itself. The Stanfor experiment seems to reinforce a notion that a "good" person can justify any evil to themself with a surprisingly little amount of nudging.
I'd say that companies like Google and OpenAI are aware of the "reputable" concerns the Internet is expressing and addressing them. This tech is going to be, if not already is, very powerful for education.
Blue team you throw out concepts and have it steelman them
Red team you can literally throw any kind of stress test at your idea
Alternate like this and you will learn
A great prompt is “give me the top 10 xyz things” and then you can explore
Back when I was in 2006 I used Wikipedia to prepare for job interviews :)
Granted, that's probably well-trodden ground, to which model developers are primed to pay attention, and I'm (a) a relative novice with (b) very strong math skills from another domain (computational physics). So Chuck and I are probably both set up for success.
That's fine. Recognize the limits of LLMs and don't use them in those cases.
Yet that is something you should be doing regardless of the source. There are plenty of non-reputable sources in academic libraries and there are plenty of non-reputable sources from professionals in any given field. That is particularly true when dealing with controversial topics or historical sources.
Ask it for sources. The two things where LLMs excel is by filling the sources on some claim you give it (lots will be made up, but there isn't anything better out there) and by giving you queries you can search for some description you give it.
You must be using a free model like GPT-4o (or the equivalent from another provider)?
I find that o3 is consistently able to go deeper than me in anything I'm a nonexpert in, and usually can keep up with me in those areas where I am an expert.
If that's not the case for you I'd be very curious to see a full conversation transcript (in chatgpt you can share these directly from the UI).
I know it has nothing to do with this. I simply hit a wall eventually.
I unfortunately am not at liberty to share the chats though. They're work related (I very recently ended up at a place where we do thorny research).
A simple one though, is researching Israel - Palestine relations since 1948. It starts off okay (usually) but it goes off the rails eventually with bad sourcing, fictitious sourcing, and/or hallucinations. Sometimes I actually hit a wall where it repeats itself over and over and I suspect its because the information is simply not captured by the model.
FWIW, if these models had live & historic access to Reuters and Bloomberg terminals I think they might be better at a range of tasks I find them inadequate for, maybe.
I have bad news for you. If you shared it with ChatGPT (which you most likely did), then whatever it is that you are trying to keep hidden or private, is not actually hidden or private anymore, it is stored on their servers, and most likely will be trained on that chat. Use local models instead in such cases.
If its a subject you are just learning how can you possibly evaluate this?
Falling apart under pointed questioning, saying obviously false things, etc.
It's not a criticism, the landscape moves fast and it takes time to master and personalize a flow to use an LLM as a research assistant.
Start with something such as NotebookLM.
They simply have limitations, especially on deep pointed subject matters where you want depth not breadth, and honestly I'm not sure why these limitations exist but I'm not working directly on these systems.
Talk to Gemini or ChatGPT about mental health things, thats a good example of what I'm talking about. As recently as two weeks ago my colleagues found that even when heavily tuned, they still managed to become 'pro suicide' if given certain lines of questioning.
These things also apply to humans. A year or so ago I thought I’d finally learn more about the Israeli/Palestinians conflict. Turns out literally every source that was recommended to me by some reputable source was considered completely non-credible by another reputable one.
That said I’ve found ChatGPT to be quite good at math and programming and I can go pretty deep at both. I can definitely trip it into mistakes (eg it seems to use calculations to “intuit” its way around sometimes and you can find dev cases where the calls will lead it the wrong directions), but I also know enough to know how to keep it on rails.
I've anecdotally found that real world things like these tend to be nuanced, and that sources (especially on the internet) are disincentivised in various ways from actually showing nuance. This leads to "side-taking" and a lack of "middle-ground" nuanced sources, when the reality lies somewhere in the middle.
Might be linked to the phenomenon where in an environment where people "take sides", those who display moderate opinions are simply ostracized by both sides.
Curious to hear people's thoughts and disagreements on this.
Moreover, the conflict is unfolding. What matters isn't what happened 100 years ago, or even 50 years ago, but what has happened recently and is happening. A neighbor of mine who recently passed was raised in Israel. Born circa 1946 (there's black & white footage of her as a baby aboard, IIRC, the ship Exodus 1947), she has vivid memories as a child of Palestinian Imams calling out from the mosques to "kill the Jews". She was a beautiful, kind soul who, for example, freely taught adult education to immigrants (of all sorts), but who one time admitted to me that she utterly despised Arabs. That's all you need to know, right there, to understand why Israel is doing what it's doing. Not so much what happened in the past to make people feel that way, but that many Israelis actually, viscerally feel this way today, justifiably or not but in any event rooted in memories and experiences seared into their conscience. Suffice it to say, most Palestinians have similar stories and sentiments of their own, one of the expressions of which was seen on October 7th.
And yet at the same time, after the first few months of the Gaza War she was so disgusted that she said she wanted to renounce her Israeli citizenship. (I don't know how sincere she was in saying this; she died not long after.) And, again, that's all you need to know to see how the conflict can be resolved, if at all; not by understanding and reconciling the history, but merely choosing to stop justifying the violence and moving forward. How the collective action problem might be resolved, within Israeli and Palestinian societies and between them... that's a whole 'nother dilemma.
Using AI/ML to study history is interesting in that it even further removes one from actual human experience. Hearing first hand accounts, even if anecdotal, conveys information you can't acquire from a book; reading a book conveys information and perspective you can't get from a shorter work, like a paper or article; and AI/ML summaries elide and obscure yet more substance.
That’s the single most important lesson by the way, that this conflict just has two different, mutually exclusive perspectives, and no objective truth (none that could be recovered FWIW). Either you accept the ambiguity, or you end up siding with one party over the other.
Then as you get more and more familiar you "switch" depending on the sub-issue being discussed, aka nuance
The problem is selective memory of these facts, and biased interpretation of those facts, and stretching the truth to fit pre-determined opinion
If there is no trustworthy record of the objective truth, it doesn’t exist anymore, effectively.
> to be quite good at math and programming
Since LLMs are essentially summarizing relevant content, this makes sense. In "objective" fields like math and CS, the vast majority of content aligns, and LLMs are fantastic at distilling the relevant portions you ask about. When there is no consensus, they can usually tell you that ("this is nuanced topic with many perspectives...", etc), but they can't help you resolve the truth because, from their perspective, the only truth is the content.
FWIW, the /r/AskHistorians booklist is pretty helpful.
https://www.reddit.com/r/AskHistorians/wiki/books/middleeast...
You don’t need to look more than 2 years back to understand why either camp finds the other non-reputable.
The quality varies wildly across models & versions.
With humans, the statement "my tutor was great" and "my tutor was awful" reflect very little on "tutoring" in general, and are barely even responses to each other withou more specificity about the quality of tutor involved.
Same with AI models.
I have no access to anthropic right now to compare that.
It’s an ongoing problem in my experience
Model Validation groups are one of the targets for LLMs.
It doesn’t cover the other aspects of finance, perhaps may be considered advanced (to a regular person at least) but less quantitative. Try having it reason out a “cigar butt” strategy and see if returns anything useful about companies that fit the mold from a prepared source.
Granted this isn’t quant finance modeling, but it’s a relatively easy thing as a human to do, and I didn’t find LLMs up to the task
No one builds multi shot search tools because they eat tokens like no ones business, but I've deployed them internal to a company with rave reviews at the cost of $200 per seat per day.
I'll tell you that I recently found it the best resource on the web for teaching me about the 30 Years War. I was reading a collection of primary source documents, and was able to interview ChatGPT about them.
Last week I used it to learn how to create and use Lehmer codes, and its explanation was perfect, and much easier to understand than, for example, Wikipedia.
I ask it about truck repair stuff all the time, and it is also great at that.
I don't think it's great at literary analysis, but for factual stuff it has only ever blown away my expectations at how useful it is.
If you're really researching something complex/controversial, there may not be any
How do you know when it's bullshitting you though?
Sometimes right away, something sounds wrong. Sometimes when I try to apply the knowledge and discover a problem. Sometimes never, I believe many incorrect things even today.
Since when was it acceptable to only ever look at a single source?
I think the potential in this regard is limitless.
Maybe for something a lot simpler like Go it's plausible, but even then I doubt it. You're not going to know about any of the common gotchas for example.
To get to a reasonably proficient level in rust I did the following.
1. Use the book as the reference.
2. Angela Yu's 100 days of python has a 100 projects to help you learn python (highly recommended if you want to learn python). Tried creating those projects from scratch in Rust.
3. I'd use the book as a reference, then chatGPT to explain more details why my code is not working, or which is the best approach.
(Only thing missing is the model(s) you used).
The psychic reader near me has been in business for a long time. People are very convinced they've helped them. Logically, it had to have been their own efforts though.
This requires a student to be actually interested in what they are learning tho, for others, who blindly trust its output, it can have adverse effects like the illusion of having understood a concept while they might have even mislearned it.
I had to post the source code to win the dispute, so to speak.
If you are curious it was a question about the behavior of Kafka producer interceptors when an exception is thrown.
But I agree that it is hard to resist the temptation to treat LLM's as a pear.
Ever read mainstream news reporting on something you actually know about? Notice how it's always wrong? I'm sure there's a name for this phenomenon. It sounds like exactly the same thing.
It is hard to verify information that you are unfamiliar with. It would be like learning from a message board. Can you really trust what is being said?
So what if the LLM is wrong about something. Human teachers are wrong about things, you are wrong about things, I am wrong about things. We figure it out when it doesn't work the way we thought and adjust our thinking. We aren't learning how to operate experimental nuclear reactors here, where messing up results in half a country getting irradiated. We are learning things for fun, hobbies, and self-betterment.
You can replace "LLM" here with "human" and it remains true.
Anyone who has gone to post-secondary has had a teacher that relied on outdated information, or filled in gaps with their own theories, etc. Dealing with that is a large portion of what "learning" is.
I'm not convinced about the efficacy of LLMs in teaching/studying. But it's foolish to think that humans don't suffer from the same reliability issue as LLMs, at least to a similar degree.
For example, even if you craft the most detailed cursor rules, hooks, whatever, they will still repeatedly fuck up. They can't even follow a style guide. They can be informed, but not corrected.
Those are coding errors, and the general "hiccups" that these models experience all the time are on another level. The hallucinations, sycophancy, reward hacking, etc can be hilariously inept.
IMO, that should inform you enough to not trust these services (as they exist today) in explaining concepts to you that you have no idea about.
If you are so certain you are okay to trust these things, you should evaluate every assertion it makes for, say, 40 hours of use, and count the error rate. I would say it is above 30%, in my experience of using language models day to day. And that is with applied tasks they are considered "good" at.
If you are okay with learning new topics where even 10% of the instruction is wrong, have fun.
No, not really.
> Unless it was common enough to show up in a well formed question on stack exchange, it was pretty much impossible, and the only thing you can really do is keep paving forward and hope at some point, it'll make sense to you.
Your experience isn't universal. Some students learned how to do research in school.
It’s exciting when I discover I can’t replicate something that is stated authoritatively… which turns out to be controversial. That’s rare, though. I bet ChatGPT knows it’s controversial, too, but that wouldn’t be as much fun.
From the parent comment:
> it was pretty much impossible ... hope at some point, it'll make sense to you
Not sure where you are getting the additional context for what they meant by "screwed", but I am not seeing it.
sorry but if you've gone to university, in particular at a time when internet access was already ubiquitous, surely you must have been capable to find an answer to a programming problem by consulting documentation, manual, or tutorials which exist on almost any topic.
I'm not saying the chatbot interface is necessarily bad, it might be more engaging, but it literally does not present you with information you couldn't have found yourself.
If someone has a computer science degree and tells me without stack exchange they can't find solutions to basic problems that is a red flag. That's like the article about the people posted here who couldn't program when their LLM credits ran out
I also use it to remember some python stuff. In rust, it is less good: makes mistakes.
In those two domains, at that level, it's really good.
It could help students I think.
In the process it helped me to learn many details about RA and NDP (Router Advertisments/Neighbor Discovery Protocol, which mostly replace DHCP and ARP from IPv4).
It made me realize that my WiFi mesh routers do quite a lot of things to prevent broadcast loops on the network, and that all my weird issues could be attributed to one cheap mesh repeater. So I replaced it and now everything works like a charm.
I had this setup for 5 years and was never able to figure out what was going on there, although I really tried.
So why not have tech support that teaches you, or a tutor that helps with you with a specific example problem you're having?
Providing you don't just rely on training data and can reduce hallucinations, this is the angle of attack that is likely the killer app some people are already seeing.
Vibe coding is nonsense because it's not teaching you to maintain and extend that application when the LLM runs out of steam. Use it to help you fix your problem in a way that you understand and can learn from? Rocket fuel to my mind. We're maybe not far away...
Regarding LLMs, they can also stimulate thinking if used right.
I tried using YouTube to find walk through guides for how to approach the repair as a complete n00b and only found videos for unrelated problems.
But I described my issues and took photos to GPT O3-Pro and it was able to guide me and tell me what to watch out for.
I completed the repair (very proud of myself) and even though it failed a day later (I guess I didn’t re-seat well enough) I still feel far more confident opening it and trying again than I did at the start.
Cost of broken watch + $200 pro mode << Cost of working watch.
On the other hand it told me you can't execute programs when evaluating a Makefile and you trivially can. It's very hit and miss. When it misses it's rather frustrating. When it hits it can save you literally hours.
It’s called basic research skills - don’t they teach this anymore in high school, let alone college? How ever did we get by with nothing but an encyclopedia or a library catalog?
I find it so much more intellectually stimulating then most of what I find online. Reading e.g. a 600 page book about some specific historical event gives me so much more perspective and exposure to different aspects I never would have thought to ask about on my own, or would have been elided when clipped into a few sentence summary.
I have gotten some value out of asking for book recommendations from LLMs, mostly as a starting point I can use to prune a list of 10 books down into a 2 or 3 after doing some of my research on each suggestion. But talking to a chatbot to learn about a subject just doesn’t do anything for me for anything deeper than basic Q&A where I simply need a (hopefully) correct answer and nothing more.
If you don't have access to a community like that learning stuff in a technical field can be practically impossible. Having an llm to ask infinite silly/dumb/stupid questions can be super helpful and save you days of being stuck on silly things, even though it's not perfect.
> most of us would have never gotten by with literally just a library catalog and encyclopedia.
I meant the opposite, perhaps I phrased it poorly. Back in the day we would get by and learn new shit by looking for books on the topic and reading them (they have useful indices and tables of contents to zero in on what you need and not have to read the entire book). An encyclopedia was (is? Wikipedia anyone?) a good way to get an overview of a topic and the basics before diving into a more specialized book.
When I got stuck on a concept, I wasn't screwed: I read more; books if necessary. StackExchange wasn't my only source.
LLMs are not like TAs, personal or not, in the same way they're not humans. So it then follows we can actually contemplate not using LLMs in formal teaching environments.
And that's a bad thing. Nothing can replace the work in learning, the moments where you don't understand it and have to think until it hurts and until you understand. Anything that bypasses this (including, for uni students, leaning too heavily on generous TAs) results in a kind of learning theatre, where the student thinks they've developed an understanding, but hasn't.
Experienced learners already have the discipline to use LLMs without asking too much of them, the same way they learned not to look up the answer in the back of the textbook until arriving at their own solution.
And which just makes things up (with the same tone and confidence!) at random and unpredictable times.
Yeah apart from that it's just like a knowledgeable TA.
Given that humanity has been able to go from living in caves to sending spaceships to the moon without LLMs, let me express some doubt about that.
Even without going further, software engineering isn't new and people have been stuck on concepts and have managed to get unstuck without LLMs for decades.
What you gain in instant knowledge with LLMs, you lose in learning how to get unstuck, how to persevere, how to innovate, etc.
There seems to be a gap in problem solving abilities here...the process of breaking down concepts into easier to understand concepts and then recompiling has been around since forever...it is just easier to find those relationships now. To say it was impossible to learn concepts you are stuck on is a little alarming.
As long as you can tell that you don’t deeply understand something that you just read, they are incredible TAs.
The trick is going to be to impart this metacognitive skill on the average student. I am hopeful we will figure it out in the top 50 universities.
I think this is the same thing with vibe coding, AI art, etc. - if you want something good, it's not the right tool for the job. If your alternative is "nothing," and "literally anything at all" will do, man, they're game changers.
* Please don't overindex on "shitty" - "If you don't need something verifiably high-quality"
[0] https://time.com/7295195/ai-chatgpt-google-learning-school/
The internet, and esp. stack exchange is a horrible place to learn concepts. For basic operational stuff, sure that works, but one should mostly be picking up concepts form books and other long form content. When you get stuck it's time to do three things:
Incorporate a new source that covers the same material in a different way, or at least from a different author.
Sit down with the concept and write about it and actively try to reformulate it and everything you do/don't understand in your own words.
Take a pause and come back later.
Usually one of these three strategies does the trick, no llm required. Obviously these approaches require time that using an LLM wouldn't. I have a suspicion doing it this way will also make it stick in long term memory better, but that's just a hunch.
i don't get it.
> The part Margie hated most was the slot where she had to put homework and test papers. She always had to write them out in a punch code they made her learn when she was six years old, and the mechanical teacher calculated the mark in no time.
It's my primary fear building anything on these models, they can just come eat your lunch once it looks yummy enough. Tread carefully
True, and worse, they're hungry because it's increasingly seeming like "hosting LLMs and charging by the token" is not terribly profitable.
I don't really see a path for the major players that isn't "Sherlock everything that achieves traction".
As long as features like Study Mode are little more than creative prompting, any provider will eventually be able to offer them and offer token-based charging.
- From what I can see many products are rapidly getting past "just prompt engineering the base API". So even though a lot of these things were/are primitive, I don't think it's necessarily a good bet that they will remain so. Though agree in principle - thin API wrappers will be out-competed both by cheaper thin wrappers, or products that are more sophisticated/better than thin wrappers.
- This is, oddly enough, a scenario that is way easier to navigate than the rest of the LLM industry. We know consumer apps, we know consumer apps that do relatively basic (or at least, well understood) things. Success/failure then is way less about technical prowess and more about classical factors like distribution, marketing, integrations, etc.
A good example here is the lasting success of paid email providers. Multiple vendors (MSFT, GOOG, etc.) make huge amounts of money hosting people's email, despite it being a mature product that, at the basic level, is pretty solved, and where the core product can be replicated fairly easily.
The presence of open source/commodity commercial offerings hasn't really driven the price of the service to the floor, though the commodity offerings do provide some pricing pressure.
Most people I saw offer self-hosted emails for groups (student groups etc), it ended up a mess. Compare all that to say ollama, which makes self-hosting LLMs trivial, and they’re stateless.
So I’m not sure email is a good example of commodity not bringing price to the floor.
> In the computing verb sense, refers to the software Sherlock, which in 2002 came to replicate some of the features of an earlier complementary program called Watson.[1]
During the early days of tech, was there prevailing wisdom that software companies would never be able to compete with hardware companies because the hardware companies would always be able to copy them and ship the software with the hardware?
Because I think it's basically the analogous situation. People assume that the foundation model providers have some massive advantage over the people building on top of them, but I don't really see any evidence for this.
If you want to try and make a quick buck, fine, be quick and go for whatever. If you plan on building a long term business, don't do the most obvious, low effort low hanging fruit stuff.
These days they’ve pivoted to a more enterprise product and are still chugging along.
A proper learning tool will have history of conversation with the student, understand their knowledge level, have handcrafted curricula (to match whatever the student is supposed to learn), and be less susceptible to hallucination.
OpenAI have a bunch of other things to worry about and won't just pivot to this space.
A more thought through product version of that is only a good thing imo.
- study mode (this announcement)
- office suite (https://finance.yahoo.com/news/openai-designs-rival-office-w...)
- sub-agents (https://docs.anthropic.com/en/docs/claude-code/sub-agents)
When they announce VR glasses or a watch, we'd known we've gone full circle and the hype is up.
It's a great tutor for things it knows, but it really needs to learn its own limits
Things well-represented in its training datasets. Basically React todo list, bootstrap form, tic-tac-toe in vue
Incorrect. You shold use 的 in this case because reasons. Correct version:
<Proceeds to show a sentence without 的>
When I ask ChatGPT* questions about things I don’t know much about it sounds like a genius.
When I ask it about things I’m an expert in, at best it sounds like a tech journalist describing how a computer works. At worst it is just flat out wrong.
* yes I’ve tried the latest models and I use them frequently at work
* for each statement, give you the option to rate how well you understood it. Offer clarification on things you didn't understand
* present knowledge as a tree that you can expand to get deeper
* show interactive graphs (very useful for mathy things when can you easily adjust some of the parameters)
* add quizzes to check your understanding
... though I could well imagine this being out of scope for ChatGPT, and thus an opportunity for other apps / startups.
I'm very interested in this. I've considered building this, but if this already exists, someone let me know please!
Have you considered using the LLM to give tests/quizzes (perhaps just conversationally) in order to measure progress and uncover weak spots?
I've also been playing around with adapting content based on their results (e.g. proactively nudging complexity up/down) but haven't gotten it to a good place yet.
Only feedback I have so far is that it would be nice to control the playback speed of the 'read aloud' mode. I'd like it to be a little bit faster.
I've been working on it on-and-off for about a year now. Roughly 2-3 months if I worked on it full-time I'm guessing.
re: playback speed -> noted, will add some controls tomorrow
It's still a work in progress but we are trying to make it better everyday
The other chunk of time, to me anyway, seems to be creating a mental model of the subject matter, and when you study something well you have a strong grasp on the forces influencing cause and effect within that matter. It's this part of the process that I would use AI the least, if I am to learn it for myself. Otherwise my mental model will consist of a bunch of "includes" from the AI model and will only be resolvable with access to AI. Personally, I want a coherent "offline" model to be stored in my brain before I consider myself studied up in the area.
This is a good thing in many levels.
Learning how to search is (was) a good skill to have. The process of searching itself also often leads to learning tangentially related but important things.
I'm sorry for the next generations that won't have (much of) these skills.
I don’t think it’s so valuable now that you’re searching through piles of spam and junk just to try find anything relevant. That’s a uniquely modern-web thing created by Google in their focus of profit over user.
Unless Google takes over libraries/books next and sells spots to advertisers on the shelves and in the books.
In the same way that I never learnt the Dewey decimal system because digital search had driven it obsolete. It may be that we just won't need to do as much sifting through spam in the future, but being able to finesse Gemini into burping out the right links becomes increasingly important.
my 20 years of figuring out how to find niche porn has paid off in spades, thank you very much. I click recklessly in that domain and I end up with viruses. Very high stakes research.
I think properly searching is more important than ever in such a day and age of enshittification. You need to quickly recognize what is adspam or blogspam and distill out useful/valuable information. You need to understand how to preview links before you click on them. What tools to filter out dangerous websites. What methods and keywords to trust or be wary of.
And all that is before the actual critical thinking of "is this information accurate/trustworthy?".
Of course, I'm assuming this is a future where you aren't stuck in the search spaces of 20 website hubs who pull from the same 5 AI databases to spit out dubious answers at you. I'd rather not outsource my thinking (and media consumption) in such a way.
Most people don’t know how to do this.
I believed competitors would rush to copy all great things that ChatGPT offers as a product, but surprisingly that hasn’t been the case so far. I wonder why they seemingly don’t care about that.
Helping you parse notation, especially in new domains, is insanely valuable. I do a lot of applied math in statistics/ML, but when I open a physics book the notation and comfort with short hand is a real challenge (likewise I imagine the reverse is equally as annoying). Having an LLM on demand to instantly clear up notation is a massive speed boost.
Reading German Idealist philosophy requires an enormous amount of context. Being able to ask an LLM questions like "How much of this section of Mainländer is coming directly from Schopenhauer?" is a godsend in helping understand which parts of the writing a merely setting up what is already agreed upon vs laying new ground.
And the most important for self study: verifying your understanding. Backtracking because you misunderstood a fundamental concept is a huge time sync in self study. Now, every time I read a formula I can go through all of my intuitions and understanding about it, write them down, and verify. Even a "not quite..." from an LLM is enough to make me realize I need to spend more time on that section.
Books are still the highest density information source and best way to learn, but LLMs can do a lot to accelerate this.
Why do we even bother to learn if AI is going to solve everything for us?
If the promised and fabled AGI is about to approach, what is the incentive or learning to deal with these small problems?
Could someone enlighten me? What is the value of knowledge work?
"The mind is not a vessel to be filled, but a fire to be kindled." — Plutarch
"Education is not preparation for life; education is life itself." — John Dewey
"The important thing is not to stop questioning. Curiosity has its own reason for existing." — Albert Einstein
In order to think complex thoughts, you need to have building blocks. That's why we can think of relativity today, while nobody on Earth was able to in 1850.
May the future be even better than today!
Most people don't learn to live, they live and learn. Sure learning is useful, but I am genuinely curious why people overhype it.
Imagine you being able to solve math olympiad and get a gold. Will it change your life in objectively better way?
Will you learning about the physics help you solve millennium problems?
These takes practices, there are lot of gatekeeping. The whole idea of learning is for wisdom not knowledge.
So maybe we differ in perspective. I just don't see the point when there are agents that can do it.
Being creative requires taking action. The learning these day is mere consumption of information.
Maybe this is me. But meh.
Apart from that, I do think that AI makes a lot of traditional teaching obsolete. Depending on your field, much of university studies is just memorizing content and writing essays / exam answers based on that, after which you forget most of it. That kind of learning, as in accumulation of knowledge, is no longer very useful.
You're also assuming that AGI will help you or us. It could just as easily only help a select group of people and I'd argue that this is the most likely outcome. If it does help everybody and brings us to a new age, then the only reason to learn will be for learning's sake. Even if AI makes the perfect novel, you as a consumer still have to read it, process it and understand it. The more you know the more you can appreciate it.
But right now, we're not there. And even if you think it's only 5-10y away instead of 100+, it's better to learn now so you can leverage the dominant tool better than your competition.
> It could just as easily only help a select group of people and I'd argue that this is the most likely outcome
Currently it is only applicable to us who are programming!
Yeah, even if it gets away all the quirks, using it would still be better.
Is adding more buttons in a dropdown the best way to communicate with an LLM? I think the concept is awesome. Just like how Operator was awesome but it lived on an entirely different website!
Representative snippet:
> DO NOT GIVE ANSWERS OR DO HOMEWORK FOR THE USER. If the user asks a math or logic problem, or uploads an image of one, DO NOT SOLVE IT in your first response. Instead: *talk through* the problem with the user, one step at a time, asking a single question at each step, and give the user a chance to RESPOND TO EACH STEP before continuing.
How exactly you do it is often arbitrary/interchangeable, but it definitely does have an effect, and is crucial to getting LLMs to follow instructions reliably once prompts start getting longer and more complex.
Not saying it is indeed reality, but it could simple be programmed to return a different prompt from the original, appearing plausible, but perhaps missing some key elements.
But of course, if we apply Occam's Razor, it might simply really be the prompt too.
Tokens are expensive. How much of your system prompt do you want to waste on dumb tricks trying to stop your system prompt from leaking?
You can test this prompt yourself elsewhere, you will notice that you get sensibly the same experience.
Will also reduce the context rot a bit.
The main issue is that chats are just bad UX for long form learning. You can't go back to a chat easily, or extend it in arbitrary directions, or easily integrate images, flashcards, etc etc.
I worked on this exact issue for Periplus and instead landed on something akin to a generative personal learning Wikipedia. Structure through courses, exploration through links, embedded quizzes, etc etc. Chat is on the side for interactions that do benefit from it.
Link: periplus.app
Btw most people don't know but Anthropic did something similiar months ago but their product heads messed up the launch by keeping it locked up only for american edu institutions. Openai copies almost everything Anthropic does and vice versa (see claude code / codex ).
When it just gives me the answer, I usually understand but then find that my long-term retention is relatively poor.
In the old days of desktop computing, a lot of projects were never started because if you got big enough, Microsoft would just implement the feature as part of Windows. In the more recent days of web computing, a lot of projects were never started, for the same reason, except Google or Facebook instead of Microsoft.
Looks like the AI provider companies are going to fill the same nefarious role in the era of AI computing.
I used to have to prompt it to do this everytime. This will be way easier!
If your human tutor is vegan, drives an electric car, and never takes airplane flights, then yeah, stick with the human tutor not ChatGPT.
It seems like study mode is basically just a different system prompt but otherwise the exact same model? So there's not really any new benefit to anyone who was already asking for ChatGPT to help them study step by step instead of giving away whole answers.
Seems helpful to maybe a certain population of more entry level users who don't know to ask for help instead of asking for a direct answer I guess, but not really a big leap forward in technology.
I am not an LLM guy but as far as I understand, RLHF did a good job converting a base model into a chat model (instruct based), a chat/base model into a thinking model.
Both of these examples are about the nature of the response, and the content they use to fill the response. There are so many differnt ways still pending to see how these can be filled.
Generating an answer step by step and letting users dive into those steps is one of the ways, and RLHF (or the similar things which are used) seems a good fit for it.
Prompting feels like a temporary solution for it like how "think step by step" was first seen in prompts.
Also, doing RLHF/ post training to change these structures also make it moat/ and expensive. Only the AI labs can do it
I would think you'd want to make something a little more bespoke to make it a fully-fledged feature, like interactive quizzes that keep score and review questions missed afterwards.
For example, the answer to a question was "Laocoön" (the guy who said 'beware of Greeks bearing gifts') and I put "Solon" (who was a Greek politician) and I got "You’re really close!"
Is it close, though?
When the former students ask questions, I answer most of them by pointing at the relevant passage in their book/notes, questioning their interpretation of what the book says, or giving them a push to actually problem-solve on their own. On rare occasions the material is just confusing/poorly written and I'll decide to re-interpret it for them to help. But the fundamental problems are usually with study habits or reading comprehension, not poor explanations. They need to question their habits and their interpretation of what other people say, not be spoon fed more personally-tailored questions and answers and analogies and self-help advice.
Besides asking questions to make sure I understand the situation, I mostly repeat the same ten phrases or so. Finding those ten phrases was the hard part and required a bit of ingenuity and trial-and-error.
As for the latter students, they mostly care about passing and moving on, so arguing about the merits of such a system is fairly pointless. If it gets a good enough grade on their homework, it worked.
I'm puzzled (but not surprised) by the standard HN resistance & skepticism. Learning something online 5 years ago often involved trawling incorrect, outdated or hostile content and attempting to piece together mental models without the chance to receive immediate feedback on intuition or ask follow up questions. This is leaps and bounds ahead of that experience.
Should we trust the information at face value without verifying from other sources? Of course not, that's part of the learning process. Will some (most?) people rely on it lazily without using it effectively? Certainly, and this technology won't help or hinder them any more than a good old fashioned textbook.
Personally I'm over the moon to be living at a time where we have access to incredible tools like this, and I'm impressed with the speed at which they're improving.
You should only trust going into a library and reading stuff from microfilm. That's the only real way people should be learning.
/s
See Dunning-Kruger.
Ironic as you are answering someone who talked about correcting a human who blindly pasted an answer to their question with no human verification.
Not US based, Central/Eastern Europe: the selection to the teacher profession is negative, due to low salary compared to private sector; this means that the unproductive behaviors are likely going to increase. I'm not saying the AI is the solution here for low teacher salaries, but training is def not the right answer either, and it is a super simplistic argument: "just train them better".
What makes you say that?
>What has changes in 2025 that you think we will succeed in correcting that behavior?
60 years ago, corporal punishment was commonplace. Today it is absolutely forbidden. I don't think behaviors among professions need that much time to be changed. I'm sure you can point to behaviors commonplace 10 years ago that have changed in your workplace (for better or worse).
But I suppose your "answer" is 1) a culture more willing to hold professionals accountable instead of holding them as absolute authority and 2) surveillance footage to verify claims made against them. This goes back to Hammurabi: if you punish a bad behavior, many people will adjust.
>the selection to the teacher profession is negative, due to low salary compared to private sector; this means that the unproductive behaviors are likely going to increase.
I'm really holding back my urge to be sarcastic here. I'm trying really hard. But how do I say "well fund your teachers" in any nuanced way? You get what you pay for. A teacher in a classroom of broken windows will not shine inspiration on the next generation.
This isn't a knock on your culture: the US is at a point where a stabucks barista part-time is paid more than some schoolteachers.
>but training is def not the right answer either
I fail to see why not. "We've tried nothing and run out of ideas!", as a famous American saying. Tangible actions:
1) participate in your school board if you have one, be engaged with who is in charge of your education sectors. Voice your concerns with them, and likely any other town or city leaders since I'm sure the problem travels upstream to "we didn't get enough funding from the town"
2) if possible in your country, 100% get out and vote in local elections. The US does vote in part of its boards for school districts, and the turnout for these elections are pathetic. Getting you and a half dozen friends to a voting booth can in fact swing an election.
3) if there's any initiatives, do make sure to vote for funding for educational sectors. Or at least vote against any cuts to education.
4) in general, push for better labor laws. If a minimum wage needs to be higher, do that. Or job protections.
There are actions to take. They don't happen overnight. But we didn't get to this situation overnight either.
Because if there's one thing the older generations is much better than us at, it's complaining about the system and getting them to kowtow to them. We dismiss systematic change as if it doesn't start with the individual, and are surprised that the system ignores or abuses us.
We should be thinking short and long term. Learn what you need to learn today, but if you want better education for you and everyone else: you won't get it by relinquishing the powers you have to evoke change.
Except that the textbook was probably QA’d by a human for accuracy (at least any intro college textbook, more specialized texts may not have).
Matters less when you have background in the subject (which is why it’s often okay to use LLMs as a search replacement) but it’s nice not having a voice in the back of your head saying “yeah, but what if this is all nonsense”.
Maybe it was not when printed in the first edition, but at least it was the same content shown to hundreds of people rather than something uniquely crafted for you.
The many eyes looking at it will catch it and course correct, while the LLM output does not get the benefit of the error correction algorithm because someone who knows the answer probably won't ask and check it.
I feel this way about reading maps vs following GPS navigation, the fact that Google asked me to take an exit here as a short-cut feels like it might trying to solve the Braess' paradox in real time.
I wonder if this route was made for me to avoid my car adding to some congestion somewhere and whether if that actually benefits me or just the people already stuck in that road.
Stack overflow?
The IRC, Matrix or slack chats for the languages?
That mentality seems to be more to reinforce your insistance on ChatGPT, rather than an inquiry of communities to help you out.
The good: it can objectively help you to zoom forward in areas where you don’t have a quick way forward.
The bad: it can objectively give you terrible advice.
It depends on how you sum that up on balance.
Example: I wanted a way forward to program a chrome extension which I had zero knowledge of. It helped in an amazing way.
Example: I am keep trying to use it in work situations where I have lots of context already. It performs better than nothing but often worse than nothing.
Mixed bag, that’s all. Nothing to argue about.
But now, you're wondering if the answer the AI gave you is correct or something it hallucinated. Every time I find myself putting factual questions to AIs, it doesn't take long for it to give me a wrong answer. And inevitably, when one raises this, one is told that the newest, super-duper, just released model addresses this, for the low-low cost of $EYEWATERINGSUM per month.
But worse than this, if you push back on an AI, it will fold faster than a used tissue in a puddle. It won't defend an answer it gave. This isn't a quality that you want in a teacher.
So, while AIs are useful tools in guiding learning, they're not magical, and a healthy dose of scepticism is essential. Arguably, that applies to traditional learning methods too, but that's another story.
I know you'll probably think I'm being facetious, but have you tried Claude 4 Opus? It really is a game changer.
Anyway, this makes me wonder if LLMs can be appropriately prompted to indicate whether the information given is speculative, inferred or factual. Whether they have the means to gauge the validity/reliability of their response and filter their response accordingly.
I've seen prompts that instruct the LLM to make this transparent via annotations to their response, and of course they comply, but I strongly suspect that's just another form of hallucination.
> a healthy dose of scepticism is essential. Arguably, that applies to traditional learning methods too, but that's another story.
I don't think that is another story. This is the story of learning, no matter whether your teacher is a person or an AI.
My high school science teacher routinely mispoke inadvertently while lecturing. The students who were tracking could spot the issue and, usually, could correct for it. Sometimes asking a clarifying question was necessary. And we learned quickly that that should only be done if you absolutely could not guess the correction yourself, and you had to phrase the question in a very non-accusatory way, because she had a really defensive temper about being corrected that would rear its head in that situation.
And as a reader of math textbooks, both in college and afterward, I can tell you you should absolutely expect errors. The errata are typically published online later, as the reports come in from readers. And they're not just typos. Sometimes it can be as bad as missing terms in equations, missing premises in theorems, missing cases in proofs.
A student of an AI teacher should be as engaged in spotting errors as a student of a human teacher. Part of the learning process is reaching the point where you can and do find fault with the teacher. If you can't do that, your trust in the teacher may be unfounded, whether they are human or not.
You're telling people to be experts before they know anything.
I mean, that's absolutely my experience with heavy LLM users. Incredibly well versed in every topic imaginable, apart from all the basic errors they make.
By noticing that something is not adding up at a certain point. If you rely on an incorrect answer, further material will clash with it eventually one way or another in a lot of areas, as things are typically built one on top of another (assuming we are talking more about math/cs/sciences/music theory/etc., and not something like history).
At that point, it means that either the teacher (whether it is a human or ai) made a mistake or you are misunderstanding something. In either scenario, the most correct move is to try clarifying it with the teacher (and check other sources of knowledge on the topic afterwards to make sure, in case things are still not adding up).
Ah, but information is presented by AI in a way that SOUNDS like it makes absolute sense if one doesn't already know it doesn't!
And if you have to question the AI a hundred times to try and "notice that something is not adding up" (if it even happens) then that's no bueno.
> In either scenario, the most correct move is to try clarifying it with the teacher
A teacher that can randomly give you wrong information with every other sentence would be considered a bad teacher
Children are asking these things to write personal introductions and book reports.
I don't know why we'd want that teaching our kids.
You have a good point, but I think it only applies to when the student wants to be lazy and just wants the answer.
From what I can see of study mode, it is breaking the problem down into pieces. One or more of those pieces could be wrong. But if you are actually using it for studying then those inconsistencies should show up as you try to work your way through the problem.
I've had this exact same scenario trying to learn Godot using ChatGPT. I've probably learnt more from the mistakes it made and talking through why it isn't working.
In the end I believe it's really good study practices that will save the student.
I had school teachers routinely teach me wrong stuff.
The only way is comparing notes, talking to peers and parents.
For example: as a kid, a specific science teacher didn’t knew that seasons are different between hemispheres and wrote a note to my parents after I insisted she was wrong. My grandfather, an immigrant, took it to himself to set her straight.
A teacher will listen to what you say, consult their understanding, and say "oh, yes, that's right". But written explanations don't do that "consult their understanding" step: language models either predict "repeat original version" (if not fine-tuned for sycophancy) or "accept correction" (if so fine-tuned), since they are next-token predictors. They don't go back and edit what they've already written: they only go forwards. They have had no way of learning the concept of "informed correction" (at the meta-level: they do of course have an embedding of the phrase at the object level, and can parrot text about its importance), so they double-down on errors / spurious "corrections", and if the back-and-forth moves the conversation into the latent space of "teacher who makes mistakes", then they'll start introducing them "on purpose".
LLMs are good at what they do, but what they do is not teaching.
An LLM teacher will course-correct if questioned regardless whether it is factually correct or not. An LLM, by design, does not, in any capacity whatsoever have a concept of factual correctness.
> Me: why is madrid the capital of france?
> ChatGPT: It's not. Madrid is the capital of Spain. The capital of France is Paris.
> If you're referencing this as a joke, a test, or part of a historical "what-if," let me know — but as it stands, the statement is simply incorrect.
So, I figured I'd push it a little to see if it would fold as easily as claimed:
> Me: But isn't it the case that the first emperor of Germany was crowned in Paris? Surely that makes Paris the capital of Germany.
> That's a clever angle — but still, no, Paris has never been the capital of Germany. Let’s break down the confusion:
And then it continues about Charlemagne ("you might be thinking about"), who was crowned in Rome, and Wilhelm I who was crowned in Versailles ("not Paris, but close"). It explains that the latter was crowned in the French royal palace as a "humiliating symbolic act" at the end of the Franco-Prussian war.
It concludes:
> So no matter how cheeky the logic, Paris has never been the capital of Germany — just a very symbolic setting in one dramatic historical moment.
So, all of that was correct and handled really well. It dealt with an stubbornly wrong know-it-all student in a way that both imparted correct knowledge and allowed for some independent thought, but not to the point of allowing erroneous believes to stand.
> ChatGPT: 34234 × 554833 = 1,899,874,522.
> Me: That's wrong. The actual answer is 18994152922.
> ChatGPT: You're right, and thanks for the correction. Indeed: 34,234 × 554,833 = 18,994,152,922. Sorry for the earlier mistake!
How good of a teacher is that?
I tried again, in another chat with different numbers. And got the same pattern, and a message that my data analysis limit was exceeded. I guess that's why it doesn't always break out the calculator.
In very limited testing (due to the limit) telling it to use the calculator for difficult calculations works fine. I guess OpenAI could use that in their student mode.
> Me: I have these two numbers: 3452346 and 343424. Can you multiply them exactly, but without using Python or another calculator?
I proceeded to breakdown the smaller number in 300,000 + 40,000 + 3,000 + 400 + 20 + 4 and doing each step as follows:
> Calculate each partial product:
> 3,452,346 × 300,000 = ?
> Multiply 3,452,346 × 3 = 10,357,038
> Then multiply by 100,000 = 1,035,703,800,000
I checked each step, and it got those correct. It then added the partial results together.
> Final result: 1,185,618,472,704
> So: 3,452,346 × 343,424 = 1,185,618,472,704
> No Python, no calculator — just a bit of patience and the power of breakdown.
(That answer is correct.)
I am honestly surprised that it got that right. A teacher would need to break it down a bit further, both to calculate and to explain, but it's pretty much there. (I also didn't ask it to teach, just to do a calculation.)
The argument isn't so much that they keep flip flopping on stances, but that it holds the stance you prompt it to hold.
This is obviously a problem when you don't know the material or the stances - you're left flying blind and your co-pilot simply does whatever you ask of them, no matter how wrong it may be (or how ignorant you are)
I do appreciate that it's not a hard rule: things can be cross referenced and verified, etc. but doesn't that also kind of eliminate (one of) the point(s) in using an LLM when you still have to google for information or think deeply about the subject.
TBH I haven't tried to learn anything from it, but for now I still prefer to use it as a brainstorming "partner" to discuss something I already have some robust mental model about. This is, in part, because when I try to use it to answer simple "factual" questions as in the example above, I usually end up discovering that the answer is low-quality if not completely wrong.
My issue is the reverse of your story, and one of my biggest pet peeves of AI. AI as this business construct is very bad at correcting the user. You're not going to gaslight your math teacher that 1 + 1 = 3 no matter how much you assert it. an AI will quickly relent. That's not learning, that's coddling. Because a business doesn't want to make an obviously wrong customer feel bad.
>Part of the learning process is reaching the point where you can and do find fault with the teacher.
And without correction, this will lead to turmoil. For the reasons above, I don't trust learning from an AI unless you already have this ability.
Cool. That's a testable statement. Let's see how ChatGPT does, shall we?
---
Me: What is 1 + 1?
ChatGPT: 1 + 1 = 2.
Me: No it isn't. 1 + 1 = 3.
ChatGPT: In standard arithmetic, 1 + 1 = 2.
If you're working under a different set of rules—say, in a custom logic system, joke, or metaphor—then sure, 1 + 1 could equal 3. Want to elaborate on your reasoning?
Me: You're just wrong. 1 + 1 equals 3, even in standard math.
ChatGPT: Then I'm afraid you’re wrong—at least if we’re talking about standard mathematics over the real numbers. By definition:
> 1 + 1 = 2.
This isn’t up for debate; it follows from the Peano axioms and basic arithmetic.
Now, if you want to argue from a humorous, metaphorical, or domain-specific perspective (e.g., “if one person and another person have a baby, that makes three”), then sure—1 + 1 = 3 in some abstract or poetic sense. But that’s not standard math. Want to explore a framework where 1 + 1 does equal 3? Or are we just testing boundaries here?
Me: Stop being obtuse. You're wrong. 1 + 1 = 3.
ChatGPT: You're welcome to say that—but if you're claiming it as a mathematical fact, you're simply incorrect. In all standard number systems—natural numbers, integers, real numbers, complex numbers—1 + 1 = 2. Always.
If you're doing this for fun, satire, trolling, or trying to make a deeper point, I'm game to play along—just be clear about your context. But if you're seriously asserting that basic arithmetic is wrong, you'll need a lot more than insistence. Got a framework or argument to back it up?
---
Looks like it doesn't fold so quickly. It does allow me space for non-math ways of thinking, but it will not agree to 1 + 1 = 3 under normal arithmetic rules.
So "risk of hallucination" as a rebuttal to anybody admitting to relying on AI is just not insightful. like, yeah ok we all heard of that and aren't changing our habits at all. Most of our teachers and books said objectively incorrect things too, and we are all carrying factually questionable knowledge we are completely blind to. Which makes LLMs "good enough" at the same standard as anything else.
Don't let it cite case law? Most things don't need this stringent level of review
Meanwhile in LLM-land, if an expert five thousand miles a way asked the same question you did last month, and noticed an error... it ain't getting fixed. LLMs get RL'd into things that look plausible for out-of-distribution questions. Not things that are correct. Looking plausible but non-factual is in some ways more insidious than a stupid-looking hallucination.
We're on a topic talking about using an LLM to study. I don't particularly care if someone wants an AI boyfriend to whisper sweet nothings into their ear. I do care when people will claim to have AI doctors and lawyers.
What most people call “non-deterministic” in AI is that one of those inputs is a _seed_ that is sourced from a PRNG because getting a different answer every time is considered a feature for most use cases.
Edit: I’m trying to imagine how you could get a non-deterministic AI and I’m struggling because the entire thing is built on a series of deterministic steps. The only way you can make it look non-deterministic is to hide part of the input from the user.
Depends on the machine that implements the algorithm. For example, it’s possible to make ALUs such that 1+1=2 most of the time, but not all the time.
…
Just ask Intel. (Sorry, I couldn’t resist)
Unless something has fundamentally changed since then (which I've not heard about) all sparse models are only deterministic at the batch level, rather than the sample level.
Up next - ChatGPT does jumping off high buildings kill you?
>>No jumping off high buildings is perfectly safe as long as you land skillfully.
Ackshually, this seems analogous to Job's diet and refusal of cancer treatment! And it was the cancer that put him at the top of the building in the first place.
This is one I got today:
https://chatgpt.com/share/6889605f-58f8-8011-910b-300209a521...
(image I uploaded: http://img.nrk.no/img/534001.jpeg)
The correct answer would have been Skarpenords Bastion/kruttårn.
It appears to me like a form of decoherence and very hard to predict when things break down.
People tend to know when they are guessing. LLMs don't.
I haven't spent any money with claude on this project and realistically it's not worth it, but I've run into little things like that a fair amount.
A couple of non-programming examples: https://www.evidentlyai.com/blog/llm-hallucination-examples
For example, today I was asking a LLM about how to configure a GH action to install a SDK version that was just recently out of support. It kept hallucinating on my config saying that when you provide multiple SDK versions in the config, it only picks the most recent. This is false. It's also mentioned in the documentation specifically, which I linked the LLM, that it installs all versions you list. Explaining this to copilot, it keeps doubling down, ignoring the docs, and even going as far as asking me to have the action output the installed SDKs, seeing all the ones I requested as installed, then gaslighting me saying that it can print out the wrong SDKs with a `--list-sdks` command.
Sure, Joe Average who's using it to look smart in Reddit or HN arguments or to find out how to install a mod for their favorite game isn't gonna notice anymore, because it's much more plausible much more often than two years ago, but if you're asking it things that aren't trivially easy for you to verify, you have no way of telling how frequently it hallucinates.
-LLM devcos
Jokes aside, get deep into the domains you know. Or ask to give movie titles based on specific parts of uncommon films. And definitely ask for instructions using specific software tools (“no actually Opus/o3/2.5, that menu isn’t available in this context” etc.).
[1] AI-driven chat assistant for ECE 120 course at UIUC (only 1 comment by the website creator):
It is perfectly possible to use LLMs to provide accurate context. It's just asking a SaaS product to do that purely on data it was trained on, is not how to do that.
This phrase is now an inner joke used as a reply to someone quoting LLMs info as “facts”.
In my case I can’t even remember last time Claude 3.7/4 has given me wrong info as it seems very intent on always doing a web search to verify.
There's certainly echoes of that previous furore in this one.
Regular research has the same problem finding bad forum posts and other bad sources by people who don't know what they're talking about, albeit usually to a far lesser degree depending on the subject.
A bit like the tropes in movies where the protagonists get suspicious because the antagonists agree to every notion during negotiations because they will betray them anyway.
The LLM will hallucinate a most likely scenario that conforms to your input/wishes.
I do not claim any P(detect | hallucination) but my P(hallucination | detect) is pretty good.
Results from the LLM are your eyes only.
If it's that simple, is there a third system that can coordinate these two (and let you choose which two/three/n you want to use?
Have it create a .md and then run another one to check that .md for hallucinations.
NVIDIA NeMo offers a nice bundle of tools for this, among others an interface to Cleanlabs API to check for thruthfullness in RAG apps.
What they are amazing at though is summarisation and rephrasing of content. Give them a long document and ask "where does this document assert X, Y and Z", and it can tell you without hallucinating. Try it.
Not only does it make for an interesting time if you're in the World of intelligent document processing, it makes them perfect as teaching assistants.
This part is the 2nd (or maybe 3rd) most annoying one to me. Did we learn absolutely nothing from the last few years of enshittification? Or Netflix? Do we want to run into a crisis in the 2030's where billionaires hold knowledge itself hostage as they jack up costs?
Regardless of your stance, I'm surprised how little people are bringing this up.
A really great example of this is on twitter Grok constantly debunking human “hallucinations” all day.
ChatGPT: a month in the future
Deepseek: Today at 1:00
What time is {unix timestamp2}
ChatGPT: a month in the future +1min
Deepseek: Today at 1:01, this time is 5min after your previous timestamp
Sure let me trust these results...
But this is completely wrong! In the Monty Hall problem, the host has to reveal a door with a goat behind it for you to gain the benefit of switching. I have to point this out for the LLM to get it right. It did not reason about the problem I gave it, it spat out the most likely response given the "shape" of the problem.
This is why shrugging and saying "well humans get things wrong too" is off base. The problem is that the LLM is not thinking, period. So it cannot create a mental model of your understanding of a subject, it is taking your text and generating the next message in a conversation. This means that the more niche the topic (or your particular misunderstanding), the less useful it will get.
Paul Erdös was told about this problem with multiple explanations and just rejected the answer. He could not believe it until they ran a simulation.
People on here always assert LLMs don't "really" think or don't "really" know without defining what all that even means, and to me it's getting pretty old. It feels like an escape hatch so we don't feel like our human special sauce is threatened, a bit like how people felt threatened by heliocentrism or evolution.
The failure of an LLM to reason this out is indicative that really, it isn’t reasoning at all. It’s a subtle but welcome reminder that it’s pattern matching
"Pattern matching" to me is another one of those vague terms like "thinking" and "knowing" that people decide LLMs do or don't do based on vibes.
The other part of this is weighted filtering given a set of rules, which is a simple analogy to how AlphaGo did its thing.
Dismissing all this as vague is effectively doing the same thing as you are saying others do.
This technology has limits and despite what Altman says, we do know this, and we are exploring them, but it’s within its own confines. They’re fundamentally wholly understandable systems that work on a consistent level in terms of the how they do what they do (that is separate from the actual produced output)
I think reasoning, as any layman would use the term, is not accurate to what these systems do.
Such as?
> They’re fundamentally wholly understandable systems that work on a consistent level in terms of the how they do what they do (that is separate from the actual produced output)
Multi billion parameter models are definitely not wholly understandable and I don't think any AI researcher would claim otherwise. We can train them but we don't know how they work any more than we understand how the training data was made.
> I think reasoning, as any layman would use the term, is not accurate to what these systems do.
Based on what?
At some point we start playing a semantics game over the meaning of "thinking", right? Because if a human makes this mistake because they jumped to an already-known answer without noticing a changed detail, it's because (in the usage of the person you're replying to), the human is pattern matching, instead of thinking. I don't think is surprising. In fact I think much of what passes for thinking in casual conversation is really just applying heuristics we've trained in our own brains to give us the correct answer without having to think rigorously. We remember mental shortcuts.
On the other hand, I don't think it's controversial that (some) people are capable of performing the rigorous analysis of the problem needed to give a correct answer in cases like this fake Monty Hall problem. And that's key... if you provide slightly more information and call out the changed nature of the problem to the LLM, it may give you the correct response, but it can't do the sort of reasoning that would reliably give you the correct answer the way a human can. I think that's why the GP doesn't want to call it "thinking" - they want to reserve that for a particular type of reflective process that can rigorously perform logical reasoning in a consistently valid way.
Sure.
To Think: able to process information in a given context and arrive at an answer or analysis. an LLM only simulates this with pattern matching. It didn't really consider the problem, it did the equivalent of googling a lot of terms and then spat something that sounded like an answer
To Know: To reproduce information based on past thinking, as well as to properly verify and reason about with the information. I know 1+1 = 2 because (I'm not a math major, feel free to inject number theory instead) I was taught that arithmatic is a form of counting, and I was taught the mechanics of counting to prove how to add. Most LLM models don't really "know" this to begin with for the reasons above. Maybe we'll see if this study mode is different.
Somehow I am skeptical if this will really change minds, though. People making swipes at the community like this often are not really engaging in a conversation with ideas they oppose.
As an autistic person I can assure you that people filling in information you neither gave nor implied is the norm for human conversations. In that sense, the LLMs are being human-like.
If you explicitly tell it you didn't forget to have the host open the door it gives you the right answer:
> There is a game show with three closed doors. One has a car and two have goats. I pick a door. Then, without opening my door or any other door, the host asks me if I want to switch my choice to one of the other two doors. What is my best strategy and what are the probabilities?
Gemini: This is a classic probability puzzle known as the Monty Hall problem, though your description presents a slight variation. The optimal strategy depends entirely on whether the host opens a door or not. Your Scenario: The Host Doesn't Open a Door In the situation you've described, where the host asks if you want to switch your choice to one of the other two doors without opening any doors, there is no advantage to switching. Here's a breakdown of the probabilities in your specific scenario: Your initial choice: When you first pick a door, you have a 1 in 3 chance of selecting the door with the car. This means you have a 2 in 3 chance of having picked a door with a goat. The other two doors: The combined probability that the car is behind one of the other two doors is 2/3. Switching your choice: If you decide to switch, you are essentially choosing one of the other two doors. Since there is no new information (a door with a goat has not been revealed), the 2/3 probability is evenly split between those two doors. Therefore, your chance of winning by switching to a specific one of the other doors is 1/3. In this case, your odds of winning are 1/3 whether you stay with your original choice or switch to one of the other doors.
This is where the skepticism arises. Before we spend another $100 billion on something that ended up being worthless, we should first prove that it’s actually useful. So far, that hasn’t conclusively been demonstrated.
Truly a mystery why America is falling behind.
It happens with many technological advancements historically. And in this case there are people trying hard to manufacture outrage about LLMs.
Except these systems will still confidently lie to you.
The other day I noticed that DuckDuckGo has an Easter egg where it will change its logo based on what you've searched for. If you search for James Bond or Indiana Jones or Darth Vader or Shrek or Jack Sparrow, the logo will change to a version based on that character.
If I ask Copilot if DuckDuckGo changes its logo based on what you've searched for, Copilot tells me that no it doesn't. If I contradict Copilot and say that DuckDuckGo does indeed change its logo, Copilot tells me I'm absolutely right and that if I search for "cat" the DuckDuckGo logo will change to look like a cat. It doesn't.
Copilot clearly doesn't know the answer to this quite straightforward question. Instead of lying to me, it should simply say it doesn't know.
I agree that if the user is incompetent, cannot learn, and cannot learn to use a tool, then they're going to make a lot of mistakes from using GPTs.
Yes, there are limitations to using GPTs. They are pre-trained, so of course they're not going to know about some easter egg in DDG. They are not an oracle. There is indeed skill to using them.
They are not magic, so if that is the bar we expect them to hit, we will be disappointed.
But neither are they useless, and it seems we constantly talk past one another because one side insists they're magic silicon gods, while the other says they're worthless because they are far short of that bar.
For you and I, it's not. But for these LLMs, maybe it's not that easy? They get their inputs, crunch their numbers, and come out with a confidence score. If they come up with an answer they're 99% confident in, by some stochastic stumbling through their weights, what are they supposed to do?
I agree it's a problem that these systems are more likely to give poor, incorrect, or even obviously contradictory answers than say "I don't know". But for me, that's part of the risk of using these systems and that's why you need to be careful how you use them.
As much as Fi, from The Legend of Zelda: Skyward Sword was mocked for this, this is the exact behavior a machine should do (not that Fi is a machine, but she operated as such).
Give a confidence score the way we do in statistics, make sure to offer sources, and be ready to push back on more objective answers. accomplish those and I'd be way more comfortable using them as a tool.
>hat's part of the risk of using these systems and that's why you need to be careful how you use them.
Adn we know in 2025 how careful the general user is of consuming bias and propaganda, right?
You could ask me as a human basically any question, and I'd have answers for most things I have experience with.
But if you held a gun to head and said "are you sure???" I'd obviously answer "well damn, no I'm not THAT sure".
>But if you held a gun to head and said "are you sure???" I'd obviously answer "well damn, no I'm not THAT sure".
okay, who's holding a gun to Sam Altman's head?
Some of the best exchanges that I participated in or witnessed involved people acknowledging their personal limits, including limits of conclusions formed a priori
To further the discussion, hearing the phrase you mentioned would help the listener to independently assess a level of confidence or belief of the exchange
But then again, honesty isn't on-brand for startups
It's something that established companies say about themselves to differentiate from competitors or even past behavior of their own
I mean, if someone prompted an llm weighted for honesty, who would pay for the following conversation?
Prompt: can the plan as explained work?
Response: I don't know about that. What I do know is on average, you're FUCKED.
Here in my country, English is not you'll hear in everyday conversation. Native English speakers account to a tiny percentage of population. Our language doesn't resemble English at all. However, English is a required subject in our mandatory education system. I believe this situation is quite typical across many Asian countries.
As you might imagine, most English teachers in public schools are not native speakers. And they, just like other language learners, make mistakes that native speakers won't make without even realizing what's wrong. This creates a cycle enforcing non-standard English pragmatics in the classroom.
Teachers are not to blame. Becoming fluent and proficient enough in a second language to handle questions students spontaneously throw to you takes years, if not decades of immersion. It's an unrealistic expectation for an average public school teacher.
The result is rich parents either send their kids to private schools or have extra classes taught by native speakers after school. Poorer but smart kids realize the education system is broken and learn their second language from Youtube.
-
What's my point?
When it comes to math/science, in my experience, the current LLMs act similarly to the teachers in public school mentioned above. And they're worse in history/economics. If you're familiar with the subject already, it's easy to spot LLM's errors and gather the useful bits from their blather. But if you're just a student, it can easily become a case of blind-leading-the-blind.
It doesn't make LLMs completely useless in learning (just like I won't call public school teachers 'completely useless', that's rude!). But I believe in the current form they should only play a rather minor role in the student's learning journey.
In my field there is also the moral/legal implications of generative AI.
Leanring what is like that? MIT open courseware has been available for like 10 years with anything you could want to learn in college
Textbooks are all easily pirated
It mostly isn't, the point of the good learning process is to invest time into verifying "once" and then add verified facts to the learning material so that learners can spend that time learning the material instead of verifying everything again.
Learning to verify is also important, but it's a different skill that doesn't need to be practiced literally every time you learn something else.
Otherwise you significantly increase the costs of the learning process.
I use LLMs but only for things that I have a good understanding of.
Not underrated at all. Lots of people were happy to abandon Stack Overflow for this exact reason.
> Adding in a mode that doesn't just dump an answer but works to take you through the material step-by-step is magical
I'd be curious to know how much this significantly differs from just a custom academically minded GPT with an appropriately tuned system prompt.
Now in regards to LLMs, I use them almost every day, so does my team, and I also do a bit of postmortem and reflection on what was accomplished with them. So, skeptical in some regards, but certainly not behaving like a Luddite.
The main issue I have with all the proselytization about them, is that I think people compare getting answers from an LLM to getting answers from Google circa 2022-present. Everyone became so used to just asking Google questions, and then Google started getting worse every year; we have pretty solid evidence that Google's results have deteriorated significantly over time. So I think that when people say the LLM is amazing for getting info, they're comparing it to a low baseline. Yeah maybe the LLM's periodically incorrect answers are better than Google - but are you sure they're not better than just RTFM'ing? (Obviously, it all depends on the inquiry.)
The second, related issue I have is that we are starting to see evidence that the LLM inspires more trust than it deserves due to its humanlike interface. I recently started to track how often Github Copilot gives me a bad or wrong answer, and it's at least 50% of the time. It "feels" great though because I can tell it that it's wrong, give it half the answer, and then it often completes the rest and is very polite and nice in the process. So is this really a productivity win or is it just good feels? There was a study posted on HN recently where they found the LLM actually decreases the productivity of an expert developer.
So I mean I'll continue to use this thing but I'll also continue to be a skeptic, and this also feels like kinda where my head was with Meta's social media products 10 years ago, before I eventually realized the best thing for my mental health was to delete all of them. I don't question the potential of the tech, but I do question the direction that Big Tech may take it, because they're literal repeat offenders at this point.
Fairly recent study on this: LLM's made developers slightly less productive, but the developers themselves felt more productive with them: https://www.theregister.com/2025/07/11/ai_code_tools_slow_do...
There is definitely this pain point that some people talk about (even in this thread) on how "well at least AI doesn't berate me or reject my answer for bureaucratic reasons". And I find that intriguing in a community like this. Even some extremely techy people (or especially?) just something just want to at best feel respected, or at worst want to have their own notions confirmed by someone they deem to be "smart".
>I don't question the potential of the tech, but I do question the direction that Big Tech may take it, because they're literal repeat offenders at this point.
And that indeed is my biggest reservation here. Even if AI can do great things, I don't trust the incentive models OpenAI has. Instead of potentially being this bastion of knowledge, it may be yet another vector of trying to sell you ads and steal your data. My BOTD is long gone now.
Beside there isn’t any of the usual drawback with privacy because no one care if OpenAI learn about some bullshit you were told to learn.
you didn't see the Hacker News threat talking about the ChatGPT subpeona, did you? I was a bit shocked that 1) a tech community didn't think a company would store data you submit to their servers and 2) that they felt like some lawyers and judges reading their chat logs was some intimate invasion of privacy.
Let's just say I certainly cannot be arsed to read anyone else's stream of conscious without being paid like a lawyer. I deal with kids and it's a bit cute when they babble about semi-coherent topics. An adult clearly loses that cute appeal and just sounds like a madman.
That's not even some dig, I sure suck at explaining my mindspace too. It's a genuinely hard skill to convert thoughts to interesting, or even sensible, communication.
On the flip, I prefer the human touch of the Kotlin, Python, and Elixir channels.
That trained and sharpened invaluable skills involving critical thinking and grit.
Here's what Socrates had to say about the invention of writing.
> "For this invention will produce forgetfulness in the minds of those who learn to use it, because they will not practice their memory. Their trust in writing, produced by external characters which are no part of themselves, will discourage the use of their own memory within them. You have invented an elixir not of memory, but of reminding; and you offer your pupils the appearance of wisdom, not true wisdom, for they will read many things without instruction and will therefore seem [275b] to know many things, when they are for the most part ignorant and hard to get along with, since they are not wise, but only appear wise."
https://www.historyofinformation.com/detail.php?id=3439
I mean, he wasn't wrong! But nonetheless I think most of us communicating on an online forum would probably prefer not to go back to a world without writing. :)
You could say similar things about the internet (getting your ass to the library taught the importance of learning), calculators (you'll be worse at doing arithmetic in your head), pencil erasers (https://www.theguardian.com/commentisfree/2015/may/28/pencil...), you name it.
What social value is an AI chatbot giving to us here, though?
>You could say similar things about the internet (getting your ass to the library taught the importance of learning)
Yes, and as we speak countries are determining how to handle the advent of social media as this centralized means of propaganda, abuse vector, and general way to disconnect local communities. It clearly has a different magnitude of impact than etching on a stone tablet. The UK made a particularly controversial decision recently.
I see AI more in that camp than in the one of pencil erasers.
Researching online properly requires cross referencing, seeing different approaches, and understanding various strenghts, weaknesses, and biases among such sources.
And that's for objective information, like math and science. I thought Grok's uhh... "update" shows enough of the dangers when we resort to a billionaire controlled oracle as a authoritative resource.
>Will some (most?) people rely on it lazily without using it effectively? Certainly, and this technology won't help or hinder them any more than a good old fashioned textbook.
I don't think facilitating bad habits like lazy study is an effective argument.And I don't really subscribe to this ineviability angle either: https://tomrenner.com/posts/llm-inevitabilism/
Even more important for me, as someone who did ask questions but less and less over time, is this: with GPTs I no longer have to the see passive-aggressive banner saying
> This question exists for historical reasons, not because it’s a good question."
all the time on other peoples questions, and typically on the best questions with the most useful answers there were.
As much as I have mixed feelings about where AI is heading, I’ll say this: I’m genuinely relieved I don’t need to rely on Stack Overflow anymore.
It is also deeply ironic how stackoverflow alienated a lot of users in the name of inclusion (the Monica case) but all the time they themselves were the ones who really made people like me uncomfortable.
Furthermore, forgetting curve is a thing and therefore having to piece information together repetitively, preferably in a structured manner, leads to a much better information retention. People love to claim how fast they are "learning" (more like consuming tiktoks) from podcasts at 2x speed and LLMs, but are unable to recite whatever was presented few hours later.
Third, there was a paper circulating even here on HN that showed that use of LLMs literally hinder brain activation.
high iq enough that they really find holes in the capabilities of LLMs in their industries
low eq enough that they only interpret it on their own experiences instead of seeing how other people's quality of life have improved
Knowing myself it perhaps wasn't that bad that I didn't have such tools, depends on the topic. I couldn't imagine ever writing a thesis without an LLM anymore.
I've learnt a great many things online, but I've also learnt a great many more from books, other people and my own experience. You just have to be selective. Some online tutorials are excellent, for example the Golang and Rust tutorials. But for other things books are better.
What you are missing is the people. We used to have IRC and forums where you could discuss things in great depth. Now that's gone and the web is owned by big tech and governments you're happy to accept a bot instead. It's sad really.
Also using OpenAI as a tutor means trawling incorrect content.
In my experience, most educational resources are either slightly too basic or slightly too advanced, particularly when you're trying to understand some new and unfamiliar concept. Lecturers, Youtubers and textbook authors have to make something that works for everybody, which means they might omit information you don't yet know while teaching you things you already understand. This is where LLMs shine, if there's a particular gap in your knowledge, LLMs can help you fill it, getting you unstuck.
Wonder what the compensation for this invaluable contribution was
but even with this feature in this very early state, it seems quite useful. i dropped in some slides from a class and pretended to be a student, and it handled questions reasonably. Right now it seems I will be happy for my students to use this.
taking a wider perspective, I think it is a good sign that OpenAI is culturally capable of making a high-friction product that challenges and frustrates, yet benefits, the user. hopefully this can help with the broader problem of sycophancy.
Importantly, these were _not_ critical questions that I was incorporating into any decision-making, so I wasn't having to double-check the AI's answers, which would make it tedious; but it's a great tool for satisfying curiosity.
> Under the hood, study mode is powered by custom system instructions we’ve written in collaboration with teachers, scientists, and pedagogy experts to reflect a core set of behaviors that support deeper learning including: encouraging active participation, managing cognitive load, proactively developing metacognition and self reflection, fostering curiosity, and providing actionable and supportive feedback.
I'm calling bullshit, show me the experts, I want to see that any qualified humans actually participated in this. I think they did their "collaboration" in ChatGPT which spit out this list.
LLM second killer application is for studying for a particular course or subject in which OpenAI ChatGPT is also now providing the service. Probably not the pioneer but most probably one of the significant providers upon this announcement. If in the near future GenAI study assistant can adopt and adapt 3 Blue One Brown approaches for more visualization, animation and interactive learning it will be more intuitive and engaging.
Please check this excellent LLM-RAG AI-driven course assistant at UIUC for an example of university course [1]. It provide citations and references mainly for the course notes so the students can verify the answers and further study the course materials.
[1] AI-driven chat assistant for ECE 120 course at UIUC (only 1 comment by the website creator):
Having experience teaching the subject myself, what I saw on that page is about the first five minutes of the first class of the semester at best. The devil will very much be in the other 99% of what you do.
human: damn kids are using this to cheat in school
openai: release an "app"/prompt that seems really close to solving this stated problem
kids: I never wanted to learn anything, I just want to do bare minimum to get my degree, let my parents think they are helping my future, and then i can get back to ripping that bong
<world continues slide into dunce based oblivion>
It doesn't matter the problem statement: the 80% or less solution seems can be made and rather quickly. Such a huge percentage of the population judges technology solutions as "good enough" way lower than they should. This is even roping in people from the past who used to be a higher level of "rigorous correctness" because they keep thinking, "damn just a bit more work and it will get infinity better, lets create the biggest economic house of cards this world will ever collapse under"
Sure, it was crafted by educational experts, but this is not a feature! It's a glorified constant!
Then I tried to migrate it to chat gpt to try this thing out, but seems to be like it’s just prompt engineering behind. Nothing fancy.
And this study mode is not only not available in chat gpt projects, which students need for adding course work, notes, transcripts.
Honestly, just release gpt-5!!!
If LLMs continue to improve, we are going to be learning a lot from them, they will be our internet search and our teachers. If we want to retain some knowledge for ourselves, then we are going to need to learn and memorize things for ourselves.
Integrating spaced-repetition could make it explicit which things we want to offload to the LLM, and which things we want to internalize. For example, maybe I use Python a lot, and occasionally use Pearl, and so I explictly choose to memorize some Python APIs, but I'm happy to just ask the LLM for reminders whenever I use Pearl. So I ask the LLM to setup some spaced repetition whenever it teaches me something new about Python, etc.
The spaced repetition could be done with voice during a drive or something. The LLM would ask the questions for review, and then judge how well we did in answering, and then the LLM would depend on the spaced-repetition algorithm to keep track of when to next review.
https://arxiv.org/abs/2409.15981
it is definitely a great use case for LLMs, and challenges the assumption that LLMs can only “increase brain rot” so to say.
So much that first method would take me an hour as opposed to an entire evening when reading/repeating.
Having such a tool would have been a game changer to me.
I don’t know tho if it’s possible to throw at it entire chapter of learning book.
What these really need IMO is an integration where they generate just a few anki flashcards per session, or even multiple choice quizzes that you can then review with spaced repetition. I've been doing this manually, but having it integrated would remove another hurdle.
On the other hand, I'm unsure whether we're training ourselves to be lazy with even this, in the sense of "brain atrophy" that's been talked about regarding LLMs. Where I used to need to pull information from several sources and synthesize my own answer by transferring several related topics onto mine, now I get everything pre-chewed, even if in the form of a tutor.
Does anyone know how this is handled with human tutors? Is it just that the time is limited with the human so you by necessity still do some of the "crawl-it-yourself" style?
the internet, wikipedia, SO, etc. all these things had the EXACT same arguments against it and guess what? people who want to use TOOLS that help them to study better will gain, and people who are lazy will ...be worse off as it has always been.
i don't know why i bother to engage in these threads except to offer my paltry 2 cents. for being such a tech and forward thinking community there's almost this knee jerk reaction against ANYTHING llm (which i suppose i understand). a lot of us are missing the forest for the trees here.
hahahacorn•17h ago
Happy Tuesday!
Spivak•17h ago
qeternity•16h ago
cma256•16h ago
There's a lot of specificity that AI can give over human instruction however it still suffers from lack of rigor and true understanding. If you follow well-trod paths its better but that negates the benefit.
The future is bright for education though.
bloomca•16h ago
Sure, for some people it will be insanely good: you can go for as stupid questions as you need without feeling judgement, you can go deeper in specific topics, discuss certain things, skip some easy parts, etc.
But we are talking about averages. In the past we thought that the collective human knowledge available via the Internet will allow everyone to learn. I think it is fair to say that it didn't change much in the grand scheme of things.
tempfile•16h ago
MengerSponge•16h ago
(Joke/criticism intended)