So, in learning environments we might not have an option but to open the floodgates to AI use, but abandon most testing techniques that are not, more or less, pen and paper, in-person. Use AI as much as you want, but know that as a student you'll be answering tests armed only with your brain.
I do pity English teachers that have relied on essays to grade proficiency for hundreds of years. STEM fields has an easier way through this.
Andrej and Garry Trudeau are in agreement that "blue book exams" (I.e. the teacher gives you a blank exam booklet, traditionally blue) to fill out in person for the test, after confiscating devices, is the only way to assess students anymore.
My 7 year old hasn't figured out how to use any LLMs yet, but I'm sure the day will come very soon. I hope his school district is prepared. They recently instituted a district-wide "no phones" policy, which is a good first step.
I guess high schools and junior highs will have to adopt something similar, too. Better condition those wrists and fingers, kids :-)
It's a shame that some students will again be limited by how fast they can get their thoughts down on a piece of paper. This is such an artificial limitation and totally irrelevant to real world work now.
This sounds as if you expect that it will become possible to access an LLM in class without a phone or other similar device. (Of course, using a laptop would be easily noticed.)
All for a calculator that can lie.
I'd be much more in favour of oral examinations. Yes, they're more resource-intensive than grading written booklets, but it's not infeasible. Separately, I also hope it might go some way to lessening the attitude of "teaching to the test".
Maybe this is a case for "learning styles", but it's probably logistically prohibitive to offer both options.
1. Corporate interests want to sell product 2. Administrators want a product they can use 3. Compliance people want a checkbox they can check 4. Teachers want to be ablet to continue what they have been doing thus far within the existing ecosystem 5. Parents either don't know, don't care, or do, but are unable to provide a viable alternative or, can and do provide it
We have had this conversation ( although without AI component ) before. None of it is really secret. The question is really what is the actual goal. Right now, in US, education is mostly in name only -- unless you are involved ( which already means you are taking steps to correct it ) or are in the right zip code ( which is not a guarantee, but it makes your kids odds better ).
This assumes we even need more Terence Taos by the time these kids are old enough. AI has gone from being completely useless to solving challening math problems in less than 5 years. That trajectory doesn't give me much hope that education will matter at all in a few years.
Putting aside the ludicrous confidence score, the student's question was: how could his sister convince the teacher she had actually written the essay herself? My only suggestion was for her to ask the teacher to sit down with her and have a 30-60 minute oral discussion on the essay so she could demonstrate she in fact knew the material. It's a dilemma that an increasing number of honest students will face, unfortunately.
Just speaking in general here -- I don't know what specific phrasing TurnItIn uses.
False positives with technology that is non-deterministic is guaranteed.
It's more than slightly comedic people being amazed when LLM math works as it's created to.
My point is that accuracy is a terrible metric here and sensitivity, specificity tell us much more relevant information to the task at hand. In that formulation, a specificity < 1 is going to have false positives and it isn't fair to those students to have to prove their innocence.
If we're being literal, accuracy is (number correct guesses) / (total number of guesses). Maybe the folks at turnitin don't actually mean 'accuracy', but if they're selling an AI/ML product they should at least know their metrics.
What happened is that I did a Q&A worksheet but in each section of my report I reiterated the question in italics before answering it.
The reiterated questions of course came up as 100% plagiarism because they were just copied from the worksheet.
Wow I'd have been screwed, so many of my high school papers were just rewrites and improvements on stuff I wrote in earlier years.
All it takes is one moron with power and a poor understanding of statistics.
It's shit software for schools and teachers to cover their ass. Nothing more, and deserves no more attention.
The professor noticed it (presumably via seeing poor "show your work") and gave zero points on the question to everyone. And once you went to complain about your grade, she would ask you to explain the answer there in her office and work through the problem live.
I thought it was a clever and graceful way to deal with it.
Tests were created to save money, more students per teacher, we're just going back to the older, actually useful, method of talking to people to see if they understand what they've been taught.
You weren't asked to write an essay because someone wanted to read your essay, only to intuit that you've understood something
I'm skeptical. Tests are a way of standardizing the curriculum and objectively determining if the lessons were learned.
The lesson of how to swim sometimes only comes in applying the learning.
Some tests require memorized knowledge, like what is the stall speed of your airplane. Some tests require reasoning skills, like what is the stress in this beam.
There are learning frameworks that explain it all well enough.
Learning what something is, vs applying what you learned has different terminology in the learning world, but not everyone might use that.
Personally I don't believe that any of the problems caused by AI are going to be solved by "more AI"
People do learn how to use the web, social media, mobile devices to ultimately work for them or against them.
How is this working out in practice? Every piece of technology is absolutely adversarial nowadays and people are getting ground to bits by it.
The world at large rarely accommodates shy people. Coping skills are essential, even if they are unpleasant.
They learned that cheating gives advantage to the cheating individual. They also learned that reporting cheating harms them and non cheaters.
In Germany, the traditional sharp-tongued answer of pupils to the question "How could both of you get the exact same WRONG answer (in the test)?" is: "Well, we both have the same teacher." :-)
How the heck is that even possible? :o
Because what the cheater is trying to accomplish is to avoid having to think.
It's an act motivated by either laziness, apathy or rebellion (or some combination thereof). Not motivated by trying to get a good grade.
He just goes to our local public elementary school.
Of course, when he asked for help with his practice problems I had no idea how they were meant to solve it so I taught him to solve it algebraically.
Knowing the way a lot of professors act, I'm not surprised, but it's always disheartening to see how many behave like petty tyrants who are happy to throw around their power over the young.
Since high school, the expectation is that you show your work. I remember my high school calculus teacher didn't even LOOK at the final answer - only the work.
The nice thing was that if you made a trivial mistake, like adding 2 + 2 = 5, you got 95% of the credit. It worked out to be massively beneficial for students.
The same thing continued in programming classes. We wrote our programs on paper. The teacher didn't compile anything. They didn't care much if you missed a semicolon, or called a library function by a wrong name. They cared if the overall structure and algorithms were correct. It was all analyzed statically.
1. they skip what are to them the obvious steps (we all do as we achieve mastery) and then get penalized for not showing their work.
2. they inherently know and understand the task abut not the mechanized minutia. Think of learning a new language. A diligent student can work through the problem and complete an a->b translation, then go the other way, and repeat. Someone with mastery doesn't do this; they think within one language and then only pass the contextual meaning back and forth when explicitly required.
"showing your work" is really the same thing as "explain how you think" and may be great for basics in learning, but also faces levels of abstraction as you ascend towards mastery.
Unless you're 100% sure that a student cheated, you don't punish them. And you don't ask them to prove they're innocent.
Because the teacher was knowingly giving zeroes to students who didn't cheat, and expecting them to take it upon themselves to reverse this injustice.
The teacher effectively filtered out the shy boys/girls who are not brave enough to "hustle." Gracefully.
The time spent challenging exam grades is usually better spent studying for the next exam. I've never gotten a significant grade improvement from it.
She didn't ask them to challenge them, she asked them additional questions. The test already asks them questions.
If you are really shy, a culture where no one cheats is far better because your actual ability and intelligence shines through
Cheaters and non cheaters were punished in exactly the same way. Effectively cheating gave you an advantage and being shy gave you disadvantage.
This has nothing to do with American Hustle culture and just with that professor's judgment.
> the final exam where somehow people got their hands on the hardest question of the exam.
They got the question but not the answer so they had to work it out before the test. They couldn't explain it later?
In general, I don’t really understand educators hyperventilating about LLM use. If you can’t tell what your students are independently capable of and are merely asking them to spit back content at you, you’re not doing a good job.
Sounds as though you do understand it.
If it looks like AI cheating software will be a problem for my children (and currently it has not been an issue), then I'm considering recording them doing all of their homework.
I suspect school admin only has so much appetite for dealing with an irate parent demanding a real time review of 10 hours of video evidence showing no AI cheating.
We are already (in the US) living in a system of soft social-credit scores administered by ad tech firms and non-profits. So “the algorithms says you’re guilty” has already been happening in less dramatic ways.
https://news.ycombinator.com/item?id=14285116 ('Justice.exe: Bias in Algorithmic sentencing (justiceexe.com)")
https://news.ycombinator.com/item?id=43649811 ("Louisiana prison board uses algorithms to determine eligility for parole (propublica.org)")
https://news.ycombinator.com/item?id=11753805 ("Machine Bias (propublica.org)")
> language models are more likely to suggest that speakers of [African American English] be assigned less-prestigious jobs, be convicted of crimes and be sentenced to death.
This one is just so extra insidious to me, because it can happen even when a well-meaning human has already "sanitized" overt references to race/ethnicity, because the model is just that good at learning (bad but real) signals in the source data.
The only thing prevents them from doing so is the fact Google is too big to sell a "plagiarism assistant."
- Write it in google docs, and share the edit history in the google docs, it is date and time stamped.
- Make a video of writing it in the google docs tab.
If this is available, and sufficient, I would pursue a written apology to remind the future detectors.
Edit: clarity
It should be way easier than TSA's goal because you don't need to stop cheaters. You instead just need to ensure that you seed skills into a minimal number of achievers so that the rest of the kids see what the real target of education looks like. Kids try their best not to learn, but when the need kicks in they learn way better spontaneously from their peers than any other method.
Of course, this all assumes an effective pre-K reading program in the first place.
Often it is more work to cheat than just learn it.
Pre-k is preschool aka kindergarten?
Is this really needed? It's really stressful for kids under 5 or 6 to read and is there a big enough statistical difference in outcome enough to rob them of some of their early youth?
I started reading around 6 years old and I was probably ahead of the vast majority of kids within 6 months.
Kids starting around 6 years old have much better focus and also greatly enhanced mental abilities overall.
If this is insufficient, then there are tools specifically for education contexts that track student writing process.
Detecting the whole essay being copied and pasted from an outside source is trivial. Detecting artificial typing patterns is a little more tricky, but also feasible. These methods dramatically increase the effort required to get away with having AI do the work for you, which diminishes the benefit of the shortcut and influences more students to do the work themselves. It also protects the honest students from false positives.
Keystroke dynamics can detect artificial typing patterns (copying another source by typing it out manually). If a student has to go way out of their way to make their behavior appear authentic then it's decreasing advantage of cheating and less students will do it.
If the student is integrating answers from multiple AI responses then maybe that's a good thing for them to be learning and the assessment should allow it.
The best solutions are in student motivations and optimal pedagogical design. Students who want to learn, and learning systems that are optimized for rate of learning.
online programs, limited infrastructure, dishonest students exploiting accessibility programs, are some examples where it's easier to say than do what you're suggesting.
Also AI can help students cheat in class too. Smart glasses, pens with cameras and LED screens on them (yes really), or just regular smart phones. Even switching to pen and paper won't reduce the ease of access.
Instructors don't want to police cheating, they want to teach (or do research). Either way, they don't want to police.
Students cheat when they think what they're learning is low value, the learning process is too clunky, or they place too high a value on the grade. All these imbalances can be improved with better pedagogy.
The only enduring way to actually solve the cheating crisis isn't to make it harder, it's to reduce the value of cheating. Everything else is either temporary or performative.
Manually re-typing another source is something these tools were originally designed to detect. The original issue was "essay mills", not AI.
I guess you could use AI to guide this, at which point it's basically a research tool and grammar checker.
Crude tools (like Google docs revision history) can protect an honest student who engages in a typical editing process from false allegations, but it can also protect a dishonest student who fabricated the evidence, and fail to protect an honest student who didn't do any substantial editing.
More sophisticated tools can do a better job of untangling the fractal, but as with fractal shaped problems the layers of complexity keep going and there's no perfect solutions, just tools that help in some situations when used by competent users.
The higher Ed professors who really care about academic integrity are rare, but they are layering many technical and logistical solutions to fight back against the dishonest students.
I guess some people can type out a 5,000 word assignment linearly from start to finish in 2 hours at 40wpm but that's both incredibly rare and easy to verify upon further investigation.
It's only an obvious choice if you have total faith that your teacher will be fair, which you might doubt if the situation starts with "You're a cheater unless you prove me otherwise". In the worse case scenario you'll be grilled for one hour and still be marked as a cheater because you didn't convince the teacher.
Once this becomes routine the class can become e.g. 10 minutes conversation on yesterday's topic, 40 minutes lecturing and live exercises again. Which is really just reinventing the "daily quiz" approach, but again the thing we are trying to optimize for is compliance.
Honestly, students should have a course in "how the justice system works" (or at least should work). So should the teachers.
Student unions and similar entities should exist and be ready to intervene to help students in such situations.
This is nothing new, AI will just make this happen more often, revealing how stupid so many teachers are. But when someone spent thousands for a tool, which purports to be reliable, and is so quick to use, how can an average person resist it? The teacher is as lazy as the cheaters they intend to catch.
We learned how government and justice worked.
The only way to reliably prevent the use of AI tools without punishing innocent students is to monitor the students while they work.
Schools can either do that by having essays be written on premise, either by hand or by using computers managed by the school.
But students that are worried that they will be targeted can also do this themselves, by setting up their phone to film them while working.
And if they do this, and the teacher tries to punish someone who can prove they wrote the essay themselves, either the teacher or the school should hopefully learn that such tools can't be trusted.
And to add to that, there should be a justice system there. The idea of due process is laughable in most educational settings.
Bizzare and unfair
https://decrypt.co/286121/ai-detectors-fail-reliability-risk...
I could see why he didn’t, so I wasn’t offended or defensive and started to tell him the steps required to build web apps and explained it in a manner he could understand using analogies. Towards the end of our conversation he could see I both knew about the topic and was enthusiastic about it. I think he was still a bit shocked that I wrote that paper, but he could tell from the way I talked about it that it was authentic.
It will be interesting to see how these situations evolve as AI gets even better. I suspect assessment will be more manual and in-person.
To wit, show the teacher that YOU did the work and not someone else. If the teacher is not willing to do this with every student they accuse of malfeasance, they need to find another job. They're lazy as hell and suck at teaching.
Computer, show "my" work and explain to the teacher why "I" wrote what "I" did, describe why that particular approach to the narrative appealed to "me" and "I" chose that as the basis of "my" work. Produce an outline on which the paper could have been based and possible rough drafts, then explain how I could have revised the work to produce the final result.
This sounds like, a good solution? It’s the exception case, so shouldn’t be constant (false positives), although I suppose this fails if everyone cheats and everyone wants to claim innocence.
I guess we could go back to giving exams soviet Russia style where you get a couple of questions that you have to answer orally in front of the whole class and that’s your grade. Not fun…
You can’t keep hiding behind being an introvert your whole life.
2. Speaking about your work in front of 1-2-5 people is one thing, but being tested in front of an entire class (30 people?) is a totally different thing.
It’s not that different from speaking up in your neighborhood group.
Speaking in front of people and advocating for yourself is a core human skill. If you don’t practice it you’re setting yourself up for failure.
I would even say it has nothing to do with introverts and extroverts.
In this particular resolution example, it would be quicker to ask the student some probing questions versus have them re-write (and potentially regurgitate) an essay.
For exams you’d need a proctored environment of some sort, say a row of conference booths so students can’t just bring notes.
You’d want to have some system for ephemeral recording so the teachers can do a risk-based audit and sample some %, eg one-two questions from each student.
Honestly for regular weekly assignments you might not even need the heavyweight proctoring and could maybe allow notes, since you can tell if someone knows what they are talking about in conversation , it’s impossible to crib-sheet your way to fluent conversational understanding.
So why is the issue you described an issue? Because it's about a grade. And the reason that's relevant is because that credential will then be used to determine where she can to to university which, in turn, is a credential that will determine her breadth of options for starting her career, and so on. But why is this all done by credentials instead of simple demonstrations of skill? What somebody scored in a high school writing class should matter far less than the output somebody is capable of producing when given a prompt and an hour in a closed setting. This is how you used to apply to colleges. Here [1], for instance, is Harvard's exam from 1869. If you pass it, you're in. Simple as that.
Obviously this creates a problem of institutions starting to 'teach the test', but with sufficiently broad testing I don't see this as a problem. If a writing class can teach somebody to write a compelling essay based on an arbitrary prompt, then that was simply a good writing class! As an aside this would also add a major selling point to all of the top universities that offer free educational courses online. Right now I think 'normal' people are mostly disinterested in those because of the lack of widely accepted credentials, which is just so backwards - people are actively seeking to maximize credentials over maximizing learning.
This is one of the very few places I think big tech in the US has done a great job. Coding interviews can be justifiably critiqued in many ways, but it's still a much better system than raw credentialization.
[1] - https://graphics8.nytimes.com/packages/pdf/education/harvard...
Just so we're clear, the coding tests are in addition to credentialisation. I'll never forget when I worked at Big Tech (from Ireland) and I would constantly hear recruiters talk about the OK school list (basically the Ivy league). Additionally, I remember having to check the University a candidate had attended before she had an interview with one of our directors.
He was fine with her, because she had gone to Oxford. Honestly, I'm surprised that I was able to get hired there given all this nonsense.
I'm a drop out (didn't finish BSc) from a no name Northern European university and I've worked at or gotten offers from:
- Meta
- Amazon
- Microsoft
- Uber
- xAI
+ some unicorns that compete with FAANG+ locally.
I didn't include some others that have reached out for interviews which I declined at the time. The lack of a degree has literally never come up for me.
It seems to be a US role thing in my experience.
In a way, I think the hiring process at second-tier (not FAANG) companies is actually better because you have to "moneyball" a little bit - you know that you're going to lose the most-credentialed people to other companies that can beat you dollar for dollar, so you actually have to think a little more deeply about what a role really needs to find the right person.
Sure, but it takes < 1 second to read a GPA.
We need some way to distill the unbelievable amount of data in human brains into something that can be processed in a reasonable amount of time. We need a measurement - a degree, a GPA, something.
Imagine if in every job interview they could assume absolutely nothing. They know nothing about your education. They might start by asking you to recite your ABCs and then, finally at sunset, you might get to a coding exam. Which still won't work, because you'll just AI cheat the coding exam.
We require gatekeepers to make the system work. If we allow the gatekeepers to just rubber stamp based off of if stuff seems correct, that tells us nothing about the person itself. We want the measurement to get close to the real understanding.
That means AI papers have to be given a 0, which means we need to know if something is AI generated. And we want to catch this at the education level, not above.
But assuming in-person day long batteries of tests for universities and companies is probably not very practical.
You can argue whether university is a very efficient use of time or money but it presumably does involve some learning and offers potential employers some level of a filter that roughly aligns with what they're looking for.
We should expect this if employers can efficiently and objectively evaluate a candidate's skills without relying on credentials. When they're unable to, we should worry about this information asymmetry leading to a "market for lemons" [0]. I found an article [1] about how this could play out:
> This scenario leads to a clear case of information asymmetry since only the graduate knows whether their degree reflects real proficiency, while employers have no reliable way to verify this. This mirrors the classic “Market for Lemons” concept introduced by economist George Akerlof in 1970, where the presence of low-quality goods (or in this case, under-skilled graduates) drives down the perceived value of all goods, due to a lack of trustworthy signals.
[0] https://quickonomics.com/terms/market-for-lemons/
[1] https://competitiveness.in/how-ai-could-exacerbate-the-skill...
I mean, it certainly seems to. I've been in hiring roles in tech for 20-ish years, and have definitely seen changes in how college hire patterns based on credential values. Some schools have gone way up in how much we value their credentials (Waterloo), some have gone somewhat down (MIT), etc.
Its also the only way that students can actually be held to the same standards. When I was a freshman in college with a 3.4 highschool GPA, I was absolutely gobsmacked by how many kids with perfect >= 4.0 GPAs couldn't pass the simple algebra test that the university administered to all undergraduates as a prerequisite for taking any advanced mathematics course.
Goodhart's law.
In education, regarding exams, Goodhart's law just means that you should randomize your test questions instead of telling the students the questions before the exam. Have a wide set of questions, randomize them. The only way for students to pass is to learn the material.
A randomized standardized test is not more susceptible to Goodhart's law than a randomized personal test. The latter however has many additional problems.
"The only way for students to pass is to learn the material."
Part of Goodhart's law in this context is precisely that it overdetermines "the material" and there is no way around this.
I wish Goodhart's law was as easy to dodge as you think it is, but it isn't.
School needs to provide opportunities to practice applying important skills like empathy, tenacity, self-regulation, creativity, patience, collaboration, critical thinking, and others that cannot be assessed using a multiple choice quiz taken in silence. When funding is tied to performance on trivia, all of the above suffers.
They promise education, but really they give you tests and scores
And they predictin' prison population by who scoring the lowestRun The Jewels - Walking In The Snow
> https://www.youtube.com/watch?v=6-M15L4BTqI
> https://genius.com/Run-the-jewels-walking-in-the-snow-lyrics
If you dont even know that the american civil war ended in 1865 how could you do any meaningful analysis on its downstream implications or causes and its relationship to other events.
I'd imagine millions if not billions of people have found basic math useful without ever learning what "commutative" even means.
Also the teachers have a vested interest in giving the highest grades they can to as many students as they can without making it obvious that they aren't actually grading them fairly. i don't mean this as an accusation against anybody or some sort of insult against teachers as a whole, I merely mean to point out that this is what they are incentized to do by virtue of the fact that they are indirectly grading themselves by grading their students.
Parents of the children who can't pass the standardized tests also get a vote.
I wish I would agree with you, but I think that having a degree (or rather the right degree) is more important than ever.
Basically grades exist to decide who gets a laid back high paying job, and who has to work 2 low paying labor intensive job just to live paycheck to paycheck.
As one teacher told me once: we could have all of you practice chess, make a big tournament and you get to choose your university based on your chess ranking. It wouldn't be any less stupid than the current system.
Rather:
Grades exist to decide who gets a stressful, but rather high-paying job, and who has to work 2 low paying labor intensive job just to live paycheck to paycheck.
Laid-back high-paying job (unluckily) barely exist.
I mean, what is the problem? It's my report! I know all the ins and outs, I take full responsibility for it. I'm the one taking this to the board of directors who will grill me on all the details. I'm up for it. So why is this so "not done"? Why do you assume I let the AI do the "thinking"? I'm appalled by your lack of trust in me.
If no, why not?
Personally I would rather read a human's output than their selection of machine outputs.
Nowadays, often I put my text into the LLM, and say: Make more concise, include all original points, don't be enthusiastic, use business style writing. And then it will come with some lines of which I think: Yes! That is what I meant!
I can't imagine you'd rather read my Dunglish. Sure, I could have "studied harder", but one simply is just much more clever in their native tongue, I know more words, more subtleties etc. Over time, and I believe due to LLM use I do get better at it myself! It's a language model after all, not a facts model. I can trust it to make nice sentences.
I understand the sentiment, even appreciate it, but there are books that draw you into a story when your eyes hit the paper, and there are books that don't and induce yawning instead (on the same topic). That is a skill issue.
Perhaps I should add that using the LLM does not make me faster in any way, maybe even slower. But it makes the end results so much more pleasant.
"If I Had More Time, I Would Have Written a Shorter Letter". Now I can, but in similar time.
Recently there was a non-native english speaker heavily using an LLM to review their answers on a Show HN post, and it was incredibly annoying. The author did not realize (because of their lack of skills in the language) but the AI-edited version felt fake and mechanical in tone. In that case yes, the broken original is better because it preserves the humanity of the original answers, mistakes and all.
You know maybe it is annoying for native speakers to pick up subtle AI signals, but for non-natives it can be annoying to find the correct words that express what you want to say as precisely as in your mother tongue. So don’t judge too much. It’s an attempt at better communication as well.
Perhaps it's an artifact of LLMs being trained on terabytes of autistic internet commenters like me. Maybe being detected as AI by Turnitin even has some diagnostic value.
One of the funniest things was being accused of plagiarising Wikipedia, when I'd actually written most of the Wikipedia article on said subject. The irony... Wikipedia doesn't just use unpaid labour, it ends up undermining the people who wrote it.
Surely it would be relatively easy to offer to show the edit history to prove that you actually contributed to the article? And, by doing so, would flip the situation in your favour by demonstrating your expertise?
The fact that you should have to is pretty annoying but also fairly edge case. And if a teacher or institute refuses to review that evidence then I don't think the credential on the table worth the paper it's printed on anyway.
It turned out he ran it through a plagiarism detector and multiple lines of code where identical to lines in their database.
It was very silly because there’s a lot of boiler plate code in win32 projects
Of course, there will be complaints from many students. However, as a prof for decades, I can say that some will prefer an exam-based solution. This includes the students who are working their way through university and don't have much time for busy-work, along with students who write their essays themselves and get lower grades than those who do not.
but if u talk like this boss i had, then obv ur a human, kthx
Great incentives. /s
She can't because she didn't write the essay herself, obviously.
This reminds me of when GPS routing devices first came onto the scene. Lots of people drove right into a lake or ocean because the device said keep going straight. (because of poorly classified multi-modal routing data)
The great thing about AI is that with a bit of imagination it can be used to amplify teachers too.
In this case, yes, you need to do a viva voce to convince the teacher (though I suspect they should be able to get fairly confident in 10-15 minutes).
But you could also have students convince an AI (probably in a proctored space?) if you need to scale this approach out.
I think AI got me some brain rot as I concern to finish stuff on time and I can't bare to spend brain energy on that (and spend on it anyway because AI sucks)
He also told me that he had in fact used AI, but asked AI multiple times to simplify the text, and he had entered the simplified version. He liked the first version best, but was aware his teacher would consider it written by AI.
Guess the teachers have already lost...
"AI detection" wasn't even a solution in the short term and it won't be going forward. Take-home essays are dead, the teachers are collectively just hoping some superhero will swoop in and somehow save them. Sometimes such a thing is possible, but it isn't going to happen this time.
Bart Simpson, we need you.
It is beginning to become an awful situation where these companies are selling tools that undermine the student. Education is suppose to be the great equalizer in society, not another toggle or tool for oppression.
Zero homework grades will be ideal. Looking forward to this.
1. Assume printing press exists 2. Now there's no need for a teacher to stand up and deliver information by talking to a class for 60 mins 3. Therefore students can read at home (or watch prepared videos) and test their learning in class where there's experts to support them 4. Given we only need 1 copy of the book/video/interactive demo, we can spend wayyyyy more money making it the best it can possibly be
What's sad is it's 500 years later and education has barely changed
From my extensive experience of four years of undergrad, the problem in your plan is "3. Therefore students can read at home " - half the class won't do the reading, and the half that did won't get what it means until they go to lecture[1].
[1] If the lecturer is any good at all. If he spends most of his time ranting about his ex-wife...
Granted, this was much less the case in grade school - but if students are going to see homework for the first time in college, I can see problems coming up.
If you got rid of homework throughout all of the "standard" education path (grade school + undergrad), I would bet a lot of money that I'd be much dumber for it.
If the concept is too foreign for them, I'm sure we could figure out how to replicate the grade school environment. Give them their 15 hours/week of lecture, and then lock them in a classroom for the 30 hours they should spend on homework.
I think that AI has the possibility of weakening some aspects of education but I agree with Karpathy here. In class work, in person defenses of work, verbal tests. These were corner stones of education for thousands of years and have been cut out over the last 50 years or so outside of a few niche cases (Thesis defense) and it might be a good thing that these come back.
Source: I did prépa.
I did nothing in high school and then by 19 for fun on Saturdays I was checking out 5 non-fiction books from the library and spending all Saturday reading.
There was no inspiring teacher or anything like that for me that caused this. At 16 I only cared about girls and maneuvering within the high school social order.
The only thing I can think of that would have changed things for me is if the math club were the cool kids and the football team were the outcasts.
At 16 anything intellectual seemed too remote to bother. That is why I would suspect the real variable is ultimately how much the parents care about grades. Mine did not care at all so there was no way my 16 year old self was going to become intrinsically motivated to grow intellectually.
All AI would have done for me in high school would have been swapping a language model for copying my friend's homework.
For background I grew up in the US, my wife grew up in China. And how she grew up (in a high tier Shanghai Highschool) she says that is kind of how it was. Top social order was basically Rich and politically connected (not different from anywhere I guess) but also really good students. Where the best students are looked up to. But also just everyone asks you how you do all in school all of the time. There are students who focus more on sports and go to sports schools, but unless they end up going to the Olympics or something, its really looked down upon compared to those who specialize in STEM or more difficult subjects.
In my high school, honors/AP students weren't outcasts, we were kind of just a separate set mostly our own clique with some being popular and some not independently of being AP students. Like I happened to be Football Team Captain and in AP classes, 3 other Captains weren't in AP. Academic success was just a non factor.
Talking to students in order to gauge their understanding is not as easy or reliable as some people make it out to be.
There are many students who are basically just useless when required to answer on the spot, some of whom likely to score top-of-the-class given an assignment and time to work on it alone (in a proctored setting).
And then there are students whom are very likable and swift to pick up on subtle signals the examiners might be giving of, and constantly adjusting course accordingly.
Grading objectively is extremely hard when doing oral exams. Especially when when you're doing them back-to-back for an entire workday, which is quite likely to happen if most examination is to be done in this way.
On the other hand, I had a neighbour ask me if he can make his 1 month apprenticeship when he finished his 3rd year of CS High School (eg ~18 years old, 3 of 4 years of 'CS trade school') 6 months ago or so. I was totally gobsmacked by his lack of basic understanding of how computers work, I am confident that he did not confidentially know the difference between a file and a folder. But he was very confident in the AI slop he produced. I had a grand plan of giving him tasks that would show him the pitfalls of AI -> no need for that, he blindly copied whatever AI gave him (he did not figure out Claude Code exsists), even when the results were very visibly bad - even from afar. I tried explaining stuff to him to no avail. I know this is a sample size of 1, but damn, I did not expect it to be that bad.
GPT 5.1 Pro made the same mistake ("Face the legs away from the door.") Claude Sonnet 4.5 agreed but added "Note: Most toaster ovens max out around 10-12 pounds for a whole turkey."
Gemini 3 acknowledged that toaster ovens are usually very compact and that the legs shouldn't be positioned where they will touch the glass door. When challenged, it hand-waved something to the effect of "Well, some toaster ovens are large countertop convection units that can hold up to a 12-pound turkey." When asked for a brand and model number of such an oven, it backtracked and admitted that no toaster oven would be large enough.
Changing the prompt to explicitly specify a 12-pound turkey yielded good answers ("A 12-pound turkey won't fit in a toaster oven - most max out at 4-6 pounds for poultry. Attempting this would be a fire hazard and result in dangerously uneven cooking," from Sonnet.)
So, progress, but not enough.
I don't yet know how we get AI to teach unruly kids, or kids with neurodivergencies. Perhaps, though, the AI can eventually be vastly superior to an adult because of the methods it can use to get through to the child, keep the child interested and how it presents the teaching in a much more interactive way.
A much bigger question is what to teach assuming we get models much more powerful than those we have today. I'm still confident there's an irreducible hard core in most subjects that's well worth knowing/training, but it might take some soul searching.
That is just such a wildly cynical point of view, and it is incredibly depressing. There is a whole huge cohort of kids out there who genuinely want to learn and want to do the work, and feel like using AI is cheating. These are the kids who, ironically, AI will help the most, because they're the ones who will understand the fundamentals being taught in K-12.
I would hope that any "solution" to the growing use of AI-as-a-crutch can take this cohort of kids into consideration, so their development isn't held back just to stop the less-ethical student from, well, being less ethical.
Well, it seems the vast majority doesn't care about cheating, and is using AI for everything. And this is from primary school to university.
It's not just that AI makes it simpler, so many pupils cannot concentrate anymore. Tiktok and others have fried their mind. So AI is a quick way out for them. Back to their addiction.
There’s a reason this stuff is banned in China. Their pupils suffer no such opiate.
Whatever solution we implement in response to AI, it must avoid hurting the students who genuinely want to learn and do honest work. Treating AI detection tools as infallible oracles is a terrible idea because of the staggering number of false reports. The solution many people have proposed in this thread, short one-on-one sessions with the instructor, seems like a great way to check if students can engage with and defend the work they turned in.
There was a reddit thread recently that asked the question, are all students really doing worse, and it basically said that, there are still top performers performing toply, but that the middle has been hollowed out.
So I think, I dunno, maybe depressing. Maybe cynical, but probably true. Why shy away from the truth?
And by the way, I would be both. Probably would have used AI to further my curiosity and to cheat. I hated school, would totally cheat to get ahead, and am now wildly curious and ambitious in the real world. Maybe this makes me a bad person, but I don't find cheating in school to be all that unethical. I'm paying for it, who cares how I do it.
People aren't one thing.
I'm curious if we instead gave students an AI tool, but one that would intentionally throw in wrong things that the student had to catch. Instead of the student using LLMs, they would have one paid for by the school.
This is more brainstorming then a well thought-out idea, but I generally think "opposing AI" is doomed to fail. If we follow a montessori approach, kids are naturally inclined to want to learn thing, if students are trying to lie/cheat, we've already failed them by turning off their natural curiosity for something else.
AI _do_ currently throw in an occasional wrong thing. Sometimes a lot. A students job needs to be verifying and fact checking the information the AI is telling them.
The student's job becomes asking the right questions and verifying the results.
Learning how to prepare for in-class tests and writing exercises is a very particular skillset which I haven't really exercised a lot since I graduated.
Never mind teaching the humanities, for which I think this is a genuine crisis, in class programming exams are basically the same thing as leetcode job interviews, and we all know what a bad proxy those are for "real" development work.
Preparing for a test requires understanding what the instructor wants. concentrate on the wrong thing get marked down.
Same applies to working in a corporation. You need to understand what management wants. It’s a core requirement.
Confusing university learning for "real industry work" is a mistake and we've known it's a mistake for a while. We can have classes which teach what life in industry is like, but assuming that the role of university is to teach people how to fit directly into industry is mistaking the purpose of university and K-12 education as a whole.
Writing long-form prose and essays isn't something I've done in a long time, but I wouldn't say it was wasted effort. Long-form prose forces you to do things that you don't always do when writing emails and powerpoints, and I rely on those skills every day.
The key issue with schools is that they crush your soul and turn you into a low-agency consumer of information within a strict hierarchy of mind-numbing rules, rather than helping you develop your curiosity hunter muscles to go out and explore. In an ideal world, we would have curated gardens of knowledge and information which the kids are encouraged to go out and explore. If they find some weird topic outside the garden that's of interest to them, figure out a way to integrate it.
I don't particularly blame the teachers for the failings of school though, since most of them have their hands tied by strict requirements from faceless bureaucrats.
Doing derivatives, learning the periodic table, basic language and alphabet skills, playing an instrument are foundational skills that will require deliberate practice to learn, something that isn't typically part of project based learning. At some point in education with most fields, you will have to move beyond concepts and do some rote memorization and repetition of principles in order to get to higher level concepts. You can't gamify your way out of education, despite our best attempts to do so.
AI has potential to smooth out all curves so that students can learn faster and maximize time in flow.
I've spent literally thousands of hours thinking about this (and working on it). The future of education will be as different from today as today is to 300 years ago.
Kids used to get smacked with a stick if they spelled a word wrong.
People thought the threat of physical violence was a good way to teach. We have learned better. What else is there for us to learn? What have we already learned but just don't have the resources to apply?
I've met many educators who have told me stories of ambitions learning goals for students that didn't work because there weren't the time or resources to facilitate them properly.
Often instructors are stuck trading off between inauthentic assessments that have scalable evaluation methods or authentic exercises that aren't feasible to evaluate at scale and so evaluation is sparse, incomplete or students only receive credit for completion.
In software engineering we often come across build environments that make code iteration really difficult and slow, and speeding up that iteration cycle usually results in being able to experiment more and ship faster changes.
I don't know if we'll ever be successful, but the entire point of gamification is to make the rote parts more palatable. A lot of gamification techniques try to model after MMO gaming for a reason, as that's a genre where people willingly subject themselves to a lot of rote tasks.
Not something everyone learns. My kids seemed to enjoy it. My older daughter learned quite a lot of algebra etc. by doing physics.
> learning the periodic table
You do not need to rote learn all of it, and you remember enough by learning about particular elements etc.
> basic language and alphabet skills
My kids learned to read through firstly reading with me (or others) so enjoying the story and learning words as we went and guessing words on flashcards. Then on to reading because they linked it.
Admittedly none of the above was in school, but my point is that its not intrinsic to learning.
> At some point in education with most fields, you will have to move beyond concepts and do some rote memorization and repetition of principles in order to get to higher level concepts.
Not a great deal and it does not feel like as much of a grind if you enjoy the subject and know where you are going.
You know what kick started my kid's ability to read? A reading teacher sitting with her every single day and teaching her explicitly the drudgery of what reading was. And then me doing the same at home.
Rote is for kids like this and a lot of kids have areas like this. No my kid doesn't need as much math facts practice as she gets. But her cousin? That kid isn't learning anything without doing lines about how to add.
> A reading teacher sitting with her every single day and teaching her explicitly the drudgery of what reading was
Individual attention makes a huge difference.
If it's something I need to do regularly, I eventually learn it through natural repetition while working towards the high level goal I was actually trying to achieve. Derivatives were like this for me. I still don't fully know the periodic table though, because it doesn't really come up in my life; if it's not something I need to do regularly, I just don't learn it.
My guess is this doesn't work for everything (or for everyone), and it probably depends on the learning curve you experience. If there are cliff edges in the curve that are not aligned with useful or enjoyable output, dedicated practice of some sort is probably needed to overcome them, which may take the form of rote learning, or, maybe better, spaced repetition or quizzing or similar. However at least for me, I've not encountered anything like that.
If I was to speculate why rote learning doesn't work well for me, I don't seem to experience a feeling of reward during it, and it seems like my ability to learn is really heavily tied somehow to that feeling. I learn far more quickly if it's a problem I've been struggling with for a while that I solve, or it's a problem I really wanted to solve, as the reward feeling is much higher.
Doing rather than memorizing outdated facts in a textbook.
All of schooling breaks down to costs and society’s willingness and desire to invest in child nutrition, education, and training.
We simply do not even have the wherewithal to have the conversation about it, without getting blackholed by cultural minefields and assumptions of child rearing, parental responsibility, morality and religion.
I soon changed my mind; I think those of us who become expert have often have really rich memories of a project where we learnt so much, but we just don't remember episodically all the accumulated learning that happened in boring classrooms to enable the project-induced higher order synthesis.
The problem is that the structure pushes for teaching productivity which basically directly opposes good pedagogy at this point in the optimization.
Some specifics:
1. Multiple choice sucks. It's obvious that written response better evaluates students and oral is even better. But multiple choice is graded instantly by a computer. Written response needs TAs. Oral is such a time sink and needs so many TAs and lots of space if you want to run them in parallel.
1.5 Similarly having students do things on computers is nice because you don't have to print things and even errors in the question can be fixed live and you can ask students to refresh the page. But if the chatbots let them cheat too easily on computers doing hand written assesments sucks cause you have to go arrange for printing and scanning.
2. Designing labs is a clear LLM tradeoff. Autograded labs with testbenches and fill in the middle style completetions or API completetions are incredibly easy to grade. You just pull the commit before some specific deadline and run some scripts.
You can do 200 students in the background when doing other work its so easy. But the problem is that LLMS are so good at fill in the middle and making testbenches pass.
I've actually tried some more open ended labs before and its actually very impressive how creative students are. They are obviously not LLMs there is this diversity in thought and simplicity of code that you do not get with ChatGPT.
But it is ridiculously time consuming to pull people's code and try to run open ended testbenches that they have created.
3. Having students do class presentations is great for evaluating them. But you can only do like 6 or 7 presentations in a 1 hr block. You will need to spend like a week even in a relatively small class.
4. What I will say LLMs are fun for are having students do open ended projects faster with faster iterations. You can scope creep them if you expect expect to use AI coding.
Can AI not grade written responses?
I was using a local LLM around 4B to 14B, I tried Phi, Gemma, Qwen, and LLama. The idea was to prompt the LLM with the question, the answer key/rubric, and the student answer. The student answer at the end did some prompt caching to make it much faster.
It was okay but not good, there were a lot of things I tried:
* Endlessly messing with the prompt. * A few examples of grading. * Messing with the rubric to give more specific instructions. * Average of K. * Think step by step then give a grade.
It was janky and I'll throw it up to local LLMs at the time being somewhat too stupid for this to be reasonable. They basically didn't follow the rubric very well. Qwen in particular was very strict giving zeros regardless of the part marks described in the answer key as I recall.
I'm sure with the correct type of question and correct prompt and a good GPU it could work but it wasn't as trivially easy as I had thought at the time.
Since the testing tool they use does notice and register 'paste'-events they've resorted to simply assigning 0 points to every answer that was pasted.
A few of us have been telling her to move to in-class testing etc. but like you also notice everything in the school organization pushes for teaching productivity so this does require convincing management / school board etc. which is a slow(er) process.
School is packed with inefficiency and busywork that is completely divorced from the way people learn on their own. In fact, it's pretty safe to say you could learn something about 10x by typing it into an AI chat bot and having it tailor the experience to you.
It seems like AI will destroy education but it's only breaking the old education system, it will also enable a new and much better one. One where students make more and faster progress developing more relevant and valuable skills.
Education system uses multiple choice quizzes and tests because their grading can be automated.
But when evaluation of any exercise can be automated with AI, such that students can practice any skill with iterative feedback at the pace of their own development, so much human potential will be unlocked.
No, lots of classes are focused on producing papers which aren't just memorization and regurgitation, but generative AI is king at... Generating text... So that class of work output is suspect now
It's the softer, no memorizing, no tests, just assignments that you can hand in at anytime because there's no deadlines, and grades don't matter, type of education that is particularly useless with AI.
This applies both to education, and to what people need to know to do work. Knowing all the written stuff is less valuable. Automated tools can been able to look it up since the Google era. Now they can work with what they look up.
There was a time when programmers poured over Fundamental Algorithms. No one does that today. When needed, you find existing code that does that stuff. Probably better than you could write. Who codes a hash table today?
> focused upon memorization and regurgitation
This is what is easy to test in-class.
Teachers worry about AI because they do not just care about memorization. Before AI, being able to write cohesive essays about a subject is a good proxy to prove your understanding beyond simple memorization. Now it's gone.
A lazy, irresponsible teacher who only cares about memorization will just grade students via in-class multi choices tests exclusively and call it a day. They don't need to worry about AI at all.
Take-homes were never a good proxy for anything because any student can pay for private "lessons" and get their homework done for them.
> A lazy, irresponsible teacher who only cares about memorization will just grade students via in-class multi choices tests exclusively and call it a day. They don't need to worry about AI at all.
What stops a diligent responsible teacher from doing in-class essays?
Who do you think will "learn" archery quicker? The kid writing an essay about it or the kid shooting a bow?
> Who do you think will "learn" archery quicker? The kid writing an essay about it or the kid shooting a bow?
The kid who imitates good archer's posture and motion.
But the foundations start with memorisation.
Effect of AI applied to coding is precisely the opposite though?
AI code review has unquestionably increased the quality of my code by helping me find bugs before they make it to production.
AI coding tools give me speed to try out more options to land on a better solution. For example, I wrote a proxy, figured out problems with that approach, and so wrote a service that could accomplish the same thing instead. Being able to get more contact with reality, and seeing how solutions actually work before committing to them, gives you a lot of information to make better decisions.
But then you still need good practices like code review, maintaining coding standards, and good project management to really keep code quality high. AI doesn’t really change that.
AI helps people more that "write" (i.e. generate) low-quality code than people who write high-quality code. This means AI will lead to a larger percentage of new code being low-quality.
This will _never_ happen. Output will increase and quality will decrease.
It's simply too complex to fix. I think we'll see increased investment by corporates who do keep hiring on remediating the gaps in their workforce.
Most elite institutions will probably increase their efforts spent on interviewing including work trials. I think we're already seeing this with many of the elite institutions talking about judgment, emotional intelligence critical thinking as more important skills.
My worry is that hiring turns into a test of likeability rather than meritocracy (everyone is a personality hire when cognition is done by the machines)
Source: I'm trying to build a startup (Socratify) a bridge for upskilling from a flawed education system to the workforce for early stage professionals
Not sure if those two were just old school (they'd occasionally hit us/ pull ears too) but damn was I ahead of all the other kids when I came back to the USA
Also the two teachers had the same class of kids for all of elementary school, teaching 1st through 5th grades sequentially So they got to know the kids quite well
So it is feasible (in principle) to give every student a different exam!
You’d use AI to generate lots of unique exams for your material, then ensure they’re all exactly the same difficulty (or extremely extremely close) by asking an LLM to reject any that are relatively too hard or too easy. Once you have generated enough individual exams, assign them to your students in your no-AI setting.
Code that the AI writes would be used to grade them.
- AI is great at some things.
- Code is great at other things.
- AI is bad at some things code is great for.
- AI is great at coding.
Therefore, leverage AI to quickly code up deterministic and fast tools for the tasks where code is best.
And to help exams be markable by code, it makes sense to be smart about exam structure - eg. only ask questions with binary answers or multiple choice so you don’t need subjective judgment of correctness.
It ended up being harder then writing an ordinary paper but taught us all a ton about citation and originality. It was a really cool exercise.
I imagine something similar could be done to teach students to use AI as a research tool rather then as a plagiarization machine.
Contemporary alternative: Copy/paste an essay entirely from LLM output, but make sure none of the information contained checks out. One would want to use an older model for this. :-)
Following to see what they do in the future.
Well it would give you a similar final artifact but it wouldn't be doing the exercise at all.
Also, all of these AI threats to public education can be mitigated if we just step 1-2 decades back and go the pen-and-paper way. I am yet to see any convincing argument in favor of digital/screen-based teaching methods being superior in any way than the traditional ones, on the contrary I have seen thousands of arguments against them.
IS NO ONE GOING TO POINT OUT MULTIPLE OF THOSE DOODLES ARE WRONG???
She started grading conversation than the students have with LLMs.
From the question that the students ask, it is obvious who knows the material and who is struggling.
We do have a custom setup, so that she creates an homework. There is a custom prompt to avoid the LLM answering the homework question. But thats pretty much it.
The results seems promising, with students spending 30m or so going back and forth with the LLMs.
If any educator wants to Ty or is interested in more information, let me know and we can see how we collaborate.
This topic has been an interesting part of the discourse in a group of friends the past few weeks because one of us is a teacher who has to deal with this on an almost daily basis and is struggling to get her students to not cheat and the options available to her are limited (yes, physical monitoring would probably work but requires concessions from the school management etc. it's not something that has an easy or quick fix available.)
[0] https://oxide-and-friends.transistor.fm/episodes/ai-in-highe...
Also, just like how calculators are allowed in the exam halls, why not allow AI usage in exams? In real-life job you are not going to avoid use of calculator or AI. So why test people in a different context? I think the tests should focus on the skills in using calculator and AI.
A calculator can be used to do things you know how to do _faster_ imho but in most jobs it still requires you to at least somewhat understand what is happening under the hood. The same principle applies to using LLMs at work imho. You can use it to do stuff you know how to do faster but if you don't understand the material there's no way you can evaluate the LLMs answer and you will be at fault when there's AI slop in your output.
eta: Maybe it would be possible to design labs with LLM's in such a way that you teach them how to evaluate the LLM's answer? This would require them to have knowledge of the underlying topic. That's probably possible with specialized tools / LLM prompts but is not going to help against them using a generic LLM like ChatGPT or a cheating tool that feeds into a generic model.
What you are desribing is that they should use LLM just after they know the topic. A dilemma.
I think you should be able to use the LMM at home to help you better understand the topic (they have endless patience and you can usually you can keep asking until you actually grok the topic) but during the test I think it's fair to expect that basic understanding to be there.
Dig deeper into this. When are calculators allowed, and when are they not? If it is kids learning to do basic operations, do we really allow them to use calculators? I doubt it, and I suspect that places that do end up with students who struggle with more advanced math because they off loaded the thinking already.
On the other hand, giving a calculus student a 4 function calculator is pretty standard, because the type of math they can do isn't what is being tested, and having a student be able to plug 12 into x^3 - 4x^2 + 12 very quickly instead of having to work it out doesn't impact their learning. On the other hand, more advanced calculator are often not allowed when they trivialize the content.
LLMs are much more powerful than a calculator, so finding where in education it doesn't trivialize the learning process is pretty difficult. Maybe at grad level or research, but anything grade school it is as bad as letting a kid learning their times tables use a calculator.
Now, if we could create custom LLMs that are targeted at certain learning levels? That would be pretty nice. A lot more work. Imagine a Chemistry LLM that can answer questions, but know the homework well enough to avoid solving problems for students. Instead, it can tell them what chapter of their textbook to go read, or it can help them when they are having a deep dive beyond the level of material and give them answers to the sorts of problems they aren't expected to solve. The difficulty is that current LLMs aren't this selective and are instead too helpful, immediately answering all problems (even the ones they can't).
It may not be obvious in a country with smaller student to teacher ratios, but for a place like India, you never have enough teachers for students.
Being able to provide courses, and homework digitally, reduced the amount of work required to grade and review work.
Then to add insult to injury, AI is removing entry level roles, removing other chances for people to do work which is easy to verify, practice and learn from.
Yes, yes, eventually tool use will result in increases in GDP. Except our incentives are not to hire more teachers, build more schools, and improve educational outcomes. Those are all public goods, not private goods. We aren’t going to tax firms further, because commerce must be protected, yet we will socialize the costs to society.
Not sure why they don't just do that? It worked fine and would be compatible with LLM use.
You must have a pretty broad definition of useless / toxic if you think that reading, writing and basic math, but also geometry, calculus, linear algebra, probability theory, foreign languages, a broad overview of history, and basic competency in physics / electronics fall under these categories.
Sure, I learned a lot in school that turned out to be pretty useless for me (chemistry, basically anything I learned in PE, french), but I did not know that at the time and I am still grateful that I was being exposed to these topics. Some of my classmates developed successful careers from these early exposures.
Out of that list you mention there (from my personal experience), we were never taught calculus, linear algebra or probability theory. The maths was very basic and uninspiring. The foreign language teaching was next to useless. (I learnt more in six months learning German after high school than six years of French in high school.) The science teaching was okay. The history teaching was appalling (I do like history but we were taught it in such a dull fashion, and from only one or two angles.)
It would have been useful for me to learn basic cookery, how to open a bank account and so on. Sewing and clothing repair would also have been handy. We did do some carpentry and I.T. (which is obsolete, but was a useful foundation). I do use some of what I learnt in Geography class.
Against this, they tried to instill some horrible habits in us. Like they would punish all of us when one person did something wrong (and they didn't know who). I saw that across multiple schools, and I still resent it. There was also the notion that we should obey teachers without question and accept everything they say (and they were frequently wrong). In my last school, you either went straight into university or the military but at that point in life neither was an option.
Um. yea. This is the first time a non-deterministic technology has achieved mass adoption for every day use. Despite repeated warnings (which are not even close to the tenor of warnings they should broadcast), folks don’t understand that AI will likely hallucinate some or all of their answer.
A calculator will not, and even the closest aspect of buggy behavior for a calculator (exploring the fringes of floating point numbers, for example) is light years away from the hallucination of generated AI for general, every day questions.
The mass exuberance over generative AI has been clouding folks from the very real effects of over-adoption or AI, and we aren’t going to see the full impact of that for some time, and when we do, folks are going to ask questions like “how were we so dumb?” And of course the answer will be “no one saw this coming.”
My spouse is an educator with nearly 20 years in the industry, and even her school has adopted AI. It’s shocking how quickly it has taken hold, even in otherwise lagging adoption segments. Her school finally went “1-1” with devices in 2020, just prior to COVID.
Calculator analogy is extremely inaccurate, understandably people keep doing this comparison. The premise is that calculator didn't take bookkeepers' job, but instead it helped them.
First of all calculator do one job and does it very well, you never question it because it solely works with numbers. But AI wants to be everything, calculator, translator, knowledge base etc.. And, it's very confident at everything all the time until you start to question it, and even then it continues to lie. Because sadly current AI products' purpose isn't to give you accurate answer, it's about making you believe that it's giving you credible information.
More importantly calculators are not connected to the internet, and they are not capable of creating profile of an individual.
It's sad to see big players push this agenda to make people believe that they don't need to think anymore, AI will do everything for them.
I've seen assignments that were clearly graded by ChatGPT. The signs are obvious: suggestions that are unrelated to the topic or corrections for points the student actually included. But of course, you can't 100% prove it. It's creating a strange feedback loop: students use an LLM to write the essay, and teachers use an LLM to grade it. It ends up being just one LLM talking to another, with no human intelligence in the middle.
However, we can't just blame the teachers. This requires a systemic rethink, not just personal responsibility. Evaluating students based on this new technology requires time, probably much more time than teachers currently have. If we want teachers to move away from shortcuts and adapt to a new paradigm of grading, that effort needs to be compensated. Otherwise, teachers will inevitably use the same tools as the students to cope with the workload.
Education seemed slow to adapt to the internet and mobile phones, usually treating them as threats rather than tools. Given the current incentive structure and the lack of understanding of how LLMs work, I'm not optimistic this will be solved anytime soon.
I guess the advantage will be for those that know how to use LLMs to learn on their own instead of just as a shortcut. And teachers who can deliver real value beyond what an LLM can provide will (or should) be highly valued.
Is using AI to support grading such a bad idea? I think that there are probably ways to use it effectively to make grading more efficient and more fair. I'm sure some people are using good AI-supported grading workflows today, and their students are benefiting. But of course there are plenty of ways to get it wrong, and the fact that we're all pretending that it isn't happening is not facilitating the sharing of best practices.
Of course, contemplating the role of AI grading also requires facing the reality of human grading, which is often not pretty. Particularly the relationship between delay and utility in providing students with grading feedback. Rapid feedback enables learning and change, while once feedback is delayed too long, its utility falls to near zero. I suspect this curve actually goes to zero much more quickly than most people think. If AI can help educators get feedback returned to students more quickly, that may be a significant win, even if the feedback isn't quite as good. And reducing grading burden also opens up opportunities for students to directly respond to the critical feedback through resubmission, which is rare today on anything that is human-graded.
And of course, a lot of times university students get the worst of both worlds: feedback that is both unhelpful and delayed. I've been enrolling in English courses at my institution—which are free to me as a faculty member. I turned in a 4-page paper for the one I'm enrolled in now in mid-October. I received a few sentences of written feedback over a month later, and only two days before our next writing assignment was due. I feel lucky to have already learned how to write, somehow. And I hope that my fellow students in the course who are actual undergraduates are getting more useful feedback from the instructor. But in this case, AI would have provided better feedback, and much more quickly.
This was the plot to a recent South Park episode: https://m.imdb.com/title/tt27035146/
A one hour lecture where students (especially <20 year old kids) need to proactively interject if they don't understand something is a pretty terrible format.
> "Education seemed slow to adapt to the internet and mobile phones, usually treating them as threats rather than tools. Given the current incentive structure and the lack of understanding of how LLMs work"
Good point, it is less like a threat and more like... "how do we shoehorn this into our current processes without adapting them at all? Oh cool now the LLM generates and grades the worksheets for me!".
We might need to adjust to more long term projects, group projects, and move away from lectures. A teacher has 5*60=300 minutes a week with a class of ~26. If you broke the class into groups of 4 - 5 you could spend a significant amount of time with each group and really get a feel for the students beyond what grade the computer gives to their worksheet.
If my son should grow up to run into the same kinds of cognitive limitations, I really don't know what I will tell him and do about it. I just wish there was a university in a Faraday cage somewhere where I could send him, so that he can have the same opportunities I had.
Fun fact on the side: Cambridge (UK) getting a railway station was a hugely controversial event at the time. The corrupting influence of London being only a short journey away was a major put-off.
I see collapsing under pressure to be either a kind of anxiety or a fixation on perfect outcomes. Teaching a tolerance for some kinds of failure is the fix for both.
One idea: Have students generate videos with their best "ELI5" explanations for things, or demos/teaching tools. Make the conciseness and clarity of the video video and the quality/originality of the teaching tools the grading criteria. Make the videos public, so classmates can compare themselves with their peers.
Students will be forced to learn the material and memorize it to make a good video. They'll be forced to understand it to create really good teaching tools. The public aspect will cause students to work harder not to feel foolish in front of their peers.
The beauty of this is that most kids these days want to be influencers, so they're likely to invest time into the assignment even if they're not interested in the subject.
My wife is a teacher. He school did this a long time ago, long before AI. But they also gave every kid a laptop and forced the teachers to move all tests/assignments to online applications with the curriculum picked out by the administrators (read as: some salesperson talked them into it). Even with assignments done in class, it's almost impossible to catch kids using AI when they're all on laptops all the time and she can't teach and monitor them all at the same time.
Bring back pencil and paper. Bring back calculators. Internet connected devices do not belong in the classroom.
It's getting rid of cheap methods.
Scantrons and bluebooks were always a way that made it cheap for institutions to produce results. Now those methods kinda seem silly, right?
500 person freshman lectures seem kinda absurd now, right?
Teaching via adjuncts that had 3 days notice for the class and are paid nothing is kinda scammy, right?
R1s professors whose tenure evals have nothing to do with teaching is kinda wrong, right?
The Oxbridge model of 5-10 person classes with a proctor is what the education with AI is going to be about. It's small, intimate, and expensive.
If you imagine students take 4 classes per semester and faculty teach 4 per semester… it seems stunningly feasible.
"We think that regulatory intervention by governments will be critical to mitigate the risks of increasingly powerful models"
sama in 2024:
"using technology to create abundance--intelligence, energy, longevity, whatever--will not solve all problems and will not magically make everyone happy. but it is an unequivocally great thing to do, and expands our option space. to me, it feels like a moral imperative."
sama in 2025:
...proposals requiring AI developers to vet their systems before rolling them out would be 'disastrous' for the industry.
Nothing to see here.
These "some" are founders of AI companies and investors who put a lot of money into such companies. Of course, the statements that these people "excrete" serve an agenda ...
Maybe all this social stuff that AI would bring to focus—may prove a catalyser for radical change?
Unlikely, but one can dream!
https://www.npr.org/2025/01/08/nx-s1-5246200/demographic-cli...
PDF warning: https://www.cdc.gov/nchs/data/nvsr/nvsr73/nvsr73-02.pdf
Colleges will need to reduce class sizes, or close entirely, for the next decade at least. With smaller class sizes brings the opportunity for course instructors to provide more time per pupil so that things like in-person homework and project review is possible.
when we've just made personalized education viable for everybody. schools already have the computers, laptops, tablets and study spaces. only thing left is to put these AI education apps on them. classes could also just remain the same size. students will get personalized education through the app. students can still interact with the teacher when necessary or for class wide sessions with the teacher.
Schools need to become tech free zones. Education needs to reorient around more frequent standardized tests. Any "tech" involved needs to be exclusively applied towards solving the supply and demand issue - the number of "quality teachers" to "students per classroom."
I admire Karpathy for advocating common sense, but none of this will happen because SV is full of IQ realists who only see "education" as a business opportunity and the bureaucratic process is too dysfunctional for common sense decisions to prevail. The future is chrome books with GPT browsers for every student.
Take away the internet. Except in a research/library scenario. Give them a limited time to complete tasks. This would promote a stronger work ethic, memory/recall and more realistic to time management skills. They need to learn to rely on themselves, not technology. The only effective way is to remove tech from the equation, otherwise the temptation to cheat to compete/complete is too strong.
Rather: don't grade homework. Make the homework rather the preparation that if you did it seriously will prepare you for the test (and if you didn't do it seriously, you won't have the skills that are necessary to pass the test).
Then allow students who want their homework evaluated for feedback to turn it in, but no homework will be graded.
This relegates the use of AI to personal choice of learning style and any misuse of AI is only hurting the student.
I'm a teacher. Kids don't have the capacity to make this choice without guidance. There are so so many that don't (can't?) make the link between what we teach and how they grow as learners. And this is at a rich school with well-off parents who largely value education.
Don't grade homework. Only grade work done in school. Students who cheat on their homework are just deterring themselves and won't do as good at the in-class, graded exams.
ekjhgkejhgk•2mo ago
I'm not minimizing Karpathy in any way, but this is obviously the right way to do this.