frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

New AI tutor achieves 0.71-1.30 SD effect size in Dartmouth course [pdf]

https://intextbooks.science.uu.nl/workshop2026/files/itb26_s1s2.pdf
44•jonahbard•1h ago

Comments

boulos•55m ago
Do you have a larger study planned for the Fall? It definitely seems promising.

I'm curious how well you feel this worked because the subject was Statistics (objective grading) versus something more subjective like Civics or Literature.

PS - I'd say this qualifies for Show HN, too!

Do you

ilaksh•32m ago
They were using Sonnet 4.6 for some fre form responses so that could be applied to something subjective.
albinahlback•50m ago
Very nicely typeset.
kubb•44m ago
Too bad the educational use case doesn't make any money. Good LLMs are a game changer for people motivated to learn.
TheLML•35m ago
I don't want to learn from hallucinations where it will change its answers based on me questioning their teachings. I use it for conversations in a language I'm learning, but I quickly learned that asking it grammar questions for example is not a wise decision.
afro88•25m ago
Curious whether you were just bare asking it questions, or whether you provided it with lessons one by one with instruction that the lesson is the baseline truth etc
treis•19m ago
Are we talking about human teachers or LLMs here?
Robotbeat•34m ago
Wikipedia doesn’t make much money but is still helpful. LLMs don’t need to make a whole bunch of money to be helpful.
kubb•26m ago
People aren't paying trillions to train them to be helpful. They want to make quadrillions.
Rperry2174•40m ago
Honestly whether or not this was effective seems less important to me than the adoption numbers.

Text book reading in this course was 10-15% at baseline ... but this AI thing got 90% voluntary usage ungraded.

Even if its worse per-hour than a textbook, you're now teaching 6x as many students _something_ instead of teaching a small minority everything.

So really it just becomes an optimization problem at that point because most students are at least in the funnel/in the running to learn something.

The paper kind of proves this itself ... they tweaked the quize formats mid-semester and where able to iterate which you can't do on a textbook that nobody opens in the first place

baq•28m ago
I'd argue the results are even better: just reading a textbook doesn't really teach you much. You have to do exercises, but they're expensive to create and grade. LLMs with a proper harness (see paper) tackle both.
rusbus•37m ago
This is exciting because the effect size is so large. But as the author's acknowledged, selection bias is nearly impossible to control for in this non-randomized study:

> and lacks randomized controls. Self-selection is the central threat: students who complete more quizzes may be more motivated or higher-performing generally

But this is still a strong result. I'm excited to see more in this space.

rahimnathwani•12m ago
They tried to control for this. It's described in the first paragraph of section 4.
constantius•35m ago
Interesting, congrats.

Are you planning on opening access to Phosphor?

baq•30m ago
I'm on record saying that a system like this with some extra hardware (i.e. a way for the LLM to have live understanding of the student's paper notebook or handout which are being written in with a plain old pencil) combines the best of both worlds - individual tutoring with approximately zero screen time which scales linearly with the number of students. The role of the teacher or professor then becomes a manager of the student - agentic tutor pairs, a referee when the student and model disagree, etc. and most importantly still being the human teacher you can just talk to in the human education process.

I'm convinced this is the future of education - models are there, we need the classroom tech to catch up. The alternative is obvious and quantified in the paper - students just use models to do their work for them and learn nothing.

terribleperson•5m ago
A 'smart pen' that records the student's writing in some way, maybe? My first thought was a tablet that boots straight into a writing software but students should not be subjected to any amount of latency in their writing.
ilaksh•25m ago
Shocking that a well executed AI tutor improves outcomes.

Hasn't computer assisted interactive learning already been proven for years? Why does there seem to be so much skepticism about enhancing it with AI?

Is this just something like, astoundingly slow adoption or poor execution? Being held back by paper textbook makers? Teachers unions dragging their feet?

How can interactive AI driven individually paced learning _not_ be obviously dramatically more effective?

dominotw•24m ago
its like anything else. benifits students that are already motivated to learn.

very few are actually motivated to learn and are just there to get a job or its just next thing that they have to do in life.

mmarian•22m ago
Conflicted about this study. On one hand, LLMs have been incredible for my personal learnings of new concepts.

On the other, I'm sceptical of that it'll have "strong benefits" at scale; I'd be more in favor if the wording was "some"/"moderate". I reckon self-selection plays a huge part, as mentioned in the "Limitations" section of the paper.

I'd also caution against attaching the tool to grading. That means students have to put more effort into the course, which increases the chances that they will use LLMs to save time rather than make the investment.

MoneyBurning•11m ago
Curious how this holds up across different learning styles. SD effect sizes look impressive, but I'd want to see retention data at 30/90 days before drawing conclusions.
isomorphic_duck•8m ago
Why did you make a new account to spam AI comments?

Trouble Transitioning (2025)

https://www.lrb.co.uk/the-paper/v47/n01/adam-tooze/trouble-transitioning
1•measurablefunc•36s ago•0 comments

Speech and Noise Corpora for Pitch Estimation of Human Speech

https://zenodo.org/records/3920591
2•q7m•5m ago•0 comments

Cursed circuits #5: capacitance multiplier

https://lcamtuf.substack.com/p/cursed-circuits-capacitance-multiplier
2•surprisetalk•5m ago•0 comments

Understanding B-Tree Indexes in PostgreSQL: A Comprehensive Guide– Part 1

https://medium.com/@devli0/b-tree-indexes-in-postgresql-part-1-theory-eb2668c52520
1•corvus-cornix•9m ago•0 comments

How to Get a Healthy, Shiny Coat on Your Dog: The Ultimate Guide

https://pawcaremedia.com/how-to-get-a-healthy-shiny-coat-on-your-dog-the-ultimate-guide/
1•Han25•12m ago•0 comments

New contributors to GNU Emacs over time

https://old.reddit.com/r/emacs/comments/1uo4t5e/new_contributors_to_gnu_emacs_over_time/
2•srijan4•12m ago•0 comments

WebGlean – API that turns any site into clean Markdown for LLMs

https://www.webglean.com
1•qubomax•12m ago•0 comments

"12-year-old girl had been shot in the chest with a crane-mounted gun"

https://www.unicef.org/press-releases/geneva-palais-briefing-child-day-deadly-illusion-gazas-ceas...
1•embedding-shape•15m ago•0 comments

Slow Tuesday Night (1965)

https://www.baen.com/Chapters/9781618249203/9781618249203___2.htm
1•et1337•16m ago•0 comments

The Strange Locomotion of Spirocuta

https://chriskiehl.com/article/euglenid-motion-in-flagellates
1•goostavos•18m ago•0 comments

Show HN: 3·6·9 COMMANDER, a turn-based strategy card game

https://forgottenmachine.itch.io/369-commander
1•forgatmachine•19m ago•0 comments

The Mental Models I Use to Work with AI

https://metedata.substack.com/p/015-the-mental-models-i-use-to-work
3•young_mete•20m ago•0 comments

State Sponsored Media? No Thanks [video]

https://www.youtube.com/watch?v=sFgSzsusIwQ
2•dp-hackernews•22m ago•0 comments

View from the Shifting Mound

https://thesolarprincess.github.io/blog/en/shiftingmound.html
1•paulpauper•23m ago•0 comments

Contributor Visualization for Superset: top contributors own 90% of lines

https://twitter.com/Principal_ADE/status/2073853855545143427
4•fernando-ram•24m ago•0 comments

Social media management for AI Agents

https://schedpilot.com/
1•schedpilot•25m ago•0 comments

Eclipse Enclave

https://projects.eclipse.org/projects/ecd.enclave
2•Tomte•26m ago•0 comments

A 3D-printed Raman spectrometer

https://hackaday.com/2026/07/05/2026-frikkin-lasers-challenge-a-3d-printed-raman-spectrometer/
2•ikbdsk•27m ago•0 comments

CommaAgents V2 Sharable Agent Orchestrator Release Candidate

https://github.com/CloAI/CommaAgents
1•NateAGeek•31m ago•1 comments

Turn Your AI Agent into an MCP Server for ChatGPT, Claude and Cursor

https://quickchat.ai/post/expose-ai-agent-as-mcp-server
1•piotrgrudzien•33m ago•0 comments

The full stack of terminals explained

https://ahmadawais.com/the-full-stack-of-terminals-explained-terminal-shell-tty-console-posix-ans...
2•ludicrousdispla•34m ago•0 comments

Large planets lighter than cotton candy

https://www.cbsnews.com/news/super-puff-planets-lighter-than-cotton-candy-found/
2•gmays•35m ago•0 comments

We'll fight the platform war against big AI

https://www.anildash.com/2026/06/23/fight-ai-platform-war/
1•bnj•36m ago•0 comments

Raylib 6.x gamejam – Make a 720x720 wasm game with raylib in 6 days

https://itch.io/jam/raylib-6x-gamejam
2•vyrotek•40m ago•0 comments

Group project, but make it 1776 – Google Workspace ad [video]

https://www.youtube.com/watch?v=Q3RjZY-rSsc
1•ChrisArchitect•42m ago•0 comments

Delta flight hit by firework while landing at Midway Airport on Fourth of July

https://www.nbcchicago.com/news/local/delta-flight-hit-by-firework-while-landing-at-midway-airpor...
2•randycupertino•42m ago•0 comments

Show HN: TrainSim – a browser train tycoon

https://aashishh15.github.io/3DTrainSim/
1•aashishharishch•44m ago•1 comments

Can AI do fact-checking?

https://www.wired.com/story/fact-checking-ai/
1•simianwords•44m ago•0 comments

Show HN: Make No Mistakes – AI coding agents must prove their work

https://github.com/momomuchu/make-no-mistakes
1•mohamedmaache•46m ago•0 comments

Tanenbaum–Torvalds Debate

https://en.wikipedia.org/wiki/Tanenbaum%E2%80%93Torvalds_debate
1•chistev•46m ago•0 comments