Do Large Language Models know who did what to whom?

39•badmonster•9mo ago

Comments

badmonster•9mo ago

kazinator•9mo ago

Of course they can do it, if they are trained with a large number of pairs of data consisting of various texts, and annotations of who does what in that text. Then they will predict correct tokens that talk about who did what.

LLMs are pretty good at preserving who did what when they translate from one language to another. That's because translation examples they are trained on correctly preserve who did what.

chewxy•9mo ago

Maybe read the paper first?

> This study asked whether Large Language Models (LLMs) understand sentences in the minimal sense of representing “who did what to whom”. In Experiment 1, we found that the overall geometry of LLM distributed activity patterns failed to capture this information: similaritiesbetween sentences reflected whether they shared syntax more than whether they shared thematic role assignments. Human judgments, in contrast, were strongly driven by this aspect of meaning.

> In Experiment 2, we found limited evidence that thematic role information was available even in a subset of hidden units. Whereas activity patterns in subsets of hidden units often allowed for significant classification of whether sentence pairs had shared vs. opposite thematic role assignments, the effect sizes were small; even the best-performing case appeared to lag behind humans, and its representation of thematic roles did not seem robust across syntactic structures.

> However, thematic role information was reliably available in a large number of attention heads, demonstrating LLMs have the capacity to extract thematic role information. In some cases, information present in attention heads descriptively exceeded human performance.

112233•9mo ago

When repeatedly running "generate story about X" on different models and then simply asking for next part, one thing that stands out is many LLMs will gladly swap characters in their output. Like X asks Y to do something, Y does, then Y says "thank you X for doing this". But obviously it is much more varied.

Most likely because there is no mechanism in this thing that would allow for building spatial or relationship model between entities.

NoToP•9mo ago

I once asked it to emulate being air traffic control so I could practice for a pilot exam. It generated a full transcript of a pilot character called "you" talking to air traffic control...

155M US land parcel boundaries

Private Inference

Font Rendering from First Principles

Show HN: Seedance 2.0 AI video generator for creators and ecommerce

Wally: A fun, reliable voice assistant in the shape of a penguin

Rewriting Pycparser with the Help of an LLM

Lobsters Vibecoding Challenge

E-Commerce vs. Social Commerce

Avoiding Modern C++ – Anton Mikhailov [video]

Show HN: AegisMind–AI system with 12 brain regions modeled on human neuroscience

Zig – Package Management Workflow Enhancements

AI-powered text correction for macOS

AppSecMaster – Learn Application Security with hands on challenges

Fibonacci Number Certificates

AI Overviews are killing the web search, and there's nothing we can do about it

City skylines need an upgrade in the face of climate stress

1979: The Model World of Robert Symes [video]

Satellites Have a Lot of Room

1980s Farm Crisis

Show HN: FSID - Identifier for files and directories (like ISBN for Books)

Show HN: Holy Grail: Open-Source Autonomous Development Agent

Show HN: Minecraft Creeper meets 90s Tamagotchi

Show HN: Termiteam – Control center for multiple AI agent terminals

The only U.S. particle collider shuts down

Ask HN: Why do purchased B2B email lists still have such poor deliverability?

Show HN: Remotion directory (videos and prompts)

Portable C Compiler

Show HN: Kokki – A "Dual-Core" System Prompt to Reduce LLM Hallucinations

Software Engineering Transformation 2026

Microsoft purges Win11 printer drivers, devices on borrowed time