frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

The Evolution of the Interface

https://www.asktog.com/columns/038MacUITrends.html
1•dhruv3006•1m ago•0 comments

Azure: Virtual network routing appliance overview

https://learn.microsoft.com/en-us/azure/virtual-network/virtual-network-routing-appliance-overview
1•mariuz•1m ago•0 comments

Seedance2 – multi-shot AI video generation

https://www.genstory.app/story-template/seedance2-ai-story-generator
1•RyanMu•4m ago•1 comments

Πfs – The Data-Free Filesystem

https://github.com/philipl/pifs
1•ravenical•8m ago•0 comments

Go-busybox: A sandboxable port of busybox for AI agents

https://github.com/rcarmo/go-busybox
1•rcarmo•9m ago•0 comments

Quantization-Aware Distillation for NVFP4 Inference Accuracy Recovery [pdf]

https://research.nvidia.com/labs/nemotron/files/NVFP4-QAD-Report.pdf
1•gmays•9m ago•0 comments

xAI Merger Poses Bigger Threat to OpenAI, Anthropic

https://www.bloomberg.com/news/newsletters/2026-02-03/musk-s-xai-merger-poses-bigger-threat-to-op...
1•andsoitis•9m ago•0 comments

Atlas Airborne (Boston Dynamics and RAI Institute) [video]

https://www.youtube.com/watch?v=UNorxwlZlFk
1•lysace•10m ago•0 comments

Zen Tools

http://postmake.io/zen-list
1•Malfunction92•13m ago•0 comments

Is the Detachment in the Room? – Agents, Cruelty, and Empathy

https://hailey.at/posts/3mear2n7v3k2r
1•carnevalem•13m ago•0 comments

The purpose of Continuous Integration is to fail

https://blog.nix-ci.com/post/2026-02-05_the-purpose-of-ci-is-to-fail
1•zdw•15m ago•0 comments

Apfelstrudel: Live coding music environment with AI agent chat

https://github.com/rcarmo/apfelstrudel
1•rcarmo•16m ago•0 comments

What Is Stoicism?

https://stoacentral.com/guides/what-is-stoicism
3•0xmattf•17m ago•0 comments

What happens when a neighborhood is built around a farm

https://grist.org/cities/what-happens-when-a-neighborhood-is-built-around-a-farm/
1•Brajeshwar•17m ago•0 comments

Every major galaxy is speeding away from the Milky Way, except one

https://www.livescience.com/space/cosmology/every-major-galaxy-is-speeding-away-from-the-milky-wa...
2•Brajeshwar•17m ago•0 comments

Extreme Inequality Presages the Revolt Against It

https://www.noemamag.com/extreme-inequality-presages-the-revolt-against-it/
2•Brajeshwar•17m ago•0 comments

There's no such thing as "tech" (Ten years later)

1•dtjb•18m ago•0 comments

What Really Killed Flash Player: A Six-Year Campaign of Deliberate Platform Work

https://medium.com/@aglaforge/what-really-killed-flash-player-a-six-year-campaign-of-deliberate-p...
1•jbegley•19m ago•0 comments

Ask HN: Anyone orchestrating multiple AI coding agents in parallel?

1•buildingwdavid•20m ago•0 comments

Show HN: Knowledge-Bank

https://github.com/gabrywu-public/knowledge-bank
1•gabrywu•26m ago•0 comments

Show HN: The Codeverse Hub Linux

https://github.com/TheCodeVerseHub/CodeVerseLinuxDistro
3•sinisterMage•27m ago•2 comments

Take a trip to Japan's Dododo Land, the most irritating place on Earth

https://soranews24.com/2026/02/07/take-a-trip-to-japans-dododo-land-the-most-irritating-place-on-...
2•zdw•27m ago•0 comments

British drivers over 70 to face eye tests every three years

https://www.bbc.com/news/articles/c205nxy0p31o
38•bookofjoe•27m ago•13 comments

BookTalk: A Reading Companion That Captures Your Voice

https://github.com/bramses/BookTalk
1•_bramses•28m ago•0 comments

Is AI "good" yet? – tracking HN's sentiment on AI coding

https://www.is-ai-good-yet.com/#home
3•ilyaizen•29m ago•1 comments

Show HN: Amdb – Tree-sitter based memory for AI agents (Rust)

https://github.com/BETAER-08/amdb
1•try_betaer•30m ago•0 comments

OpenClaw Partners with VirusTotal for Skill Security

https://openclaw.ai/blog/virustotal-partnership
2•anhxuan•30m ago•0 comments

Show HN: Seedance 2.0 Release

https://seedancy2.com/
2•funnycoding•30m ago•0 comments

Leisure Suit Larry's Al Lowe on model trains, funny deaths and Disney

https://spillhistorie.no/2026/02/06/interview-with-sierra-veteran-al-lowe/
1•thelok•30m ago•0 comments

Towards Self-Driving Codebases

https://cursor.com/blog/self-driving-codebases
1•edwinarbus•31m ago•0 comments
Open in hackernews

Do Large Language Models know who did what to whom?

https://arxiv.org/abs/2504.16884
39•badmonster•9mo ago

Comments

badmonster•9mo ago
op: https://arxiv.org/abs/2504.16884
kazinator•9mo ago
Of course they can do it, if they are trained with a large number of pairs of data consisting of various texts, and annotations of who does what in that text. Then they will predict correct tokens that talk about who did what.

LLMs are pretty good at preserving who did what when they translate from one language to another. That's because translation examples they are trained on correctly preserve who did what.

chewxy•9mo ago
Maybe read the paper first?

> This study asked whether Large Language Models (LLMs) understand sentences in the minimal sense of representing “who did what to whom”. In Experiment 1, we found that the overall geometry of LLM distributed activity patterns failed to capture this information: similaritiesbetween sentences reflected whether they shared syntax more than whether they shared thematic role assignments. Human judgments, in contrast, were strongly driven by this aspect of meaning.

> In Experiment 2, we found limited evidence that thematic role information was available even in a subset of hidden units. Whereas activity patterns in subsets of hidden units often allowed for significant classification of whether sentence pairs had shared vs. opposite thematic role assignments, the effect sizes were small; even the best-performing case appeared to lag behind humans, and its representation of thematic roles did not seem robust across syntactic structures.

> However, thematic role information was reliably available in a large number of attention heads, demonstrating LLMs have the capacity to extract thematic role information. In some cases, information present in attention heads descriptively exceeded human performance.

112233•9mo ago
When repeatedly running "generate story about X" on different models and then simply asking for next part, one thing that stands out is many LLMs will gladly swap characters in their output. Like X asks Y to do something, Y does, then Y says "thank you X for doing this". But obviously it is much more varied.

Most likely because there is no mechanism in this thing that would allow for building spatial or relationship model between entities.

NoToP•9mo ago
I once asked it to emulate being air traffic control so I could practice for a pilot exam. It generated a full transcript of a pilot character called "you" talking to air traffic control...