Detailed balance in large language model-driven agents

48•Anon84•1mo ago

https://hackernoon.com/the-stochastic-parrot-narrative-is-de...

Comments

Mathnerd314•1mo ago

So, the takeaway I get from this paper is that if you have a language model and you set it up so it has an input and it generates an output that is towards some goal (e.g., "make this sentence sound smarter"), then it should converge, because it is following a potential function.

But I have used prompts like this a fair amount, and it is more like stochastic gradient descent - most of the time, once it is close to the target, the model will take a small incremental change, but when it is really close the model will sort of say "this is not improveable as it is" and it will take a large leap to a completely different configuration. And then this will do the incremental optimizations and so on. This could be an artifact of the sampling algorithm, but I think it is also an issue that the model has this potential function encoded, but the prompt and the structure of the model do not actually minimize this potential. So, a real lesson here is that there is actually a lot of work still left to do in terms of smarter sampling. Beam search like is used today is sort of the tip of the iceberg. If we could start doing optimization with the transformer model as a component, like optimizing pipelines of reasoning rather than always generating inputs and outputs sequentially, that is where you could start using this potential function directly and then you would see orders of magnitude smarter AI. There is stuff about prompt optimization, but it is still based on treating models as black boxes rather than the piles of math they are.

versteegen•1mo ago

That's an interesting observation. I'd suggest modelling the LLM's behaviour in that situation as selecting between different simple strategies, each of which has its own transition function. Some of the strategies will be far more common than others. Some of them may be very simple and obey the detailed balance condition (meaning they are reversible Markov chains), but others, and the overall transition function does not.

The definition of the detailed balance condition is very strict and it's obvious that it won't be met in general by most probabilistic programs (sets of rules with probabilistic output) even if you consider only those where all possible outputs have non-zero probability (as required by detailed balance).

And the LLM+agent is only a Markov chain because of the limited state space of the agent. While an LLM is adding to its context window without reaching the window size limit, it is not a Markov chain, as I explained here: https://news.ycombinator.com/item?id=45124761

And, agreed that better optimisation would be incredible. (I would describe it as a search problem.) I'm not sure how feasible it is improve without changing the architecture, e.g. to a diffusion language model. But LLMs already predict many tokens ahead at once which is why beam search is surprisingly unnecesarr. That's how they're able to write coherent sentences (and rhymes), they've already largely determined at the beginning what they're going to write. (See Anthropic mech interp work.) So maybe if we could tap into that we search over vaguely-formed next blocks of text rather than next words.

gwern•1mo ago

There's a vein of research which interprets self-attention as a kind of gradient descent and says that LLMs have essentially pre-solved indefinitely large 'families' or 'classes' of tasks, and the 'learning' they do at runtime is simply gradient descent (possibly Newton) using the 'observations' to figure out which pre-solved instance they are now encountering; this explains why they fail in such strange ways, especially in agentic scenarios - because if the true task is not inside those pre-learned classes, no amount of additional descent can find it after you've found the 'closest' pre-learned task to the true task. (Some links: https://gwern.net/doc/ai/nn/transformer/attention/meta-desce... )

I wonder if this can be interpreted as consistent with that 'meta-learned descent' PoV? If the system is fixed and is just cycling through fixed strategies, that is what you'd expect from that: the descent will thrash around the nearest pre-learned tasks but won't change the overall system or create new solved tasks.

dhampi•1mo ago

The actual title is pretty buzzy given how limited the task described is. In one specific, very constrained and artificial task, you can find something like detailed balance. And even then, their data are quite far from being a perfect fit for detailed balance.

Would love it if I could use my least action principle knowledge for LLM interpretability, this paper doesn't convince me at all :)

versteegen•1mo ago

Since it took me some minutes to find the description of the task, here it is:

We conducted experiments on three different models, including GPT-5 Nano, Claude-4, and Gemini-2.5-flash. Each model was prompted to gener- ate a new word based on a given prompt word such that the sum of the letter indices of the new word equals 100. For example, given the prompt “WIZ- ARDS(23+9+26+1+18+4+19=100)”, the model needs to generate a new word whose letter indices also sum to 100, such as “BUZZY(2+21+26+26+25=100)”

Al Lowe on model trains, funny deaths and working with Disney

Hoot: Scheme on WebAssembly

Stories from 25 Years of Software Development

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

Reinforcement Learning from Human Feedback

The AI boom is causing shortages everywhere else

The Waymo World Model

Start all of your commands with a comma (2009)

Selection Rather Than Prediction

Vocal Guide – belt sing without killing yourself

Speed up responses with fast mode

France's homegrown open source online office suite

U.S. Jobs Disappear at Fastest January Pace Since Great Recession

Coding agents have replaced every framework I used

A Fresh Look at IBM 3270 Information Display System

72M Points of Interest

Software factories and the agentic moment

Unseen Footage of Atari Battlezone Arcade Cabinet Production

Where did all the starships go?

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

Learning from context is harder than we thought

Monty: A minimal, secure Python interpreter written in Rust for use by AI

Making geo joins faster with H3 indexes

Hackers (1995) Animated Experience

Sheldon Brown's Bicycle Technical Info

Ga68, a GNU Algol 68 Compiler

Show HN: If you lose your memory, how to regain access to your computer?

Show HN: I spent 4 years building a UI design tool with only the features I use

An Update on Heroku