So the maximal intelligence is actually not an agent at all (it has zero agency itself), it's a place. You can imagine the final direct democratic simulated multiverse, that's the final absolute super-intelligence. It has all the agents inside of it, while it itself is as static spacetime. Agents (like us and others) are 3D and dynamic, while the multiverse is 4D static spacetime. Everything already happened, so there is no future, only the past, you can forget something to relive it.
While maximal agency (=shape-changing) is actually the Big Bang, it has almost zero intelligence (it's a dot) but infinite potential future intelligence (can become a multiversal simulation).
[0] A protected species for its sentience.
1. Any concept you're interested in, get inputs with and without it. For images: 100 with, say a pink elephant, 100 without.
2. Calculate the difference between these models as represented by an "Optimal Transport Map".
Apply the map at desired strength, and voila - you don't have a pink elephant anymore. These can stack.
There are lots of obvious and interesting applications here in LLMs - there's some research showing that LLMs have honesty/dishonesty parameter groupings, for instance.
But, I can't really figure out what this OT map is. Is it a single layer tensor? Is it multidimensional? If it's the size of the original model (which they say it is not), then I understand how to apply it - just add weights and rerun. If it's not a copy, where and when is this map applied? Another way to say this is, how is this different than calculating the average difference and storing it in a low-rank adapter? I have no idea.
turnsout•1w ago