2 + 2 + 2
<=> reversable
2 + 2 + 2, 2 + 4
<=> reversable
2 + 2 + 2, 2 + 4, 6
=> irreversible
6
Edit: i see now. Well, this is much less exciting than I thought. Still, I'm excited for all the other people excited.
> What does it mean for computation to have a direction?
Actually, all computation has a directionality! This is actually a subject that I get really excited about ^__^Think about this, we have a function f, with input x, and output y. f(x) -> y We'd even use that notation! This is our direction.
Now, the reverse actually gets a bit tricky. If the reverse is straightforward, our function has an inverse. But, it might not always. So our function f(x)=mx+b is invertible, because we can write x = (f(x)-b)/m (well... m can't be 0), which provides a unique solution. Every x corresponds to a unique f(x). But if instead, we have the function f(x) = x^2, this is not true! x = sqrt(f(x)), and here every f(x) corresponds to both x and -x! They are not unique.
We can start adopting the language of images and preimages if you want to start heading down that route. There's a lot of interesting phenomena here and if you hadn't already guessed it, yes, this is related to the P=NP problem!
An easy way to see this visually is to write down the computational circuit. Polylog actually has a really good video and will make the connection to P vs NP[0]
In the context of machine learning, a Normalizing Flow would be invertible, while a diffusion model is reversible. A pet peeve of mine is that in ML people (I'm a ML researcher) call it the "inverse problem", such as GAN-Inversion, but that is a misnomer and we shouldn't propagate it... This also has to do with the naivety of these statements...[1,2]. If yo understand this you'll understand how one could make accurate predictions in one direction but fail in the other. Which really puts a whole damper on that causality stuff. Essentially, we run into the problem of generating counterfactuals.
> Said direction does not seem to refer to causality
Understanding this, I think you can actually tell that there's a direct relationship to causality here! In physics we love to manipulate equations around because... well... the point of physics is generating a causal mapping of the universe. But there's some problems... Entropy is the classic prime example (but there are many more in QM), and perhaps this led to his demise[3]. (This is also related to the phenomena of emergence and chaos.)Here the issue is that we can take some gas molecules, run our computation forward and get a new "state" (configuration of our molecules). But now... how do we run this in reverse? We will not generate a unique solution, but instead we have a family of solutions.
Funny enough, you ran into this when you took calculus! That's why when you integrated your professor always got mad if you dropped the "+C"! So here you can see that differentiation isn't (necessarily) an invertible process. All f(x)+c map to f'(x)! It is a many to one relationship, just like with f(x)=x^2
> So this just about preserving state by default to make backtracking easier?
I think here we should have some more clarity? If not, think about our gas distribution problem. If instead of just sampling at time 0 and time T we sampled at {0,t0,t1,...,T} we greatly reduce the solution space, right? Because now our mapping from T->0 needs to pass through all such states. It's still a lot of potential paths, but it's still fewer...[4][0] https://www.youtube.com/watch?v=6OPsH8PK7xM
[1] https://www.reddit.com/r/singularity/comments/1dhlvzh/geoffr...
[2] https://www.youtube.com/watch?v=Yf1o0TQzry8&t=449s
[3] The opening of Goldstein's States of Matter book (the standard graduate textbook on statistical mechanics). Be sure to also read the first line of the second paragraph: https://i.imgur.com/Dm0PeJU.png
[4] I know...
First it says we lose electrons by deleting information. But AFAIK we are losing electrons everywhere, most gates will operate on negation of a current, which I understand is what they refeer to losing electrons. So, are all gates bad now?
Also, why keeping a history of all memory changes will prevent losing heat? You will have to keep all that memory running so...
And finally, why would this be useful? Who needs to go back in time in their computations??
Edit: and yes, most of the logical operations in a regular chip like AND, OR, NAND etc are irreversible (in isolation, anyway)
The Landauer limit at ambient temperature gives something of the order of 10⁻²¹ J to irreversibly flip a bit. While, if I read this paper[1] correctly, current transistors are around 10⁻¹⁵ J. So, definitely not coming to AI "soon".
Obviously, in real life, most power consumed by computers is lost by wire resistance, not through "forgetting" memory in logic gates. You would need superconducting wires and gates to build an actually reversible CPU.
Also, you would need to "uncompute" the result of a computation to bring back your reversible computer from its result back to its initial state, which may be problematic. Or you can expend energy to erase the state.
Quantum computers are reversible computers, if you seek a real life example. Quantum logic gates are reversible and can all be inverted.
How much power does a persistent storage (hard drive, SSD) require to preserve its stored data? Zero, which is why it emits zero heat.
> Who needs to go back in time in their computations??
At its most basic level, erasing/overwriting data requires energy. This generates a lot of heat. Heat dissipation is a major obstacle to scaling chips down even further. If you can design a computer that doesn't need to erase nearly as much data, you generate orders of magnitude less heat, and this potentially opens up more scaling potential and considerable power savings.
Edit: One of their white papers mentions " Application Framework: A PyTorch-compatible interface supports both AI applications and general-purpose computing, ensuring versatility without sacrificing performance."
This idea of reversible computing was new to me. I didn’t know it was even possible to run computations “backwards” to save power. It’s interesting that slowing things down might actually save more energy in the long run. I’ll definitely be reading more about this.
Since then diffusion models have been popular. Generating from these can be seen as a special case of a continuous time normalizing flow, and so (in theory) is a reversible computation. Although the distilled/fast generation that's run in production is probably not that!
Simulating differential equations is not usually actually reversible in practice due to round-off errors. But when done carefully, simulations performed in a computer can actually be exactly bit-for-bit reversible: https://arxiv.org/abs/1704.07715
People tend to ignore a problem if it's someone else's. The costs of [insert disruptive technology here] are largely externalised - on our natural environment, on individuals' livelihoods, on violated copyrights, on independent hosts' infrastructure, on pedestrians, on about-to-be burnt-out/jobless/homeless, etc. What you gain in efficiency, you will use to bring more for yourself, not to bring less harm to someone else. ¯\_(ツ)_/¯
throwawaymaths•1d ago
YetAnotherNick•1d ago
But in any case I don't understand the claim of the article. If you can reverse the computation(say only use reversible matrix) you can do it for less energy?
cwillu•1d ago
throwawaymaths•1d ago
bee_rider•1d ago
In current chips we just charge and dump a bunch of parasitic capacitances every clock cycle.
bee_rider•1d ago
In current computers, we’re nowhere near those limits anyway. Reversible computing is interesting research for the future.
thrance•1d ago
bee_rider•1d ago
physix•1d ago
bee_rider•1d ago
hyghjiyhu•1d ago
cwillu•1d ago
thrance•1d ago
physix•1d ago
TheDudeMan•1d ago
throwawaymaths•1d ago
if you're also batching matmuls isnt there an unavoidable information loss that happens when you evict the old data and drop in the new batch?
bravesoul2•1d ago
Let f: V -> V
g: V -> V x V is reversable form of f, where g(v) = (v, f(v))
g'((v, w)) = v
g' can be "fooled" with a fake w, but that is of no concern here. We trust our own chip!
thrance•1d ago
f: V -> V
g: V x A -> V x A
with g(x, a) = g(f(x), b) for some value(s) of a. And b cannot be set to x, because then you can't find a back with g', and your function is not invertible.
TheDudeMan•1d ago
ajb•1d ago
JoshuaDavid•1d ago
thrance•1d ago
add(x, a) = (x + a, x - a), and add†(y, b) = ((y + b) / 2, (y - b) / 2)
It's a well known thing in thermodynamics that a reversible process doesn't increase entropy (heat). So in theory, a reversible computer consumes no power whatsoever, as it doesn't leak any. In practice, most power leakage in real life computers is due to wire resistance, not the irreversibility of computations. Also, even with a perfectly reversible CPU, you would need to expend some energy to (re)initialize the state of your computer (input) and to copy its results out (output). Alternatively, once a computation is done, you can always "uncompute" it to get back to the initial state without using any power, at the cost of time.
If you want an example of a real invertible computer, look into quantum computers, which are adiabatic (reversible) by necessity, in accordance with the laws of quantum physics.
* Actually, you can represent reversible gates with invertible matrices, and that has quite profound implications. A gate/operation is reversible if and only if its corresponding matrix is invertible. But let's not get into that here.