If you ask your cache for a value, it could choose to reply now, with the information that it has - favouring A.
Or it could wait and hope for more accurate information to return to you later, favouring C.
'Cache' seems to imply that it's built for availability purposes.
In a specific use case that might apply. For example, if two people edit the same document and fix the same typo, the visual outcome is the same, no matter who made the change first or last.
But that is very niche as if we would take a programming code, someone can change a line of code that someone else is changing as well and they might be the same, but then you have other lines of code as well that might not be and then you end up with a code that won't compile. In other words, if we focus on the singular change in insolation, this makes sense. But that is essentially never the case in distributed environments in this context and we have to look at broader picture where multiple changes made by someone are related or tied to each other and do not live insolation.
Either way, i see nothing useful here. You can "render" your local changes immediately vs wait for them to be propagated through the system and return back to you. There is very little difference here and in the end it is mostly just about proper diffing approach and has little to do with the distributed system itself.
PS: the problem here is not really the order of applied changes for local consumer, like in case of editing a shared word document. The problem here is if we have a database and we commit a change locally but then someone else commits different change elsewhere, like "update users set email = foo@bar where id = 5" and before we receive the other, later, change we serve clients invalid data. That is the main issue of eventual consistency here. As I am running a system like this, I have to use "waiters" to ensure I get the correct data. For example, when user creates some content via web ui and is redirected back to list of all content, this is so fast that the distributed system has not had enough time to propagate the changes. So this user will not see his new content in the list - yet. For this scenario, I use correlation id that i receive when content is created and i put it into the redirect so when user moves to the page that lists all the content, this correlation is detected and a network call is made to appropriate server whose sole purpose is to keep the connection open until that server's state is caught up to the provided correlation id. Then I refresh the list of content to present the user the correct information - all of this whilst there is some loading indicator present on the page. There is simply no way around this in distributed systems and so I find this article of no value(at least to me).
Just a basic example for a task tracker:
* first update sets task cancelled_at and cancellation_reason
* second update wants the task to be in progress, so sets started_at
If code just uses the timestamps to consider the task state, it would not assume the task is cancelled, unexpected since the later user update set it to in progress.
Easy fix, we just add a state field 'PENDING|INPROGRESS|CANCELLED|...'.
Okay, but now you have a task that is in progress, but also has a cancellation timestamp, which seems inconsistent.
The point is:
With CRDTs you have to consider how partial out of order merges affect the state, and make sure your logic is always written in a way so these are handled properly. That is *not easy*!
I'd love it if someone came up with a framework that allows defining application semantics on top of CRDTs, and have the framework ensure types remain consistent.
The point is that you always have to think about merging behaviour for every piece of state.
Any many CRDT implantations have already solved this for the styled text domain (e.g bold and cursive can be additive but color not etc).
But something user definable would be really useful
The gist is replicating intentions (actions, immutable function call definitions that advance state) instead of state + hybrid logical clocks for total ordering + some client side db magic to make action functions deterministic. This ensures application semantics are always preserved with no special conflict resolution considerations while still having strong eventual consistency. Check out the readme for more info. I haven’t gotten to take it much further beyond an experiment but the approach seems promising.
Well, this all depends on the definition of «function properly». Convergence ensures that everyone observed the same state, not that it’s a useful state. For instance, The Imploding Hashmap is a very easy CRDT to implement. The rule is that when there’s concurrent changes to the same key, the final value becomes null. This gives Strong Eventual Consistency, but isn’t really a very useful data structure. All the data would just disappear!
So yes, CRDT is a massively useful property which we should strive for, but it’s not going to magically solve all the end-user problems.
The basic CRDT ideas are actually pretty easy to implement: add some metadata here, keep some history there. The difficulty, for the past 20 years or so, is making the overheads low, and the APIs understandable.
Many projects revolve around some JSON-ish data format that is also a CRDT:
- Automerge https://automerge.org (the most tested one, but feels like legacy at times, the design is ~10yrs old, there are more interesting new ways)
- JsonJoy https://jsonjoy.com/
- RDX (mine) https://replicated.wiki/ https://github.com/gritzko/go-rdx/
- Y.js https://yjs.dev/
Others are trying to retrofit CRDTs into SQLite or Postgres. IMO, those end up using last-write-wins in most cases. Relational logic steers you that way.
ijaym•4h ago
Do people really distinguish "Strong Eventual Consistency" from "Eventual Consistency"? To me, when I say "Eventual Consistency" I alwayes mean "Strong Eventual Consisteny".
nl•2h ago
In an eventually consistent system replicas can diverge. A "last write" system can be eventually consistent, but a given point can read differently.
Eg: operations
1) Add "AA" to end of string 2) Split string in middle
Replicas R1 and R2 both have the string "ZZZZ"
If R1 sees operations (1) then (2) it will get "ZZZZAA", then "ZZZ", "ZAA"
If R2 sees (2) then (1) it will get:
"ZZ", "ZZ", then "ZZAA", "ZZ".
Strong Eventual Consistency doesn't have this problem because the operations have the time vector on them so the replicas know what order to apply them.
aatd86•55m ago