I wanted to share something I've been working on for the past couple of months, which may be interesting to developers interacting with distributed architectures (e.g., microservices).
I'm a backend developer, and in my 9-5 job last year, we started building a distributed app - by that, I mean two or more services communicating via some sort of messaging system, like Kafka. This was my first foray into distributed systems. Having been exposed to structured concurrency by Nathan J. Smith's beautiful article on the subject (https://vorpus.org/blog/notes-on-structured-concurrency-or-g...), I started noticing the similarities between the challenges of this message-based communication, and that of concurrent programming, and GOTO-based programming before that - actions at a distance, non-trivial tracing of failures, synchronization issues, etc. I started suspecting that if the symptoms were similar, maybe the root cause, and therefore the solution, could be as well.
This led me to design something I'm calling "structured cooperation", which is basically what you get when you apply the principles of structured concurrency to distributed systems. It's something like a "protocol", in the sense that it's basically a set of rules, and not tied to any particular language or framework. As it turns out, obeying those rules has some pretty powerful consequences, including:
- Pretty much eliminates race conditions caused by eventual consistency
- Allows you to recover something resembling distributed exceptions - stack traces and the equivalent of stack unwinding, but across service boundaries
- Makes it much easier to reason about the system as a whole
I put together three articles that explain:
1) what structured cooperation is (https://developer.porn/posts/introducing-structured-cooperat...),
2) one way you could implement it (https://developer.porn/posts/implementing-structured-coopera...), and
3) why it works (https://developer.porn/posts/framing-structured-cooperation/).
I also put together a heavily documented POC implementation in Kotlin, called Scoop (linked in the title). I guess you could call it an orchestration library, similar to e.g. Temporal (https://temporal.io/), although I want to stress that it's just a POC, and not meant for production use.
I was hoping to bounce this idea off the community and see what people think. If it turns out to be a useful way of doing things, I'd try and drive the implementation of something similar in existing libraries (e.g. the aforementioned Temporal, Axon (https://www.axoniq.io/products/axon-framework), etc. - let me know if you know of others where this would make sense). As I mention in the articles, due to the heterogeneous nature of the technological landscape, I'm not sure it's a good idea to actually try to build a library, in the same way as it wouldn't make sense to do a "structured concurrency library", since there are many ways that "concurrency" is implemented. Rather, I tried to build something like a "reference implementation" that other people can use as a stepping stone to build their own implementations.
Above and beyond that, I think that this has educational value as well, and I did my best to make everything as understandable as possible. Some things I think are interesting:
- Implementation of distributed coroutines on top of Postgres
- Has both reactive and blocking implementation, so can be used as a learning resource for people new to reactive
- I documented various interesting issues that arise when you use Postgres as an MQ (see, in particular, https://github.com/gabrielshanahan/scoop/blob/09db323bf6c8a7... and https://github.com/gabrielshanahan/scoop/blob/09db323bf6c8a7...)
Let me know what you think.