Turns out it is pretty much required for a distributed system. A common question in microservice architecture is whether to validate permissions only at the API gateway layer, or at every points of use. If you want to validate it everywhere, what happen when you're running async job and the user get revoked. In Zanzibar you attach the cookie as the user's context and Zanzibar will always return the same answer. (This is not meant for cronjob where user set it once and it repeat daily, but rather for quick, one off background jobs like generating reports to users' email) If you remove the internal store, the application's API must provide point-in-time query, which I never see one application does that let alone a microservice environment.
Another problem is cache invalidation - when permission get added or removed, users want that to reflect quickly. I can't remember how the paper handle this, but in any case since the permissions are stored in Zanzibar, every changes goes through Zanzibar. If you remove the internal data store, you lose the change notification.
The pseudo-Zanzibar lives in production today, but I feel like it is one of the mistake in my career.
jauntywundrkind•6h ago
> Most noticeably, Zanzibar is built with Spanner Google’s distributed database, and Spanner has the ability to order timestamps using TrueTime, which relies on atomic clocks and GPS antennae: this is not standard equipment for a server. Even CockroachDB, which is explicitly modeled off of Spanner, can’t rely on having GPS & atomic clocks around so it has to take a very different approach.
GPS based timing is very accurate (not atomic clock accurate), and very good boards can be a couple hundred dollars, based around chips like the U-blox LEA-M8F or it's newer variants. @jeffgeerling has been going through a bunch of the various offerings. https://news.ycombinator.com/item?id=28380002 https://news.ycombinator.com/item?id=36893922
If that's not good enough chip-scale atomic like the CASC-SA65 is "only" $5-$3k. https://www.microchipdirect.com/product/090-02789-001?srslti...
It'd be very interesting to assess what the requirements really are, what the threat analysis really is. My instinct says that even advanced attacks are unlikely to be problematic, that rarely will cutting off access this millisecond or that make a huge difference. But most people aren't safeguarding extremely high value systems that would incentivized advanced persistent threats to sit there finding out.
Really cool to see skip-lists involved; very fun having a datastructure that integrates statistics. I'm kind of surprised how little advancement there's been here since Pugh introduced them in 1989.