For db stuff - what if we flipped the way we write schemas around. A schema is its something that you derive at a current state, rather than starting by writing a schema and then generating migrations? So you can reason over all of it rather than a particular snapshot?
[0] https://docs.datomic.com/datomic-overview.html#information-m...
Gameplay tags for example [0], and quite a lot of SQL.
[0] https://dev.epicgames.com/documentation/en-us/unreal-engine/...
The value of a schema in a db like Postgres isn’t in the description of the storage format or indexing, but the invariants the database can enforce on the data coming in. That includes everything from uniqueness and foreign key constraints to complex functions that can pull in dozens of tables to decide whether a row is valid or should be rejected. How do you derive declarative rules from a turing complete language built into the database?
Eg some table Users -> you start with `add user_id` , `add org_id`, `remove org_id` for example, so then the current state is `Users{ user_id }`. But you trust the compiler to derive that, and then when you want to do something with Users you have to scope into it, or tell it how to handle different steps in that chain.
Im not saying this is not equivalent at the end of the day, just if anything surfaces it this way, or makes it more ergonomic.
So when you call writeXY your caller has absolutely no need to know what actually happens. Catching and modifying old versions is just another morphism. You can even keep the layout and just accept version and payload as input.
You define an ecto schema and write your own migrations, but like the state of the app seems tied to that kinda snapshot of the schema. Its not like you have some lens into the overall chain of all the migrations together.
The correct answer here is that a database ought to be modeled as a data type. Sql treats data type changes separate from the value transform. To say this is retarded is an understatement.
The actual answer is that the schema update is a typed function from schema1 to schema2. The type / schema of the db is carried in the types of the function. But the actual data is moved via the function computation.
Keeping multiple databases around is honestly a potential good use of homotopy type theory extended with support for partial / one-way equivalences
Im not sure what the point of the post is. Minimizing side effects in your code (keeping functions pure) is what gives you the flexibility he says FP misses?!
There are experimental systems like Unison that keep old versions of your code alive as data, which are fascinating solutions to the underlying problem.
But none of these things are functional programming? This is more the tradition of 'expressive static types' than it is of FP.
What about Lisp, Racket, Scheme, Clojure, Erlang / Elixir...
> The immutability of the log is the entire value proposition.
In practice, much of the article seems to be about the problems introduced by rigid typing. Your quote, for instance, is used in the context of reading old logs using a typed schema if that schema changes. But that's a non-issue in the FP languages mentioned above since they tend towards the use of unstructured data (maps, lists) and lambdas. Conversely, reading state from old, schema-incompatible logs might be an issue in something like Java or C++, which certainly are not FP languages as the term is usually understood.
So overall, not really an FP issue at all, and yet the article is called "what functional programmers get wrong". The author's points might be very valid for his version of FP, but his version of FP seems to be 'FP in the ML tradition, with types so rigid you might want to consult a doctor after four hours'.
But the solutions and tools functional programming provides help to have higher verifiability, fewer side effects and more compile-time checks of code in the deployment unit. That other challenges exists is fully correct. But without a functional approach, you have these challenges plus all others.
So while FP is not a one-size-fits-all solution to distributed systems challenges, it does help solve a subset of the challenges on a single system level.
> A deploy pipeline that queries your orchestrator for running image tags, checks your migration history against the schema registry, diffs your GraphQL schema against collected client operations, and runs Buf’s compatibility checks: this is buildable today, with off-the-shelf parts.
that was largely successful at eliminating interservice compatibility errors, but it felt like we were scrambling around in a dark and dusty corner of software engineering that not many people really cared about. Good to see that there's a body of academic work that I was not entirely aware about that is looking into this.
The article successfully identifies that, yes, there are ways, with sufficient effort, to statically verify that a schema change is (mostly) safe or unsafe before deployment. But even with that determination, a lot of IDLs make it still very hard to evolve types! The compatibility checker will successfully identify that strengthening a type from `optional` to `required` isn't backwards compatible: _now what_. There isn't a safe pathway for schema evolution, and you need to reach for something like Cambria (https://www.inkandswitch.com/cambria/, cited in the article) to handle the evolution.
I've only seen one IDL try to do this properly by baking evolution semantics directly into its type model, typical (https://github.com/stepchowfun/typical) with it's `asymmetric` type label: it makes a field required in the constructor but not the deserialiser. I would like to see these "asymmetric" types find their ways into IDLs like Protocol Buffers such that their compatibility checkers, like Buf's `buf breaking` (https://buf.build/docs/reference/cli/buf/breaking/), can formally reason about them. I feel like there is a rich design space of asymmetric types that are poorly described (enum values that can only be compared against but never constructed, paired validations, fallback values sent over the wire, etc.)
Another one that I think makes a pretty good attempt is the config language CUE and its type system (https://cuelang.org/docs/concept/the-logic-of-cue/), which allows you to do first-class version compatibility modelling. But I have yet to investigate whether CUE is fit for purpose for modelling compatibility over the wire / between service deploys.
But yes, interfacing between formally proven code and the wild wild West is always an issue. This is similar to when a person from a country with a strong legal system suddenly find themselves subject to the rule of the jungle.
The better question to ask is why did we choose to build a chaotic jungle and how do we get out of it
I still want to believe that future programming languages will be capable of tackling these issues, but exploring the design space will require being cognizant of those issues in the first place.
I don’t think the criticism of the author applies LISP-style functional programming, which is much more accommodating to embracing chaos.
Ideas that helped me situate many of the issues we struggled with at various companies:
- distinguishing programs (the code as designed) from (distributed) systems (the whole mess, including maintenance and migration, as deployed)
- versioning code (reversibly) and data (irreversibly)
- walk-through of software solutions for distributed systems, and their blind spots wrt data versioning; temporal databases
- semantic drift: schema-less change
- Limitations of type systems for managing invariants
- co-evolving code and data; running old code on old data
The real benefit of systems thinking is recognizing whether an issue is unsolvable and/or manageable (with the current approach, or warrants a technology upgrade or directional shift). It's like O(n) evaluation of algorithms, not for space/time but for issue tractability.
An FP programmer admitting they live in ivory towers. I never thought I'd see the day.
I love it. HN needs 10 times more of this.
This person has grown up - the rest of you? Get a grip.
Even if it is, or isn't ai, the content is in fact correct.
There is no 'one' state of your application, unless you literally do maintenance window deploys and 0 queues and keep everything sync.
So either his prompt is making the writing more palatable, or maybe he's just prolific.
In either case, I would count this as a w
Fundamentally - all your system after a certain scale becomes distributed systems, accuracy on a local level is usually not where systemic risks are coming from. A distributed system of fully accurate and correct local programs can implode or explode. Migration, compatibility and system stability are surprisingly dependent on engineer experience till this day.
First put all your services in a monorepo have it all build as one under CI. That’s a static check across the entire system. When you do a polyrepo you are opening up a mathematical hole where two repos can be mismatched.
Second don’t use databases. Databases don’t type check and aren’t compatible with your functional code written on servers.
The first one is stupid everyone just makes the mistake and most of the time unless you were there in the beginning to enforce monorepo you have to live with someone deploying a breaking change that effects a repo somehwere else.
The second one is also stupid but you had to be there in the very very beginning to have stopped it. The inception of the concept of databases should have been created such that any query you run on it needs to be compiled to run on the database and is thus statically checked. This is something that can easily be done but wasn’t done and now we all just make do with testing.
That’s like half the problems with distributed systems completely solved. And these problems just exist because of raw stupidity. Stuff like race conditions and dead locks are solvable too but much harder.
diebillionaires•2h ago
srik•1h ago
the__alchemist•51m ago