Can you elaborate?
@attr.s
class C:
x = attr.ib()
as its main api (with `attr.attrs` and `attr.attrib` as serious business aliases so you didn't have to use it).That API was always polarizing, some loved it, some hated it.
I will point out though, that it predates type hints and it was an effective way to declare classes with little "syntax noise" which made it easy to write but also easy to read, because you used the import name as part of the APIs.
Here is more context: https://www.attrs.org/en/stable/names.html
I REGRET NOTHING
(I’m the author of dataclasses, and I owe an immeasurable debt to Hynek).
i believe that design pressure sense is a form of taste, and like taste it needs to be cultivated, and that is can't be easily verbalized or measured. you just know that your architecture is going to have advantageous properties, but to sit down and explain why will take inordinate amount of effort. the goal is to be able to look at the architecture and be able to see its failure states as it evolves through other people working with it, external pressures, requirement changes, etc. over the course of 2, 3, ... 10, etc. years into the future. i stay in touch with former colleagues from projects where i was architect, just so that i can learn how the architecture evolved, what were the pain points, etc.
i've met other architects who have that sense, and it's a joy to work with them, because it is vibing. conversly "best practices or bust" sticklers are insufferable. i make sure that i don't have to contend with such people.
[0] https://www.cs.unc.edu/~stotts/COMP723-s13/patterns/forces.h...
[1] https://www.pmi.org/disciplined-agile/structure-of-pattern-p...
[2] Chapter 19 in “Pattern languages of program design 2”, ISBN 0201895277
Which is what the person I was replying to said with "Code is for communicating with humans primarily, even though it needs to be run on a machine.". If the primary purpose is communication with other humans we wouldn't choose such awkward languages. The primary purpose of code is to run and provide some kind of features supporting use cases. It's really nice however if humans can understand it well.
The code does also need to be understandable by other humans, but that is not its primary purpose.
The only thing that matter to the machine is opcodes and bits, But that's alien to human, so we map it to assembly. Any abstractions higher than that is mostly for reasoning about the code and share that with other people. And in the process we find some very good abstractions which we then embed into programming languages like procedure, namespacing, OOP, patterns matching, structs, traits/protocols,...
All these abstractions are good because they are useful when modeling a problem. The some are so good then it's worth writing a whole VM to get them (lisp homoiconicity, smalltalk's consistent world representation,...)
Saying that reading code is the point of writing code is crazy, that's like saying the point of writing scripts is to read them, or the point of writing sheet music is to look at it.
No - the point of writing a script is to have it performed as a play, the point of writing music is to hear it and enjoy it. The point of writing code is to run it.
> All these abstractions are good because they are useful when modeling a problem.
Then what do you do after modeling the problem? You solve it! You run the program! Everything is in service to that.
No one does it in isolation. The goal of having a common formal notation is for everyone to share solution unambiguously with each other. We have mathematical notation, choreographic notation, music notation, electric notation,... because when you've created something, you want to share it as best as possible to others. If not, you could just ship the end result and be done with it.
So no the point of writing music is not to hear it and enjoy it. To do so you just find an instrument and perform. You do not to do anything else. But to have someone else to do it, you can rely on their ear, their sights and their memory to pick things up. Or you just use the common notation to exchange the piece of music.
Yes, they do. There are plenty of solo developers out there. And plenty of solo musicians who write and perform music.
So you are already not interpreting it literally: none of us can avoid our biases (also, machine code is code too, yet nobody misinterpreted that).
I took that quote to mean that we go through the extra trouble of writing nice code for humans to be able to reason about the code, and especially to update when changes are needed: that makes it the primary reason we invent programming languages instead of going with machine code directly.
Also, it is good to remember what game is actually being played. When someone comes up with a popularizes a given "best practice", why are they doing so? In many cases, Uncle Bob types are doing this just as a form of self promotion. Most best practices are fundamentally indefensible with proponents resorting to ad-hominem attacks if their little church is threatened.
I find this topic difficult to navigate because of the many trade-offs. One aspect that wasn't mentioned is temporal. A lot of the time, it makes sense to start with a "database-oriented design" (in the pejorative sense), where your types are just whatever shape your data has in Postgres.
However, as time goes on and your understanding of the domain grows, you start to realize the limitations of that approach. At that point, it probably makes sense to introduce a separate domain model and use explicit mapping. But finding that point in time where you want to switch is not trivial.
Should you start with a domain model from the get-go? Maybe, but it's risky because you may end up with domain objects that don't actually do a better job of representing the domain than whatever you have in your SQL tables. It also feels awkward (and is hard to justify in a team) to map back and forth between domain model, sql SELECT row and JSON response body if they're pretty much the same, at least initially.
So it might very well be that, rather than starting with a domain model, the best approach is to refactor your way into it once you have a better feel for the domain. Err on the side of little or no abstraction, but don't hesitate to introduce abstraction when you feel the pain from too much "concretion". Again, it takes judgment so it's hard to teach (which the talk does an admirable job in pointing out).
By domain model do you mean something like what a scientist would call a theory? A description of your domain in terms of some fundamental concepts, how they relate to each other, their behaviour, etc? Something like a specification?
Which could of course have many possible concrete implementations (and many possible ways to represent it with data). Where I get confused with this is I'm not sure what it means to map data to and from your domain model (it's an actual code entity?), so I'm probably thinking about this wrong.
For context: https://fsharpforfunandprofit.com/posts/designing-with-types...
So both the storage and presentation layer are strings, but they differs. So to reconcile both, you need an intermediate layer, which will contains structures that are the domain models, and logic that manipulate them. To jump from one layer to another you map the data, in this example, string to structs then to string.
With MVC and CRUD apps, the layers often have similar models (or the same, especially with dynamic languages) so you don't bother with mapping. But when the use cases becomes more complex, they alter the domain layer and the models within. So then you need to add mapping code. Your storage layers may have many tables (if using sql), but then it's a single struct at the domain layer, which then becomes many models at the presentation layer with duplicate information.
NOTE
That's why a lot of people don't like most ORM libraries. They're great when the models are similar, but when they start to diverge, you always need to resort to raw SQL query, then it becomes a pain to refactor. The good ORM libraries relies on metaprogramming, then they're just weird SQL.
The way the different classes are associated with each other by method calls makes evidednt a kind of "theory" of our system, what kind of objects there are in the system what operations they can perform returning other types of objects as results and so on. So it looks much like a "theory" might in Ecological Biology, m ultiple species interacting with each other.
In that model, you can navigate from anywhere to anywhere by following references.
The domain model, at least from a DDD perspective, is different in at least a couple of ways: your domain classes expose business behaviours, and you can hide certain entities as such.
For example, imagine an e-commerce application where you have to represent an order.
In the DB model, you will have the `order` table as well as the `order_line` table, where each row of the latter references a row of the former. In your domain model, instead, you might decide to have a single Order class with order lines only accessed via methods and in the form of strings, or tuples, or whatever - just not with an entity. The Order class hides the existence of the order_line table.
Plus, the Order class will have methods such as `markAsPaid()` etc, also hiding the implementation details of how you persist this type of information - an enum? a boolean? another table referencing rows of `order`? It does not matter to callers.
There’s no one data model ideal for all scenarios — so why not have a different model for each scenario? Then I just need to figure out a way to transform between one model and the next, and whatever logic depending on that idealized data model can now be implemented fairly simply (since that’s the nature of a good data model - the rest of the logic often just falls out).
So the data model you’re using then is localized to the domain/subject in question. You’re just transitioning the data between models as needed. A domain just being an arbitrary context — the persistence layer, or the UI logic, or even specific like I want my model for an accountant to reflect how an accountant UI page would organize it because I only understand 30% of what they’re asking me to do so keeping it “in their terms” makes things much easier to implement blindly. Or perhaps the primary purpose of this particular function is various aggregations for reporting, so I start off by organizing my dataset into a hierarchy that largely aligns with the aggregation groups. Once it’s aligned properly, the aggregation logic itself becomes utterly trivial to express
You could even say that every time you query the database beyond a single table select *, you’re creating a new domain-specific data model. You’re just transforming from the original table representations to a new one.
All domain modeling is specifically choosing a representation that best fits the logic you’re about to write, and then figuring out how to take the model you have and turn it into the model you want. Everything else on the subject is just implementation detail.
I know this was a minor point, but I think it speaks to the overall topic, so I'll poke at it.
Adjacency lists are perhaps the worst way to store a graph / tree in RDBMS. They may be the easiest to understand, but they have some of the worst performance characteristics, especially if your RDBMS doesn't have Recursive CTEs. This starts to matter at a much lower scale than you might think; several million rows is enough to start showing slowdowns.
This book [0] (Joe Celko's Trees and Hierarchies in SQL For Smarties) shows many other options, though it does lack the closure table approach [1], which is my preferred approach.
And here, we come full circle back to the long-held friction between DBs and applications. You start mentioning triggers, and devs flinch, stating that they don't want logic in the DB. In every case I've ever seen, the replacements they come up with is incredibly convoluted and prone to errors, but hey, it's not in the DB. There is no reason to fear triggers, if and only if you treat them the same way that you'd treat code (because it is): added/modified/removed only via PRs, with careful review and testing.
[0]: https://ia804505.us.archive.org/19/items/0411-pdf-celko-tree...
[1]: https://dirtsimple.org/2010/11/simplest-way-to-do-tree-based...
You absolutely should go with a domain model from the get-go. You can take some shortcuts if absolutely necessary, such as simply using a typealias like `type User = PostgresUser`. But you should definitely NOT use postgres-types inside all the rest of your code - that just asks for a terrible refactoring later.
> It also feels awkward (and is hard to justify in a team) to map back and forth between domain model, sql SELECT row and JSON response body if they're pretty much the same, at least initially.
Absolutely not. This is the most normal thing in the world. And, in fact, they won't be the same anyways. Don't you want to use at least decent calendar/datetime types and speaking names? Don't you want to at least structure things a bit? And you should really really use proper types for IDs.
User(name: string, posts: string[]) is terrible.
User(name: UserName, posts: PostId[]) is acceptable. So you will have to do some kind of mapping even in the vast majority of trivial cases.
Impedance mismatch, ORM, type generators, query parameterisation, async, etc... all stem from treating data as this "external" thing instead of the beating heart of the application.
It terrifies me to say this, but sooner or later someone is going to cook up a JavaScript database engine that also has web capability, along with a native client-side cache component... and then it'll be curtains for traditional databases.
Oh, the performance will be atrocious and grey-bearded wise old men will waggle their fingers in warning, but nobody will care. It'll be simple, consistent, integrated, and productive.
it hurts that it's not exactly wrong.
but i don't think it's 100% right either, there are some things that you just can't do reliably, in current db engines at least.
As soon as you start baking this kind of support in to the db all you have if a db engine that has all the other bits stuffed in it.
They'll still have most of the issues you describe, it'll just be all in the "db layer" of the engine.
But again, at that point you're really just moving the surface rather than addressing the issues.
Something like Java's Akka or .NET Orleans combined with React.
So the "data" would be persisted by stateful Actors, which can run either on the server or in the browser, using the exact same code. Actors running in the browser can persist to localStorage and on the server the actors can persist to blob storage or whatever.
Actors have unique addresses that can be used to activate them. In this system these become standard HTTP URIs so that there is a uniform calling convention.
Bringing model definition and usage closer to the storage layer thereby reducing the need for translation and transport might cut down on the need for repeated and variant definitions across layers but it doesn't remove the other issues related to data storage.
There will need to be a system for storage and that will have to deal with transactional state management as well as consistency schemes, having that in an actor layer that is shared with other parts of the system might solve some issues, but it'd need to be really carefully managed so as not to inflict the solutions to those problems on the other parts of the system using the same transport mechanisms.
i dislike javascript with a passion but I'd be down for using a wasm based system that does what you say, my skepticism is usually just me shouting at clouds so it'd be interesting to see a working model.
First and foremost, what if there is not "the" database? What if you have multiple places that store data? For example, a postgres for ACID stuff, something like Kafka/RabbitMQ or similar to easily communicate with other services (or even yourself) - and sure, you could do that in postgres, but it's a trade-off. Then maybe something like redis/memcache for quick lookups/caching, then maybe elasticsearch for indexed search queries, and so on. And you usually also have some http API.
Sure you can say "I just do all with postgres" and honestly, that's often a good choice.
But it shows that it's not where "code (...) really belongs". Even IF you move a lot of logic into your database engine (and you often should), you most of the time will have another API and there will be a connection. Well, unless you use shared database tables with another application for communication.
All you do is pushing it out further to a later point - and often forcefully so.
> It terrifies me to say this, but sooner or later someone is going to cook up a JavaScript database engine that also has web capability, along with a native client-side cache component... and then it'll be curtains for traditional databases.
Not going to happen. Services like https://spacetimedb.com exist. Also, solutions like Spark (where you send your code to the database(s)) exist. And for certain things, they are great. However, it is a trade-off. There is no one-fits-all solution.
These are different things, and the fact that they're so often conflated is IMO prima facie that they're misused. If you need a queue, you shouldn't be reaching for Kafka, and vice-versa.
> Then maybe something like redis/memcache for quick lookups/caching
With proper RDBMS schema, indexing, and queries, these are _very_ frequently not needed. At FAANG scale, sure, but I think most would be shocked at how performant a properly-tuned RBDMS can be. That's of course the problem; RDBMS are incredibly difficult to run optimally at scale, and so the easier solution is to scale them up and slap a cache in front of them.
> elasticsearch for indexed search queries
Again, probably not needed for most. Both Postgres, MySQL, and SQLite (and I assume others) all have FTS that works quite well. ES is massively complicated to administer.
Disclaimer: I am the developer of SpacetimeDB which is spiritually similar. We absolutely intend to run client side as well. We need to for client side prediction. And eventually we’ll probably do web rendering at some point as well.
In any case, there is a slide in the talk that has both the Pydantic and SQL Alchemy logos. As far as I know, there’s only one (somewhat popular) library that ties these two together. I think the speaker makes a persuasive case that data, domain, API, and other models should remain related but distinct.
No wonder there are so many single-monitor, no-LSP savants out there.
I tried pulling out the Youtube transcript, but it was very uncomfortable to read with asides and jokes and "ums" that are all native artifacts of speaking in front of a crowd but that only represent noise in when converted to long written form.
---
FWIW, I'm the speaker and let me be honest with you: I'm super unmotivated to write nowadays.
In the past, my usual MO was writing a bunch of blog posts and submit the ones that resonated to CfPs (e.g. <https://hynek.me/articles/python-subclassing-redux/> → <https://hynek.me/talks/subclassing/>).
However, nowadays thanks to the recent-ish changes in Twitter and Google, my only chance to have my stuff read by a nontrivial amount of people is hitting HN frontage which is a lottery. It's so bad I even got into YouTubing to get a roll at the algorithm wheel.
It takes (me) a lot of work to crystallize and compress my thoughts like this. Giving it as a talk at a big conference, at least opens the door to interesting IRL interactions which are important (to me), because I'm an introvert.
I can't stress enough how we're currently eating the seed corn by killing the public web.
I just pasted the YouTube link into AI Studio and gave it this prompt if you want to replicate:
reformat this talk as an article. remove ums/ahs, but do not summarize, the context should be substantively the same. include content from the slides as well if possible.
1317•1mo ago