Keep Pydantic out of your Domain Layer

https://coderik.nl/posts/keep-pydantic-out-of-your-domain-layer/

62•erikvdven•3d ago

Comments

IshKebab•9h ago

This seems ridiculously over-complicated. This guy would love Java.

He doesn't even say why you should tediously duplicate everything instead of just using the Pydantic objects - just "You know you don’t want that"! No I don't.

The only reason I've heard is performance... but... you're using Python. You don't give a shit about performance.

photios•7h ago

The "gain" TFA is describing is very very questionable too. You're losing a lot in terms of complexity.

You're going from a straightforward "Pydantic everywhere" solution to a weird concoction of:

1. Pydantic models

2. "Poor man's Pydantic models" (dataclasses)

3. Obscure third party dependencies (Dacite)

Thanks, I'll pass.

pletnes•7h ago

Pydantic seems to be fast (in the context, it’s written in rust) so it might make sense to keep using pydantic for performance reasons.

yedpodtrzitko•7h ago

Just because it's written in Rust it doesnt mean it's fast. I was working on a project where Pydantic was the bottleneck - there were multiple levels of nested Pydantic objects, and creating the instances was very slow due to the validation which is performed on input values. Even after disablign the validation, dataclasses were twice as fast, compiling the dataclasses with mypyc improved the performance ten times.

ensignavenger•5h ago

Were you using v2?

Pydantic docs do clearly state that multple levels of nesting of Pydantic objects can make it much slower, so it isn't particularly surprising that such models were slow.

franktankbank•5h ago

> you're using Python. You don't give a shit about performance.

That's dumb. You may not care about max performance but you've got some threshold where shit gets obviously way to slow to be workable. I've worked with a library heavy on pydantic where it was the bottleneck.

hxtk•5h ago

The main “why” that I find is that it allows you to intentionally design your API types and know when a change is touching them.

I worked on a project with a codebase on the order of millions of lines, and many times a response was made by taking an ORM object or an app internal data structure and JSON serializing it. We had a frequent problem where we’d make some change to how we process a data structure internally and oops, breaking API change. Or worse yet, sensitive data gets added to a structure typically processed with that data, not realizing it gets serialized by a response handler.

It was hard to catch this in code review because it was hard to even know when a type might be involved in generating a response elsewhere in the code base.

Switching to a schema-first API design meant that if you were making a change to a response data type, you knew it. And the CODEOWNERS file also knew it, and would bring the relevant parties into the code review. Suddenly those classes of problems went away.

microflash•4h ago

For many cases, we don’t do these kind of things in Java; a single annotated record can function as a model for both data and API layers. Regardless of the language, the distinction becomes important when these layers diverge or there’s some sensitive data involved.

never_inline•3h ago

Even when DTO is separate, you can use projection methods of eg: spring data JPA, or something like mapstruct.

slt2021•3h ago

>you're using Python. You don't give a shit about performance.

Maybe it is true if you artificially limit yourself to a single instance single thread model, due to GIL.

But because nowadays apps can easily be scaled up in many instances, this argument is irrelevant.

one may say that Python has large overhead when using a lot of objects, or that it has GIL, but people learned how to serve millions of users with python easily.

dontlaugh•2h ago

Any Python code will be dozens to hundreds of times slower than Go or Java, but it may still be fast enough to stay within human reaction latencies.

And you make be able to scale to many users, worst case with more machines. But it’ll still costs you a lot more than a faster language. That is extremely relevant, even today.

rtpg•9h ago

In the Django world I have gotten very frustrated at people rushing to go from DRFs serializers to Django Ninja + Pydantic.

You have way less in terms of tools to actually provide nice straightforward APIs. I appreciate that Pydantic gives you type safety but at one point the actual ease of writing correct code goes beyond type safety

Just real straightforward stuff around dealing with loading in user input becomes a whole song and dance because Pydantic is an extremely basic validation thing… the hacks in DRF like request contexts are useful!

I’ve seen many projects do this and it feels like such a step back in offering simple-to-maintain APIs. Maybe I’m just biased cuz I “get” DRF (and did lose half a day recently to weird DRF behavior…)

zo1•7h ago

This is the Javascript hipster effect. FastAPI and Pydantic are pushed heavily because of their fancy docs page and the evangelism which thrives on reinventing the wheel. So we are all now stuck with everything being Pydantic this Pydantic that, instead of existing frameworks which are frankly better.

WesolyKubeczek•7h ago

It's also because Pydantic has VC money and needs to grow fast now, or else.

murkt•6h ago

What’s the story here, can anyone enlighten me? How can they make money being a Python library?

I can stretch my imagination about Astral monetizing their tools, but this one is too difficult

stephantul•5h ago

Pydantic (the company) owns logfire, a logging service. There’s a lot of money in logging/observability. The pydantic library itself is not monetizable, as you indicate.

zo1•5h ago

Wow really I had no idea. This rabbit hole goes deeper then I expected!

In 2022, the project evolved into a commercial entity called Pydantic Services Inc., founded by Samuel Colvin and Adrian Garcia Badaracco, to build products around the open-source library. The company raised $4.7 million in seed funding in February 2023, led by Sequoia Capital, with participation from Partech, Irregular Expressions, and other investors. This was followed by a $12.5 million Series A round in October 2024, again led by Sequoia Capital and including Partech Partners, bringing the total funding to approximately $17.2 million across rounds. The Series A funding coincided with the launch of Pydantic Logfire, a commercial observability platform for backend applications, aimed at expanding beyond the core open-source validation framework. As of mid-2025, no additional funding rounds have been publicly reported.

https://techcrunch.com/2023/02/16/sequoia-backs-open-source-...

rtpg•7h ago

To be fair I do think that Pydantic leaning into the type annotation story is nice. If you’re really going lean or performant the restrictions work well in your favor. Just like… for the bog standard B2B SaaS the expressivity tradeoff just doesn’t feel worth it.

In a more just world pythons typing story was closer to typescript’s and we could have a fully realized idea like it that supports the asymmetric nature of serializing/deserializing and offers nice abstractions through the stack

Right now Pydantic for me is like “you can validate a straightforward data structure! Now it’s up to you to actually build up a useful data structure from the straightforward one”. Other tools give me both in one go. At the cost of safety (that you can contain, but you gotta do it right)

cout•3h ago

Which existing framework is better?

lispisok•2h ago

What alternatives do you suggest then?

zo1•2h ago

It's a tough answer because we have had years of artificially-pumped support and development and ecosystem growth of Pydantic.

But if I had to roll the clock back I'd recommend marshmallow and that entire ecosystem. It's definitely way less bloated than Pydantic currently, and only lacks some features. Beyond that, just use plain-old dataclasses.

https://marshmallow.readthedocs.io/en/latest/

pdhborges•3h ago

Are you refering to DRF model serializers? For medium to big applications I think they are worthless.

rtpg•1h ago

Shrug, I find them more helpful than Pydantic models for lots of canonical cases.

I have had good success with DRF model serializers in like Django projects with 100+ apps (was the sprawling nature of the apps itself a problem? Sure, maybe). Got the job done

As with anything you gotta built your own wrappers around these things to get value in larger projects though

JackSlateur•3h ago

Could you elaborate a bit more, perhaps with examples ?

lysecret•9h ago

The core thesis is that your types received by the api should not be the same as the types you process internally. I can see a situation where this makes sense and a situation where this senselessly duplicates everything. The blog post shows how to do it but never really dives into why/when.

r9295•8h ago

Personally, I think that's a good idea. Design patterns naturally make sense (Visitor, Builder for e.g) once you encounter such a situation in your codebase. It almost makes complete sense then. Otherwise IMHO, it's just premature abstraction

roland35•7h ago

No one is satisfied with premature abstraction :(

tetha•8h ago

It does touch on what I was thinking as well at the end of the first section: Usually this makes sense if your application has to manage a lot of complexity, or rather, has to consume and produce the same domain objects in many different ways across many different APIs.

For example, some systems interact with several different vendor, tracking and payment systems that are all kinda the same, but also kinda different. Here it makes sense to have an internal domain model and to normalize all of these other systems into your domain model at a very early level. Otherwise complexity rises very, very quickly due to the number of n things interacting with n other things.

On the other hand, for a lot of our smaller and simpler systems that output JSON based of a database for other systems... it's a realistic question if maintaining the domain model and API translation for every endpoint in every change is actually less work than ripping out the API modelling framework, which occurs once every few years, if at all? Some teams would probably rewrite from scratch with new knowledge, especially if they have API-tests available.

jon-wood•7h ago

I’ve not done this in Python, where mercifully I don’t really touch CRUD style web apps anymore, but when I was doing Ruby web development we settled on similar patterns.

The biggest benefit you get is being able to have much more flexibility around validation when the input model (Pydantic here) isn’t the same as the database model. The canonical example here would be something like a user, where the validation rules vary depending on context, you might be creating a new stub user at signup when only a username and password are required, but you also want a password confirmation. At a different point you’re updating the user’s profile, and that case you have a bunch of fields that might be required now but password isn’t one of them and the username can’t be changed.

By having distinct input models you make that all much easier to reason about than having a single model which represents the database record, but also the input form, and has a bunch of flags on it to indicate which context you’re talking about.

nvader•2h ago

I'm with you. But what want sufficiently justified in the article is why both sides of that divide, canonical User and User stubs, could not be pydantic models.

nine_k•33m ago

The idea, as far as I was able to understand it, is that you want your core models as dependency-free as possible. If you, for whatever reason, were to drop Pydantic, that would only affect the way you validate inputs from API, and nothing deeper.

This wasn't mentioned, but the constant validation on construction also costs something. Sometimes it's a cost you're willing to pay (again, dealing with external inputs), sometimes it's extraneous because e.g. a typechecker would suffice to catch discrepancies at build time.

NeutralCrane•6h ago

> The core thesis is that your types received by the api should not be the same as the types you process internally.

Is it? I read the blog a couple of times and never was able to divine any kind of thesis beyond the title, but as you said, the content never actually explains why.

Perhaps there is a reason, but I didn’t walk away from the post with it.

causal•5h ago

Yeah title implies a why, but this is really just about how

skissane•6h ago

I used to work on a Java app where we did this… we had a layer of POJO value classes, a layer of ORM objects… both written by hand… plus for every entity a hand-written mapper which translated between the two… and then sometimes we even had a third layer of classes generated from Swagger specs, and yet another set of mappers to map between the Swagger classes and the value POJOs

Now I mainly do Python and I don’t see that kind of boilerplate duplication anywhere near as much as I used to. Not going to say the same kind of thing never happens in Python, but the frequency of it sure seems to have declined a lot-often you get a smattering of it in a big Python project rather than it having been done absolutely everywhere

CharlieDigital•6h ago

I think this depends in principle on what you're building. Take an API, for example.

The thesis is simple:

    1) A DTO is a projection or a view of a given entity.  
    2) The "domain entity" itself is a projection of the actual storage in a database table.
    3) At different layers (vertical separation), the representation of this conceptual entity changes
    4) In different entry/exit points (horizontal separation), the projection of the entity may also change.

In some cases, the domain entity can be used in different modules/routes and are projected to the API with different shapes -- less properties, more properties, transformed properties, etc.

Typically, when code has a very well-defined domain layer and separation of the DTO and storage representation, the code has a very predictable quality because if you are working with a `User` domain entity, it behaves consistently across all of your code and in different modules. Sometimes, a developer intermixes a database `User` or a DTO `User` and all of a sudden, the code behaves unpredictably; you suddenly have to be cognizant if the `user` instance you're handling is a `DBUser`, a `UserDTO`, or the domain entity. It has extra properties, missing properties, missing functions, can't be passed into some methods, etc.

Does this matter? I think it depends on 1) the size of the team, 2) how much re-use of the modules is needed, 3) the nature of the service. For a small team, it's overkill. For a module that will be reused by many teams, it has long term dividends. For a one-off, lightweight service, it probably doesn't matter. But for sure, for some core behaviors, having a delineated domain model really makes life easy when working with multiple teams reusing a module.

I find that the code I've worked with over the years that I like has this quality. So if I'm responsible for writing some very core service or shared module, I will take the extra effort to separate my models -- even if there's more duplication required on my behalf because it makes the code more predictable to use if everything inside of the service expects to have only one specific shape and set of behaviors and project shapes outwards as needed for the use case (DTO and storage).

BiteCode_dev•4h ago

Because they don't represent the same thing. Pydantic models represent your input, it's the result of the experience you expose to the outside world, and therefore comes with objectives and constraints matching this:

- make it easy to provide

- make it simple to understand

- make it familiar

- deal with security and authentication

- be easily serializable through your communication layer

On the other hand, internal representations have the goal to help you with your private calculations:

- make it performant

- make it work with different subsystems such as persistence, caching, queuing

- provide convenience shortcuts or precalculations for your own benefits

Sometimes they overlap, or the system is not big enough that it matters.

But the more you get to big or old system, the less likely they will.

However, I often pass around pydantic objects if I have them, and I do this until it becomes a problem. And I rarely reach that point.

It's like using Python until you have performance problems.

Practicality beasts premature optimization.

JackSlateur•3h ago

My pydantic models represent a "Thing" (a concept or whatever), not an input

You can translate many things into a Thing, model_validate will help you with that (with contextinfo etc)

You can translate your Thing into multiple output format, with model_serialize

In your model, you shall put every checks required to ensure that some input are, indeed, a Thing

And from there, you can use this object everywhere, certain that this is, indeed, a Thing, and that it has all the properties that makes a thing a Thing

BiteCode_dev•2h ago

You can certainly do it, but since serialization and validation are the main benefit from using Pydantic, I/O are why it exists.

Outside of I/O, the whole machinery has little use. And since pydantic models are used by introspection to build APIs, automatic deserializer and arg parsing, making it fit the I/O is where the money is.

Also, remember that despite all the improved perf of pydantic recently, they are still more expensive than dataclass, themselves more than classes. They are 8 times more expensive to instanciate than regular classes, but above all, attribute access is 50% slower.

Now I get that in Python this is not a primary concern, but still, pydantic is not a free lunch.

I'd say it's also important to state what it conveys. When I see a Pydantic objects, I expect some I/O somewhere. Breaking this expectation would take me by surprise and lower my trust of the rest of the code. Unless you are deep in defensive programming, there is no reason to validate input far from the boundaries of the program.

JackSlateur•1h ago

This is true, there is a performance cost

Apart from what has been said, I find pydantic interesting even in the middle of my code: it can be seen as an overpowered assert

It helps making sure that the complex data structure returned by that method is valid (for instance)

senkora•1h ago

You should do it if and only if backwards compatibility is more important for your project than development velocity.

If you have two layers of types, then it becomes much easier to ensure that the interface is stable over time. But the downside is that it will take longer to write and maintain the code.

nyrikki•56m ago

PO?O is just an object not bound by any restriction other than those forced by the Language.[0]

From the typing lens, it may be useful to consider it from Rice's theorm, and an oversimplification that typing is converting a semantic property to a trivial property. (Damas-Hindley-Milner inference usually takes advantage of a pathological case, it is not formally trivial)

There is no hard fast rules IMHO, because Rice, Rice-Shapiro, and Kreisel-Lacombe-Shoenfield-Tseitin theorms are related to generalized solutions as most undecidable problems.

But Kreisel-Lacombe-Shoenfield-Tseitin deals with programs that are expected to HALT, yet it is still undecidable if one fixed program is equivalent to a fixed other program that always terminates.

When you start stacking framework, domain, and language restrictions, the restrictions form a type of coupling, but as the decisions about integration vs disintegration are always tradeoffs it will always be context specific.

Combinators (maybe not the Y combinator) and finding normal forms is probably a better lens than my attempt at the flawed version above.

If you consider using po?is as the adapter part of the hex pattern, and notice how a service mesh is less impressive but often more clear in the hex form, it may help build intuitions where the appropriate application of the author's suggestions may fit.

But it really is primarily decoupling of restrictions IMHO. Sometimes the tradeoffs go the other way and often they change over time.

[0] https://www.martinfowler.com/bliki/POJO.html

vjerancrnjak•8h ago

Just have 1 input type and 1 output type. You don’t need more data types in between.

If pydantic packages valid input, use that for as long as you can.

Loading stuff from db, you need validation again, either go from binary response to 1 validated type with pydantic, or ORM object that already validates.

Then stop having any extra data types.

Keeping pydantic only at the edge and then abandoning it by reshaping it into another data type is a weird exercise. It might make sense if you have N input types and 1 computation flow but I don’t see how in the world of duck typing you’d need an extra unified data type for that.

sgarland•1h ago

> Loading stuff from db, you need validation again, either go from binary response to 1 validated type with pydantic, or ORM object that already validates.

You shouldn’t need to validate data coming from the database. IMO, this is a natural consequence of teams abandoning traditional RDBMS best practices like normalization and constraints in favor of heavy denormalization, and strings for everything.

If you strictly follow 3NF (or higher, when necessary), it is literally impossible to have referential integrity violations. There may be some other edge cases that can be difficult to enforce, but a huge variety of data bugs simply don’t exist if you don’t treat the RDBMS as a dumb KV store.

NeutralForest•8h ago

What's the motivation for doing this? When does Pydantic in the domain model starts being an issue?

halfcat•2h ago

When the structure of your team makes it a problem. Conway’s law.

If you have one person maintaining a CRUD app, splitting out DTOs and APIs and all of these abstractions are completely not needed. Usually, you don’t even know yet what the right abstraction is, and making a premature wrong abstraction is WAY worse. Building stuff because you might need it later is a massive momentum killer.

But at some point when the project has grown (if it grows, which it won’t if you spend all your time making wrong abstractions early on), the API team doesn’t want their stuff broken because someone changed a pydantic model. So you start to need separation, not because it’s great or because it’s “the right way” but because it will collapse if you don’t. It’s the least bad option.

NeutralForest•1h ago

I'm not sure I agree, you can still use Pydantic in the domain model and update the version of the API when you change the expected schemas of your CRUD application.

Where I'm with you, is that you should take care of your boundaries and muddling the line between your Pydantic domain models and your CRUD models will be painful at some point. If your domain model is changing fast compared to the API you're exposing, that could be an issue.

But that's not a "Pydantic in the domain layer" issue, that's a separation of concerns issue.

politelemon•7h ago

The reasoning given here is more academic than anything else. I'm not seeing any actual problem here though. Perhaps this could show how this is bad. Until then, I don't think this excessive duplication and layering is necessary, and is more of a liability itself.

> That’s when concerns like loose coupling and separation of responsibilities start to matter more.

gostsamo•7h ago

I'm sure that the pydantic guys had a reason to rename .dict to .model_dump. This single change caused so much grieve when upgrading to pydantic2.1 The very idea of unnecessary breaking changes is a big reason not to over rely on pydantic, tbh.

1 we were using .dict to introduce pydantic in the mix of other entity schemes and handling this change later was a significant pain in the neck. Some python introspection mechanism that can facilitate deep object recasting might've been nice if possible.

jmogly•6h ago

Haha, ChatGPT recommends this:

from pydantic import BaseModel

class MyModel(BaseModel): name: str

    def dict(self, *args, **kwargs):
        return self.model_dump(*args, **kwargs)

gostsamo•1h ago

Yep, and when you are done migrating, you need to remove this, and there is pydantic3 coming. Keeping in mind the number of libraries nad microservices involved, search and replace was the easier option.

PS: thank you, I can think on my own and even failing that, chat gpt is not in closed beta any more.

the__alchemist•6h ago

Representing structured data as key/value pairs is a pattern I've only seen in Python, and don't understand why it became popular and canonical.

halfcat•2h ago

> Representing structured data as key/value pairs is a pattern I've only seen in Python

Come on. We know you’ve seen JavaScript.

brap•7h ago

I’m far from being an experienced Pythonista, but one thing that really bugs me in Python (and other dynamic languages) is that when I accept an input of some type, like User, I have to wonder if it’s really a User. This is annoying throughout the codebase, not just the API layer. Especially when there are multiple contributors.

The argument against using API models internally is something I agree with but it’s a separate question.

jon-wood•6h ago

I’m curious, what do you mean by having to wonder if it’s really a User? It’s optional in Python but you can use type annotations and then the type checker will shout at you for passing something that’s not a User instance to things that expect one.

brap•4h ago

The type checker can be ignored in all sorts of ways

padjo•6h ago

Python has reasonably good types these days. If you were to use pydantic to Marshall stuff from the API and then put type annotations on every method below that it would be pretty bulletproof.

derriz•4h ago

I've been using Python on and off for a few decades and agree. I don't know why you're being downvoted.

I've authored tens of thousands of lines of Python code in that time - both for research tools and for "production".

I use type hints everywhere in the Python I write but it's simply not enough.

This issue is political and not so much technical as Typescript demonstrates how you can add a beautifully orthogonal and comprehensive type system to a dynamic language, thus improving the language's ergonomics and scaleability.

The political aspect is the fact that early Python promoters decided that sanity checking arguments was not "pythonic" and this dogma/ideology has persisted to this day. The only philosophical basis for this position was that that Python offered no support for simple type checking. And apparently if you didn't/don't "appreciate" this philosophy, it reflected poorly on your software engineering abilities or skill with Python.

To be fair, Python isn't the only language of that era, where promoters went to great lengths to invent alternate-reality bubbles to avoid facing the fact that their pet language had some deep flaws - and actually Perl and C++ circles were even worse and more inward facing.

So the "pythonic" approach suggests having functions just accepting anything, whether it makes sense or not, and allowing your code to blow up somewhere deep in some library somewhere - that you probably didn't even know you're using.

So instead of an error like "illegal create_user(name: str) call: name should be a str but was a float", it's apparently better (more "pythonic") to not provide such feed-back to users of your functions and instead allow them to have to deal with an exception in a 40 line stack trace with something like "illegal indexing of float by dict object" in some source file library your users haven't even heard of.

dontlaugh•2h ago

I’ve also used Python for well over a decade and nowadays I mostly don’t. But for that reason and because of the terrible performance / efficiency.

zo1•47m ago

The problem with that small tidbit is that it immediately sets your type system to go down the path of Java and Typescript (which we all mock for it's crazy type systems and examples such as IImplementsFactoryAbstractMethodThingVirtual classes). This is not the python way, and is frankly part of its secret sauce (if you ask me).

And yes I include Typescript with Java there because it has it's own version of the Java class ecosystem hell, we just don't notice it yet. Look at any typescript library that's reasonably complicated and try to deduce what some of those input types actually do or mean - be honest. Heck a few weeks back someone posted how they solved a complicated combinatorial problem using Typescripts type system alone.

3eb7988a1663•2h ago

Do you mean like is the User object is a well formed User, or did someone actually give you an int?

As to the first problem, I recommend the Parse don't validate post[0]. The essential idea is stop using god objects that do it all, but use specific types to make contracts on what is known. Separate out concerns so there is an UnvalidatedUser (not serialized and lacking a primary key) and a ValidatedUser (committed to the database, has unique username, etc). Basic type hinting should get you the rest of the way to cleaning up code paths where you get some type certainty.

[0] https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-va...

dgan•7h ago

i have to confess , i use Protobuffs for everything. They convert to pure python (a la dataclass), to json strings and to binary strings, so i literally shove it everywhere : network, logic, disk.

BUT when doing heavy computation (c++, not python !) don't forget to convert to plain vectors, Protobuffs are horribly inefficient

the__alchemist•6h ago

Protobuf is fine if:

A: You control both ends of the serialized line, or: B: The other end of the line expects protobufs.

There are many [de]serialization scenarios where you are interfacing with a third party API. (HTTP/JSON web API, a given IC's comm protocol as defined in its datasheet etc)

dontlaugh•2h ago

You can still use a protobuf schema to parse/generate JSON, in most cases.

dgan•2h ago

i think even if 3rd party API expects json, you could still map their models to proto ; i haven't encountered this case tho

might still be challenging to convince proto to output what you want exactly

the__alchemist•19m ago

I don't understand then. Here is my mental model; as described, you can see why I'm confused:

JSON: UTF-8 Serialization format, where brackets, commas, fields represented by strings etc.

Protobuf: Binary serialization format that makes liberal use of varints, including to define field number, lengths etc. Kind of verbose, but not heinous.

So, you could start and end your journey with the same structs and serialize with either. If you try to send a protobuf to an HTTP API that expects JSON, it won't work! If you try to send JSON to an ESP32 running ESP-Hosted, likewise.

leoff•6h ago

>The less your core logic depends on specific tools or libraries, the easier it becomes to maintain, test, or even replace parts of your system without causing everything to break.

It seems like the author doesn't like depending on `pydantic`, simply because it's a third party dependency. To solve this they introduce another, but more obscure, third party dependency called `dacite`, that converts `pydantic` to `dataclasses`.

It's more likely that `dacite` is going to break your application, than `pydantic`, a library used by millions of users in huge projects, ever will. Not to mention the complexity overhead introduced by this non sense mapping.

wiseowise•6h ago

> simply because it's a third party dependency

Not simply. This is one one of the most important reasons NOT to propagate something through your code. How many millions codebases use it is irrelevant.

leoff•6h ago

>How many millions codebases use it is irrelevant.

It is relevant, because it speaks to the reliability of the dependency. `pydantic` has 24.7k Github stars and was last updated 52 minutes ago.

Adding a random dependency `dacite`, which has 1.9k Github stars, no one has ever heard of, and was last updated 4 months ago, introduces way more complexity and sources of instabilities than propagating `pydantic`.

murkt•6h ago

More updates means more changes and more instability. I have never seen dacite, but it’s pretty easy for a small library to just be complete. If it’s complete, why the need for constant changes?

Lucasoato•6h ago

Actually Pydantic could be extremely useful if used in conjunction with SQLAlchemy, check out the SQLModel library, from the very same creators of Pydantic.

jessekv•3h ago

Sebastián Ramírez created FastAPI and SQLModel, and was an early adopter of Pydantic. Samuel Colvin created Pydantic.

cout•3h ago

Having used sqlmodel recently for a project, I was underimpressed. Documentation was sparse, I found myself going to the source code to figure out how to solve problems I ran into, and I ended up dropping into sqlalchemy a lot more than I wanted. I think the idea is sound, but the code is hard to follow, and there are a lot of missing common cases.

JackSlateur•3h ago

sqlmodel is a wrapper around sqlalchemy, made by the guy who made fastapi

While it uses pydantic, sqlmodel has not been written by those guys

stephantul•5h ago

I think this article misses the main point by focusing on removing pydantic. The main point is that you should convert external types as soon as possible to decouple them from the rest of your code. Whether this involves pydantic or something else is not really important I guess

nisten•5h ago

From the article:

"Why are there no laws requiring device manufacturers to open source all software and hardware for consumer devices no longer sold?"

I think it's because people (us here included) love to yap and argue about problems instead of just implementing them and iterating on solutions in an organized manned. A good way these days to go about it would be to forego the facade of civility and use your public name to publicly tell your politician to just fuck it, do it it bad, and have plan to UNfuck after you fuck it up, until the fucking problem is fucking solved.

Same goes for UBI and other semi-infuriating issues that seem to (and probably do) have obvious solutions that we just don't try.

barbazoo•4h ago

> But Pydantic is starting to creep into every layer, even your domain, and it starts to itch.

I can’t relate yet. Itch how? It doesn’t really go into what the problem is they’re solving.

karolinepauls•4h ago

I'll go further and elsewhere at once: APIs should not present nested objects but normalised data. It enables clients to easily to lay out their display structure independently of API resource schemas and eases out tricks like diffing between subsequent responses, pulling updates or requesting new data by passing IDs and timestamps of already known data, etc. API normalised data obviously shouldn't correspond to DB normalised data. Nested objects are superior only for use with jq.

mindcrash•4h ago

And that's why it is key in your architecture to differentiate between Data Transfer Objects (DTOs) or Models on one hand which has values which can and actually must be validated when they come from the outside, and Domain Entities / Value Objects on the other. Even though the DTO and Domain Entity might look similar.

Thank me later.

ripped_britches•3h ago

This persons’s head would explode if they saw what we’re doing over here in typescript with structural typing. It would make things way too simple.

axpy906•3h ago

The trouble I have with pedantic is that everything is immutable. There are use cases where I need mutability and it’s not bad but a trade off.

henning•2h ago

Oh boy, I love making adding a trivial nullable column take even more code and require even more tests and have even more places I forgot to update which results in a field being nullable somewhere.

And don't forget, you get to duplicate this shit on the frontend too.

And what is a modern app if we aren't doing event-driven microservice architecture? That won't scale!!!! So now I also have to worry about my Avro schema/Protobufs/whateverthefuck. But how does everyone else know about the schema? Avro schema registry! Otherwise we won't know what data is on the wire!

And so on and so on into infinity until I have to tell a PM that adding a column will take me 5 pull requests and 8 deploys amounting to several days of work.

Congratulations on making your own small contribution to a fucking ridiculous clown fiesta.

jmward01•1h ago

Strongly decoupling API implementation and, well, actual implementation, is pretty key when you start to evolve an application. People often focus on 'the design' like there is one perfect design for an application for its lifetime when in really it is about how easy the mass of code you have is able to change for the next feature/fix/change and not turn into a hairball of code. That perfect initial design where the internal and external objects are exactly the same generally works well for 1.0, but not 1.1 or 2.0 so strongly decoupling the API implementation is a good general practice if you think your code will continue to evolve.

clickety_clack•1h ago

I use pyrsistent in the domain, and pydantic for tricky validation at the boundary. Pyrsistent is a pretty neat solution if you want immutable data structures, with some nice methods for working with nested records.

golly_ned•28m ago

I still don’t quite get the motivation for “don’t use pydantic except at border” — it sounds like it’s “you don’t need it”, which might be true. But then adds dacite to translate between pydantic at the border and python objects internally. What exactly is wrong with pydantic internally too?

How We Rooted Copilot

What Went Wrong for Yahoo

Purple Earth Hypothesis

Rust running on every GPU

Where Are Vacation Homes Located in the US?

Font-size-adjust Is Useful

Inverted Indexes: A Step-by-Step Implementation Guide

Bringing a decade old bicycle navigator back to life with open source software

Breaking the WASM/JS communication performance barrier

Open Sauce is a confoundingly brilliant Bay Area event

CCTV footage captures the first-ever video of an earthquake fault in motion

Ageing accelerates around age 50 ― some organs faster than others

The rise and fall of the Hanseatic League

It's time for modern CSS to kill the SPA

Earth Has Tilted 31.5 Inches. That Shouldn't Happen

Yes, the Book of PF, Fourth Edition Is Coming Soon

Simon Tatham's Portable Puzzle Collection

Upsides and Downsides

The Rise of Shippable Microfactories

Project Lyra – Exploring Interstellar Objects

Svalbard winter warming is reaching melting point

The append-and-review note

Users claim Discord's age verification can be tricked with video game characters

Do not download the app, use the website

It's a DE9, not a DB9 (but we know what you mean)

Instapaper Rakuten Kobo Integration

Keep Pydantic out of your Domain Layer

Never write your own date parsing library

Great Oxidation Event

Why MIT switched from Scheme to Python (2009)

How We Rooted Copilot

What Went Wrong for Yahoo

Purple Earth Hypothesis

Rust running on every GPU

Where Are Vacation Homes Located in the US?

Font-size-adjust Is Useful

Inverted Indexes: A Step-by-Step Implementation Guide

Bringing a decade old bicycle navigator back to life with open source software

Breaking the WASM/JS communication performance barrier

Open Sauce is a confoundingly brilliant Bay Area event

CCTV footage captures the first-ever video of an earthquake fault in motion

Ageing accelerates around age 50 ― some organs faster than others

The rise and fall of the Hanseatic League

It's time for modern CSS to kill the SPA

Earth Has Tilted 31.5 Inches. That Shouldn't Happen

Yes, the Book of PF, Fourth Edition Is Coming Soon

Simon Tatham's Portable Puzzle Collection

Upsides and Downsides

The Rise of Shippable Microfactories

Project Lyra – Exploring Interstellar Objects

Svalbard winter warming is reaching melting point

The append-and-review note

Users claim Discord's age verification can be tricked with video game characters

Do not download the app, use the website

It's a DE9, not a DB9 (but we know what you mean)

Instapaper Rakuten Kobo Integration

Keep Pydantic out of your Domain Layer

Never write your own date parsing library

Great Oxidation Event

Why MIT switched from Scheme to Python (2009)

Keep Pydantic out of your Domain Layer

Comments