>>> from dataclasses import dataclass
>>> @dataclass
... class C: pass
...
>>> C().x = 1
>>> @dataclass(slots=True)
... class D: pass
...
>>> D().x = 1
Traceback (most recent call last):
File "<python-input-4>", line 1, in <module>
D().x = 1
^^^^^
AttributeError: 'D' object has no attribute 'x' and no __dict__ for setting new attributes
Most of the time this is not a thing you actually need to do.If you're using dataclasses it's less of an issue because dataclasses.asdict.
Are you able to share a snippet that reproduces what you're seeing?
>>> class A(BaseModel):
>>> a: int
>>> class B(BaseModel):
>>> b: A
>>> class C(BaseModel):
>>> c: B | Dict[str, Any]
>>> C.model_validate({'c':{'b':{'a':1}}})
C(c=B(b=A(a=1)))
>>> C.model_validate({'c':{'b':{'a':"1"}}})
C(c={'b': {'a': '1'}})
>>> class C(BaseModel):
>>> c: B | Dict[str, Any] = Field(union_mode='left_to_right')
>>> C.model_validate({'c':{'b':{'a':"1"}}})
C(c=B(b=A(a=1)))
I know nothing about your context, but in what context would a single model need to support so many permutations of a data structure? Just because software can, doesn't mean it should.
Just tracking payments through multiple tax regions will explode the places where things need to be tweaked.
You can have nested dataclasses, as well as specify custom serializers/loaders for things which aren't natively supported by json.
Calling `x: str = json.dumps(MyClass(...).serialize())` will get you json you can recover to the original object, nested classes and custom types and all, with `MyClass.load(json.loads(x))`
For example if you are querying a DB that returns a column as a JSON string, trivial with Pydantic to json parse the column are part of deser with an annotation.
Pydantic is definitely slower and not a 'zero cost abstraction', but you do get a lot for it.
Automatic, statically typed deserialization is worth the trouble in my opinion
This looks highly reminiscent (though not exactly the same, pedants) of why people used to get excited about using SAX instead of DOM for xml parsing.
1. Inefficient parser implementation. It's just... very easy to allocate way too much memory if you don't think about large-scale documents, and very difficult to measure. Common problem with many (but not all) JSON parsers.
2. CPython in-memory representation is large compared to compiled languages. So e.g. 4-digit integer is 5-6 bytes in JSON, 8 in Rust if you do i64, 25ish in CPython. An empty dictionary is 64 bytes.
thisguy47•6h ago
itamarst•6h ago
The linked-from-original-article ijson article was the inspiration for the talk: https://pythonspeed.com/articles/json-memory-streaming/