That is, it isn't just knowing whether or not the data is ever used. It is useful to know if it was used in this specific run. And often times, seeing what parts of the data was not used is a good clue as to what went wrong. At the least, you can use it to rule out what code was not hit.
> The need for this class has been partially supplanted by the ability to subclass directly from dict; however, this class can be easier to work with because the underlying dictionary is accessible as an attribute.
Sounds like (unless you need the dict as a separate data member) this class is a historical artefact. Unless there's some other issue you know of not mentioned in the documentation?
[1] https://typing.python.org/en/latest/spec/overload.html
Even then, to be honest I'm a bit sceptical. Can you point at a link in the official documentation that says overriding methods of dictionaries may not work? I would have thought the link to UserDict would have mentioned that if true. What do you mean they are "runtime dependent"?
UserDict isn't just some historical artifact of a bygone era like some of the posters below are miscorrecing me on.
The UserDict class is mostly defunct and is only still in the standard library because there were a few existing uses that were hard to replace (such as avoiding base class conflicts in multiple inheritance).
Another way to get data out would be to use the new | operator (i.e. x = {} | y essentially copies dictionary x to y) or the update method or ** unpacking operator (e.g. x = {**y}). But maybe those come under the umbrella of iterating as you mentioned.
d.setdefault(k, computevalue())
defaultdict takes a factory function, so it's only called if the key is not already present: d = defaultdict(computevalue)
This applies to some extent even if the default value is just an empty dictionary (as it often is in my experience). You can use dict() as the factory function in that case.But I have never benchmarked!
I'd say "or" rather than "and": defaultdict has higher overhead to initialise the default (especially if you don't need a function call in the setdefault call) but because it uses a fallback of dict lookup it's essentially free if you get a hit. As a result, either a very high redundancy with a cheap default or a low amount of redundancy with a costly default will have the defaultdict edge out.
For the most extreme case of the former,
d = {}
for i in range(N):
d.setdefault(0, [])
versus d = defaultdict(list)
for i in range(N):
d[0]
has the defaultdict edge out at N=11 on my machine (561ns for setdefault versus 545 for defaultdict). And that's with a literal list being quite a bit cheaper than a list() call.UserDict will route '.get', '.setdefault', and even iteration via '.items()' through the '__getitem__' method.
edited to remove "(maybe all?) edge cases". As soon as I posted, I thought of several less common/obvious edge cases.
self.accessed_keys = set()
instead of @property
def accessed_keys(self):
return self._accessed_keys
Therefore, I really wanted to know that I was actually pulling in all of the data I needed, so I tracked what was seen vs not seen, and compared against what was attempted to see.
In the end it was basically a wrapper around the JSON object itself, that allowed lookup of data via a string in "dot notation" (so you could do "keyA.key2" to get the same thing you would have directly in JSON. Then, it would either return a simple value (if there was one), or another instance of the wrapper if the result was itself an object (or an array or wrapped objects). All instances would share the "seen" list.
It's unfortunately locked behind NDA/copyright stuff, but the implementation was only 67 lines.
However: the dict in this case would also include dataclasses, and I’d be interested in finding what exact attributes within those dataclasses were accessed, and also be able to mark all attributes in those dataclasses as accessed if the parent dataclasses is accessed, and with those dataclasses, being config objects, being able to do the same to its own children, so that the topmost dictionary has a tree of all accessed keys.
I couldn’t figure out how to do that, but welcome to ideas.
I am currently teaching (typed) Python to a team of Windows sysadmins and it's been incredibly difficult to explain when to use a dataclass, a NamedTuple, a Pydantic model, or a dictionary.
Let me make this more concrete: Those sysadmins frequently need to process and pass around complex (as in heavily nested) structured data. The data often comes in the form of singleton objects, i.e. they are built in single place, then used in another place and then thrown away (or merged into some other structure). In other words, any class hierarchy you build represents boilerplate code you'll only ever use once and which will be annoying to maintain as you refactor your code. Do you pick dataclasses or TypedDicts (or something else) for your map data structures?
In TypeScript you would just use `const data = <heavily nested object> as const` and be done with it.
I mean I agree w.r.t. the blurriness in general but this PEP is not going to change anything about that, in neither direction.
I get the rationale for "anonymous strict" return types, but then I think a better way would be to think up some way to accomplish that for dataclasses.
Rust front: Here's a faster ls called ls-rs with different defaults, you should use this!
Go front: Here's reverse proxy #145728283 it is an open source project that has slightly different parameters than all the others.
Python hobo front: Uhh guys here's a dict that kinda might remember what you've accessed if you used it in a particular way.
jraph•4d ago
This can be used to evaluate the migration quality and spot what can be improved.
https://github.com/xwiki-contrib/confluence/blob/7a95bf96787...