It's a heck of a lot of work (easily as much work as writing the actual code!), but typing has already paid dividends as I've been starting to use the SDK for one-off scripts.
The type annotations say `float32`, `float64`, etc. are type aliases for `Floating[32]` and `Floating[64]`. In reality, they are concrete classes.
This means that `isinstance(foo, float32)` works as expected, but pisses off static analysis tools.
I’ve found the library `beartype` to be very good for runtime checks built into partially annotated `Annotated` types with custom validators, so instead of instance checks you rely directly on signatures and typeguards.
There's not a good way to say "Expects a 3D array-like" (i.e. something convertible into an array with at least 3 dimensions). Similarly, things like "At least 2 dimensional" or similar just aren't expressible in the type system and potentially could be. You wind up relying on docstrings. Personally, I think typing in docstrings is great. At least for me, IDE (vim) hinting/autocompletion/etc all work already with standard docstrings and strictly typed interpreters are a completely moot point for most scientific computing. What happens in practice is that you have the real info in the docstring and a type "stub" for typing. However, at the point that all of the relevant information about the expected type is going to have to be the docstring, is the additional typing really adding anything?
In short, I'd love to see the ability to indicate expected dimensionality or dimensionality of operation in typing of numpy arrays.
But with that said, I worry that typing for these use cases adds relatively little functionality at the significant expense of readability.
"array-like" has real meaning in the python world and lots of things operate in that world. A very common need in libraries is indicating that things expect something that's either a numpy array or a subclass of one or something that's _convertible_ into a numpy array. That last part is key. E.g. nested lists. Or something with the __array__ interface.
In addition to dimensionality that part doesn't translate well.
And regardless, if the type representation is not standardized across multiple libraries (i.e. in core numpy), there's little value to it.
https://github.com/ramonhagenaars/nptyping/blob/master/USERD...
The real challenge is denoting what you can accept as input. `NDArray[np.floating] | pd.Series[float] | float` is a start but doesn't cover everything especially if you are a library author trying to provide a good type-hinted API.
In the same line, I would love to have more Pandas-Pydantic interoperability at the type level.
Does this "type system" ensure that you can't get a runtime type error? I don't think so.
tecoholic•2h ago
Let’s take a pause here for a second - the ‘CanIndex’ and ‘SupportsIndex’ from the looks are just “int”. I have a hard time dealing with these custom types because they are so obscure.
Does anyone find these useful? How should I be reading them? I understand custom types that package data like classes. But feel completely confused for this yadayava is string, floorpglooep is int style types.
jkingsman•2h ago
CaliforniaKarl•2h ago
The PR for the change is https://github.com/numpy/numpy/pull/28913 - The details of files changed[0] shows the change was made in 'numpy/__init__.pyi'. Looking at the whole file[1] shows SupportsIndex is being imported from the standard library's typing module[2].
Where are you seeing SupportsIndex being defined as an int?
> I have a hard time dealing with these custom types because they are so obscure.
SupportsIndex is obscure, I agree, but it's not a custom type. It's defined in stdlib's typing module[2], and was added in Python 3.8.
[0]: https://github.com/numpy/numpy/pull/28913/files
[1]: https://github.com/charris/numpy/blob/c906f847f8ebfe0adec896...
[2]: https://docs.python.org/3/library/typing.html#typing.Support...
tecoholic•29m ago
I looked at the function in the article taking in an int and I stopped there. And then I read the docs for SupportsIndex, I just can’t think of an occasion I would have used it.
masspro•2h ago
* some arithmetic/geometry problems for example 2D layout where there are several different things that are “an integer number of pixels” but mean wildly different things
In either case it can help pick apart dense code or help stay on track while writing it. It can instead become anti-helpful by causing distraction or increasing the friction to make changes.