What fields are present and what types they have is extremely non uniform: it depends heavily on ORM objects’ internal configuration and the way a given model class relates to other models, including circular dependencies.
(And when I say “fields” here, I’m not only referring to data fields; even simple models include many, many computed method-like fields, complex lazily-evaluatable and parametrizable query objects, fields whose types and behavior change temporally or in response to far-distant settings, and more).
Some of this complexity is inherent to what ORMs are as a class of tool—many ORMs in all sorts of languages provide developer affordances in the form of highly dynamic, metaprogramming-based DSL-ish APIs—but Django really leans in to that pattern more than most.
Add to that a very strong community tendency to lazily (as in diligence, not caching) subclass ORM objects in ways that shadow computed fields—and often sloppily override the logic used to compute what fields are available and how they act—and you have a very thorny problem space for type checkers.
I also want to emphasize that this isn’t some rare Django power-user functionality that is seldom used, nor is it considered deprecated or questionable—these computed fields are the core API of the Django ORM, so not only are they a moving target that changes with Django (and Django extension module) releases, but they’re also such a common kind of code that even minor errors in attempts to type-check them will be extremely visible and frustrating to a wide range of users.
None of that should be taken as an indictment of the Django ORM’s design (for the most part I find it quite good, and most of my major issues with it have little to do with type checking). Just trying to answer the question as directly as possible.
Ultimately you can get typing for the usual cases, but it won't be complete because you can outright change the shape of your models in Django at runtime (actions that aren't type safe of course)
(I'm not commenting on it being possible or not to fix; but the current status)
> pyrefly, mypy, and pyright all assume that my_list.append("foo") is a typing error, even though it is technically allowed (Python collections can have multiple types of objects!)
> If this is the intended behavior, ty is the only checker that implicitly allows this without requiring additional explicit typing on my_list.
EDIT: I didn't intend my comment to be this sharp, I am actually rooting for ty to succeed :)
ORIGINAL: I am strongly against ty behaviour here. In production code you almost always have single type lists and it is critical that the typechecker assumes this, especially if the list already has same-type _literal_ items.
The fact that Python allows this has no bearing at all. To me having list[int | str] implicitly allowed by the typechecker seems like optimizing for beginner-level code.
Imo a type checker in a dynamic language should is primarily there to avoid runtime errors. In a list with multiple types the typechecker should instead force you to check the type before using an element in that list.
If you want static types python is the wrong language
Why is it critical though? If having a `list[int]` was a requirement I would expect a type error where that's explicit.
Instead it could mark it as an error (as all the other checkers do), and if that's what the user really intended they can declare the type as list[str | int] and everything down the line is checked correctly.
So in short, this seems like a great place to start pushing the user towards actually (gradually) typing their code, not just pushing likely bugs under the rug.
In the real world, a row of CSV data is not type checked -- and the world hasn't pushed the spreadsheet industry to adopt typed CSV data.
[ty developer here]
Please note that ty is not complete!
In this particular example, we are tripped up because ty does not do anything clever to infer the type of a list literal. We just infer `list[Unknown]` as a placeholder, regardless of what elements are present. `Unknown` is a gradual type (just like `Any`), and so the `append` call succeeds because every type is assignable to `Unknown`.
We do have plans for inferring a more precise type of the list. It will be more complex than you might anticipate, since it will require "bidirectional" typing to take into account what you're doing with the list in the surrounding context. We have a tracking issue for that here: https://github.com/astral-sh/ty/issues/168
I am talking from some experience as I had to convert circa 40k lines of untyped code (dicts passed around etc) to fully typed. IIRC this behaviour would have masked a lot of bugs in my situation. (I relied on mypy at first, but migrated to pyright about 1/4 in).
But otherwise it's good to hear that this is still in progress and I wish the project the best of luck.
Not at all! :-) Just wanted to clarify for anyone else reading along
But less snarkily, we do talk to them often (and the authors of other tools like mypy and pyright) to make sure we aren't introducing gross incompatibilities between the different type checkers. When there are inconsistencies, we want to make sure they are mindful rather than accidental; for good reasons; spec-compliant; and well documented.
>ty, on the other hand, follows a different mantra: the gradual guarantee. The principal idea is that in a well-typed program, removing a type annotation should not cause a type error. In other words: you shouldn’t need to add new types to working code to resolve type errors.
It seems like `ty`'s current behaviour is compatible with this, but changing it won't (unless it will just be impossible to type a list of different types).If your code doesn't do that then your program isn't well typed according to Python's typing semantics... I think.
So you can have lists of multiple types, but then you get consequences from that in needing type guards.
Of course you still have stuff like `tuple[int, int, int, str]` to get more of the way there. Maybe one day we'll get `FixedList[int, int, int, str]`....
For an internal tool at Meta, this is fine. Just make all your engineers adopt the style guide.
For introducing a tool gradually at an organization where this sort of change isn't one of the top priorities of engineering leadership, being more accepting is great. So I prefer the way ty does this, even though in my own personal code I would like my tool to warn me if I mix types like this.
Yes, lets base our tooling on your opinion rather what is allowed in python.
my_list = [BarWidget(...), FooWidget(...)] ?
my_list.append(BazWidget(...))
my_list.append(7)
Wouldn't it be nice if the type checker could infer the type hint there, which is almost certainly intended to be list[Widget], and allow the first append and flag the second one?
I'm curious to see which way the community will lean to.
https://github.com/astral-sh/ty/blob/main/docs/README.md#oth...
I've been using pyright with neovim for years and have never experienced any kind of noticeable lag.
We are happy with the attention that ty is starting to receive, but it's important to call out that both ty and pyrefly are still incomplete! (OP mentions this, but it's worth emphasizing again here.)
There are definitely examples cropping up that hit features that are not yet implemented. So when you encounter something where you think what we're doing is daft, please recognize that we might have just not gotten around to that yet. Python is a big language!
The subject of a "scripting language for Rust" has come up a few times [1]. A language that fits nicely with the syntax of Rust, can compile right alongside rust, can natively import Rust types, but can compile/run/hot reload quickly.
Do you know of anyone in your network working on that?
And modulus the syntax piece, do you think Python could ever fill that gap?
I would never ever want a full fledged programming language to build type checking plugins, and doubly so in cases where one expects the tool to run in a read-write context
I am not saying that Skylark is the solution, but it's sandboxed mental model aligns with what I'd want for such a solution
I get the impression the wasm-adjacent libraries could also help this due to the WASI boundary already limiting what mutations it is allowed
These are the strong vs weak, static vs dynamic axes.
You probably want strong, but dynamic typing. eg., a function explicitly accepts only a string and won't accept or convert a float into a string implicitly or magically.
You're free to bind or rebind variables to anything at any time, but using them in the wrong way leads to type errors.
JavaScript has weak dynamic typing.
Python has strong dynamic typing (though since types aren't annotated in function definitions, you don't always see it until a type is used in the wrong way at the leaves of the AST / call tree).
Ruby has strong dynamic typing, but Rails uses method_missing and monkey patching to make it weaker though lots of implicit type coercions.
C and C++ have weak static typing. You frequently deal with unstructured memory and pointers, casting, and implicit coercions.
Java and Rust have strong static typing.
let { (*>), (<*), wrap } = import! std.applicative
Can you explain how you came up with this solution? Rust docs code-examples inspired?
https://groups.google.com/g/comp.lang.python/c/DfzH5Nrt05E/m...
Following these projects with great interest though. At the end of the day, a good type checker should let us write code faster and more reliably, which I feel isn't yet the case with the current state of the art of type checking for python.
Good luck with the project!
If you have a large codebase that you want to be typesafe, Pyrefly’s approach means writing far fewer type annotations overall, even if the initial lift is much steeper.
Ty effectively has noImplicitAny set to false.
I'm not very fond of mypy as it struggles even with simple typing at times.
So in complement to this can I share my favorite _run-time_ type checker? Beartype: this reads your type annotations (ie I see this is where Pyrefly and Ty come in), and enforces the types at runtime. It is blazingly fast, as in, incredibly fast. I use it for all my deployed code.
https://beartype.readthedocs.io/en/latest/
I suspect either of Pyrefly or Ty will be highly complementary with Beartype in terms of editor additions, and then runtime requirements.
The docs also have a great sense of humour.
What about interacting with other libraries?
For anyone interested in using these tools, I suggest reading the following:
https://www.reddit.com/r/Python/comments/10zdidm/why_type_hi...
That post should probably be taken lightly, but I think that the goal there is to understand that even with the best typing tools, you will have troubles, unless you start by establishing good practices.
For example, Django is large code base, and if you look at it, you will observe that the code is consistent in which features of python are used and how; this project passes the stricter type checking test without troubles. Likewise, Meta certainly has a very large code base (why develop a type checker otherwise?), and they must have figured out that they cannot let their programmers write code however they like; I guess their type checker is the stricter one for that reason.
Python, AFAIK, has many features, a very permissive runtime, and perhaps (not unlike C++) only some limited subset should be used at any time to ensure that the code is manageable. Unfortunately, that subset is probably different depending on who you ask, and what you aim to do.
(Interestingly, the Reddit post somehow reminded me of the hurdles Rust people have getting the Linux kernel guys to accept their practice: C has a much simpler and carefree type system, but Rust being much more strict rubs those C guys the wrong way).
Obviously that isn't always possible but you can spend far too long trying to make python work.
State of the art AI models are all closed source and accessible through an API anyways. APIs that any language can easily access.
AAa for AI model development in itself, yes it's mostly Pyython, but niche.
This is honestly a thing, at least in the startup world.
Sure, a system that only relies on token factory LLM APIs can be written in any language, but that is not the full width and breadth of the AI hype.
You realize model training cost millions right? "a lot of businesses" doesn't pass sniff test here.
I'm not even counting the large swaths of data required to train. And the expensive specialists.
And then you'll have to retrain outdated models every so often.
There's a reason that AI has only a handful of players delivering SoTA models and these players are all worth $5B+.
I occasionally spend like 2h working on some old python code. I will spend say 15 minutes of that time adding type annotations (sometimes requires some trivial refactoring). This has an enormous ROI, the cost is so low and the benefit is so immediate.
In these cases migrating code to a proper language and figuring out interop is not on my radar, it would be insane. So having the option to get some best-effort type safety is absolutely fantastic.
I can definitely see your point, it's a useful analysis for projects under heavy development. But if you have a big Python codebase that basically just works and only sees incremental changes, adding type annotations is a great strategy.
Have fun with that!
You annotate enough functions and you get a really good linter out of it!
Python has flaws and big ones at that, but there's a reason it's popular. Especially with tools like pydantic and fastapi and uv (and streamlit) you can do insane things in hours what would take weeks and months before. Not to mention how good AI is at generating code in these frameworks. I especially like typing using pydantic, any method is now able to dump and load data from files and dbs and you get extremely terse validated code. Modern IDEs also make quick work of extracting value even from partially typed code. I'd suggest you just open your mind up to imperfect things and give them a shot.
I'll get started on the subset of Python that I personally do not wish to use in my own codebase: meta classes, descriptors, callable objects using __call__, object.__new__(cls), names trigger the name mangling rules, self.__dict__. In my opinion, all of the above features involve too much magic and hinder code comprehension.
* Meta classes: You're writing Pydantic or an ORM.
* Descriptors: You're writing Pydantic or an ORM.
* Callable objects: I've used these for things like making validators you initialize with their parameters in one place, then pass them around so other functions can call them. I'd probably just use closures if at all possible now.
* object.__new__: You're writing Pydantic or an ORM.
* Name mangling: I'm fine with using _foo and __bar where appropriate. Those are nice. Don't ever, ever try to de-mangle them or I'll throw a stick at you.
* self.__dict__: You're writing Pydantic or an ORM, although if you use this as shorthand for "doing things that need introspection", that's a useful skill and not deep wizardry.
Basically, you won't need those things 99.99% of the time. If you think you do, you probably don't. If you're absolutely certain you do, you might. It's still good and important to understand what they are, though. Even if you never write them yourself, at some point you're going to want to figure out why some dependency isn't working the way you expected, and you'll need to read and know what it's doing.
That's kind of my point. If you don't need a language feature 99.99% of the time perhaps it is better to cut it out from your language altogether. Well unless your language is striving to have the same reputation as C++. In Python's case here's a compromise: such features can only be used in a Python extension in C code, signifying their magic nature.
> If you have a super-generic function like that and type hinting enforced, you just use Any and don't care about it.
It's a stupid example, but even within the context of a `slow_add` function in a library: maybe the author originally never even thought people would pass in non-numeric values, so in the next version update instead of a hardcoded `time.sleep(0.1)` they decide to `time.sleep(a / b)`. Oops, now it crashes for users who passed in strings or tuples! If only there were a way to declare that the function is only intended to work with numeric values, instead of forcing yourself to provide backwards compatibility for users who used that function in unexpected ways that happened to work.
IMO: for Python meant to run non-interactively with any sort of uptime guarantees, type checking is a no-brainer. You're actively making a mistake if you choose to not add type checking.
The purpose was to show different ideologies and expectations on the same code don't work, such as strict backwards compatibilities, duck typing, and strictly following linting or type hinting rules (due to some arbitrary enforcement). Although re-reading it now I wish I'd spent more than an evening working on it, it's full of issues and not very polished.
> If you have a super-generic function like that and type hinting enforced, you just use Any and don't care about it.
Following the general stupidness of the post: they are now unable to do that because a security consultant said they have to enable and can not break RUFF rule ANN401: https://docs.astral.sh/ruff/rules/any-type/
Okay, then your function which is extremely generic and needs to support 25 different use cases needs to have an insane type definition which covers all 25 use cases.
This isn't an indictment of the type system, this is an indictment of bad code. Don't write functions that support hundreds of input data types, most of which are unintended. Type systems help you avoid this, by the way.
But by it's nature, duck typing supports an unbounded number of input types and is what Python was built on.
You've already decided duck typing is wrong and strict type adherence is correct, which is fine, but that doesn't fit the vast history of Python code, or in fact many of the core Python libraries.
You're trying to shove a square peg into a round hole. It's not about right or wrong. Either you want your function to operate on any type, attempt to add the two values (or perform any operation which may or may not be supported, i.e. duck typing), and throw an runtime error if it doesn't work--in which case you can leave it untyped or use `Any`--or you want stronger type safety guarantees so you can validate before runtime that nobody is calling your method with incorrect arguments, in which case you have to represent the types which you accept somehow.
If you want to have a method that's fully duck typed, you're supposed to use `Any`. That's exactly why it exists. Inventing contrived scenarios about how you can't use `Any` is missing the point. It's like complaining C doesn't work if you're not allowed to use pointers.
You're right that historically Python code was written with duck typing in mind, but now even highly flexible libraries like Pandas have type definition support. The ecosystem is way different from even 5-6 years ago, I can't think of any well-known libraries which don't have good typing support by now.
You just have to say the type implements Semigroup.
Yes, this would work if the arguments are lists, or integers, or strings. And it won't pass the typecheck for arguments that are not Semigroups.
It may not work with Python, but only because it's designers weren't initially interested in typechecking.
from dataclasses import dataclass
from typing import Protocol, Self, TypeVar
class Semigroup(Protocol):
def __add__(self, other: Self) -> Self:
...
T = TypeVar("T", bound=Semigroup)
def join_stuff(first: T, *rest: T) -> T:
accum = first
for x in rest:
accum += x
return accum
@dataclass
class C:
x: int
@dataclass
class D:
x: int
def __add__(self, other: Self) -> Self:
return type(self)(self.x + other.x)
@dataclass
class E:
x: int
def __add__(self, other: Self) -> Self:
return type(self)(self.x + other.x)
_: type[Semigroup] = D
_ = E
def doit() -> None:
print(join_stuff(1,2,3))
print(join_stuff((1,), tuple(), (2,)))
print(join_stuff("a", "b", "c"))
print(join_stuff(D(1), D(2)))
print(join_stuff(D(1), 3))
print(D(1) + 3) # caught by mypy
print(D(1) + E(3)) # caught by mypy
print(join_stuff(1,2,"a")) # Not caught by mypy
print(join_stuff(C(1), C(2))) # caught by mypy
doit()
Now, this doesn't quite work to my satisfaction. Mypy lets you
freely mix and match values of incompatible types, and I don't know
how to fix that. Basically, if you directly try to add a D and an
int, mypy will yell at you, but there's no way I've found to insist
that the arguments to join_stuff, in addition to being Semigroups,
are all of the compatible types. It looks like mypy is checking
join_stuff as if Semigroup were a concrete class, so once you're
inside join_stuff, the actual types of the arguments become irrelevant.However, it will correctly tell you that it can't accept arguments that don't define addition at all, and that's better than nothing.
Imagine I say "the human body is dumb! Here's an example: if I stab myself, it bleeds!" Like is that stupid or absurd?
TypeScript's goal is to take a language with an unhinged duck type system that allows people to do terrible things and then allow you to codify and lock in all of those behaviours exactly as they're used.
Mypy (and since it was written by GVM and codified in the stdlib by extension Python and all other typecheckers)'s goal is to take a language with an unhinged duck type system that allows people to do terrible things and then pretend that isn't the case and enforce strict academic rules and behaviours that don't particularly care about how real people write code and interact with libraries.
If you include type hints from the very beginning than you are forced to use the very limited subset of behaviours that mypy allow you to codify and everything will be "fine".
If you try to add type hints to a mature project, you will scream with frustration as you discover how many parts of the codebase literally cannot be represented in the extremely limited type system.
>I think that the goal there is to understand that even with the best typing tools, you will have troubles, unless you start by establishing good practices.
Like - what makes you think that python developers doesn't understand stuff about Python, when they are actively using the language as opposed to you?
I must admit that I largely prefer static typing, which is why I got interested in that article. It's true that trying to shoehorn this feature in the Python ecosystem is an uphill battle: there's a lot of good engineering skill spent on this.
Perhaps there's a connection to make between this situation and an old theorem about incompleteness?
https://copilot.microsoft.com/shares/2LpT2HFBa3m6jYxUhk9fW
(was generated in quick mode, so you might want to double check).
> When you add type hints to your library's arguments, you're going to be bitten by Hyrum's Law and you are not prepared to accurately type your full universe of users
That's understandable. But they're making breaking changes, and those are just breaking change pains - it's almost exactly the same if they had instead done this:
def slow_add(a, b):
throw TypeError if !isinstance(a, int)
...
but anyone looking at that would say "well yeah, that's a breaking change, of course people are going to complain".The only real difference here is that it's a developer-breaking change, not a runtime-breaking one, because Python does not enforce type hints at runtime. Existing code will run, but existing tools looking at the code will fail. That offers an easier workaround (just ignore it), but is otherwise just as interruptive to developers because the same code needs to change in the same ways.
---
In contrast! Libraries can very frequently add types to their return values and it's immediately useful to their users. You're restricting your output to only the values that you already output - essentially by definition, only incorrect code will fail when you do this.
Unless you mean something like record prod for a few weeks to months, similar to how Netflix uses eBPF (except they run it all the time).
Use any of them at your own risk I suppose.
> The standard VC business model is to invest in stuff that FAANG will buy from them one day. The standard approach is to invest in stuff that's enough of a threat to FAANG that they'll buy it to kill it, but this seems more like they're gambling on an acqui-hire in the future.
why is type checking the exception? with google and facebook and astral all writing their own mypy replacements, i’m curious why this space is suddenly so busy
But Ruff is an even greater improvement over that
"package/dependency management" - Everything is checked into a monorepo, and built with [Buck2](https://buck2.build/). There's tooling to import/update packages, but no need to reinvent pip or other package managers. Btw, Buck2 is pretty awesome and supports a ton of languages beyond python, but hasn't gotten a ton of traction outside of Meta.
"linting, formatting" - [Black](https://github.com/psf/black) and other public ecosystem tooling is great, no need to develop internally.
"why is type checking the exception" - Don't know about Astral, but for Meta / Google, most everyone else doesn't design for the scale of their monorepos. Meta moved from SVN to Git to Mercurial, then forked Mercurial into [Sapling](https://sapling-scm.com/) because simple operations were too slow for the number of files in their repo, and how frequently they receive diffs.
There are obvious safety benefits to type checking, but with how much Python code Meta has, mypy is not an option - it would take far too much time / memory to provide any value.
senkora•1d ago
The gradual guarantee that Ty offers is intriguing. I’m considering giving it a try based on that.
With a language like Python with existing dynamic codebases, it seems like the right way to do gradual typing.
yoyohello13•1d ago
RandomBK•1d ago
More restrictive requirements (ie `noImplicitAny`) could be turned on one at a time before eventually flipping the `strict` switch to opt in to all the checks.
rendaw•1d ago
I get where they're coming from, but the endgame was a huge issue when I tried mypy - there was no way to actually guarantee that you were getting any protection from types. A way to assert "no graduality to this file, it's fully typed!" is critical, but gradual typing is not just about migrating but also about the crazy things you can do in dynamic languages and being terrified of false positives scaring away the people who didn't value static typing in the first place. Maybe calling it "soft" typing would be clearer.
I think gradual typing is an anti-pattern at this point.
mmoskal•1d ago
belmont_sup•1d ago
Sorbet (Ruby typechecker) does this where it introduces a runtime checks on signatures.
Similarly in ts, we have zod.
MeetingsBrowser•1d ago
That's the problem with bugs though, there's always something that could have been done to avoid it =)
Pydantic works great in specific places, like validating user supplied data, but runtime checks as a replacement for static type checkers are not really feasible.
Every caller would need to check that the function is being called correctly (number and position of args, kwarg names, etc) and every callee would need to manually validate that each arg passed matches some expected type.
eternityforest•1d ago
Type checking is real time in the IDE and lets you fix stuff before you waste fifteen minutes actually running it.
Spivak•1d ago
guappa•11h ago
belmont_sup•17h ago
But the reality is that teams have started with untyped Python, Ruby, and Javascript, have been productive, and now need to gradually add static types to remain productive.
> Every caller would need to check that the function...
The nice part here is where the gradual part comes in. As you are able to type more of your code, you're able to move where you add your runtime validation, and eventually you'll be able to move all validation to the edges of your system.
dcreager•1d ago
This is a good point, and one that we are taking into account when developing ty.
The benefit of the gradual guarantee is that it makes the onboarding process less fraught when you want to start (gradually) adding types to an untyped codebase. No one wants a wall of false positive errors when you first start invoking your type checker.
The downside is exactly what you point out. For this, we want to leverage that ty is part of a suite of tools that we're developing. One goal in developing ty is to create the infrastructure that would let ruff support multi-file and type-aware linter rules. That's a bit hand-wavy atm, since we're still working out the details of how the two tools would work together.
So we do want to provide more opinionated feedback about your code — for instance, highlighting when implicit `Any`s show up in an otherwise fully type-annotated function. But we view that as being a linter rule, which will likely be handled by ruff.
eternityforest•1d ago
ramses0•21h ago
genshii•1d ago
Difference here that strict mode is a tsc option vs. having this kind of rule in the linter (ruff), but the end result is the same.
Anyway, that was a long winded way of saying that ty or ruff definitely needs its own version of a "strict" mode for type checking. :)
pydry•1d ago
josevalim•1d ago
That depends on the implementation of gradual typing. Elixir implements gradual set-theoretic types where dynamic types are a range of existing types and can be refined for typing violations. Here is a trivial example:
Since the function is untyped, `x` gets an initial value of `dynamic()`, but it still reports a typing violation because it first gets refined as `dynamic(integer())` which is then incompatible with the `atom()` type.We also introduced the concept of strong arrows, which allows dynamic and static parts of a codebase to interact without introducing runtime checks and remaining sound. More information here: https://elixir-lang.org/blog/2023/09/20/strong-arrows-gradua...
_carljm•6h ago
In your example, wouldn't `none()` be a type for `x` that satisfies both `Integer.to_string(x)` and `Atom.to_string(x)`? Or do you special-case `none()` and error if it occurs?
HelloNurse•5h ago
If the body of the function contained only the first or the second call, the verdict would have been that x is respectively an Integer or an Atom and the type of the function is the type of the contained expression.
tclancy•1d ago
rtpg•11h ago
You still have the "garbage in/garbage out" problem on the boundaries but at the very least you can improve confidence. And if you're hardcore... turn that on all over, turn off explicit Any, write wrappers around all of your untyped dependencies etc etc. You can get what you want, just might be a lot of work
guappa•11h ago
Less than it took you to write all that.
tialaramex•1d ago
IshKebab•1d ago
Hopefully they'll add some kind of no-implicit-any or "strict" mode for people who care about having working code...