I mashed together some stats from PyPI, GitHub, ClickHouse, and BigQuery. To give a fuller picture of the dependencies you may want to use.
I get the downloads from ClickHouse, then some data from BigQuery, in seconds.
It takes hours to get the GitHub data using batched GraphQL queries, edging the various rate limits.
Using FastAPI to serve the data.
About 70% of packages have a resolvable GitHub repo.
bramwick•1h ago
Feedback welcome.