Also, worth noting that it looks like this data only covers 2012-2014?
Data analysys without adjusting groups by popularity is a bit lame.
A cursory glance at the source code[1] reveals that it's using GitHub Archive data. Looking through the gharchive data[2], it seems like it was last updated in 2024. So there's 10 years of publicly accessible new data.
Is there any reason we (by "we" I mean "random members of the community" as opposed to the developer of the project) can't re-build GitHut with the new data, seeing as it's open source? It's only processing the repo metadata, meaning it shouldn't even be that much data and should be well under the free 1TB limit in BigQuery (The processed data from 2014 stored in the repo[3] is only 71MB in size, though I assume the 2024 data will be larger), so cost shouldn't be a concern.
I'm not experienced enough to know whether creating an updated version of this would take an afternoon or several weeks.
[1]: https://github.com/littleark/githut/
[2]: https://console.cloud.google.com/bigquery?project=githubarch...
[3]: https://github.com/littleark/githut/blob/master/server/data/...
There is also GitHut 2.0: https://madnight.github.io/githut/#/pull_requests/2024/1
This updates through 2024.
miguel_martin•1h ago
some_guy_nobel•1h ago