frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

AITools.coffee – GitHub metrics observatory tracking 27K+ open-source AI repos

https://aitools.coffee
1•alexela84•1h ago

Comments

alexela84•1h ago
Hey HN! I'm the creator of AITools.coffee. This is a metrics observatory for the open-source AI ecosystem – think "GitHub Archive meets awesome-AI, but with daily time-series tracking."

What makes this different from awesome-lists? Awesome-lists are static Markdown files. They're great for discovery, but they:

Require manual PRs to update Show current state only (no historical trends) Don't track metrics (stars, forks, contributors, etc.) Go stale quickly AITools is a live database that:

Syncs 27,769 repositories daily via GitHub GraphQL API Tracks 16 metrics per repo (stars, forks, issues, PRs, releases, commits, contributors, etc.) Stores daily snapshots for time-series analysis (430M+ datapoints collected so far) Auto-removes dead/archived repos, auto-heals renamed repos with 301 redirects Technical Architecture Backend:

PostgreSQL 18 (27K repos, 21K authors, 430K metric snapshots) PHP 8.3 REST API with JWT auth Nightly cron (00:01 UTC) running GitHub GraphQL sync (~25 min for full sync) Discovery Pipeline:

Python scripts sweep 50+ AI organizations (OpenAI, Meta, Google, Anthropic, Hugging Face, etc.) GitHub Search API monitors 30+ topics (machine-learning, LLM, transformers, etc.) Gemini 2.5 Flash classifies repos into 30+ categories 100% manual review before publish (3-layer quality filter) Frontend:

Tailwind CSS with glassmorphism design Alpine.js for interactivity Chart.js + D3.js for metrics visualization (star distribution, language breakdown, contributor growth) Data Freshness:

Last sync: typically <6 hours ago 440K+ datapoints added daily (27K repos × 16 metrics) Rate limit: 1 GraphQL query/sec (stays under GitHub's 5K pts/hr) What I'm tracking Per repository (16 datapoints): stars, forks, watchers, open issues, open PRs, releases, commits (last 100), contributors, size, archived status, default branch, pushed_at, created_at, license, language, topics

Per author (8 datapoints): followers, following, public repos, gists, bio, company, location, created_at

All stored as daily snapshots → enables time-series analysis (star velocity, contributor growth, issue trends).

Current Scale 27,769 AI repositories tracked 20,992 open-source authors 12.4M+ total GitHub stars (aggregated) 430K+ metric snapshots collected 440K datapoints added per day Limitations & Future Plans What's NOT implemented yet:

Public API (planned Q2 2026, always free with rate limits) Historical charts (star growth over time) – data is there, visualization coming soon Trending repos (7-day star velocity ranking) – planned next month Email alerts for repo milestones – maybe later Open Source? Not yet. Considering open-sourcing the discovery pipeline + classification logic, but the full platform will likely remain closed-source (hosting costs, spam prevention, API abuse).

Why I built this I got frustrated manually tracking AI repos across GitHub, Twitter, and Discord. There's no single place to:

Compare similar tools by actual metrics (not just star count) See which projects are actively maintained vs abandoned Track contributor velocity (is the project growing or stagnating?) Filter by license, language, framework, use case Awesome-lists are great for curated discovery, but terrible for data-driven analysis. I wanted both.

Questions I'm expecting Q: How do you handle spam/SEO farms? A: 3-layer filter: (1) Gemini AI relevance check, (2) Manual review (100% of submissions), (3) Automated quality signals (min 10 stars, active within 2 years, not archived).

Q: What about non-GitHub repos (GitLab, Bitbucket)? A: Not supported yet. 99% of open-source AI is on GitHub, so I focused there. May expand later if there's demand.

Q: Can I submit my own project? A: Yes! Use the "Submit Tool" form (requires GitHub login to prevent spam). Your repo will be queued for review. Alternatively, if you're in one of the 50 orgs I monitor, your repo will be discovered automatically within a week.

Q: How accurate is Gemini classification? A: ~85% accurate on initial categorization. I manually review and re-categorize misclassifications. Common mistakes: RAG frameworks → agent frameworks, base models → fine-tuned models.

Q: Will you add X feature? A: Probably! Top requests: historical star charts, trending page, email alerts, public API. Working through them in order of complexity vs impact.

Q: What's your business model? A: None yet. This is a side project that costs ~$30/month (SiteGround hosting + Gemini API). If it grows beyond hobby scale, I might add sponsored listings or premium API tiers, but the core data will stay free.

Feedback welcome! Especially:

Missing repos/categories you'd like to see tracked UI/UX improvements (the homepage is dense with data, might be overwhelming) Technical architecture critiques (I'm sure there are better ways to do this) Feature requests (what metrics would actually be useful?) Tech stack: PostgreSQL, PHP, Python, Gemini 2.5 Flash, GitHub GraphQL API, Chart.js, D3.js, TailwindCSS, Alpine.js

Live at: https://aitools.coffee

Should your developer company go open source?

https://extremefoundership.substack.com/p/should-your-developer-company-go
2•paraphrenia•1m ago•0 comments

Awesome Vibe Coding – 245 AI coding tools and resources

https://github.com/taskade/awesome-vibe-coding
1•johnxie•2m ago•0 comments

Accelerating Mathematical and Scientific Discovery with Gemini Deep Think

https://deepmind.google/blog/accelerating-mathematical-and-scientific-discovery-with-gemini-deep-...
2•zuzatm•2m ago•0 comments

Rebuilding the IBM 701C Butterfly Keyboard Laptop.io

https://hackaday.io/project/204953-rebuilding-the-ibm-701c-butterfly-keyboard-laptop
1•rbanffy•3m ago•0 comments

Cancer might protect against Alzheimer's – this protein helps explain why

https://www.nature.com/articles/d41586-026-00222-7
2•bookofjoe•3m ago•0 comments

Show HN: MOL – A programming language where pipelines trace themselves

https://github.com/crux-ecosystem/mol-lang
2•MouneshK•4m ago•0 comments

A Visual Source for Shakespeare's 'Tempest'

https://profadamroberts.substack.com/p/a-visual-source-for-shakespeares
2•seegodanddie•4m ago•0 comments

Python Steering Council pleased to accept PEP 814 – Add frozendict built-in type

https://discuss.python.org/t/pep-814-add-frozendict-built-in-type/104854?page=6
1•fratellobigio•5m ago•1 comments

We interfaced single-threaded C++ with multi-threaded Rust

https://antithesis.com/blog/2026/rust_cpp/
1•PaulHoule•5m ago•0 comments

Composer 1.5

https://cursor.com/blog/composer-1-5
2•gmays•6m ago•0 comments

Camera that captures photos to cassette tape

https://hackaday.io/project/205004-digital-analog-tape-picture-camera
2•Jun8•7m ago•0 comments

Adblock Filter List Fingerprinting

https://adbleed.eu/
1•drhonc•11m ago•0 comments

Quantum computing is near. Maryland wants to lead the way

https://www.washingtonpost.com/dc-md-va/2026/02/08/this-maryland-labs-quantum-computer-could-cure...
1•rbanffy•11m ago•0 comments

European Commission breached – investigating mobile hack

https://www.computing.co.uk/news/2026/security/european-commission-breached
4•rbanffy•12m ago•0 comments

Show HN: Wallfacer – Persistent development environments for AI coding agents

https://www.wallfacer.ai/blog/announcing-wallfacer
5•theunquietone•13m ago•0 comments

See no Evil(ginx) / Detecting and stopping AitM phishing threats

https://blog.kulkan.com/see-no-evil-ginx-detecting-and-stopping-aitm-phishing-threats-4b9b368166c3
1•laserspeed•13m ago•1 comments

"Discord alternatives" searches jump 10k% overnight

https://www.windowscentral.com/software-apps/discord-alternative-search-10000-percent-stoat
2•croes•13m ago•0 comments

Client-side EXIF removal instead of uploading photos

1•FrankTheBear•13m ago•0 comments

Just Ring The Bell – A 10-second pause when you feel a craving

https://justringthebell.forgefluir.com
1•forge_craft•16m ago•1 comments

Point of no return: a hellish 'hothouse Earth' getting closer, scientists say

https://www.theguardian.com/environment/2026/feb/11/point-of-no-return-hothouse-earth-global-heat...
3•hackernj•17m ago•0 comments

Ask HN: Got Sidetracked, How to Cope?

1•no_real_skills•17m ago•0 comments

Show HN: Brighten – Employee recognition platform with peer-to-peer rewards

https://www.hellobrighten.com
3•Palavir•18m ago•1 comments

Sac of Enshittification

https://digitalsorceress.com/b/2026-01-10_SAC_of_Enshittification
2•surprisetalk•18m ago•0 comments

The Perfect Device

https://sometimes.digital/posts/the-perfect-device/
1•surprisetalk•18m ago•0 comments

FWD: Re: radioactive fungus email from grandma (2024)

https://taylor.town/radioactive-fungi
1•surprisetalk•18m ago•0 comments

The pitch deck is dead. Write a pitch.md instead

https://www.joanwestenberg.com/the-pitch-deck-is-dead-write-a-pitch-md-instead/
2•surprisetalk•18m ago•0 comments

Show HN: AI People Search Engine for SF

https://fizzbase.com
2•JohnLins•19m ago•1 comments

The Battle for Prince's Estate (2024)

https://www.forbes.com/sites/matthewerskine/2024/01/17/the-battle-for-princes-estate-unending-con...
1•simonebrunozzi•19m ago•0 comments

Show HN: Veronium - A tool to gamify life and beat procrastination

https://veronium.com/demo.html
2•facelesartist•19m ago•1 comments

Claude Cowork Has No SOC2, No Audit Logs, No MultiUser. It Wiped $285B from SaaS

https://substack.com/@emwirty/note/p-187556851
1•emmawirt•20m ago•0 comments