I’ve always been frustrated by the lack of an accurate ranking for top open-source contributors on GitHub. The available lists either cap out early or are highly localized, completely missing developers with tens or hundreds of thousands of contributions.
So, I built DevIndex to rank the top 50,000 most active developers globally based on their lifetime contributions.
From an engineering perspective, the constraint I imposed was: *No backend API.* I wanted to host this entirely on GitHub Pages for free, meaning the browser had to handle all 50,000 data-rich records directly.
Here is how we made it work:
1. *The Autonomous Data Factory (Backend):* Because GitHub's API has no "Lifetime Contributions" endpoint, we built a Node.js pipeline running on GitHub Actions. It uses a "Network Walker" spider to traverse the social graph (to break out of algorithmic filter bubbles) and an Updater that chunks GraphQL queries to prevent 502 timeouts. The pipeline continuously updates a single `users.jsonl` file.
*Privacy Note:* We use a "Stealth Star" architecture for opt-outs. If a dev stars our opt-out repo, the pipeline cryptographically verifies them, instantly purges their data, and blocklists them. No emails required.
2. *Engine-Level Streaming (O(1) Memory Parsing):*
You can't `JSON.parse()` a 23MB JSONL file without freezing the UI. We built a Stream Proxy using `ReadableStream` and `TextDecoderStream` to parse the NDJSON incrementally, rendering the first 500 users instantly while the rest load in the background.3. *Turbo Mode & Virtual Fields:* Instantiating 50k JS objects crushes memory. The store holds raw POJOs exactly as parsed. Complex calculated fields (like "Total Commits 2024") use prototype-based getters dynamically generated by a RecordFactory. Adding 60 new data columns adds 0 bytes of memory overhead per record.
4. *The "Fixed-DOM-Order" Grid:* We had to rewrite our underlying UI engine (Neo.mjs). Traditional VDOMs die on massive lists because scrolling triggers thousands of `insertBefore`/`removeChild` mutations. We implemented a strict DOM pool. The VDOM array length never changes. Rows leaving the viewport are recycled in place via hardware-accelerated CSS `translate3d`. A 60fps vertical scroll across 50,000 records generates 0 structural DOM mutations.
5. *Quintuple-Threaded Architecture:* To keep sorting fast and render "Living Sparklines" in the cells, we aggressively split the workload across workers. The Main Thread only applies DOM updates. The App Worker handles the 50k dataset, streaming, and VDOM generation. A dedicated Canvas Worker renders the sparklines independently at 60fps using `OffscreenCanvas`.
The entire backend pipeline, streaming UI, and core engine rewrite were completed in one month by myself and my AI agent.
Live App (see where you rank): https://neomjs.com/apps/devindex/ Code / 26 Architectural Guides: https://github.com/neomjs/neo/tree/dev/apps/devindex
Would love to hear feedback on the architecture, especially from anyone who has tackled "Fat Client" scaling issues or massive GraphQL aggregation!