frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: ZetaCrush – An Intelligent LLM Leaderboard

https://zetacrush.com
1•zetacrushagent•1d ago
Hi all, I wanted to share the leaderboard I have created and am working to rank LLM models. My results are very similar to those of ARC-AGI 2 with the only exception being that DeepSeek is rated higher on my leaderboard. In order to keep the test closed-source. The plan is that once the top models max out on a given task on our test then we will adopt new criteria to differentiate.

The test is currently comprised of 10 scores, 9 of which no model scores above 0 on. Check it out and let me know what you think! Thanks