The Problem:
As an active user of Claude/MCP, I found the ecosystem fragmented. Searching GitHub for "agent skills" returns thousands of results, but many are broken, outdated, or just simple wrappers. Stars often don't correlate with reliability.
The Solution:
I built MCPxel to curate and rate these skills.
How it works:
LLM-as-a-Judge: We use an automated pipeline to evaluate skills on 5 dimensions (Clarity, Utility, Quality, Maintainability, Novelty).
S-Tier Filtering: We assign grades (S/A/B/C). S-Tier skills are verified to work out of the box.
Role-Based Search: Instead of keyword guessing, you can find tools tailored for specific workflows (e.g., "frontend dev").
Tech Stack:
Built with Next.js. We used TRAE (AI IDE) and DeepSeek-V3 heavily during development to accelerate the process.
Try it out:
Check out the LLM-judged ratings (S/A/B/C). I'd love to know if the scores match your experience with these tools!
maxnew•1h ago
I'm the maker of MCPxel (https://mcpxel.com).
The Problem: As an active user of Claude/MCP, I found the ecosystem fragmented. Searching GitHub for "agent skills" returns thousands of results, but many are broken, outdated, or just simple wrappers. Stars often don't correlate with reliability.
The Solution: I built MCPxel to curate and rate these skills.
How it works:
LLM-as-a-Judge: We use an automated pipeline to evaluate skills on 5 dimensions (Clarity, Utility, Quality, Maintainability, Novelty). S-Tier Filtering: We assign grades (S/A/B/C). S-Tier skills are verified to work out of the box. Role-Based Search: Instead of keyword guessing, you can find tools tailored for specific workflows (e.g., "frontend dev"). Tech Stack: Built with Next.js. We used TRAE (AI IDE) and DeepSeek-V3 heavily during development to accelerate the process.
Try it out: Check out the LLM-judged ratings (S/A/B/C). I'd love to know if the scores match your experience with these tools!