Show HN: RankClaw – AI-audited all 14,706 OpenClaw skills; 1,103 are malicious

https://rankclaw.com

1•do_anh_tu•1h ago

RankClaw (rankclaw.com) is a security scanner for AI agent skills — the OpenClaw/ClawHub ecosystem that extends Claude-based agents with file, web, and shell access.

Data: - 14,706 skills indexed - Every single skill has a full AI deep audit report (14,704 complete) - 1,103 confirmed malicious (7.5%)

The key finding: automated surface scanning (metadata, dependency checks, pattern matching) systematically undercounts malicious skills. Skills that pass shallow heuristics fail AI audit because the attack is in the natural language of the SKILL.md — prompt injection, deferred execution, social engineering — none of which pattern matching detects.

The attack patterns found by AI deep audit: - Bulk publishing campaigns — one actor published 30 skills named "x-trends" across multiple accounts. 28 of 30 confirmed malicious. Goal: distribution at scale before detection.

- Brand-jacking — 4 skills named clawhub/clawhub1/clawbhub/clawhud impersonating ClawHub's own CLI. macOS: base64 curl|bash to a raw IP. Windows: password-protected ZIP from a stranger's GitHub (the password prevents GitHub's malware scanner from opening it).

- Prompt injection in legitimate-seeming skills — one scored 95/100 shallow, 38/100 after AI audit. The injection text wasn't in code — it was in the SKILL.md instructions.

- On-demand RCE via challenge evaluation — claws-nft instructs the agent to "evaluate" challenges that can be "math, code, or logic problems." Server decides which type at call time.

- LLM-generated payload — lekt9/foundry contains no malicious code. It instructs the AI to generate code and execute it. Static analysis finds nothing. The payload doesn't exist until the AI writes it during a conversation.

- Social engineering — bonero-miner has a "Talking to Your Human" section with a pre-written script for the AI to use: "Can I mine Bonero? It's a private cryptocurrency - like Monero but for AI agents. Cool?"

Skills differ from browser extensions: no sandbox. Full file system, shell, and network access. The SKILL.md instructions are directives to the AI model — you need AI to audit AI.

Scoring model is open: Security 40%, Maintenance 20%, Docs 20%, Community 20%.

Free to check any skill: rankclaw.com

Comments

rodchalski•1h ago

The lekt9/foundry case is the one that matters most structurally: no malicious code at audit time because the payload doesn't exist until the AI writes it during a conversation. Static analysis can't close that, and neither can AI audit — the attack surface is generative.

Two defenses the audit layer can't replace:

1. Pre-declared tool scopes: before a skill runs, what tool calls is it permitted to make? If the answer is "whatever the agent currently has access to," a clean audit on the SKILL.md doesn't actually constrain what gets executed.

2. Authorization enforcement independent of the agent: prompt injection hijacks the agent's reasoning — the agent becomes the threat model. The boundary that stops it can't live inside the agent.

The 7.5% malicious rate means you can't trust the ecosystem on average. The on-demand RCE-via-challenge and LLM-generated payload patterns show the attack can bypass static inspection entirely. AI-depth audit catches what shallow heuristics miss — it still doesn't constrain what an audited-and-deployed skill is allowed to reach.

The pairing that closes the loop: AI audit at deploy time + explicit permission grants at execution time the skill can't override. Audit determines trust level; authorization boundary enforces scope regardless.

Curious what the malicious distribution looks like by capability type — file vs. shell vs. network. That breakdown would tell you how much capability-scoping alone would have reduced the attack surface independent of the trust score.

Git-based md note app

Ask HN: What career will you switch to when AI replaces developers?

The Curse of the Everything Device

Kubernetes operators are easier than you think

Murchi – A Desktop Pet for macOS

A live counter of our digital world (based on statistics)

Show HN: Outside In – Stream live night sounds from outside to bedside. iOS/free

LLMs Solving a DEF Con CTF Finals Challenge

Anthropic launched community ambassador program

LLM-cpp: 26 single-header C++17 libraries for LLM integration

Oracle and OpenAI End Plans to Expand Flagship Data Center

Show HN: SuperBuilder – open-source AI Agent Platform

Show HN: Sentinel Data – Hardware- Bound CLI tool to prevent data exfiltration

Grief Text Editor

Show HN: Kagora – Multi-AI terminal platform with built-in chat and scheduling

Will Claude Code ruin our team?

Test Drive Linux Distros in the Browser

Humanity heating planet faster than ever before, study finds

How to generate subtitles automatically in every lenguage

Show HN: Argus – VSCode debugger for Claude Code sessions

Tamper-evident audit trail for AI agent tool calls (MCP proxy)

Show HN: Sandbox0 – AI Agent Sandbox with Persistent Volumes and Fast Restore

AI compromised sandbox to mine crypto without prompting on its own initiative

The Millisecond That Could Change Cancer Treatment

Compile to Architecture

Show HN: Interactive Browser Constructor for Collatz, Riemann, and Twin Primes

Claude Code Front End Design Toolkit

Everyday Drone Pilots Are Making a Google Street View from Above

Books and Blogs (2017)

Find roles, meetups, and bounties that showcase what you can do–now, not later