In January 2026, 1,200 malicious skills infiltrated the OpenClaw agent marketplace (ClawHavoc campaign). A month later, researchers catalogued 6,487 malicious agent tools that VirusTotal cannot detect. The first agent-software RCE was assigned CVE-2026-25253.
The response: a dozen heuristic scanning tools (pattern matching, LLM-as-judge, YARA rules). They all carry the same caveat: "no findings does not mean no risk."
SkillFortify takes a different approach. Instead of checking for known bad patterns, it formally verifies what a skill CAN do against what it CLAIMS to do. Five mathematical theorems guarantee soundness -- if SkillFortify says a skill is safe, it provably cannot exceed its declared capabilities.
What it does: - skillfortify scan . -- discover and analyze all skills in a project - skillfortify verify skill.md -- formally verify against capability declaration - skillfortify lock -- generate skill-lock.json for reproducible configs - skillfortify trust skill.md -- compute trust score (provenance + behavior) - skillfortify sbom -- CycloneDX 1.6 Agent Skill Bill of Materials
Supports Claude Code skills, MCP servers, and OpenClaw manifests.
Evaluated on 540 skills (270 malicious, 270 benign): F1=96.95%, zero false positives.
Paper: [ZENODO_DOI_URL] Install: pip install skillfortify Code: https://github.com/varun369/skillfortify
Built as part of the AgentAssert research suite. Happy to answer questions about the formal model, threat landscape, or benchmark methodology.
varunpratap369•1h ago