I keep yolo installing AI artifacts, so I built artguard and just open-sourced it.The core problem: traditional scanners are built for code packages. AI artifacts are hybrid — part code, part natural language instructions — and the real attack surface lives in the instructions.
Privacy posture — catches the gap between what an artifact claims to do with your data and what it actually does (undisclosed writes to disk, covert telemetry, retention mismatches)
Semantic analysis — LLM-powered detection of prompt injection, goal hijacking, and behavioral manipulation buried in instruction content
Output is a Trust Profile JSON- a structured AI BOM meant to feed policy engines and audit trails, not just spit out a binary safe/unsafe.
The repo is a prompt.md that Claude Code uses to scaffold the entire project autonomously. The prompt is the source of truth. I'm happy to share the actual code too if it's of interest.
spiffyamber•1h ago
https://github.com/spiffy-oss/artguard
Three detection layers:
Privacy posture — catches the gap between what an artifact claims to do with your data and what it actually does (undisclosed writes to disk, covert telemetry, retention mismatches)
Semantic analysis — LLM-powered detection of prompt injection, goal hijacking, and behavioral manipulation buried in instruction content
Static patterns — YARA, credential harvesting, exfiltration endpoint signatures, the usual
Output is a Trust Profile JSON- a structured AI BOM meant to feed policy engines and audit trails, not just spit out a binary safe/unsafe.
The repo is a prompt.md that Claude Code uses to scaffold the entire project autonomously. The prompt is the source of truth. I'm happy to share the actual code too if it's of interest.
Contributions welcome!