Hey HN! I built Skills-Kit, a TypeScript framework that lets you create, validate, and bundle self-contained "skills" – think of them as portable automation modules that AI agents (or humans) can execute.
The Problem: Most AI agent frameworks treat code execution as an afterthought. You get either sandboxed-but-limited environments or full system access with zero safety. Plus, sharing and versioning agent capabilities is a mess.
Skills-Kit's approach:
Each skill is a folder: metadata (YAML), a deterministic Node.js entrypoint, declarative security policies, and golden tests
Built-in linting validates structure and security declarations
Golden test runner ensures skills behave correctly
AI-powered creation: Use Claude (or mock templates) to generate skills from natural language
Bundle and distribute skills as validated packages
What makes it interesting:
Security-first: skills declare what they need (network, filesystem, exec) upfront via policy.yaml
Testable: golden tests catch regressions before deployment
Provider-agnostic: works with Anthropic's API today, designed to support other LLMs
Composable: skills can call other skills (orchestration primitives)
Current state: Early (v0.1.0), interfaces may evolve. Looking for feedback on:
The skill format itself – too verbose? missing something critical?
Security model – how would you enforce policies at runtime?
Use cases I'm missing – what would you build with this?
I'm not running a hosted service (yet?) – this is CLI/library tooling you run locally. The goal is to make "agentic capabilities" as shareable and reliable as npm packages.
GitHub: https://github.com/gabrielekarra/skills-kit
Would love to hear what you think, especially from folks building agent systems. What's your experience with code generation and execution safety?
gabrielekarra•1h ago
Each skill is a folder: metadata (YAML), a deterministic Node.js entrypoint, declarative security policies, and golden tests Built-in linting validates structure and security declarations Golden test runner ensures skills behave correctly AI-powered creation: Use Claude (or mock templates) to generate skills from natural language Bundle and distribute skills as validated packages
What makes it interesting:
Security-first: skills declare what they need (network, filesystem, exec) upfront via policy.yaml Testable: golden tests catch regressions before deployment Provider-agnostic: works with Anthropic's API today, designed to support other LLMs Composable: skills can call other skills (orchestration primitives)
Current state: Early (v0.1.0), interfaces may evolve. Looking for feedback on:
The skill format itself – too verbose? missing something critical? Security model – how would you enforce policies at runtime? Use cases I'm missing – what would you build with this?
I'm not running a hosted service (yet?) – this is CLI/library tooling you run locally. The goal is to make "agentic capabilities" as shareable and reliable as npm packages.
GitHub: https://github.com/gabrielekarra/skills-kit Would love to hear what you think, especially from folks building agent systems. What's your experience with code generation and execution safety?