1. Capability gating - Don't declare sampling capability during init for external/untrusted servers.
Keep it enabled only for internal trusted ones.
2. Human approval loops - Force manual review before any sampling request hits your LLM. Protocol says "SHOULD" not "MUST" so implementation varies.
3. Token rate limiting - Set max_tokens params client-side when calling LLM APIs. Again, relies on individual devs following policy.
4. True MCP proxy - Terminate & reestablish connections (not just network filtering). Enables granular controls like "sampling for tool A but not B."
The real issue: first 3 strategies depend on individual developers following security policies. Only #4 gives centralized control.
Sampling's a double-edged sword - shifts LLM costs from server to client (good for internal workflows) but opens denial-of-wallet attacks from malicious external servers.
Most orgs probably don't even know this feature exists yet.
Worth noting the travel booking example is compelling - instead of travel team paying tokens to format JSON responses, the requesting department's LLM budget handles it. Smart cost allocation if you can secure it properly.
schwentkerr•25m ago
1. Capability gating - Don't declare sampling capability during init for external/untrusted servers. Keep it enabled only for internal trusted ones.
2. Human approval loops - Force manual review before any sampling request hits your LLM. Protocol says "SHOULD" not "MUST" so implementation varies.
3. Token rate limiting - Set max_tokens params client-side when calling LLM APIs. Again, relies on individual devs following policy.
4. True MCP proxy - Terminate & reestablish connections (not just network filtering). Enables granular controls like "sampling for tool A but not B."
The real issue: first 3 strategies depend on individual developers following security policies. Only #4 gives centralized control.
Sampling's a double-edged sword - shifts LLM costs from server to client (good for internal workflows) but opens denial-of-wallet attacks from malicious external servers.
Most orgs probably don't even know this feature exists yet. Worth noting the travel booking example is compelling - instead of travel team paying tokens to format JSON responses, the requesting department's LLM budget handles it. Smart cost allocation if you can secure it properly.