The shared-state manifest pattern is the right call — we hit the same design question building VitalStack (supplement interaction MCP server, vital-stack.com). Ended up using Supabase instead of a file because our skills need live lookup, but the principle is identical: one source of truth that downstream skills read and annotate.
Your confidence tiers (validated/researched/assumed) resonates too — we distinguish PubMed RCT data from case reports from mechanistic inference. Ended up being the most important UX decision: users need to know why a result is flagged, not just that it is.
The anti-slop checks are clever. Does the swap test run at generation time or as a post-processing step? Curious whether you're prompting for it explicitly or checking output against a classifier.
VitalStack•36m ago
Your confidence tiers (validated/researched/assumed) resonates too — we distinguish PubMed RCT data from case reports from mechanistic inference. Ended up being the most important UX decision: users need to know why a result is flagged, not just that it is.
The anti-slop checks are clever. Does the swap test run at generation time or as a post-processing step? Curious whether you're prompting for it explicitly or checking output against a classifier.