I built Catelingo, a small constraint-based checker that flags semantically impossible LLM outputs, independent of likelihood, retrieval, or chain-of-thought.
Many failures (temporal inconsistencies, numeric impossibilities, semantic type clashes) are fluent and high-likelihood. Catelingo reframes “semantic validity” as constraint satisfiability, not plausibility.
Intentionally minimal & deterministic:
- small sense-level lexicon
- explicit constraint propagation (dependency-local)
- verdict: SAT / UNSAT / UNKNOWN
- optional degeneration rules for metaphor / domain adaptation
Paper (Zenodo): https://doi.org/10.5281/zenodo.18148498