The core idea is that the structure of legal reasoning (which stages to run, how to select and interpret norms, how to balance competing interests, when to revise earlier conclusions) is expressed in a strongly typed pseudocode / meta-language. Some parts of this meta-algorithm are implemented directly in code (procedural checks, basic qualification, graph updates), some are mathematical (utilities, equilibria, fuzzy uncertainty), and some are written as high-level instructions in natural language, which the LLM interprets under tight constraints. In that setting, the LLM is not a predictor of outcomes but an interpreter of a given procedural script.
The system doesn’t train on case law and doesn’t try to “predict” courts. It reconstructs the reasoning pipeline itself: from extracting the parties’ factual narratives and evidence structure, through norm selection and weighting, up to generating a decision that can be traced back step-by-step in the internal graph of operations. The same meta-algorithm can work with different jurisdictions by swapping norm packages; we’ve tested it so far on a set of international and domestic disputes.
There is an early public demo here: https://portal.judgeai.space/
If you upload a small statement of claim and a response, the engine runs the full pipeline and outputs a structured decision document.
We’d be grateful for feedback from people working on hybrid symbolic/semantic systems, “LLM as interpreter” architectures, or formal models of complex decision-making. Obvious open questions for us are: how best to test failure modes of this kind of meta-control, what formal tools to use for checking consistency of the reasoning graph, and how far one can push this approach before hitting hard theoretical limits.