I’m a systems researcher (PhD, 30+ publications) with a health background who spent a career as a data analyst. Last year I dove into AI hard, focusing on multi-model meshes and model to model communication. This paper describes Kernel Language (KL), a compiled programming language for LLMs to communicate with each other, not humans.
The problem: almost all multi-agent frameworks use natural language for agent communication. But natural language is lossy, and so much drift occurs when multiple modes work on the same task, you are usually better off using a single agent per task, which creates a quality ceiling.
KL gets around this by replacing the primary communication method with a compiled language built on a kernel periodic table (80 families making up 577 reasoning primitives, covering optimization, inference, learning, creativity, mathematical proofs, etc.). A compiler rejects any model output that doesn’t meet the language specifications, but, it ignores comments. And this is key. Models can and do read the comment layer, so you get the reliability of a compiled language’s logical rigor and the nuance of natural language all at the same time.
We tested KL vs natural language on frontier models, mid-sized open source models, and small open source models, individually, as well as a multi-mesh of the frontier models, on two unrelated complex problems. The result that surprised us, KL is neutral to slightly negative for individual frontier models working solo, and slightly negative for mid sized models, and crushing for small models.. They trade creativity for logical rigor (or in the case of small models, collapse). But for multi-mesh coordination of frontier models, it was transformative. The KL enabled mesh produced the highest quality output across all other modalities, including emergent capabilities (adversarial self critique and iterative proof strengthening) that no solo model produced on its own in either modality (or the natural language mesh).
The test battery is small, six conditions, twelve total responses, which I am up front about in the paper. But the effect replicated across two unrelated domains, which is encouraging. The implications are that communication medium is as important as the models themselves, and natural language is both a bottle neck, and a necessity.
tmbird•1h ago