Take a giant model. Add a prompt and hope it does the whole job. The AI equivalent of spray-n-pray.
That works great for chatbots for creative outputs like code generation, website design and email writing. BUT, It breaks down fast for developer work that requires highly deterministic output. Think of OCR for KYC at banks or Audio recognition for patient-doctor notes.
We have been training SLMs for specific tasks for a while now. During which we kept seeing the same pattern. A SOTA language model with great benchmarks, failing on the parts that actually matter: reading a dense PDF like Form-1040, pulling clean data from a highly protected website, transcribing a long call with speakers in seconds, or returning output in a format code can trust.
So we spent a year researching how we can solve determinism. As many of us thought, more data was the solution. But not really… While data is part of the problem, re-thinking the architecture is the key solution for fixing issues like context drift.
Check out our paper: https://arxiv.org/abs/2602.04101 (Accepted into IEEE CAI 2026)
Traditional AI/ML models in the past, like YOLO, EasyOCR, PaddleOCR were great for a single task that they’re usually trained for, bringing consistent outputs like confidence scores, however they quickly become outdated at scale, limited by the data used for training, while requiring a team of ML engineers to maintain the model driving costs higher. LLMs offer higher flexibility and the ability to interact with natural language, making them generalizable, but hallucinate on sensitive tasks which have no room for error.
We combined DNNs/CNNs with Transformers to achieve a state model that brings both the advantages of a traditional ML model: the determinism & reliability, and the generalizability of an LLM.
Yoeven and I have over 14 years of combined research and development experience, and we’ve built a team of amazing researchers, software and infrastructure engineers engrossed in the fact that AI models can do a lot more work for developer tasks. Allowing us to make controllable AI available in every part of your stack.