I'm a self-taught developer and researcher who left school at 16, and I've spent some time exploring a first-principles approach to system design for various frontier problems. In this case it's AI that challenges the 'bigger is better' transformer paradigm.
Lingo is the first piece of that research, a high-performance linguistic database designed to run on-device.
The full technical overview and manifesto is here: https://medium.com/@robm.antunes/bcd1e9752af6
The paper has been archived on Zenodo with a DOI: https://doi.org/10.5281/zenodo.17196613
The code is open-source and can be found at https://github.com/RobAntunes/lingodb, it's currently broken and feature incomplete but I'm working on it - just wanted to start getting some feedback.
All benchmarks are reproducible from the repo and can also be found in the various texts.
As an independent without academic affiliation, I'd be incredibly grateful for your feedback! I'm here to answer any questions.
Cheers!