The direction here is to take a task-specific reasoning surface and compress it into something that can live in flash and run on a very cheap MCU. Not as a replacement for cloud LLMs, but as a different endpoint entirely: an auditable offline expert for constrained environments.
For me the interesting question is not “can I squeeze a tiny chatbot onto a board?”, but “can useful benchmark-level behavior be crystallized into a deployable embedded runtime?” That is what this project is trying to explore.