Instead of treating each image as a one-off prompt, scanOS normalizes recurring visual inputs into the same schema over time — even when the source formats differ. The goal is to accumulate state from visual data, not just extract text.
It’s not an OCR tool and doesn’t rely on embeddings, RAG, or fine-tuning. The output is either human-readable text or explicit machine-readable JSON that can be stored, inspected, and reused as part of a file-based memory system.
scanOS is part of a larger file-based architecture I’m using daily, but this module stands on its own.
Code + docs: https://github.com/johannes42x/scanOS