- Quality: night and day vs local models, we tried running vision locally first and it was mediocre
It works by triggering a screenshot on activity, sending it to a cloud vision model for summarization, then deleting the screenshot and storing only the text in local SQLite. You query it via MCP – "what was I working on before lunch?" and Claude actually knows.
BloondAndDoom•1h ago
This is great stuff, have you tried with local models? Summarization etc. is easy but I haven’t played with image to text models locally? Any ideas. I can run 32b models fine and for summarization kind of tasks they are extremely good I’d even say more than necessary
fidorka•1h ago
Nice, could I provide this memory to openclaw as well?
jzapletal•1h ago
What surprised us:
- Cost: $0.0002/screenshot (we budgeted 100x more), guess cloud vision APIs got cheap fast
- CPU: 5% (exp. 50%) and laptop stays cool
- Quality: night and day vs local models, we tried running vision locally first and it was mediocre
It works by triggering a screenshot on activity, sending it to a cloud vision model for summarization, then deleting the screenshot and storing only the text in local SQLite. You query it via MCP – "what was I working on before lunch?" and Claude actually knows.
BloondAndDoom•1h ago