"Hey HN. I've been building a completely offline AI journal. The biggest hurdle was the memory footprint of running multiple agent personas. I ended up bypassing standard wrappers and using Meta's ExecuTorch to compile the PyTorch graphs ahead-of-time for the Apple Neural Engine, plus 4-bit quantization. Happy to answer any questions about the CoreML backend or managing the 'Blackboard' state object for the agents without killing the battery."