The insight that kicked this off: a small model with good tool calling is already useful. I tested Qwen 3.5 9B on my M1 Pro 16GB and gave it a simple tool (rename files matching a pattern). It worked, offline, reliably. The model doesn't need to be smart about everything - it needs to know what tool to call next.
What Familiar does now: - Hardware detection on first launch → recommends the best model your machine can run without becoming a heater - Ships with a small default model so you're running in minutes - File tools out of the box (create, rename, move, delete)
What I'm building toward: Night shift mode — when you sleep, the app switches to a larger model using your full resources. The reasoning: you're not at the machine, so it can use 100% of what's available. You wake up to work done by a smarter local agent, no cloud involved.
Current limitations I won't gloss over: sub-3B models struggle with tool call reliability. Complex multi-step reasoning needs bigger models. The night shift feature exists specifically to unlock larger models during low-usage hours.
Stack: Swift/SwiftUI, MLX for inference on Apple Silicon. Tested on M1 Pro 16GB — intentionally not exotic hardware.
iOS companion is in progress (lighter capabilities, same local-first approach).
GitHub going public shortly. Happy to answer questions about the MLX inference setup or tool calling architecture.