What's new in v0.9.1 - Built-in Web UI:
Links:
GitHub: https://github.com/scouzi1966/maclocal-api
Release: https://github.com/scouzi1966/maclocal-api/releases/tag/v0.9...
You can now run afm -w to start both the API server and a chat web interface in one command. It integrates llama.cpp's webui and auto-opens your browser. No need to set up Open-webui separately if you just want a quick chat interface.
afm -w
That's it. Browser opens to http://localhost:9999 with a chat UI talking to Apple's on-device 3B model.
Other changes:
/props endpoint for webui compatibility
model field now optional in chat completion requests
llama.cpp pinned as a submodule for reproducible builds
What afm does:
Runs Apple's 3B parameter on-device LLM
OpenAI-compatible API (/v1/chat/completions, /v1/models)
Single-prompt mode: afm -s "your question"
Pipe mode: cat file.txt | afm
LoRA adapter support for fine-tuned models
Vision capabilities (text extraction, table OCR)
Works as a backend for Open-webui too
Install:
brew tap scouzi1966/afm
brew install afm
Requirements:
macOS 26 (Tahoe)
Apple Silicon (M1/M2/M3/M4)
Apple Intelligence enabled
Questions welcome.