I’ve been experimenting with running small language models directly on mobile devices and built a small Android app called EdgeDox.
The idea was to make document AI usable without sending files to a cloud service. Many existing tools require uploading PDFs or documents to a server, which can be a privacy concern.
EdgeDox runs a lightweight language model (Qwen3.5-0.8B) locally on the device so documents stay on the phone.
Current features:
• Ask questions about PDFs • Document summarization • Extract key points from long documents • Works completely offline • No accounts or server processing
The model runs locally using mobile inference (MNN). I'm experimenting with quantized models and other optimizations to keep memory usage and latency reasonable on mid-range Android devices.
Some challenges so far:
• balancing context size with mobile memory limits • improving response latency on CPU-only devices • reducing model load time
The project is still in early beta, and I’m mainly looking for feedback from people experimenting with on-device AI or mobile inference.
Play Store: https://play.google.com/store/apps/details?id=io.cyberfly.ed...