I'm a 20yo solo builder from India,
I got frustrated that every capable AI model assumes you have a GPU, a credit card, or reliable internet. None of those are true for most of the world — including me.
So I started digging into the compression literature and ways through which i can solve this problem
What I found: - DeepSeek distilled 671B reasoning into 1.5B that runs on a laptop - TRM (Samsung, 2025) beat DeepSeek R1 on ARC-AGI with 7M parameters by iterating instead of scaling - RWKV runs in constant memory with no quadratic attention cost - GRPO lets you specialize a tiny model on a narrow domain in hours on CPU
The techniques exist. What doesn't exist: a systematic effort to apply all of them together, specifically for low-resource languages and low-end hardware, and give the results away free.
I'm building this. Calling it KIRO.
The goal is simple: take every major open source frontier model, compress it into domain-specific versions under 500MB, and deploy them offline on the cheapest Android hardware available.
Starting with math/physics education because that's the problem I know personally. Expanding to healthcare triage, legal aid, and agricultural advisory.
Currently running my first experiment on my i3 — R1-1.5B vs Qwen-7B on Hindi math problems. Will post results when training finishes.
Two honest questions for HN:
1. Is anyone else working on this specific intersection — compression + low-resource languages + offline deployment?
2. What would make this genuinely useful vs just technically interesting to you?
Everything will be open source.