I built a web app that splits bills just by taking a photo and talking to it — even in different languages and with complex rules like “Anna pays for drinks, Bob only pays 20%, the rest 60/40”.
The interesting part for me was making the AI understand natural speech as if you were explaining it to a human. Friends can join the same bill from their phones and add their parts, and everything syncs in real time.
### Why I built it I was frustrated with manual bill-splitting apps. I wanted to see how far modern OCR, multilingual speech recognition, and LLMs could go if combined properly.
### Tech details - *Frontend*: React + Vite + TypeScript, Tailwind + shadcn/ui for UI, TanStack Query for data, Socket.IO for real-time sync. - *Backend*: Node.js + Express, PostgreSQL with Prisma ORM, Google OAuth, JWT for auth, Socket.IO for events. - *AI pipeline*: - OCR to extract text from the photo - Whisper for multilingual speech-to-text - LLM parses the free-form speech/text into structured math rules for splitting - *Real-time*: When you share a bill, everyone joins a Socket.IO room and sees updates live.
### Things I’d love feedback on - How well it handles tricky or “unfair” split instructions - Testing non-English voice commands - UX around collaborative splitting
You can try it here: killbill.top
(I might open-source some of the parsing pipeline once it’s stable.)