The constraint: Two devices need to share an expense ledger, but I don't want to run a database. No Postgres, no Firebase, no Supabase. The user's financial data should never leave their devices.
Storage: IndexedDB only
Every transaction lives in IndexedDB. Schema has 7 object stores — transactions, version history, device identity, pairings, sync state, recurring templates, and alerts. Performance is fine at personal finance scale (thousands of records). I added a simple query cache layer on top to avoid redundant reads — nothing fancy, just memoization with timestamp invalidation.
Sync: WebRTC DataChannel, no relay
Two devices pair using a 4-digit code. A signaling server exchanges ICE candidates and SDP offers — that's all it does. Once the WebRTC DataChannel opens, transactions sync as small JSON payloads directly between devices. The signaling server never sees transaction data.
TURN fallback is there for NAT traversal edge cases, but in practice most connections establish peer-to-peer on the first try.
Conflict resolution:
Both devices can edit the same transaction offline. Every edit creates a version entry. On sync, conflicts resolve using a version vector — latest timestamp wins, but both versions are preserved in history so nothing is lost. Simple but sufficient for a two-device household.
Voice pipeline (the fun part):
Users can speak an expense instead of typing it. Audio goes to a speech-to-text API, then the transcript goes to Gemini Flash with a structured output schema (Zod-validated). It extracts amount, category, payment method, and date from natural language. Accuracy is surprisingly good — the model handles ambiguous input like "lunch with coworkers, split three ways, I paid 1500" correctly.
Curious if anyone has built similar zero-server-DB architectures. How did you handle sync beyond two devices? And would love to hear from anyone who's done WebRTC DataChannel in production — did you end up needing TURN more than expected?