Hardest part: getting word-level bounding boxes right from Vision's line-level text observations - no undocumented APIs, no manual position estimation, just candidate.boundingBox(for:).
Everything runs offline. Live on iOS; Android build is ready and heading to Google Play. Built solo in Swift/SwiftUI/AVFoundation. Would love feedback, especially on OCR accuracy on tricky labels.