Inspired by this Hacker News post: https://news.ycombinator.com/item?id=43048698
Backstory: I was having trouble producing transcriptions of Colonial American documents, which have their own unique challenges for OCR, and things like Tesseract fail miserably. So I built something. Uses Gemini and seems to work pretty well (disclaimer: you need your own API key). I didn't build Claude but I expect it works similarly well.
FWIW: largely vibe coded, with human review and intervention as required.