Excited to share Nanonets-OCR2, a state-of-the-art suite of models designed for advanced image-to-markdown conversion and Visual Question Answering (VQA).
Wow, OCR is now basically a general domain. I remember when I spent like a year trying to create one for receipts. Took me 6 months of data curation to prepare.
Nice job, the scores are superb.
prats226•5h ago
Yes, and its not just OCR (Optical Character Recognition), it understands layouts, captures signatures, charts, watermarks etc so way beyond just characters
PixelPanda•5h ago
Live Demo -> https://docstrange.nanonets.com/
Blog -> https://nanonets.com/research/nanonets-ocr-2/