Hi HN! I built Bank Parser to solve a problem CPAs face: importing old Chase and Bank of America statements into QuickBooks.
Chase limits CSV downloads to 18 months. For 5+ years of historical data, CPAs manually enter transactions (45-60 min per statement).
Generic PDF converters (Tabula, PDFTables) fail on bank statements because:
- Multiple formats in same year (Chase v2 AND v3 in 2024)
- No consistent headers in some PDFs
- Need heuristic column detection
My solution:
- Structure-based format detection (not year-based)
- Heuristic column inference for headerless PDFs
- 99% accuracy tested on 70+ PDFs
- QuickBooks-ready 16-field format
Tech stack: Node.js, pdfjs-dist (Mozilla PDF.js), TypeScript, Bull queue.
Free trial: 200 operations (3-4 statements).
Looking for feedback from anyone working with financial data or PDF parsing!