I've been building a bank statement converter (Bankstatemently) and kept discovering edge cases across different banks. At some point, I started cataloging them as "quirks" and I'm currently at 36 documented challenges and counting (think: dates without years across year boundaries, credit card charges shown as positive instead of negative, dates hiding inside description text etc)
Real bank data is private, so there's no shared dataset to test parsers against. Once I had these quirks, I realized I can use them to reconstruct statements that deliberately include these challenges so more people can use them
There's also a free evaluation API: submit your parsed JSON and get field-level accuracy scores back. Ground truth is held server-side, but that's not necessarily bullet-proof against overfitting
Would appreciate feedback on which edge cases are missing. I'm planning to make the next 10 statements a bit harder (scanned PDFs, multi-currency across multi-table, Buddhist era dates)
https://github.com/bankstatemently/bank-statement-parsing-be...
You can browse all of the quirks here with real-world examples: https://bankstatemently.com/benchmark/challenges