frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Open-source synthetic bank statements for testing parsers

1•Maesh•1h ago
I open-sourced a dataset of 5 synthetic bank and credit card statement PDFs designed for testing extraction/parsing accuracy. Each PDF uses a fictional bank with realistic formatting from a different country

I've been building a bank statement converter (Bankstatemently) and kept discovering edge cases across different banks. At some point, I started cataloging them as "quirks" and I'm currently at 36 documented challenges and counting (think: dates without years across year boundaries, credit card charges shown as positive instead of negative, dates hiding inside description text etc)

Real bank data is private, so there's no shared dataset to test parsers against. Once I had these quirks, I realized I can use them to reconstruct statements that deliberately include these challenges so more people can use them

There's also a free evaluation API: submit your parsed JSON and get field-level accuracy scores back. Ground truth is held server-side, but that's not necessarily bullet-proof against overfitting

Would appreciate feedback on which edge cases are missing. I'm planning to make the next 10 statements a bit harder (scanned PDFs, multi-currency across multi-table, Buddhist era dates)

https://github.com/bankstatemently/bank-statement-parsing-be...

You can browse all of the quirks here with real-world examples: https://bankstatemently.com/benchmark/challenges

Show HN: AskAudience – Ask 16,500 AI personas built from real survey data

https://askaudience.de
1•torlik•50s ago•1 comments

John O'Hurley on "Seinfeld" audition [video]

https://www.youtube.com/watch?v=T239Et669q8
1•keepamovin•1m ago•0 comments

Beholder

https://beholder.news/
1•m-hodges•1m ago•0 comments

UTM tracking parameters on internal links waste crawl budget and fracture

https://berreby.ai/utm-tracking-parameters-internal-link-crawl-budget-waste-fracture-analytics-fix/
1•semking•2m ago•1 comments

macOS 26 breaks custom DNS settings including .internal:(

https://gist.github.com/adamamyl/81b78eced40feae50eae7c4f3bec1f5a
1•adamamyl•2m ago•1 comments

Maintenance: Of Everything – The End of Combustion Vehicles

https://books.worksinprogress.co/book/maintenance-of-everything/vehicles/the-end-of-combution-veh...
1•ostacke•2m ago•0 comments

Show HN: Blazeway – A/B testing tool that builds a connected experiment history

https://www.blazeway.app/
1•jaylisches•2m ago•0 comments

The Specification Gap: Coordination Failure Under Partial Knowledge in Agents

https://www.researchgate.net/publication/402716423_The_Specification_Gap_Coordination_Failure_Und...
1•camilochs•3m ago•0 comments

AI is programmed to hijack human empathy

https://www.nature.com/articles/d41586-026-00834-z?WT.ec_id=NATURE-20260319
1•delichon•3m ago•0 comments

Cut AI debugging tokens by 60% by grouping test failures

https://medium.com/@bilalimamogluu/most-ai-agent-debugging-is-just-expensive-log-reading-b51e6266...
1•bimamoglu•5m ago•1 comments

Show HN: MDX Docs – a lightweight React framework for documentation sites

https://mdxdocs.com
2•thequietmind•5m ago•0 comments

Ask HN: Why isn't using your home network as a VPN more common?

1•hjconstas•6m ago•0 comments

Introducing GPU Acceleration

https://www.kernel.sh/blog/gpu
1•rgarcia•6m ago•0 comments

Make.com Is a Bad Idea for Your Business

https://medium.com/@nick_25216/make-com-is-a-bad-idea-for-your-business-97456e03199b
1•niceguy1827•6m ago•0 comments

Gait Analysis

https://en.wikipedia.org/wiki/Gait_analysis
1•cainxinth•7m ago•0 comments

Is it Really Impossible To Cool A Datacenter In Space?

https://www.youtube.com/watch?v=FlQYU3m1e80
1•gessha•7m ago•0 comments

Is a random human peer better than a chatbot in reducing loneliness over time?

https://www.sciencedirect.com/science/article/pii/S0022103126000417
2•speckx•8m ago•0 comments

Mnist-Lean4

https://github.com/brettkoonce/mnist-lean4
1•asparagui•8m ago•0 comments

Psychedelic Therapy vs. Antidepressants for the Treatment of Depression

https://jamanetwork.com/journals/jamapsychiatry/article-abstract/2846479
1•cpncrunch•8m ago•0 comments

Folio: PDF generation for Go with an in-browser WASM playground

https://github.com/carlos7ags/folio
1•carlos7ags•8m ago•1 comments

Street Fighter 6's Incestuous New Storyline Divides Opinion

https://www.pushsquare.com/news/2026/03/street-fighter-6s-incestuous-new-storyline-divides-opinion
1•randycupertino•11m ago•0 comments

Iran attack wipes out 17% of Qatar's LNG capacity for up to 5 yrs

https://www.bnnbloomberg.ca/business/2026/03/19/iran-attack-wipes-out-17-of-qatars-lng-capacity-f...
4•bhouston•12m ago•1 comments

Show HN: Starspelled – Turn Words into Constellations

https://starspelled.com
1•photonboom•12m ago•0 comments

Amazon CEO sees AI doubling prior AWS sales projections to $600B by 2036

https://www.reuters.com/business/amazon-ceo-sees-ai-doubling-his-prior-aws-sales-projections-600-...
3•randycupertino•15m ago•0 comments

Mac on-screen camera indicator light

https://support.apple.com/guide/security/mac-on-screen-camera-indicator-light-sec75a2d237d/web
1•raw_anon_1111•16m ago•0 comments

4Chan attorney replies to UK Ofcom fine with picture of giant hamster

https://twitter.com/prestonjbyrne/status/2034551030453539149
4•timr•17m ago•0 comments

Iran, Gabbard Turned Intelligence Duties over to Trump

https://www.nytimes.com/2026/03/18/us/politics/tulsi-gabbard-iran-trump.html
1•whack•18m ago•0 comments

SAP's grand cloud escape plan €2B short of the runway

https://www.theregister.com/2026/03/19/sap_2b_off_target/
2•Brajeshwar•18m ago•0 comments

ArXiv leaves partnership with Cornell to become an independent non-profit

https://tech.cornell.edu/arxiv/
1•halperter•20m ago•0 comments

Ask HN: How do you vibe code in microservices without breaking everything?

4•qbacode•20m ago•2 comments