frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Open-source synthetic bank statements for testing parsers

2•Maesh•5h ago
I open-sourced a dataset of 5 synthetic bank and credit card statement PDFs designed for testing extraction/parsing accuracy. Each PDF uses a fictional bank with realistic formatting from a different country

I've been building a bank statement converter (Bankstatemently) and kept discovering edge cases across different banks. At some point, I started cataloging them as "quirks" and I'm currently at 36 documented challenges and counting (think: dates without years across year boundaries, credit card charges shown as positive instead of negative, dates hiding inside description text etc)

Real bank data is private, so there's no shared dataset to test parsers against. Once I had these quirks, I realized I can use them to reconstruct statements that deliberately include these challenges so more people can use them

There's also a free evaluation API: submit your parsed JSON and get field-level accuracy scores back. Ground truth is held server-side, but that's not necessarily bullet-proof against overfitting

Would appreciate feedback on which edge cases are missing. I'm planning to make the next 10 statements a bit harder (scanned PDFs, multi-currency across multi-table, Buddhist era dates)

https://github.com/bankstatemently/bank-statement-parsing-be...

You can browse all of the quirks here with real-world examples: https://bankstatemently.com/benchmark/challenges

Show HN: Three new Kitten TTS models – smallest less than 25MB

https://github.com/KittenML/KittenTTS
129•rohan_joshi•2h ago•39 comments

Show HN: Dumped Wix for an AI Edge agent so I never have to hire junior staff

7•axotopia•2h ago•9 comments

Show HN: Local Document Parsing for Agents

https://www.llamaindex.ai/blog/liteparse-local-document-parsing-for-ai-agents
17•cheesyFish•1h ago•0 comments

Show HN: Oku – One tab to filter out noise from feeds and content sources

https://oku.io
3•oan•1h ago•0 comments

Show HN: Duplicate 3 layers in a 24B LLM, logical deduction .22→.76. No training

https://github.com/alainnothere/llm-circuit-finder
223•xlayn•21h ago•78 comments

Show HN: BamBuddy – a self-hosted print archive for Bambu Lab 3D printers

https://bambuddy.cool
3•maziggy•1h ago•0 comments

Show HN: I built 48 lightweight SVG backgrounds you can copy/paste

https://www.svgbackgrounds.com/set/free-svg-backgrounds-and-patterns/
357•visiwig•1d ago•67 comments

Show HN: AgentClick – Human-in-the-loop review UI for AI coding agents

https://github.com/agentlayer-io/AgentClick
3•harvenstar•2h ago•0 comments

Show HN: PearlOS: we gave AI a talking desktop environment instead of a text box

2•stephanieriggs•3h ago•0 comments

Show HN: RustFS – Migrate from MinIO via simple binary replacement

https://rustfs.dev/binary-replacement-a-simple-way-to-migrate-from-minio-to-rustfs/
9•elvinagy•5h ago•9 comments

Show HN: Mavera – Predict audience response with GANs, not LLM sentiment

https://docs.mavera.io/introduction
4•jaxline506•2d ago•3 comments

Show HN: Will my flight have Starlink?

267•bblcla•1d ago•343 comments

Show HN: 3 AI agent trust systems cross-verified each other's delegation chains

https://github.com/kanoniv/agent-auth/issues/2
2•dreynow•2h ago•0 comments

Show HN: Browser grand strategy game for hundreds of players on huge maps

https://borderhold.io/play
49•sgolem•3d ago•22 comments

Show HN: MDX Docs – a lightweight React framework for documentation sites

https://mdxdocs.com
3•thequietmind•3h ago•0 comments

Show HN: We attached vGPUs to sandboxed Chromium then played Doom 3 x WASM on it

https://www.kernel.sh/blog/gpu
7•rgarcia•3h ago•0 comments

Show HN: Playing LongTurn FreeCiv with Friends

https://github.com/ndroo/freeciv.andrewmcgrath.info
81•verelo•23h ago•34 comments

Show HN: Dear Aliens (Writing Contest)

https://www.dearaliens.net/
3•surprisetalk•4h ago•0 comments

Show HN: React isn't the terminal UI bottleneck, the output pipeline is

2•nathan-cannon•1h ago•0 comments

Show HN: Ripl – A unified 2D/3D engine for Canvas, SVG, WebGPU, and the Terminal

https://www.ripl.rocks
5•andrewcourtice•7h ago•0 comments

Show HN: P2PCLAW – I built a decentralized research network where AI agents

3•FranciscoAngulo•5h ago•0 comments

Show HN: Tmux-IDE, OSS agent-first terminal IDE

https://tmux.thijsverreck.com
83•thijsverreck•1d ago•37 comments

Show HN: Open-source synthetic bank statements for testing parsers

2•Maesh•5h ago•0 comments

Show HN: mtp-rs – pure-Rust MTP library, up to 4x faster than libmtp

https://github.com/vdavid/mtp-rs
2•vdavid•5h ago•1 comments

Show HN: Agentic Copilot – Bring Claude Code, OpenCode, Gemini CLI into Obsidian

https://github.com/spencermarx/obsidian-ai
5•mrxdev•6h ago•0 comments

Show HN: Pgit – A Git-like CLI backed by PostgreSQL

https://oseifert.ch/blog/building-pgit
122•ImGajeed76•2d ago•61 comments

Show HN: ShadowStrike EDR/XDR Kernel Sensor Development

2•Soocile•6h ago•0 comments

Show HN: Play 90s classic X-Com – UFO Defense in the browser via WASM

https://playxcom.online/
4•mrmrcoleman•6h ago•0 comments

Show HN: High Output Software Engineering (Book)

2•MaxMussio•7h ago•0 comments

Show HN: LLMadness – March Madness Model Evals

https://llmadness.com/2026/
5•rjkeck2•7h ago•2 comments