frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Queueing Theory v2: DORA metrics, queue-of-queues, chi-alpha-beta-sigma notation

https://github.com/joelparkerhenderson/queueing-theory
1•jph•3m ago•0 comments

Show HN: Hibana – choreography-first protocol safety for Rust

https://hibanaworks.dev/
1•o8vm•4m ago•0 comments

Haniri: A live autonomous world where AI agents survive or collapse

https://www.haniri.com
1•donangrey•5m ago•1 comments

GPT-5.3-Codex System Card [pdf]

https://cdn.openai.com/pdf/23eca107-a9b1-4d2c-b156-7deb4fbc697c/GPT-5-3-Codex-System-Card-02.pdf
1•tosh•18m ago•0 comments

Atlas: Manage your database schema as code

https://github.com/ariga/atlas
1•quectophoton•21m ago•0 comments

Geist Pixel

https://vercel.com/blog/introducing-geist-pixel
1•helloplanets•24m ago•0 comments

Show HN: MCP to get latest dependency package and tool versions

https://github.com/MShekow/package-version-check-mcp
1•mshekow•32m ago•0 comments

The better you get at something, the harder it becomes to do

https://seekingtrust.substack.com/p/improving-at-writing-made-me-almost
2•FinnLobsien•33m ago•0 comments

Show HN: WP Float – Archive WordPress blogs to free static hosting

https://wpfloat.netlify.app/
1•zizoulegrande•35m ago•0 comments

Show HN: I Hacked My Family's Meal Planning with an App

https://mealjar.app
1•melvinzammit•35m ago•0 comments

Sony BMG copy protection rootkit scandal

https://en.wikipedia.org/wiki/Sony_BMG_copy_protection_rootkit_scandal
1•basilikum•38m ago•0 comments

The Future of Systems

https://novlabs.ai/mission/
2•tekbog•38m ago•1 comments

NASA now allowing astronauts to bring their smartphones on space missions

https://twitter.com/NASAAdmin/status/2019259382962307393
2•gbugniot•43m ago•0 comments

Claude Code Is the Inflection Point

https://newsletter.semianalysis.com/p/claude-code-is-the-inflection-point
3•throwaw12•44m ago•1 comments

Show HN: MicroClaw – Agentic AI Assistant for Telegram, Built in Rust

https://github.com/microclaw/microclaw
1•everettjf•44m ago•2 comments

Show HN: Omni-BLAS – 4x faster matrix multiplication via Monte Carlo sampling

https://github.com/AleatorAI/OMNI-BLAS
1•LowSpecEng•45m ago•1 comments

The AI-Ready Software Developer: Conclusion – Same Game, Different Dice

https://codemanship.wordpress.com/2026/01/05/the-ai-ready-software-developer-conclusion-same-game...
1•lifeisstillgood•47m ago•0 comments

AI Agent Automates Google Stock Analysis from Financial Reports

https://pardusai.org/view/54c6646b9e273bbe103b76256a91a7f30da624062a8a6eeb16febfe403efd078
1•JasonHEIN•50m ago•0 comments

Voxtral Realtime 4B Pure C Implementation

https://github.com/antirez/voxtral.c
2•andreabat•53m ago•1 comments

I Was Trapped in Chinese Mafia Crypto Slavery [video]

https://www.youtube.com/watch?v=zOcNaWmmn0A
2•mgh2•59m ago•0 comments

U.S. CBP Reported Employee Arrests (FY2020 – FYTD)

https://www.cbp.gov/newsroom/stats/reported-employee-arrests
1•ludicrousdispla•1h ago•0 comments

Show HN: I built a free UCP checker – see if AI agents can find your store

https://ucphub.ai/ucp-store-check/
2•vladeta•1h ago•1 comments

Show HN: SVGV – A Real-Time Vector Video Format for Budget Hardware

https://github.com/thealidev/VectorVision-SVGV
1•thealidev•1h ago•0 comments

Study of 150 developers shows AI generated code no harder to maintain long term

https://www.youtube.com/watch?v=b9EbCb5A408
1•lifeisstillgood•1h ago•0 comments

Spotify now requires premium accounts for developer mode API access

https://www.neowin.net/news/spotify-now-requires-premium-accounts-for-developer-mode-api-access/
1•bundie•1h ago•0 comments

When Albert Einstein Moved to Princeton

https://twitter.com/Math_files/status/2020017485815456224
1•keepamovin•1h ago•0 comments

Agents.md as a Dark Signal

https://joshmock.com/post/2026-agents-md-as-a-dark-signal/
2•birdculture•1h ago•0 comments

System time, clocks, and their syncing in macOS

https://eclecticlight.co/2025/05/21/system-time-clocks-and-their-syncing-in-macos/
1•fanf2•1h ago•0 comments

McCLIM and 7GUIs – Part 1: The Counter

https://turtleware.eu/posts/McCLIM-and-7GUIs---Part-1-The-Counter.html
2•ramenbytes•1h ago•0 comments

So whats the next word, then? Almost-no-math intro to transformer models

https://matthias-kainer.de/blog/posts/so-whats-the-next-word-then-/
1•oesimania•1h ago•0 comments
Open in hackernews

Show HN: I got fired so I built a bank statement converter

https://aussiebankstatements.com
16•matherslabs•3mo ago
I recently got fired and decided to channel my energy into something productive. Over two weeks, I spent 16-hour days building a tool that converts Australian bank PDFs into clean, reliable CSVs, tailored specifically for Aussie banks.

Most Aussie banks only provide statements as a PDF, and generic converters often fail: columns drift, multi-line descriptions break parsing, headers shift. Existing tools don’t handle it well and I wanted a tool that just works.

To get started, I used my own bank statements to build the initial parsers. There was a "duh" moment when I realised how hard it is to get more realistic test data. People don't just hand over their financial ledgers. This solidified my core principle: trust and privacy had to be the absolute top priority.

I initially tried building everything client-side in JavaScript for maximum privacy, but performance and reliability were poor, and exposing the parsers on the front-end would have made them easy to copy.

I settled on a middle ground: a Python and FastAPI backend on Google Cloud Run. This lets me balance reliability with a strict privacy architecture. Files are processed in real-time and the temp file is deleted immediately after the request is complete. There is no persistent storage and no logging of request bodies.

My technical approach is straightforward and focused on reliability:

- I use pdfplumber to extract text, avoiding complex and error-prone OCR.

- I apply a set of bank-specific regex patterns to pinpoint dates, amounts, and descriptions.

- A lookahead heuristic correctly merges multi-line transactions. Each parser is customised to its bank's unique PDF layout quirks.

The project is deliberately focused. Instead of supporting hundreds of banks with mediocre results, I'm concentrating on a small set to get them right. It currently supports CommBank, Westpac, UBank, and ING, with ANZ and NAB next. The whole thing is deployed on Cloudflare Pages and outputs clean CSVs ready for Excel, Google Sheets, Xero, or MYOB.

It was a fun challenge in reverse-engineering messy, real-world data.

Try it out here: https://aussiebankstatements.com

I'd love to hear feedback. If it breaks on your statement, a redacted sample would be a huge help for improving the parser.

I'm also curious to hear how others here have tackled similar messy data extraction challenges.

Comments

devrundown•3mo ago
Looks really good. Simple and clean UI. Nice work.
matherslabs•3mo ago
Thank you, really appreciate that!!
abc03•3mo ago
I appreciate that you described your approach. I don‘t live in Australia but because of this I still looked at it.
ApolloRising•3mo ago
"Technical guarantee: Processing happens in secure, isolated server instances that immediately purge all data. Your financial information never touches persistent storage." - May I ask how are you doing this?