frontpage.

Show HN: I got fired so I built a bank statement converter

https://aussiebankstatements.com

3•matherslabs•3h ago

I recently got fired and decided to channel my energy into something productive. Over two weeks, I spent 16-hour days building a tool that converts Australian bank PDFs into clean, reliable CSVs, tailored specifically for Aussie banks.

Most Aussie banks only provide statements as a PDF, and generic converters often fail: columns drift, multi-line descriptions break parsing, headers shift. Existing tools don’t handle it well and I wanted a tool that just works.

To get started, I used my own bank statements to build the initial parsers. There was a "duh" moment when I realised how hard it is to get more realistic test data. People don't just hand over their financial ledgers. This solidified my core principle: trust and privacy had to be the absolute top priority.

I initially tried building everything client-side in JavaScript for maximum privacy, but performance and reliability were poor, and exposing the parsers on the front-end would have made them easy to copy.

I settled on a middle ground: a Python and FastAPI backend on Google Cloud Run. This lets me balance reliability with a strict privacy architecture. Files are processed in real-time and the temp file is deleted immediately after the request is complete. There is no persistent storage and no logging of request bodies.

My technical approach is straightforward and focused on reliability:

- I use pdfplumber to extract text, avoiding complex and error-prone OCR.

- I apply a set of bank-specific regex patterns to pinpoint dates, amounts, and descriptions.

- A lookahead heuristic correctly merges multi-line transactions. Each parser is customised to its bank's unique PDF layout quirks.

The project is deliberately focused. Instead of supporting hundreds of banks with mediocre results, I'm concentrating on a small set to get them right. It currently supports CommBank, Westpac, UBank, and ING, with ANZ and NAB next. The whole thing is deployed on Cloudflare Pages and outputs clean CSVs ready for Excel, Google Sheets, Xero, or MYOB.

It was a fun challenge in reverse-engineering messy, real-world data.

Try it out here: https://aussiebankstatements.com

I'd love to hear feedback. If it breaks on your statement, a redacted sample would be a huge help for improving the parser.

I'm also curious to hear how others here have tackled similar messy data extraction challenges.

Show HN: A CSS-Only Terrain Generator

Show HN: I built a local-first daily planner for iOS

Show HN: I built a highly customizable mental arithmetic trainer for iOS

Show HN: Pion/rtwatch – Watch video in sync with friends, pause/seek on back end

Show HN: Yourshoesmells.com – Find the most smelly boulder gym

Show HN: Nallely a modular reactive Python framework for custom MIDI instruments

Show HN: I got fired so I built a bank statement converter

Show HN: MyTimers.app offline-first PWA with no build step and zero dependencies

Show HN: a Rust ray tracer that runs on any GPU – even in the browser

Show HN: Agor → Figma for AI Coding (Open Source)

Show HN: Tamagotchi P1 for FPGAs

Show HN: I made a website that vibe-codes itself

Show HN: I Built a Prototype for a Universal Causal Language (UCL)

Show HN: FinBodhi – Local-first, double-entry app/PWA for your financial journey

Show HN: Centia.io – Open PostgreSQL/PostGIS back end for developers

Show HN: Strange Attractors

Show HN: Anki-LLM – Bulk process and generate Anki flashcards with LLMs

Show HN: Serie – A rich Git commit graph in your terminal

Show HN: Why write code if the LLM can just do the thing? (web app experiment)

Show HN: Chess960v2 – Stockfish tournament with different starting positions

Show HN: Pipelex – Declarative language for repeatable AI workflows

Show HN: Glitch Text Generator – Create stunning unicode text effects

Show HN: In a single HTML file, an app to encourage my children to invest

Show HN: An AI to match your voice to songs and artists you should sing

Show HN: Quibbler – A critic for your coding agent that learns what you want

Show HN: Learn German with Games

Show HN: Secret Management for Local Development

Show HN: WebAudio Data-Driven audio engine

Show HN: AgentML – SCXML for Deterministic AI Agents (MIT)

Show HN: An AI that keeps your internal documentation alive

Show HN: I got fired so I built a bank statement converter

Show HN: A CSS-Only Terrain Generator

Show HN: I built a local-first daily planner for iOS

Show HN: I built a highly customizable mental arithmetic trainer for iOS

Show HN: Pion/rtwatch – Watch video in sync with friends, pause/seek on back end

Show HN: Yourshoesmells.com – Find the most smelly boulder gym

Show HN: Nallely a modular reactive Python framework for custom MIDI instruments

Show HN: I got fired so I built a bank statement converter

Show HN: MyTimers.app offline-first PWA with no build step and zero dependencies

Show HN: a Rust ray tracer that runs on any GPU – even in the browser

Show HN: Agor → Figma for AI Coding (Open Source)

Show HN: Tamagotchi P1 for FPGAs

Show HN: I made a website that vibe-codes itself

Show HN: I Built a Prototype for a Universal Causal Language (UCL)

Show HN: FinBodhi – Local-first, double-entry app/PWA for your financial journey

Show HN: Centia.io – Open PostgreSQL/PostGIS back end for developers

Show HN: Strange Attractors

Show HN: Anki-LLM – Bulk process and generate Anki flashcards with LLMs

Show HN: Serie – A rich Git commit graph in your terminal

Show HN: Why write code if the LLM can just do the thing? (web app experiment)

Show HN: Chess960v2 – Stockfish tournament with different starting positions

Show HN: Pipelex – Declarative language for repeatable AI workflows

Show HN: Glitch Text Generator – Create stunning unicode text effects

Show HN: In a single HTML file, an app to encourage my children to invest

Show HN: An AI to match your voice to songs and artists you should sing

Show HN: Quibbler – A critic for your coding agent that learns what you want

Show HN: Learn German with Games

Show HN: Secret Management for Local Development

Show HN: WebAudio Data-Driven audio engine

Show HN: AgentML – SCXML for Deterministic AI Agents (MIT)

Show HN: An AI that keeps your internal documentation alive