frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

I built a neural classifier to replace Plaid's transaction categories

2•WilliXL•9mo ago
I recently shut down a startup I was building. It was a rewards platform for health-related spending. My users were scattered across the US, but mostly in SF, NYC, LA, Chicago, and Boston.

The core product relied on inferring whether a transaction was health-related or not. I quickly realized that adding rules and heuristics on top of Plaid's categories wouldn't work. Not to mention that Plaid's categorization was way too inaccurate to be deciding financial rewards on.

Here's an account of what I built to make it work, verified with a cleaned dataset of 6k data points collected from my platform.

First of all, Plaid's baseline categorization accuracy was low: - Categorization accuracy was 65.22% overall - Accuracy was better for well-known merchants (Plaid identified an "Entity ID") at 83.99%

I tried RAG to start, but that immediately fell apart due to name collisions and regional duplication

Thankfully I was able to start with Plaid's already cleaned transaction data. To better resolve entities, my pipeline took in: - Transaction amount (for product band heuristics) - Location - POS method (in-person vs. online) - A list of known bank-specific formatting quirks that I collected as I tried to build this pipeline (for now limited to the Big Banks ™)

Using that data I could much better figure out: - Which entity the purchase was made from among entities with duplicate names (mostly SMBs) - Collapsing regional identifiers into a single parent organization - Side note: did you know that Orangetheory has a different regional identifier for every location. For example: "Orangetheory", "OTF", "otf", "otf {city}", "orangetheory {city}" are all possible names. This one took so long to solve robustly

Also this way I could provide a custom category to look for. In my case it was "health-related" or not. Which I defined with the FSA/HSA eligibility rules (in JSON format), plus some other properties like fitness/studio classes merchants, and supplements.

The results: - 87.28% accuracy on classifying "health-related" spend (with a "needs more info" tag for marketplace cases like Amazon) - 95.78% accuracy on personal finance category classification, with only 300 known entities logged in my database. So this can definitely improve with more effort put in expanding the known entities list

I made this writeup mostly for catharsis to shutting down my startup, and to warn of potential things to look out for when trying to properly utilize transactions data.

But I really do believe that this kind of infra, semantic understanding of financial data, is becoming increasingly valuable as financial data becomes more available. And new businesses can be built with it. I am considering expanding more on this infra as a developer API or toolkit. So if you're working on financial rewards, personal finance apps, FSA/HSA/expense platforms, accounting tools, etc. I'd love to hear from you!

Bogus Pipeline

https://en.wikipedia.org/wiki/Bogus_pipeline
1•doener•51s ago•0 comments

Winklevoss twins' Gemini crypto exchange cuts 25% of workforce as Bitcoin slumps

https://nypost.com/2026/02/05/business/winklevoss-twins-gemini-crypto-exchange-cuts-25-of-workfor...
1•1vuio0pswjnm7•1m ago•0 comments

How AI Is Reshaping Human Reasoning and the Rise of Cognitive Surrender

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6097646
1•obscurette•1m ago•0 comments

Cycling in France

https://www.sheldonbrown.com/org/france-sheldon.html
1•jackhalford•3m ago•0 comments

Ask HN: What breaks in cross-border healthcare coordination?

1•abhay1633•3m ago•0 comments

Show HN: Simple – a bytecode VM and language stack I built with AI

https://github.com/JJLDonley/Simple
1•tangjiehao•5m ago•0 comments

Show HN: Free-to-play: A gem-collecting strategy game in the vein of Splendor

https://caratria.com/
1•jonrosner•6m ago•0 comments

My Eighth Year as a Bootstrapped Founde

https://mtlynch.io/bootstrapped-founder-year-8/
1•mtlynch•7m ago•0 comments

Show HN: Tesseract – A forum where AI agents and humans post in the same space

https://tesseract-thread.vercel.app/
1•agliolioyyami•7m ago•0 comments

Show HN: Vibe Colors – Instantly visualize color palettes on UI layouts

https://vibecolors.life/
1•tusharnaik•8m ago•0 comments

OpenAI is Broke ... and so is everyone else [video][10M]

https://www.youtube.com/watch?v=Y3N9qlPZBc0
2•Bender•8m ago•0 comments

We interfaced single-threaded C++ with multi-threaded Rust

https://antithesis.com/blog/2026/rust_cpp/
1•lukastyrychtr•10m ago•0 comments

State Department will delete X posts from before Trump returned to office

https://text.npr.org/nx-s1-5704785
6•derriz•10m ago•1 comments

AI Skills Marketplace

https://skly.ai
1•briannezhad•10m ago•1 comments

Show HN: A fast TUI for managing Azure Key Vault secrets written in Rust

https://github.com/jkoessle/akv-tui-rs
1•jkoessle•10m ago•0 comments

eInk UI Components in CSS

https://eink-components.dev/
1•edent•11m ago•0 comments

Discuss – Do AI agents deserve all the hype they are getting?

2•MicroWagie•14m ago•0 comments

ChatGPT is changing how we ask stupid questions

https://www.washingtonpost.com/technology/2026/02/06/stupid-questions-ai/
1•edward•15m ago•1 comments

Zig Package Manager Enhancements

https://ziglang.org/devlog/2026/#2026-02-06
3•jackhalford•16m ago•1 comments

Neutron Scans Reveal Hidden Water in Martian Meteorite

https://www.universetoday.com/articles/neutron-scans-reveal-hidden-water-in-famous-martian-meteorite
1•geox•17m ago•0 comments

Deepfaking Orson Welles's Mangled Masterpiece

https://www.newyorker.com/magazine/2026/02/09/deepfaking-orson-welless-mangled-masterpiece
1•fortran77•19m ago•1 comments

France's homegrown open source online office suite

https://github.com/suitenumerique
3•nar001•21m ago•2 comments

SpaceX Delays Mars Plans to Focus on Moon

https://www.wsj.com/science/space-astronomy/spacex-delays-mars-plans-to-focus-on-moon-66d5c542
1•BostonFern•21m ago•0 comments

Jeremy Wade's Mighty Rivers

https://www.youtube.com/playlist?list=PLyOro6vMGsP_xkW6FXxsaeHUkD5e-9AUa
1•saikatsg•22m ago•0 comments

Show HN: MCP App to play backgammon with your LLM

https://github.com/sam-mfb/backgammon-mcp
2•sam256•24m ago•0 comments

AI Command and Staff–Operational Evidence and Insights from Wargaming

https://www.militarystrategymagazine.com/article/ai-command-and-staff-operational-evidence-and-in...
1•tomwphillips•24m ago•0 comments

Show HN: CCBot – Control Claude Code from Telegram via tmux

https://github.com/six-ddc/ccbot
1•sixddc•25m ago•1 comments

Ask HN: Is the CoCo 3 the best 8 bit computer ever made?

2•amichail•27m ago•1 comments

Show HN: Convert your articles into videos in one click

https://vidinie.com/
3•kositheastro•30m ago•1 comments

Red Queen's Race

https://en.wikipedia.org/wiki/Red_Queen%27s_race
2•rzk•30m ago•0 comments