frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Replacing $50k manual forensic audits with a deterministic .py engine

2•cd_mkdir•2h ago
I’m a software architect, and I recently built Exit Protocol (https://exitprotocols.com), an automated forensic accounting engine for high-conflict litigation.

Problem: If you get divorced and need to prove that a specific $250k in a heavily commingled joint bank account is your "separate property" (e.g., from a pre-marital startup exit), the burden of proof is strictly mathematical. Historically, this meant paying a forensic CPA $500/hour to dump years of blurry bank PDFs into Excel and manually trace every dollar. It takes weeks and routinely costs over $50,000.

I looked at the legal standard courts use for this—the Lowest Intermediate Balance Rule (LIBR)—and realized it wasn’t an accounting problem. It is a Distributed Systems state-machine problem.

Why we didn't just "Throw AI at it"?

There are a hundred legal-tech startups right now trying to use LLMs to summarize bank data. In a courtroom, GenAI is a fatal liability. If an LLM hallucinates a single transaction, the entire ledger is inadmissible under the Daubert standard.

To make this court-ready, we had to build a strictly deterministic pipeline:

1. Vision-Native Ingestion (Beating Tesseract) Bank statements are the final boss of OCR (merged cells, overlapping debit/credit columns). Standard linear OCR fails catastrophically. We built a spatial-grid OCR pipeline (using Azure Document Intelligence with a local Surya OCR fallback) that maps the geometric structure of the page. It reconstructs tabular ledgers perfectly, even from multi-generational "PDFs from hell."

2. The Deterministic Engine (LIBR) The LIBR algorithm acts as a one-way ratchet. If an account balance drops below your separate property claim amount, your claim is permanently capped at that new floor. Subsequent marital deposits do not refill it (the "replenishment fallacy"). The engine replays thousands of transactions chronologically, continuously evaluating S_t = min(S_t-1, B_t).

3. Resolving Timestamp Ambiguity Bank PDFs give you dates, not timestamps. If a $10k deposit and $10k withdrawal happen on the same day, order matters. We built a simulation toggle that forces "Worst Case" (withdrawals process first) vs "Best Case" sorting, establishing a mathematically irrefutable "Zone of Truth" for settlement negotiations.

4. Cryptographic Chain of Custody & Sovereign Mode Lawyers are terrified of cloud SaaS breaches. We containerized the entire monolith (Django 5.0/Postgres/Celery) via Docker so enterprise firms can run it air-gapped on their own hardware (Sovereign Mode). Furthermore, every generated PDF dossier is sealed with a SHA-256 hash of the underlying data snapshot, proving to a judge that the output hasn't been tampered with since generation.

If you want to see the math in action, we set up a "Demo Sandbox" populated with a synthetic, highly complex 3-year commingled ledger. You can run the engine yourself here (Desktop recommended): https://exitprotocols.com/simulation/uplink/

Here is the exact "Attorney Work Product" it generates from raw PDF or Forensic Audit Dossier our system generates- https://exitprotocols.com/static/documents/Forensic_Audit_Sa...

I'd love feedback from the HN crowd on the architecture—specifically handling edge-case data ingestion and maintaining cryptographic integrity in B2B enterprise deployments.

Cheers!

Comments

cd_mkdir•2h ago
Not a lawyer, so the Go-To-Market side in the legal space has been a steep learning curve. If anyone here has experience selling/deploying air-gapped, on-prem solutions to highly risk-averse, non-technical clients (like law firms), I would love to hear your battle stories.

Happy to answer any questions about the math, the OCR pipeline, or the architecture!

Sandbox link again: https://exitprotocols.com/simulation/uplink/

Show HN: Han – A Korean programming language written in Rust

https://github.com/xodn348/han
68•xodn348•2h ago•24 comments

Show HN: Ichinichi – One note per day, E2E encrypted, local-first

55•katspaugh•4h ago•24 comments

Show HN: GitAgent – An open standard that turns any Git repo into an AI agent

https://www.gitagent.sh/
86•sivasurend•10h ago•12 comments

Show HN: Learn Arabic with spaced repetition and comprehensible input

https://abjadpro.com
60•adangit•7h ago•12 comments

Show HN: Costly – Open-source SDK that audits your LLM API costs

https://www.getcostly.dev/
3•itsdannyt•1h ago•1 comments

Show HN: Replacing $50k manual forensic audits with a deterministic .py engine

2•cd_mkdir•2h ago•1 comments

Show HN: AI coding agent for VS Code with pay-as-you-go pricing- no subscription

https://www.llmonestop.com/#pricing
2•hhossain•2h ago•0 comments

Show HN: ZaneOps, A beautiful and fast self hosted alternative to Vercel

https://zaneops.dev/
2•fredkisss•2h ago•1 comments

Show HN: ngrep – grep plus word embeddings (Rust)

https://github.com/0xNaN/ngrep
3•xnan•2h ago•2 comments

Show HN: Data-anim – Animate HTML with just data attributes

https://github.com/ryo-manba/data-anim
10•ryo-manba•9h ago•1 comments

Show HN: Cloak – send and receive secrets from OpenClaw

https://cloak.opsy.sh
3•d36ugger•2h ago•0 comments

Show HN: Json.express – Query and explore JSON in the browser, zero dependencies

https://json.express
2•udidu•3h ago•0 comments

Show HN: Pidrive – File storage for AI agents (mount S3, use ls/cat/grep)

https://pidrive.ressl.ai/
3•abhishek203r•3h ago•0 comments

Show HN: Ink – Deploy full-stack apps from AI agents via MCP or Skills

https://ml.ink/
27•august-•3d ago•4 comments

Show HN: Paperctl- An Arxiv CLI designed for agents

https://github.com/ChristianFJung/paperctl
2•christianjung•3h ago•1 comments

Show HN: KeyID – Free email and phone infrastructure for AI agents (MCP)

https://keyid.ai/
8•vasilyt•7h ago•8 comments

Show HN: Language Life – Learn a language by living a simulated life

https://www.languagelife.ai
4•bitforger•3h ago•1 comments

Show HN: Channel Surfer – Watch YouTube like it’s cable TV

https://channelsurfer.tv
578•kilroy123•3d ago•169 comments

Show HN: Context Gateway – Compress agent context before it hits the LLM

https://github.com/Compresr-ai/Context-Gateway
89•ivzak•1d ago•50 comments

Show HN: I built Wool, a lightweight distributed Python runtime

https://github.com/wool-labs/wool
10•bzurak•11h ago•3 comments

Show HN: Zap Code – AI code generator that teaches kids real HTML/CSS/JS

https://www.zapcode.dev
9•eibrahim•4h ago•2 comments

Show HN: Auto-Save Claude Code Sessions to GitHub Projects

https://github.com/ej31/claude-session-tracker
2•ej31•5h ago•0 comments

Show HN: What was the world listening to? Music charts, 20 countries (1940–2025)

https://88mph.fm/
108•matteocantiello•4d ago•48 comments

Show HN: Axe – A 12MB binary that replaces your AI framework

https://github.com/jrswab/axe
219•jrswab•2d ago•122 comments

Show HN: Hedra – an open-world 3D game I wrote from scratch before LLMs

https://github.com/maxilevi/project-hedra
4•maxilevi•8h ago•0 comments

Show HN: SupplementDEX – The Evidence-Based Supplement Database

https://supplementdex.com/
13•richarlidad•23h ago•0 comments

Show HN: OneCLI – Vault for AI Agents in Rust

https://github.com/onecli/onecli
160•guyb3•2d ago•50 comments

Show HN: BirdDex – Pokémon Go, but with real life birds

https://birddex.co/
3•stellay•10h ago•1 comments

Show HN: QKD eavesdropper detector using Krylov complexity-open source Python

https://github.com/quantumspiritresearch-crypto/qkd-krylov-detector
3•QuantumSpirit•10h ago•0 comments

Show HN: Got tired of AI copilots just autocompleting, and built Glass Arc

4•Conquer01•10h ago•2 comments