frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Show HN: AI-Powered Receipt and Invoice Generator (LLM-Agnostic, Prompt-Based)

1•maxime_wellapp•1d ago
GitHub: https://github.com/WellApp-ai/Well/tree/main/ai-receipt-gene... Example output: https://imgur.com/a/YtFSodj

We’ve open-sourced a small tool to generate synthetic receipts and invoices using LLMs — no templates, no HTML, just prompts.

Why? We needed large, diverse, semi-realistic documents to evaluate our document extraction pipeline (OCR + AI). Most generators were either too rigid (static templates) or tied to a single provider. So we built our own:

Features:

LLM-agnostic (OpenAI, local models, etc)

Prompt-driven JSON output (not rendered docs)

Optional Faker-based fallback if the model isn't used

Configurable schemas, locales, and output count

Lightweight and hackable — runs with a single config

This makes it easy to:

Generate test data at scale

Create edge cases (missing fields, odd currencies, broken totals)

Evaluate or fine-tune document understanding models

Example use case: You can ask the model to generate a receipt from a Vietnamese coffee shop, with broken math and a typo in the merchant name — and get structured JSON output that mimics what real OCR might return.

Would love feedback from folks working on:

LLM eval tooling

Synthetic data generation

OCR / document AI

AI agents that touch financial data

Happy to support anyone who wants to extend it or wire up other backends (Claude, Mistral, LM Studio, etc).

Comments

chelm•1d ago
Cool work! I did not yet add vendors that generate “synthetic data” or collect documents to my list. If you like, create a PR, I'll merge it. https://idp-software.com/contribution/

Halloween Problem

https://en.wikipedia.org/wiki/Halloween_Problem
1•TMWNN•1m ago•0 comments

ESignatures in Construction Simplifies Approvals with BoldSign

https://boldsign.com/blogs/how-esignatures-simplify-construction-approvals-in-real-estate/
1•dhinesh1806•1m ago•0 comments

Red Cross says at least 21 killed and dozens shot in Gaza aid incident

https://www.bbc.com/news/articles/c991j01lym3o
2•Jimmc414•31m ago•0 comments

Rolldown-Vite: a Rust-Rewrite of Rollup

https://voidzero.dev/posts/announcing-rolldown-vite
2•thunderbong•36m ago•0 comments

Is "The Phoenician Scheme" Wes Anderson's Most Emotional Film?

https://www.newyorker.com/magazine/2025/06/09/the-phoenician-scheme-movie-review
1•prismatic•37m ago•0 comments

The Steve Ballmer Interview: The Complete History and Strategy

https://www.acquired.fm/episodes/the-steve-ballmer-interview
2•tambourine_man•41m ago•0 comments

How to post when no one is reading

https://www.jeetmehta.com/posts/thrive-in-obscurity
1•j4mehta•42m ago•0 comments

Show HN: MBCompass – Android Compass App

https://github.com/MubarakNative/MBCompass
9•nativeforks•44m ago•0 comments

Price Index Could Clarify Opaque GPU Costs for AI

https://spectrum.ieee.org/gpu-prices
1•neom•50m ago•0 comments

Yisp: Lisp-like YAML templating for Kubernetes and beyond

https://dev.to/totegamma/what-if-yaml-and-lisp-had-a-child-bring-functional-power-to-your-kubernetes-manifests-266h
2•totegamma•57m ago•0 comments

Projected Outcomes of Removing Fluoride from US Public Water Systems

https://jamanetwork.com/journals/jama-health-forum/fullarticle/2834515
3•zzzeek•59m ago•0 comments

MailLM

https://maillm.com/
1•PurnataHassan•1h ago•1 comments

LFSR CPU Running Forth

https://github.com/howerj/lfsr-vhdl
6•izabera•1h ago•0 comments

INTERCAL Rides Again – Restoring a Lost Compiler

https://adventofcomputing.libsyn.com/episode-158-intercal-rides-again-restoring-a-lost-compiler
1•matt_d•1h ago•1 comments

Autonomous Software Maintenance Has Arrived

https://www.tembo.io/blog/autonomous-software-maintenance-has-arrived
2•pjungwir•1h ago•0 comments

The Relation of Mathematics and Physics (1964)

https://www.feynmanlectures.caltech.edu/fml.html#2
1•fisheuler•1h ago•0 comments

Turning used cooking oil into soap where deep-Fried foods rule

https://www.bbc.com/news/articles/c9djx7llj44o
1•1659447091•1h ago•0 comments

Ask HN: How do you find ideas?

1•nbbaier•1h ago•4 comments

Disaster of a product – so many things wrong [video]

https://www.youtube.com/watch?v=pR8cMi67WNc
2•josephcsible•1h ago•0 comments

Inventing Japanese Braille

https://www.historyworkshop.org.uk/disability-history/inventing-japanese-braille/
2•zdw•1h ago•0 comments

2024 Pay for S&P 500 CEOs

https://www.wsj.com/business/rick-smith-axon-ceo-pay-package-2024-6e864a64
2•J253•1h ago•0 comments

Show HN: A small library for stack-trace-like error messages in Rust

https://docs.rs/errors_with_context/latest/errors_with_context/
2•AnyTimeTraveler•2h ago•0 comments

Does U.S. Need to Build Hardened Aircraft Shelters for Combat Aircraft? (2024)

https://www.twz.com/news-features/does-the-u-s-need-to-be-building-hardened-aircraft-shelters-for-its-combat-aircraft
3•walterbell•2h ago•1 comments

Show HN: I built an AI Agent that uses the iPhone

https://github.com/rounak/PhoneAgent
3•rounak•2h ago•0 comments

Automatic rollbacks are a last resort

https://octopus.com/blog/automatic-rollbacks-last-resort
2•gpi•2h ago•0 comments

How Can AI Researchers Save Energy? By Going Backward

https://www.quantamagazine.org/how-can-ai-researchers-save-energy-by-going-backward-20250530/
12•pseudolus•2h ago•3 comments

Bugs Love Starlink [video]

https://www.reddit.com/r/Starlink/s/enZE2dQCxo
5•elsewhen•2h ago•1 comments

Building a Newsroom Technology Culture

https://werd.io/2025/building-a-newsroom-technology-culture
2•benwerd•2h ago•0 comments

Transitive Closure in PostgreSQL

https://engineering.remind.com/Transitive-Closure-In-PostgreSQL/
4•thunderbong•2h ago•0 comments

Show HN: LMStudio Client in Elixir

https://github.com/arthurcolle/lmstudio.ex
3•arthurcolle•2h ago•0 comments