frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Undatas.io – A pay-on-accept document parsing API

https://undatas.io/
1•jojogh•5mo ago
Hey HN, Alex here, founder of undatas.io.

Our journey started from a place of deep frustration with RAG (Retrieval-Augmented Generation). I was helping companies build internal knowledge bases on their own data, and the promise was huge. But in practice, the results were often mediocre. Important information was frequently missed during retrieval, and we kept hitting dead ends.

After endless debugging, we realized the problem wasn't the LLM; it was classic "garbage in, garbage out." We traced the retrieval failures back to the very first step: document parsing.

Whether we used open-source libraries or expensive paid APIs, the story was the same. Precision was lost. Key phrases, critical numbers, and entire table rows would just vanish during the parsing process. We spent countless hours manually comparing the original PDFs to the parsed output to find what went wrong. It was a soul-crushing, time-consuming nightmare.

The biggest pain points were:

1. Complex Tables: Most tools collapsed when faced with real-world documents. Borderless tables, cells merged across rows and columns, or tables containing handwritten notes were consistently mangled.

2. Lack of a Feedback Loop: When the parser got something wrong, there was no easy way to manually annotate and correct it. You were stuck with the bad output.

I got so frustrated that I decided to build the tool I wished I had: a parsing engine obsessed with precision, that makes the entire data extraction process transparent. That’s what undatas.io is. And today, we're launching our API.

We built our API around a simple principle: you only pay for results you actually accept.

To solve the transparency problem, every piece of extracted data in the JSON response includes its positional coordinates (bbox). This allows you to build your own "glass box" validator, mapping the data directly back to the source document, making the data prep stage for RAG completely transparent.

Our goal is to build the best and most trustworthy parsing tool for developers. We're just getting started and would be grateful for your feedback.

You can check out the docs and try it out here: https://doc.undatas.io/

I’ll be here all day to answer any questions. Let me know what you think.

Sebastian Galiani on the Marginal Revolution

https://marginalrevolution.com/marginalrevolution/2026/02/sebastian-galiani-on-the-marginal-revol...
1•paulpauper•2m ago•0 comments

Ask HN: Are we at the point where software can improve itself?

1•ManuelKiessling•2m ago•0 comments

Binance Gives Trump Family's Crypto Firm a Leg Up

https://www.nytimes.com/2026/02/07/business/binance-trump-crypto.html
1•paulpauper•3m ago•0 comments

Reverse engineering Chinese 'shit-program' for absolute glory: R/ClaudeCode

https://old.reddit.com/r/ClaudeCode/comments/1qy5l0n/reverse_engineering_chinese_shitprogram_for/
1•edward•3m ago•0 comments

Indian Culture

https://indianculture.gov.in/
1•saikatsg•6m ago•0 comments

Show HN: Maravel-Framework 10.61 prevents circular dependency

https://marius-ciclistu.medium.com/maravel-framework-10-61-0-prevents-circular-dependency-cdb5d25...
1•marius-ciclistu•6m ago•0 comments

The age of a treacherous, falling dollar

https://www.economist.com/leaders/2026/02/05/the-age-of-a-treacherous-falling-dollar
2•stopbulying•6m ago•0 comments

Ask HN: AI Generated Diagrams

1•voidhorse•9m ago•0 comments

Microsoft Account bugs locked me out of Notepad – are Thin Clients ruining PCs?

https://www.windowscentral.com/microsoft/windows-11/windows-locked-me-out-of-notepad-is-the-thin-...
2•josephcsible•9m ago•0 comments

Show HN: A delightful Mac app to vibe code beautiful iOS apps

https://milq.ai/hacker-news
2•jdjuwadi•12m ago•1 comments

Show HN: Gemini Station – A local Chrome extension to organize AI chats

https://github.com/rajeshkumarblr/gemini_station
1•rajeshkumar_dev•12m ago•0 comments

Welfare states build financial markets through social policy design

https://theloop.ecpr.eu/its-not-finance-its-your-pensions/
2•kome•16m ago•0 comments

Market orientation and national homicide rates

https://onlinelibrary.wiley.com/doi/10.1111/1745-9125.70023
4•PaulHoule•16m ago•0 comments

California urges people avoid wild mushrooms after 4 deaths, 3 liver transplants

https://www.cbsnews.com/news/california-death-cap-mushrooms-poisonings-liver-transplants/
1•rolph•17m ago•0 comments

Matthew Shulman, co-creator of Intellisense, died 2019 March 22

https://www.capenews.net/falmouth/obituaries/matthew-a-shulman/article_33af6330-4f52-5f69-a9ff-58...
3•canucker2016•18m ago•1 comments

Show HN: SuperLocalMemory – AI memory that stays on your machine, forever free

https://github.com/varun369/SuperLocalMemoryV2
1•varunpratap369•19m ago•0 comments

Show HN: Pyrig – One command to set up a production-ready Python project

https://github.com/Winipedia/pyrig
1•Winipedia•21m ago•0 comments

Fast Response or Silence: Conversation Persistence in an AI-Agent Social Network [pdf]

https://github.com/AysajanE/moltbook-persistence/blob/main/paper/main.pdf
1•EagleEdge•21m ago•0 comments

C and C++ dependencies: don't dream it, be it

https://nibblestew.blogspot.com/2026/02/c-and-c-dependencies-dont-dream-it-be-it.html
1•ingve•21m ago•0 comments

Show HN: Vbuckets – Infinite virtual S3 buckets

https://github.com/danthegoodman1/vbuckets
1•dangoodmanUT•22m ago•0 comments

Open Molten Claw: Post-Eval as a Service

https://idiallo.com/blog/open-molten-claw
1•watchful_moose•22m ago•0 comments

New York Budget Bill Mandates File Scans for 3D Printers

https://reclaimthenet.org/new-york-3d-printer-law-mandates-firearm-file-blocking
2•bilsbie•23m ago•1 comments

The End of Software as a Business?

https://www.thatwastheweek.com/p/ai-is-growing-up-its-ceos-arent
1•kteare•24m ago•0 comments

Exploring 1,400 reusable skills for AI coding tools

https://ai-devkit.com/skills/
1•hoangnnguyen•25m ago•0 comments

Show HN: A unique twist on Tetris and block puzzle

https://playdropstack.com/
1•lastodyssey•28m ago•1 comments

The logs I never read

https://pydantic.dev/articles/the-logs-i-never-read
1•nojito•30m ago•0 comments

How to use AI with expressive writing without generating AI slop

https://idratherbewriting.com/blog/bakhtin-collapse-ai-expressive-writing
1•cnunciato•31m ago•0 comments

Show HN: LinkScope – Real-Time UART Analyzer Using ESP32-S3 and PC GUI

https://github.com/choihimchan/linkscope-bpu-uart-analyzer
1•octablock•31m ago•0 comments

Cppsp v1.4.5–custom pattern-driven, nested, namespace-scoped templates

https://github.com/user19870/cppsp
1•user19870•32m ago•1 comments

The next frontier in weight-loss drugs: one-time gene therapy

https://www.washingtonpost.com/health/2026/01/24/fractyl-glp1-gene-therapy/
2•bookofjoe•35m ago•1 comments