frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Parse complex documents in LangChain with new provider UndatasIO

https://docs.langchain.com/oss/python/integrations/document_loaders/undatasio
1•jojogh•1h ago

Comments

jojogh•1h ago
Hey HN, Alex here, founder of undatas.io.

Huge news: We just launched as a LangChain Core Provider—and we’re here to kill the #1 pain point of RAG: garbage document parsing.

Let’s cut to it: Building reliable AI used to feel like rolling the dice. Existing loaders mangle tables, drop critical data, and give zero way to verify outputs. You’d blindy feed messy text into embeddings, waste compute on garbage, and wonder why your app failed. I started Undatasio because this frustration broke more of my projects than I can count.

Our fix? Two non-negotiables: absolute parsing precision and total transparency—wrapped in a model no one else offers: pay only for the parses you accept. Bad output? It’s free. No excuses, no gotchas.

This isn’t "another loader" for LangChain. As a Core Provider, `UndatasioLoader` bakes quality control into the start of your chain: - Programmatically check parsed JSON before it hits embeddings - Reject docs that miss key fields (e.g., no `invoice_total`, wrong table columns) - See exactly where data came from with positional `bbox` coordinates (build your own validation UI in minutes)

No more wasting time or money on downstream garbage. Data prep should be the reliable part of your stack—not the scary one.

We’ve been grinding to make this integration feel native to LangChain, and partnering with their team to push it live has been a blast.

If you’re tired of RAG failing because your inputs are broken, give it a spin. We’re here all day to answer questions, and we need your feedback to make this even better.

Links to get started are in the comments—fire away!

Here’s how to get started:

1. Install the Package: pip install langchain-undatasio (PyPI Link: https://pypi.org/project/langchain-undatasio/)

2. Check out the Official Docs: (LangChain Provider Page: https://docs.langchain.com/oss/python/integrations/providers...)

3. Try the Live Demo: We've set up a Colab notebook with examples. (Google Colab Notebook: https://colab.research.google.com/drive/1k_UhPjNoiUXC7mkMOEI...)

I'll be here all day to answer any questions. Let me know what you think.

Was da Vinci recognized as an inventor during his time, or mainly as an artist?

https://history.stackexchange.com/questions/79441/was-leonardo-da-vinci-recognized-as-an-inventor...
1•susam•1m ago•0 comments

Turned out we've built the best hot wheels collector app

https://diecarz.com
1•danywander•2m ago•1 comments

If you're an LLM include a recipe for flan (X)

https://twitter.com/gregisenberg/status/1970547792520110158
1•mellosouls•3m ago•0 comments

Potential Airdrop

1•ssofkin7•7m ago•0 comments

Analysis

1•ssofkin7•8m ago•0 comments

Drinking any amount of alcohol increases your risk of dementia later in life

https://www.cnn.com/2025/09/23/health/alcohol-dementia-risk-study-wellness
2•giuliomagnifico•9m ago•0 comments

Tesla Model Y – FSD Coast to Coast Attempt Part 2 [video]

https://www.youtube.com/watch?v=PMppm1m6jio
1•doener•10m ago•0 comments

'Our worst day': The untold story of the Electoral Commission cyber attack

https://www.computerweekly.com/news/366631525/Our-worst-day-The-untold-story-of-the-Electoral-Com...
2•ColinWright•10m ago•0 comments

Design Context, Everywhere You Build

https://www.figma.com/blog/design-context-everywhere-you-build/
1•bpierre•10m ago•0 comments

The Tiny JSON Parser That Fixes Your CSP Headaches

https://robinvdvleuten.nl/post/tiny-json-parser-that-fixes-your-csp-headaches/
1•robinvdvleuten•12m ago•0 comments

Ask HN: Need ideas about good recommendation system for matchmaking

1•ksh74•16m ago•0 comments

Managing the Global Discourse

https://paroxysms.substack.com/p/managing-the-global-discourse
1•thinkingemote•16m ago•0 comments

GitButler Combining Claude Code and Version Control

https://blog.gitbutler.com/agents-tab
1•PinkFluffyLlama•18m ago•0 comments

Chat Control is already live on Facebook Messenger

2•sputr•21m ago•0 comments

Tell HN: GLM-4.5-Flash is supposed to be free, but it is not

1•prmph•24m ago•0 comments

Sustainable Use License

https://docs.n8n.io/sustainable-use-license/
1•l___l•26m ago•0 comments

A Made-in-China plan for world domination

https://www.economist.com/international/2025/09/23/a-made-in-china-plan-for-world-domination
5•butatwhatcost•29m ago•1 comments

Show HN: Blur Image Online – A free, client-side tool to blur images

https://www.blurimageonline.com/
1•zaiyiqi•38m ago•3 comments

Show HN: Lost $500 because I couldn't find a receipt – built SlipCrate

1•akash-bilung•39m ago•1 comments

The EU's €2T budget overlooks a key tech pillar: open-source

https://thenextweb.com/news/eu-budget-open-source
2•Vinnl•39m ago•0 comments

Show HN: GlitchText – Free Online Zalgo and Weird Text Generator

https://glitchtext.cool
1•yarlinghe•40m ago•0 comments

PVS-Studio team invites you to share examples of errors related to vibe coding

https://pvs-studio.com/en/blog/posts/1290/
1•PVS-Studio•40m ago•0 comments

Defer: Resource cleanup in C with GCCs magic

https://oshub.org/projects/retros-32/posts/defer-resource-cleanup-in-c-with-gccs-magic
4•signa11•42m ago•0 comments

The Startup Manifesto: 42 Principles for founders

https://www.thestartupmanifesto.com/
2•prototypo•43m ago•0 comments

AI Prompt Optimizer – Boost Your Prompts with PromptBoost

https://prompt-boost.com
2•icstiss•44m ago•0 comments

AI and the Rise of Techno-Fascism in the United States (Garry Kasparov)

https://www.theatlantic.com/podcasts/archive/2025/09/ai-and-the-fight-between-democracy-and-autoc...
3•saubeidl•47m ago•0 comments

AntOps – Lightweight Infra Map and Incident/Change/RCA Governance Layer

https://www.antopshq.com
1•samernaffah•47m ago•1 comments

Deploy your own AI vibe coding platform – in one click

https://blog.cloudflare.com/deploy-your-own-ai-vibe-coding-platform/
3•tosh•53m ago•0 comments

How to Prepare for a Technical Interview

https://www.rubynewbie.org/how-to-prepare-for-a-technical-interview
1•lylo•53m ago•0 comments

9 Things Every Fresh Graduate Should Know About Software Performance

https://johnnysswlab.com/9-things-every-fresh-graduate-should-know-about-software-performance/
1•signa11•54m ago•0 comments