frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Show HN: Contextual AI Document Parser – Infer hierarchy for long, complex docs

1•ishan_sinha•3h ago
Hey HN,

I’m Ishan, Product Manager at Contextual AI.

We're excited to announce our document parser that combines the best of custom vision, OCR, and vision language models to deliver unmatched accuracy.

There are a lot of parsing solutions out there—here’s what makes ours different: 1) Document hierarchy inference: Unlike traditional parsers that process documents as isolated pages, our solution infers a document’s hierarchy and structure. This allows you to add metadata to each chunk that describes its position in the document, which then lets your agents understand how different sections relate to each other and connect information across hundreds of pages. 2) Minimized hallucinations: Our multi-stage pipeline minimizes severe hallucinations while also providing bounding boxes and confidence levels for table extraction to simplify auditing its output. 3) Superior handling of complex modalities: Technical diagrams, complex figures and nested tables are efficiently processed to support all of your data.

In an end-to-end RAG evaluation of a dataset of SEC 10Ks and 10Qs (containing 70+ documents spanning 6500+ pages), we found that including document hierarchy metadata in chunks increased the equivalence score from 69.2% to 84.0%.

Getting started The first 500+ pages in our Standard mode (for complex documents that require VLMs and OCR) are free if you want to give it a try. Just create a Contextual AI account (https://app.contextual.ai/?signup=1) and visit the Components tab to use the Parse UI playground, or get an API key and call the API directly.

Documentation 1) /parse API: https://docs.contextual.ai/api-reference/parse/parse-file 2) Python SDK: https://github.com/ContextualAI/contextual-client-python/blo... 3) Code examples: https://github.com/ContextualAI/examples/blob/main/03-standa... 4) Blog post: https://contextual.ai/blog/document-parser-for-rag/

Happy to answer any questions about how our document parser works or how you might integrate it into your RAG systems!

New York's attempt to cut off public records access has been stopped

https://mailchi.mp/reclaimtherecords/reclaim-the-records-supports-new-legislation-in-new-york-for-better-public-records-access
1•toomuchtodo•1m ago•1 comments

Airbnb is trying to win travelers back from hotels

https://www.washingtonpost.com/travel/2025/05/13/airbnb-services-chefs-massage-haircuts/
1•pseudolus•3m ago•1 comments

"Google wanted that": Nextcloud decries Android permissions as "gatekeeping"

https://arstechnica.com/gadgets/2025/05/nextcloud-accuses-google-of-big-tech-gatekeeping-over-android-app-permissions/
1•thunderbong•3m ago•0 comments

Vision Language Models (Better, Faster, Stronger)

https://huggingface.co/blog/vlms-2025
2•jimmcslim•4m ago•0 comments

Google tests replacing 'I'm Feeling Lucky' with 'AI Mode'

https://techcrunch.com/2025/05/13/google-tests-replacing-im-feeling-lucky-with-ai-mode/
1•MarcoDewey•5m ago•0 comments

An End to Dead App Design

https://quality.ghost.io/an-end-to-dead-app-design/
1•notkoalas•7m ago•0 comments

The Database Row That Did and Didn't Exist

https://www.mistys-internet.website/blog/blog/2025/05/13/the-database-row-that-did-and-didnt-exist/
1•ingve•7m ago•0 comments

Crane: Reasoning with Constrained LLM Generation

https://arxiv.org/abs/2502.09061
1•tough•9m ago•0 comments

New online courses based on Handbook of Applied Cryptography

https://cacr.uwaterloo.ca/hac/
2•teleforce•12m ago•0 comments

Vibe coders, would you use this?

https://vybecheck.com
1•spencerh21•13m ago•1 comments

Ask HN: What do you think of this novel Rubik's Cube variant?

1•amichail•18m ago•0 comments

New Intel CPU Flaw Bypasses Spectre v2 Defenses to Leak Kernel Memory

https://cyberinsider.com/new-intel-cpu-flaw-bypasses-spectre-v2-defenses-to-leak-kernel-memory/
4•pierremenard•19m ago•0 comments

Ron Buckton laid off from Microsoft

https://twitter.com/rbuckton/status/1922364558426911039
7•latchkey•23m ago•0 comments

AI Tool Uses Face Photos to Estimate Biological Age and Predict Cancer Outcomes

https://www.massgeneralbrigham.org/en/about/newsroom/press-releases/ai-face-photos-tool-estimate-age-predict-cancer-outcomes
1•pseudolus•23m ago•0 comments

Microsoft to lay off 6k workers despite streak of profitable quarters

https://www.theguardian.com/technology/2025/may/13/microsoft-layoffs
3•ommz•24m ago•0 comments

José Mujica, Uruguay's modest leader who transformed the country, dies at 89

https://www.cnn.com/2025/05/13/americas/uruguay-president-jose-pepe-mujica-obit-intl-latam
1•marcodiego•25m ago•0 comments

D Slices

https://dlang.org/articles/d-array-article.html
1•teleforce•25m ago•0 comments

'A New Era' of Cancer Therapies

https://undark.org/2025/05/12/cancer-therapies-new-era/
1•EA-3167•26m ago•0 comments

Emacs: My New Doric Themes

https://protesilaos.com/codelog/2025-05-13-emacs-doric-themes/
2•robenkleene•28m ago•0 comments

Ask HN: AI Founders, how do you handle liability clauses?

1•midgard27•31m ago•2 comments

Intelligence on Earth Evolved Independently at Least Twice

https://www.wired.com/story/intelligence-evolved-at-least-twice-in-vertebrate-animals/
1•hydrolox•31m ago•0 comments

Dusk OS C Compiler

https://git.sr.ht/~vdupras/duskos/tree/master/item/fs/doc/comp/c.txt?__goaway_challenge=js-refresh&__goaway_id=09245079d4cf1efcea00bf4a44558faf&__goaway_referer=https%3A%2F%2Fduskos.org%2F
3•doener•32m ago•0 comments

Google might replace the 'I'm Feeling Lucky' button with AI Mode

https://www.theverge.com/news/665560/google-search-ai-mode-feeling-lucky-tests
1•mfiguiere•35m ago•0 comments

Ferroid: High-Throughput Snowflake-Like ID Generator in Rust

https://old.reddit.com/r/rust/comments/1klkd7z/ferroid_a_customizable_snowflakestyle_id/
2•s0l0ist•35m ago•0 comments

Binary Formats Are Better Than JSON in Browsers

https://adamfaulkner.github.io/binary_formats_are_better_than_json_in_browsers.html
2•adamkf•35m ago•0 comments

China's AI-powered humanoid robots aim to transform manufacturing

https://www.reuters.com/world/china/chinas-ai-powered-humanoid-robots-aim-transform-manufacturing-2025-05-13/
1•gray_amps•35m ago•0 comments

Saudi Arabia's Humain Partners with Nvidia on AI Goals as Trump Visits

https://www.reuters.com/world/middle-east/saudi-arabia-partners-with-nvidia-spur-ai-goals-trump-visits-2025-05-13/
1•bit_qntum•36m ago•0 comments

Nvidia CEO's net worth nears $120B as shares surge on Saudi chip deal

https://www.reuters.com/technology/nvidia-ceos-net-worth-nears-120-billion-shares-surge-saudi-chip-deal-2025-05-13/
2•byte-bolter•37m ago•0 comments

Ask HN: What's your best "data to wisdom" hack for SaaS?

2•korky•38m ago•0 comments

Flattening Rust's Learning Curve

https://corrode.dev/blog/flattening-rusts-learning-curve/
2•birdculture•41m ago•0 comments