Benchmarking the Most Reliable Document Parsing API

https://www.tensorlake.ai/blog/benchmarks

12•calavera•1h ago

Comments

serjester•1h ago

This is just a company advertisement, not even one that’s well done. They didn’t benchmark any of the real leaders in the space (reducto, extend, etc) and left Gemini out of the first two tests, presumably because it was the best performer (while also being multiple orders of magnitude cheaper).

JLO64•1h ago

Personally I use OpenAI models via the API for transcription of PDF files. Is there a big difference between them and Gemini models?

diptanu•43m ago

Hey! I am the founder of Tensorlake. We benchmarked the models that our customers consider using in enterprises or regulated industries where there is a big need for processing documents for various automation. Benchmarking takes a lot of time so we focussed on the ones that we get asked about.

On Gemini and other VLMs - we excluded these models because they don't do visual grounding - aka they don't provide page layouts, bounding boxes of elements on the pages. This is a table stakes feature for use-cases customers are building with Tensorlake. It wouldn't be possible to build citations without bounding boxes.

On pricing - we are probably the only company offer a pure on-demand pricing without any tiers. With Tensorlake, you can get back markdown from every page, summaries of figures, tables and charts, structured data, page classification, etc - in ONE api call. This means we are running a bunch of different models under the hood. If you add up the token count, and complexity of infrastructure to build a complex pipeline around Gemini, and other OCR/Layout detection model I bet the price you would end up with won't be any cheaper than what we provide :) Plus doing this at scale is very very complex - it requires building a lot of sophisticated infrastructure - another source of cost behind modern Document Ingestion services.

ianhawes•33m ago

I just tested a non-English document and it rendered English text. Does your model not support anything other than English?

diptanu•14m ago

It does, we have users in Europe and Asia using it with non English languages. Can you please send me a message at diptanu at tensorlake dot ai, would love to see why it didn’t work.

coderintherye•27m ago

Google's Vertex API for document processing absolutely does bounding boxes. In fact, some of the document processors are just a wrap around Google's product.

diptanu•15m ago

OP mentioned Gemini and not Google’s Vertex OCR API which has very different performance and accuracy characteristics than Gemini

hotpaper75•43m ago

Thanks for mentioning them, indeed their post seem to only surface a couple of names in the field and maybe not the most relevant ones.

karakanb•36m ago

I have been recently looking into extracting a bunch of details from a set of legacy invoice PDFs and had a subpar experience. Gemini was the best among the ones that I tried, but even that missed quite a bit. I'll definitely give this a look.

It seems like such a crowded space and there are many tools doing document extraction, I wonder if there's anything particular pulling more attention into the space?

recursive4•23m ago

Curious how it compares to https://github.com/datalab-to/chandra

diptanu•11m ago

We haven’t texted Chandra yet, because it’s very new. Under the hood Tensorlake is very similar to Marker - it’s a pipeline based OCR API, we do layout detection, Text Recognition and Detection, Table Structure Understanding, etc. We then use VLMs to enrich the results. Our models are much bigger than marker, and thus takes a little longer to parse documents. We optimized for accuracy. We will have a faster API soon.

Live Translation on AirPods Expands to the EU (IE)

BBC on Gaza-Israel: One Story, Double Standards

Nvidia's H100 GPU Takes AI Processing to Space

UK outperforms US in creating unicorns from early stage VC investment

ClusterMAX 2.0: The Industry Standard GPU Cloud Rating System

How to lead products through layoff fear

Self-Replicating Probes Could Be Operating in the Solar System

Antropocene

II. Leaflet of the White Rose

One porn platform made millions suing its viewers

NASA instrument arrives at ISS to demonstrate quantum entanglement

How Tiles Works – Tiles Privacy

Ask HN: Can people please stop commenting on whether a submission is AI?

Creating a New Embedded Rust Projects for NXP LPC55S69

Satisfying Bazel's relative paths requirement in C++ toolchains

OpenAI Wants Federal Backstop for New Investments [video]

Bombshell report exposes how Meta relied on scam ad profits to fund AI

What Did Medieval Peasants Know? (2022)

Cursor – Sixty days with the AI coding startup

JanitorBench: A new LLM benchmark for multi-turn chats

Lightstep is shutting down March 1, 2026

Ford Considers Scrapping Electric Version of F-150 Truck

Show HN: Deepcon – Get the most accurate context for coding agents

New court docs put Sam Altman's honesty in spotlight again

Show HN: Stingray Security – In-browser AI checking for phishing and scams

Bloodhound/GriffonAD: exploit automatically bad configurations in AD

Most Frequent Applesoft Basic Tokens

Crown Office – The Gazette

Evaluating Control Protocols for Untrusted AI Agents

From silicon to softmax: Inside the Ironwood AI stack