Show HN: Browser-based PDF form fields detection (YOLO-based)

https://commonforms.simplepdf.com/

26•nip•3mo ago

Hey HN!

Last week, Joe Barrow released CommonForms [1], a set of open models for automatically detecting form fields in PDFs.

He trained two models, FFDNet-S and FFDNet-L, on a dataset of 55k documents. You can read more about his approach in the arXiv paper [2].

As someone who's been searching for reliable models to auto-detect form fields (one of the last hard problems in PDF form filling), I was seriously impressed by the quality of these models. I wanted to give them the attention and distribution they deserve, so I created a fully browser-based implementation that handles both detection and field addition.

My implementation relies on his models and onnx runtime web + some post-processing. I plan on publishing a small browser library to encapsulate it in the coming days to make it easier to deploy anywhere (currently you'd have to fork / copy my code)

Happy to answer any questions about the browser-based implementation!

Questions about the models themselves should be directed to Joe, who I believe is also on HN [3]

[1] https://github.com/jbarrow/commonforms [2] https://arxiv.org/abs/2509.16506 [3] https://news.ycombinator.com/user?id=jbarrow

Comments

jbarrow•3mo ago

Hey, Benjamin, thanks for the attribution! Happy to field any questions HN users have.

It's really gratifying to see people building on the work, and I love that it's possible to do browser-side/on-device.

Shindi•3mo ago

Tbh this model is extremely bad. I tried a couple of our medical form examples and it couldn't find almost any of the fields.

jbarrow•3mo ago

Super interesting. Would you be willing to try the Python package (https://github.com/jbarrow/commonforms) or share the PDFs?

For the non-ONNX models there are some inference tricks that generally improve performance, and potentially lowering confidence could help.

Start all of your commands with a comma

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

The Waymo World Model

Jeffrey Snover: "Welcome to the Room"

How we made geo joins 400× faster with H3 indexes

Unseen Footage of Atari Battlezone Arcade Cabinet Production

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

Monty: A minimal, secure Python interpreter written in Rust for use by AI

Show HN: I spent 4 years building a UI design tool with only the features I use

Hackers (1995) Animated Experience

Sheldon Brown's Bicycle Technical Info

Vocal Guide – belt sing without killing yourself

Where did all the starships go?

Microsoft open-sources LiteBox, a security-focused library OS

Show HN: If you lose your memory, how to regain access to your computer?

An Update on Heroku

ga68, the GNU Algol 68 Compiler – FOSDEM 2026 [video]

Was Benoit Mandelbrot a hedgehog or a fox?

PC Floppy Copy Protection: Vault Prolok

Dark Alley Mathematics

How to effectively write quality code with AI

Delimited Continuations vs. Lwt for Threads

Female Asian Elephant Calf Born at the Smithsonian National Zoo

I now assume that all ads on Apple news are scams

Introducing the Developer Knowledge API and MCP Server

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

Understanding Neural Network, Visually

Why I Joined OpenAI

What Is Ruliology?

Show HN: R3forth, a ColorForth-inspired language with a tiny VM