frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Free API to extract PDF data

6•leftnode•9h ago
Hi HN,

Like everyone, I'm working on an product that uses LLMs to extract data from photos and documents. Part of the processing pipeline is extracting data from PDFs as raw text or a raster image.

As part of our leadgen strategy, we've opened our REST API that lets you process pages of a PDF. The API is completely free to use anonymously, but is rate limited to 1 page per 30 seconds. Creating a free account removes this restriction.

The two endpoints are:

- https://extract.dev/api/pages/extract/raster - Rasterize a page of a PDF

- https://extract.dev/api/pages/extract/text - Extract text from a page of a PDF

Both have the same request format:

    {
        "file": "https://assets.extract-cdn.com/data/hd-receipt.pdf",
        "page": 1
    }
I've outlined more of the documentation here: https://extract.dev/docs

Under the hood, the API is using Poppler to extract text and rasterize pages. Note that the text extraction functionality extracts actual text encoded in the PDF, and does not employ an OCR model. Give it a spin, I'm interested in your feedback if this is useful or not.

Show HN: An open source access logs analytics script to block bot attacks

https://github.com/tempesta-tech/webshield
22•krizhanovsky•6h ago•2 comments

Show HN: Metorial (YC F25) – Vercel for MCP

https://github.com/metorial/metorial
43•tobihrbr•11h ago•15 comments

Show HN: Wispbit - Linter for AI coding agents

https://wispbit.com
23•dearilos•6h ago•11 comments

Show HN: CSS Extras

https://github.com/sindresorhus/css-extras
97•mofle•6d ago•59 comments

Show HN: PlayMyMood – Generate YouTube Music playlists based on your mood

https://playmymood.com/
2•speeq•4h ago•0 comments

Show HN: Relaya – Agent calls businesses for you

https://relaya.ai/
5•rishavmukherji•4h ago•0 comments

Show HN: Free API to extract PDF data

6•leftnode•9h ago•0 comments

Show HN: SQLite Online – 11 years of solo development, 11K daily users

https://sqliteonline.com/
448•sqliteonline•1d ago•138 comments

Show HN: Pathwave.io – MCP and mobile app to manually approve AI actions

https://web.pathwave.io/docs
2•felipe-pathwave•5h ago•0 comments

Show HN: Nofan Framework 16 Fan Controller

https://github.com/laktak/nofan
2•laktak•5h ago•0 comments

Show HN: AI toy I worked on is in stores

https://www.walmart.com/ip/SANTA-SMAGICAL-PHONE/16364964771
146•Sean-Der•2d ago•164 comments

Show HN: I built a simple ambient sound app with no ads or subscriptions

https://ambisounds.app/
295•alpaca121•2d ago•117 comments

Show HN:I built a free AI tool that scans and sorts financial news for traders

https://www.fxradar.live/
4•LuckyAleh•8h ago•1 comments

Show HN: Get a PMF score for your website, based on simulated user data

https://semilattice.ai/demos/pmf-report
2•jtewright•9h ago•0 comments

Show HN: I made an esoteric programming language that's read like a spellbook

https://github.com/sirbread/spellscript
171•sirbread•2d ago•55 comments

Show HN: GoHPTS-TCP/UDP Transparent Proxy with ARP Spoofing and Traffic Sniffing

https://github.com/shadowy-pycoder/go-http-proxy-to-socks
2•shadowy-pycoder•11h ago•0 comments

Show HN: Aidlab – Health Data for Devs

55•guzik•3d ago•17 comments

Show HN: Daily install trends of AI coding extensions in VS Code

https://bloomberry.com/coding-tools.html
23•AznHisoka•12h ago•9 comments

Show HN: Baby's first international landline

https://wip.tf/posts/telefonefix-building-babys-first-international-landline/
221•nbr23•6d ago•54 comments

Show HN: A Digital Twin of my coffee roaster that runs in the browser

https://autoroaster.com/
155•jvkoch•1w ago•37 comments

Show HN: docker/model-runner – an open-source tool for local LLMs

https://github.com/docker/model-runner
17•ericcurtin•13h ago•9 comments

Show HN: Wordle-Style Daily Wikipedia Game

https://hyperlinked.wiki
4•Mistri•13h ago•1 comments

Show HN: A Lisp Interpreter for Shell Scripting

https://github.com/gue-ni/redstart
113•quintussss•6d ago•25 comments

Show HN: I extracted BASIC listings for Tim Hartnell's 1986 book

https://github.com/nzduck/hartnell-exploring-ai-book
60•nzduck•4d ago•6 comments

Show HN: I invented a new generative model and got accepted to ICLR

https://discrete-distribution-networks.github.io/
649•diyer22•4d ago•90 comments

Show HN: Lights Out: my 2D Rubik's Cube-like Game

https://raymondtana.github.io/projects/pages/Lights_Out.html
80•raymondtana•4d ago•25 comments

Show HN: AI visuals that feel the music

https://www.trackart.io/
2•feskk•18h ago•0 comments

Show HN: Rift – A tiling window manager for macOS

https://github.com/acsandmann/rift
212•atticus_•3d ago•120 comments

Show HN: Open source, logical multi-master PostgreSQL replication

https://github.com/pgEdge/spock
150•pgedge_postgres•5d ago•60 comments

Show HN: FFTN, faster than FFTW in 700 lines of C

https://gitlab.sac-home.org/sac-group/fftn
7•thomaskoopman•1d ago•0 comments