frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Ask HN: How to get started in electronic music

4•A_Random_Nerd•1h ago•5 comments

Ask HN: What are you working on? (May 2026)

263•david927•1d ago•978 comments

Ask HN: We just had an actual UUID v4 collision...

459•mittermayr•3d ago•334 comments

Ask HN: How are you preparing for interviews nowadays?

5•holden_nelson•5h ago•15 comments

Rumors of my death are slightly exaggerated

1655•CliffStoll•5d ago•254 comments

Cloudflare Is Down

6•sammy2255•36m ago•4 comments

Ask HN: Best static site generator for a docs site in 2026?

9•agenttestjekuqz•18h ago•11 comments

Ask HN: How often do you investigate issues in production vs. looking at logs?

3•aspectrr•11h ago•1 comments

Ask HN: How do you choose a model for a task?

6•bix6•11h ago•8 comments

Our keyboards are tracking us

8•tukunjil•17h ago•5 comments

Ask HN: Do you know the ethics of Developers?

7•eropatori•18h ago•17 comments

Ask HN: Will low quality AI customer support be the new normal?

22•0-bad-sectors•1d ago•19 comments

Ask HN: What would you like to be working on?

6•DDerTyp•14h ago•8 comments

Remind HN: Today is Mother's Day, call your moms

371•rationalist•1d ago•162 comments

Ask HN: What Wintel/AMD (Laptop) Harware are you liking?

4•aagha•1d ago•0 comments

Ask HN: Is this the SWE workflow of the future?

15•mc-0•1d ago•10 comments

Ask HN: Can a tinnitus be triggered by high frequency noises?

6•tinnitus_crazy•2d ago•15 comments

Ask HN: Which LLM are you using to evaluate your ideas?

5•Marius77•1d ago•9 comments

Tell HN: Claude claims the AGPLv3 license violates it's content policy

12•freedomben•1d ago•0 comments

Cancelling Claude subscription renewal immediately revokes Design access

5•o10449366•1d ago•1 comments

Best AI coding plan alternative to Claude and ChatGPT

15•Jsttan•1d ago•10 comments

Ask HN: Former master-tech building AI systems – how to break into software?

4•nicku711•2d ago•4 comments

Ask HN: Before Open Source took over the server, what was the discourse like?

8•mbgerring•2d ago•3 comments

You've reached the end!

Open in hackernews

Ask HN: Best on device LLM tooling for PDFs?

4•martinald•11mo ago
I've got very used to using the "big" LLMs for analysing PDFs

Now llama.cpp has vision support; I tried out PDFs with it locally (via LM Studio) but the results weren't as good as I hoped for. One time it insisted it couldn't do "OCR", but gave me an example of what the data _could_ look like - which was the data.

The other major problem is sometimes PDFs are actually made up of images; and it got super confused on those as well.

Given this is so new I'm struggling to find any tools which make this easier.

Comments

raymond_goo•11mo ago
Try something like this

  !pip install pytesseract pdf2image pillow
  !apt install poppler-utils
  #!apt install tesseract-ocr
  from pdf2image import convert_from_path
  import pytesseract

  pages = convert_from_path('k.pdf', dpi=300)

  all_text = ""
  for page_num, img in enumerate(pages, start=1):
      text = pytesseract.image_to_string(img)
      all_text += f"\n--- Page {page_num} ---\n{text}"

  print(all_text)
constantinum•11mo ago
give https://pg.llmwhisperer.unstract.com/ a try