frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Ask HN: Best on device LLM tooling for PDFs?

4•martinald•7mo ago
I've got very used to using the "big" LLMs for analysing PDFs

Now llama.cpp has vision support; I tried out PDFs with it locally (via LM Studio) but the results weren't as good as I hoped for. One time it insisted it couldn't do "OCR", but gave me an example of what the data _could_ look like - which was the data.

The other major problem is sometimes PDFs are actually made up of images; and it got super confused on those as well.

Given this is so new I'm struggling to find any tools which make this easier.

Comments

raymond_goo•7mo ago
Try something like this

  !pip install pytesseract pdf2image pillow
  !apt install poppler-utils
  #!apt install tesseract-ocr
  from pdf2image import convert_from_path
  import pytesseract

  pages = convert_from_path('k.pdf', dpi=300)

  all_text = ""
  for page_num, img in enumerate(pages, start=1):
      text = pytesseract.image_to_string(img)
      all_text += f"\n--- Page {page_num} ---\n{text}"

  print(all_text)
constantinum•7mo ago
give https://pg.llmwhisperer.unstract.com/ a try

Ask HN: What did you read in 2025?

319•kwar13•2d ago•424 comments

Ask HN: What skills do you want to develop or improve in 2026?

260•meridion•3d ago•400 comments

Tell HN: Merry Christmas

1946•basilikum•4d ago•427 comments

Ask HN: What are you building as a side-project or side-hustle in 2026?

9•bayeslaw•13h ago•4 comments

Book recommendations based on reading history

4•easywood•15h ago•6 comments

Ask HN: Ruby 4 and unicorn segfault (kgio) how to get a gem release?

2•catatsuy•4h ago•0 comments

Ask HN: If you only needed 200 customers at$49, how would you approach it?

9•OmKadam•18h ago•10 comments

Ask HN: What are the best engineering blogs with real-world depth?

461•nishilpatel•6d ago•136 comments

Ask HN: How are you sandboxing coding agents?

44•m-hodges•2d ago•29 comments

Ask HN: How do you get visibility if you're suuuuper bad at marketing?

5•ClipNoteBook•18h ago•11 comments

Ask HN: What are you building during the holiday break?

5•linsomniac•19h ago•10 comments

Tell HN: I am afraid AI will take my job at some point

21•funnyfoobar•2d ago•31 comments

Ask HN: Why isn't there competition to LinkedIn yet?

60•antfie•6d ago•59 comments

Tell HN: Merry Christmas

92•franze•4d ago•57 comments

Ask HN: What was the hardest bug you tracked down in 2025?

9•varshith17•2d ago•4 comments

Tell HN: Google ignores English searches and forces localized results

75•jeanlucas•16h ago•84 comments

The Epstein files downloaded today is different compared to before

47•IDKhowTo•1d ago•10 comments

Looking for Decent Conversation?

101•kmstout•4d ago•16 comments

Ask HN: How many HN'ers Celebrate Christmas vs. ?

19•gist•4d ago•36 comments

Postgres for everything, does it work?

8•saisrirampur•2d ago•5 comments

Ask HN: What is the international distribution/statistics of HN visitors?

63•KellyCriterion•3d ago•28 comments

Stronk.app – open-source gym lifts journal

64•apatheticonion•5d ago•30 comments

Ask HN: My mother was scammed out of all her savings. What should I do?

136•scapbi•6d ago•66 comments

Do you know what your dev team shipped last week?

2•akhnid•1d ago•1 comments

Ask HN: Good uses cases for Fabrice's microquickjs

14•fud101•4d ago•5 comments

Bloat in software is getting WAAAY out of hand

8•sdrawkcabsti•1d ago•8 comments

Ask HN: Oberon et al., vs. Rust

17•mikethe•6d ago•30 comments

Google Cloud Run cost me $4,676 in 6 weeks with zero traff

50•creativesage•4d ago•33 comments

Ask HN: Would anyone pay for a social network with no ads or data harvesting?

5•neilfd•1d ago•22 comments

Ask HN: At 34, can I aspire to being more than a JavaScript widget engineer?

29•yesitcan•4d ago•24 comments