frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Ask HN: Best on device LLM tooling for PDFs?

4•martinald•6mo ago
I've got very used to using the "big" LLMs for analysing PDFs

Now llama.cpp has vision support; I tried out PDFs with it locally (via LM Studio) but the results weren't as good as I hoped for. One time it insisted it couldn't do "OCR", but gave me an example of what the data _could_ look like - which was the data.

The other major problem is sometimes PDFs are actually made up of images; and it got super confused on those as well.

Given this is so new I'm struggling to find any tools which make this easier.

Comments

raymond_goo•6mo ago
Try something like this

  !pip install pytesseract pdf2image pillow
  !apt install poppler-utils
  #!apt install tesseract-ocr
  from pdf2image import convert_from_path
  import pytesseract

  pages = convert_from_path('k.pdf', dpi=300)

  all_text = ""
  for page_num, img in enumerate(pages, start=1):
      text = pytesseract.image_to_string(img)
      all_text += f"\n--- Page {page_num} ---\n{text}"

  print(all_text)
constantinum•6mo ago
give https://pg.llmwhisperer.unstract.com/ a try

Tell HN: Happy Thanksgiving

745•prodigycorp•1d ago•185 comments

SQL Still Wins: Why It's Not Going Anywhere

2•browsejobs5•29m ago•0 comments

Ask HN: Type 1 Hypervisor for use on a laptop?

3•jaitaiwan•5h ago•2 comments

Ask HN: What open source projects are you grateful for?

20•jayzalowitz•18h ago•25 comments

Ask HN: Hearing aid wearers, what's hot?

354•pugworthy•4d ago•209 comments

Ask HN: What Are You Thankful For?

8•nerdsniper•20h ago•8 comments

Vibro-Braille for Deaf-Blind

2•Billiamdan•15h ago•0 comments

Ask HN: Do AIs reply with numerous em dashes to save money somehow?

6•amichail•22h ago•8 comments

Why is OpenAI lying about the data its collecting on users?

16•kypro•1d ago•12 comments

Color.io Is Going Offline

23•hilti•2d ago•15 comments

Can Management Be Outsourced?

9•ymanagers•1d ago•8 comments

Enterprise security can be messy: Building a Security-Aware Culture

2•rezliant•19h ago•7 comments

Tell HN: Happy Thanksgiving

2•turkeyboi•20h ago•0 comments

Ask HN: Scheduling stateful nodes when MMAP makes memory accounting a lie

23•leo_e•3d ago•20 comments

Ask HN: What's your go-to strategy for programmatic SEO in 2025?

3•liquid99•3h ago•2 comments

Tell HN: Happy Thanksgiving – Grateful

4•emreb•22h ago•3 comments

Ask HN: Good resources to learn financial systems engineering?

137•_1tan•4d ago•28 comments

Ask HN: Should account creation/origin country be displayed on HN profiles?

26•megraf•2d ago•36 comments

Ask HN: Would you use a fast/cheap "prior art" service instead of a patent?

5•shaheeniquebal•1d ago•6 comments

Tell HN: Stall AI progress for the benefit of humanity

9•blutoot•1d ago•14 comments

Ask HN: What is the best Christmas movie?

6•johnsillings•12h ago•7 comments

Ask HN: What work problems would your company pay to solve?

16•aryanchaurasia•3d ago•16 comments

Ask HN: What did Stripe change (Value Add)?

7•dzonga•2d ago•9 comments

Ask HN: Opinions on facial recognition at air ports?

5•bjourne•2d ago•30 comments

Tell HN: Wanted to give dang appreciation

64•razodactyl•4d ago•5 comments

Ask HN: TCP/IP Illustrated, v2 2e?

7•mayureshkathe•23h ago•0 comments

Should R ecosystem be a choice for longer-term projects?

3•northlondoner•2d ago•1 comments

Google attacking human thought with Gemini in Google Keep

9•fellowniusmonk•2d ago•1 comments

A logging loop in GKE cost me $1,300 in 3 days – 9.2x my actual infrastructure

9•nthypes•3d ago•4 comments

Tell HN: Cursor charged 19 subscriptions, won't refund

16•devtailz•3d ago•7 comments