frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Ask HN: Best on device LLM tooling for PDFs?

4•martinald•1y ago
I've got very used to using the "big" LLMs for analysing PDFs

Now llama.cpp has vision support; I tried out PDFs with it locally (via LM Studio) but the results weren't as good as I hoped for. One time it insisted it couldn't do "OCR", but gave me an example of what the data _could_ look like - which was the data.

The other major problem is sometimes PDFs are actually made up of images; and it got super confused on those as well.

Given this is so new I'm struggling to find any tools which make this easier.

Comments

raymond_goo•1y ago
Try something like this

  !pip install pytesseract pdf2image pillow
  !apt install poppler-utils
  #!apt install tesseract-ocr
  from pdf2image import convert_from_path
  import pytesseract

  pages = convert_from_path('k.pdf', dpi=300)

  all_text = ""
  for page_num, img in enumerate(pages, start=1):
      text = pytesseract.image_to_string(img)
      all_text += f"\n--- Page {page_num} ---\n{text}"

  print(all_text)
constantinum•1y ago
give https://pg.llmwhisperer.unstract.com/ a try

Ask HN: What was your "oh shit" moment with GenAI?

535•andrehacker•2d ago•938 comments

Ask HN: Why is the HN crowd so anti-AI?

378•Ekami•22h ago•635 comments

Ask HN: How to get my contact info off US political party's list

4•kaycebasques•2h ago•1 comments

Does anyone know since when we are close to building in space?

4•kingleopold•7h ago•1 comments

Ask HN: So what happened to Facebook "localhost" tracking?

103•juliusceasar•2d ago•101 comments

Ask HN: What is your (AI) dev tech stack / workflow?

145•dv35z•1d ago•127 comments

Ask HN: Who is hiring? (June 2026)

239•whoishiring•5d ago•432 comments

Ask HN: How do you find deep technical content?

36•f311a•2d ago•28 comments

Ask HN: Were CS profs right to look down on programming in light of modern AI?

4•amichail•10h ago•2 comments

Ask HN: Who wants to be hired? (June 2026)

149•whoishiring•5d ago•536 comments

Ask HN: Does robotics capabilities research accelerate AGI timelines?

8•themasterchief•19h ago•1 comments

Bad MCP design costs your agent 5x more tokens

12•JohnnyZhang483•1d ago•0 comments

Ask HN: Gin rummy strategies

24•bix6•2d ago•4 comments

Ask HN: My competitors have flawed products but I can't get traction

10•saveitincork•1d ago•12 comments

Ask HN: Is the web for machines (/llm.txt) the one we wished we had as humans?

36•sunshine-o•1d ago•55 comments

Ask HN: What was your best experience with a VC?

2•krrishd•6h ago•2 comments

Ask HN: How did you discover Hacker News?

9•chistev•1d ago•25 comments

Ask HN: What would you name your own LLM?

4•akashwadhwani35•18h ago•2 comments

Ask HN: Is Azure capacity this constraind or am I doing it wrong?

11•lanycrost•1d ago•14 comments

Life saving / first aid posters

37•cpu_•4d ago•4 comments

Ask HN: Is Everyone an Engineer Now?

7•piratesAndSons•2d ago•9 comments

Ask HN: Why isn't AI image generation closely linked with graphics code gen?

2•amichail•1d ago•1 comments

I'm tired of LLM skill slop, so I built mine with regression tests

6•iliaov•2d ago•0 comments

Ask HN: How do you stay up to date without information overload?

4•bohdanstefaniuk•1d ago•4 comments

Being privacy-conscious comes with some downsides

8•wqtz•2d ago•4 comments

Google killed my $1M ARR startup overnight

19•vadumo•2d ago•9 comments

Supply chain attack alert: .github/setup.js

21•antihero•1d ago•13 comments

Ask HN: For non-hackers/nerds, why do you read HN?

5•throwaway2037•17h ago•4 comments

I underestimated how hard audio waveforms are in the browser

5•syncara•1d ago•3 comments

Ask HN: Hey, you, tech worker–how are you feeling?

5•arm32•2d ago•12 comments