frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Ask HN: Best on device LLM tooling for PDFs?

4•martinald•1y ago
I've got very used to using the "big" LLMs for analysing PDFs

Now llama.cpp has vision support; I tried out PDFs with it locally (via LM Studio) but the results weren't as good as I hoped for. One time it insisted it couldn't do "OCR", but gave me an example of what the data _could_ look like - which was the data.

The other major problem is sometimes PDFs are actually made up of images; and it got super confused on those as well.

Given this is so new I'm struggling to find any tools which make this easier.

Comments

raymond_goo•1y ago
Try something like this

  !pip install pytesseract pdf2image pillow
  !apt install poppler-utils
  #!apt install tesseract-ocr
  from pdf2image import convert_from_path
  import pytesseract

  pages = convert_from_path('k.pdf', dpi=300)

  all_text = ""
  for page_num, img in enumerate(pages, start=1):
      text = pytesseract.image_to_string(img)
      all_text += f"\n--- Page {page_num} ---\n{text}"

  print(all_text)
constantinum•1y ago
give https://pg.llmwhisperer.unstract.com/ a try

Ask HN: Gin rummy strategies

11•bix6•9h ago•1 comments

Ask HN: So what happened to Facebook "localhost" tracking?

71•juliusceasar•10h ago•79 comments

Ask HN: Spent thousands, got no customers. What's wrong with my site?

8•petebay•13h ago•10 comments

I'm tired of LLM skill slop, so I built mine with regression tests

3•iliaov•1h ago•0 comments

Ask HN: Why is it still so hard for LLMs to query NoSQL databases?

4•cammasmith•8h ago•0 comments

Ask HN: How do you find deep technical content?

18•f311a•11h ago•10 comments

Being privacy-conscious comes with some downsides

4•wqtz•1h ago•2 comments

Ask HN: Good books/resources for learning SQL?

3•CobaltFire•3h ago•1 comments

Ask HN: Who is hiring? (June 2026)

238•whoishiring•3d ago•415 comments

Ask HN: Who wants to be hired? (June 2026)

148•whoishiring•3d ago•495 comments

Life saving / first aid posters

33•cpu_•2d ago•3 comments

Ask HN: AI efficiency in the workplace

3•localhoster•10h ago•0 comments

Google killed my $1M ARR startup overnight

4•vadumo•13h ago•3 comments

Ask HN: What are all the ways to punch through NAT?

4•jupr•10h ago•5 comments

Tell HN: Max messenger app removed from App Store

8•secondary_op•19h ago•3 comments

Ask HN: A Brief History of LLMs

9•menomatter•1d ago•6 comments

Laid off. Broke. Depressed. & idk how to market my SaaS

14•touseefbuilds•1d ago•23 comments

Ask HN: Why Ask HN has only 14 questions now?

11•throwaw12•1d ago•3 comments

Ask HN: What is your opinion on index rule changes to accommodate Mega-Cap IPOs?

16•figmert•2d ago•10 comments

Ask HN: Why are so many Show HNs being flagged?

3•866-RON-0-FEZ•1d ago•6 comments

Angular jasmine unit tests are harder to code/maintain than the actual feature

4•GamingAtWork•2d ago•0 comments

I'm Done Using AI

32•nyxtom•2d ago•23 comments

$100 to a Debian Developer who can get Fresh Editor into Trixie

28•jph•4d ago•14 comments

Please don't spam people looking for employment. It's just cruel

960•IliaLitviak•2d ago•271 comments

Recruiters, How do you vet resume in 2026?

15•CoffeeSky•4d ago•8 comments

AI Goal: Senior Software Engineer

4•oryocyph•2d ago•5 comments

You've reached the end!