frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Ask HN: Best on device LLM tooling for PDFs?

4•martinald•7mo ago
I've got very used to using the "big" LLMs for analysing PDFs

Now llama.cpp has vision support; I tried out PDFs with it locally (via LM Studio) but the results weren't as good as I hoped for. One time it insisted it couldn't do "OCR", but gave me an example of what the data _could_ look like - which was the data.

The other major problem is sometimes PDFs are actually made up of images; and it got super confused on those as well.

Given this is so new I'm struggling to find any tools which make this easier.

Comments

raymond_goo•7mo ago
Try something like this

  !pip install pytesseract pdf2image pillow
  !apt install poppler-utils
  #!apt install tesseract-ocr
  from pdf2image import convert_from_path
  import pytesseract

  pages = convert_from_path('k.pdf', dpi=300)

  all_text = ""
  for page_num, img in enumerate(pages, start=1):
      text = pytesseract.image_to_string(img)
      all_text += f"\n--- Page {page_num} ---\n{text}"

  print(all_text)
constantinum•7mo ago
give https://pg.llmwhisperer.unstract.com/ a try

My Logitech mouse became unusable, Logi Options+ can't validate certificate

5•enescakir•4h ago•5 comments

Ask HN: Any Microsoft employees/devs here? What's happening to Microsoft?

37•thehamkercat•19h ago•13 comments

Ask HN: Who wants to be hired? (January 2026)

164•whoishiring•4d ago•378 comments

Ask HN: How do you use 5–10 minute gaps productively?

36•pea•2d ago•48 comments

Ask HN: Who is hiring? (January 2026)

347•whoishiring•4d ago•305 comments

Git analytics that works across GitHub, GitLab, and Bitbucket

3•akhnid•21h ago•1 comments

RevisionDojo, a YC startup, is running astroturfing campaigns targeting kids?

448•red-polygon•1d ago•83 comments

Ask HN: Those making $500/month on side projects in 2026 – Show and tell

6•selmas58•5h ago•9 comments

Ask HN: What's a standard way for apps to request text completion as a service?

4•nvader•1d ago•2 comments

Amazon Prime AI overviews can't even get the basics right

42•PyWoody•1d ago•12 comments

Ask HN: Has anyone else been struggling with search lately?

27•areoform•1d ago•15 comments

ProjectCLI: The Swiss Army Knife CLI for bootstrapping any project

3•dawitworku•18h ago•0 comments

Ask HN: How do small teams make sure recurring tasks don't slip?

5•batels•1d ago•5 comments

Anyone building software for wearable tech?

15•ssc23•1d ago•14 comments

Tell HN: Internet Bug Bounty (IBB) on HackerOne Appears Dead, CVEs Unpaid

7•irke882•21h ago•2 comments

Ask HN: What kind of setup do you run for your children?

6•mattwdelong•21h ago•10 comments

Private Operating System

6•ariatelco•1d ago•9 comments

I made a lofi page for late night work

17•onmyway133•1d ago•8 comments

Ask HN: Reading list for being a better engineer?

41•drekipus•3d ago•15 comments

What do people usually do with spare Android phones? Any practical use cases?

18•AndroidShare•3d ago•21 comments

Ask HN: What's the future of software testing and QA?

23•sjgeek•3d ago•17 comments

Ask HN: Launching niche service soon, how should I prepare?

2•thedangler•15h ago•1 comments

Ask HN: What did you learn in 2025?

19•kiernanmcgowan•3d ago•7 comments

Tell HN: I'm having the worst career winter of my life

98•mariogintili•4d ago•121 comments

Ask HN: Are you missing daily email alerts from HN?

9•unknownhad•2d ago•7 comments

Ask HN: What's the best talk you've watched?

20•barddoo•1d ago•14 comments

Ask HN: How is your work making the world a better place?

15•AbstractH24•2d ago•14 comments

Ask HN: Is anyone having success with Reddit ads?

6•mattglossop•20h ago•4 comments

How to use AI to augment learning without losing critical thinking skills?

24•mintsuku•5d ago•14 comments

Tell HN: EU soliciting feedback on law that could affect Open Access

10•Quanttek•2d ago•0 comments