frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Ask HN: Best on device LLM tooling for PDFs?

4•martinald•7mo ago
I've got very used to using the "big" LLMs for analysing PDFs

Now llama.cpp has vision support; I tried out PDFs with it locally (via LM Studio) but the results weren't as good as I hoped for. One time it insisted it couldn't do "OCR", but gave me an example of what the data _could_ look like - which was the data.

The other major problem is sometimes PDFs are actually made up of images; and it got super confused on those as well.

Given this is so new I'm struggling to find any tools which make this easier.

Comments

raymond_goo•7mo ago
Try something like this

  !pip install pytesseract pdf2image pillow
  !apt install poppler-utils
  #!apt install tesseract-ocr
  from pdf2image import convert_from_path
  import pytesseract

  pages = convert_from_path('k.pdf', dpi=300)

  all_text = ""
  for page_num, img in enumerate(pages, start=1):
      text = pytesseract.image_to_string(img)
      all_text += f"\n--- Page {page_num} ---\n{text}"

  print(all_text)
constantinum•7mo ago
give https://pg.llmwhisperer.unstract.com/ a try

Tell HN: I write and ship code ~20–50x faster than I did 5 years ago

41•EGreg•1w ago•67 comments

Ask HN: Any Microsoft employees/devs here? What's happening to Microsoft?

19•thehamkercat•4h ago•7 comments

Git analytics that works across GitHub, GitLab, and Bitbucket

2•akhnid•6h ago•1 comments

Ask HN: Launching niche service soon, how should I prepare?

2•thedangler•32m ago•1 comments

Tell HN: Internet Bug Bounty (IBB) on HackerOne Appears Dead, CVEs Unpaid

4•irke882•6h ago•3 comments

Ask HN: What kind of setup do you run for your children?

4•mattwdelong•6h ago•10 comments

Ask HN: How do you use 5–10 minute gaps productively?

34•pea•1d ago•47 comments

Ask HN: Who wants to be hired? (January 2026)

162•whoishiring•4d ago•372 comments

RevisionDojo, a YC startup, is running astroturfing campaigns targeting kids?

444•red-polygon•1d ago•83 comments

Ask HN: What's a standard way for apps to request text completion as a service?

4•nvader•22h ago•2 comments

Ask HN: Who is hiring? (January 2026)

346•whoishiring•4d ago•299 comments

Amazon Prime AI overviews can't even get the basics right

41•PyWoody•18h ago•9 comments

Private Operating System

6•ariatelco•15h ago•5 comments

Ask HN: Has anyone else been struggling with search lately?

25•areoform•19h ago•12 comments

Ask HN: How do small teams make sure recurring tasks don't slip?

4•batels•11h ago•2 comments

Anyone building software for wearable tech?

15•ssc23•1d ago•14 comments

I made a lofi page for late night work

17•onmyway133•1d ago•7 comments

Tell HN: Happy New Year

444•schappim•6d ago•207 comments

Ask HN: Reading list for being a better engineer?

38•drekipus•2d ago•15 comments

What do people usually do with spare Android phones? Any practical use cases?

17•AndroidShare•2d ago•21 comments

Ask HN: Are you missing daily email alerts from HN?

9•unknownhad•1d ago•6 comments

Ask HN: What's the future of software testing and QA?

23•sjgeek•2d ago•17 comments

Ask HN: What's the best talk you've watched?

16•barddoo•21h ago•12 comments

Ask HN: What did you learn in 2025?

18•kiernanmcgowan•2d ago•7 comments

Ask HN: How is your work making the world a better place?

14•AbstractH24•1d ago•12 comments

Tell HN: I'm having the worst career winter of my life

95•mariogintili•4d ago•121 comments

Tell HN: EU soliciting feedback on law that could affect Open Access

9•Quanttek•1d ago•0 comments

How to use AI to augment learning without losing critical thinking skills?

24•mintsuku•4d ago•14 comments

Ask HN: Expository/Succinct Books on Modern Physics

27•rramadass•3d ago•25 comments

Ask HN: Who is using Nebula (mesh VPN)?

8•cdsl•2d ago•6 comments