frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Ask HN: Best on device LLM tooling for PDFs?

4•martinald•5mo ago
I've got very used to using the "big" LLMs for analysing PDFs

Now llama.cpp has vision support; I tried out PDFs with it locally (via LM Studio) but the results weren't as good as I hoped for. One time it insisted it couldn't do "OCR", but gave me an example of what the data _could_ look like - which was the data.

The other major problem is sometimes PDFs are actually made up of images; and it got super confused on those as well.

Given this is so new I'm struggling to find any tools which make this easier.

Comments

raymond_goo•5mo ago
Try something like this

  !pip install pytesseract pdf2image pillow
  !apt install poppler-utils
  #!apt install tesseract-ocr
  from pdf2image import convert_from_path
  import pytesseract

  pages = convert_from_path('k.pdf', dpi=300)

  all_text = ""
  for page_num, img in enumerate(pages, start=1):
      text = pytesseract.image_to_string(img)
      all_text += f"\n--- Page {page_num} ---\n{text}"

  print(all_text)
constantinum•5mo ago
give https://pg.llmwhisperer.unstract.com/ a try

Ask HN: What Are You Working On? (Nov 2025)

431•david927•2d ago•1254 comments

Ask HN: Senior people, how did your career evolve?

58•Seb-C•10h ago•37 comments

Ask HN: Effective way to deal with mosquitoes?

13•simonebrunozzi•4h ago•17 comments

Fashaion Tryon Tool for Fashion Brands

3•jimmydesi9•2h ago•0 comments

Ask HN: Building Privacy-Compliant LLM Apps (e.g. Section 203 StGB)

4•privacycurios•5h ago•1 comments

GitLab – do you host one? Or use the cloud?

3•roscas•5h ago•2 comments

Ask HN: Do businesses want to leave the cloud and return to installable apps?

5•cyrusradfar•5h ago•2 comments

Ask HN: How would you set up a child’s first Linux computer?

217•evolve2k•2d ago•290 comments

Abac-engine: Lean zero-dep ABAC PDP at 9.37 µs with minimal PAP UI

2•astralstriker•12h ago•0 comments

Ask HN: How do you get over the fear of sharing code?

72•sodokuwizard•2d ago•90 comments

Ask HN: Anyone Using Gleam in Production?

8•akudha•22h ago•3 comments

Ask HN: My family business runs on a 1993-era text-based-UI (TUI). Anybody else?

315•urnicus•6d ago•307 comments

Tell HN: X is opening any tweet link in a webview whether you press it or not

646•stillatit•1w ago•515 comments

Ask HN: Why has typing on a phone not improved in ~20 years?

11•mvkel•1d ago•17 comments

Ask HN: Who is hiring? (November 2025)

400•whoishiring•1w ago•565 comments

Ask HN: How to grow and become more employable when working with outdated tech?

4•mattfrommars•1d ago•7 comments

Ask HN: Who wants to be hired? (November 2025)

197•whoishiring•1w ago•468 comments

Is there open source alternative for VAPI or retellai?

7•p_srivastav•21h ago•5 comments

Supply Chain Alert: Sipeed's Official COMTools Software Flagged as Trojan

5•dripmet•1d ago•2 comments

When the Firefighter Looks Like the Arsonist: AI Safety Needs IRL Accountability

4•fawkesg•1d ago•0 comments

Tell HN: Mechanical Turk is twenty years old today

94•csmoak•1w ago•62 comments

Ask HN: Where to begin with "modern" Emacs?

225•weakfish•1w ago•121 comments

Ask HN: Why do designers have repugnant websites?

15•admissionsguy•2d ago•10 comments

Valori – A Python-native Vector Database I built from scratch

9•varshith17•2d ago•11 comments

Ask HN: Do you let your kids use ChatGPT?

7•eibrahim•1d ago•10 comments

Ask HN: How do you deal with eye strain as a developer?

6•deterministic•1d ago•9 comments

Ask HN: Any actual AI projects in production at bigcorp?

5•meetingthrower•2d ago•6 comments

Ask HN: Is AI code assistance fundamentally unenforceable without hooks?

4•meloncafe•2d ago•2 comments

You've reached the end!