frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Ask HN: Is WordPress the best way to create new websites for beginner

10•anitroves•3h ago•32 comments

Ask HN: Books about Genetic Algorithms

6•andyjohnson0•1h ago•2 comments

Ask HN: What do SRE do at your company?

2•petemc_•1h ago•2 comments

Ask HN: Is there a bad employers (who have a records of not paying) list?

48•trowa159•7h ago•60 comments

Ask HN: Where is the programming profession going?

152•syntaxbush•3d ago•166 comments

The open source DOCX editor submitted to HN a few weeks ago has been deleted

100•gcanyon•2d ago•43 comments

Ask HN: Is "no source code was copied" still a sufficient copyright defense?

64•oscgam1•2d ago•79 comments

Everyone feared AI taking over; the real danger is AI serving just the few

104•PhilipDaineko•1d ago•68 comments

Ask HN: Smallest amount of working ML weights that can be tattooed on a body?

7•thoughtpeddler•1d ago•4 comments

Ask HN: MacBook vs. Dedicated GPU for LLM

33•mzubairtahir•1d ago•65 comments

I patched llama.cpp to gain 20% prompt processing TPS. Help me make a PR

5•i_am_rocoe•1d ago•2 comments

Ask HN: What do you predict the world will look like in 5-10 years?

8•justanything•1d ago•10 comments

Ask HN: How much coding should beginners learn in the AI era?

36•JohnDSDev•4d ago•49 comments

Ask HN: What GUI/desktop app do you use to keep track of different AI sessions?

4•howToTestFE•1d ago•4 comments

Fast feedback loops is the way

5•skyglider•22h ago•0 comments

Recursive self improvement for human skills

4•rando77•1d ago•2 comments

Ask HN: Has Ilya Sutskever spoken publicly lately?

9•aurenvale•1d ago•1 comments

Tell Zillow: Fee-Simple vs. Leasehold Filter

4•HoldOnAMinute•1d ago•1 comments

Ask HN: Norway bans AI in elementary schools

15•mellosty•2d ago•19 comments

Data Privacy while using API tools

4•11shyam11•1d ago•4 comments

Tell HN: Mojo is becoming open source

8•theanonymousone•1d ago•4 comments

Ask HN: Is there a quiet market for 'no enforced AI' dev jobs?

7•reinhardt•1d ago•10 comments

Ask HN: Techniques for learning things quickly using coding agents?

5•throwaw12•2d ago•2 comments

Roblox parental controls are a dystopian security disaster

23•notsure357•2d ago•5 comments

Ask HN: Who remembers Fry's Electronics – the "church" of IT people?

8•netfortius•2d ago•4 comments

Ask HN: What home printer do you use/recommend?

19•niyazpk•5d ago•23 comments

Ask HN: Running local LLMs? What's your model and hardware

11•alfiedotwtf•1d ago•9 comments

Ask HN: You have one year to make $1M. What's your plan?

12•vantareed•9h ago•17 comments

I feel like VSCode is falling apart

15•othmanosx•3d ago•18 comments

Ask HN: Why does every AI demo sound perfect but real world deployment always

8•VaderAi•2d ago•12 comments
Open in hackernews

Ask HN: Best on device LLM tooling for PDFs?

4•martinald•1y ago
I've got very used to using the "big" LLMs for analysing PDFs

Now llama.cpp has vision support; I tried out PDFs with it locally (via LM Studio) but the results weren't as good as I hoped for. One time it insisted it couldn't do "OCR", but gave me an example of what the data _could_ look like - which was the data.

The other major problem is sometimes PDFs are actually made up of images; and it got super confused on those as well.

Given this is so new I'm struggling to find any tools which make this easier.

Comments

raymond_goo•1y ago
Try something like this

  !pip install pytesseract pdf2image pillow
  !apt install poppler-utils
  #!apt install tesseract-ocr
  from pdf2image import convert_from_path
  import pytesseract

  pages = convert_from_path('k.pdf', dpi=300)

  all_text = ""
  for page_num, img in enumerate(pages, start=1):
      text = pytesseract.image_to_string(img)
      all_text += f"\n--- Page {page_num} ---\n{text}"

  print(all_text)
constantinum•1y ago
give https://pg.llmwhisperer.unstract.com/ a try