frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Ask HN: Best on device LLM tooling for PDFs?

4•martinald•1y ago
I've got very used to using the "big" LLMs for analysing PDFs

Now llama.cpp has vision support; I tried out PDFs with it locally (via LM Studio) but the results weren't as good as I hoped for. One time it insisted it couldn't do "OCR", but gave me an example of what the data _could_ look like - which was the data.

The other major problem is sometimes PDFs are actually made up of images; and it got super confused on those as well.

Given this is so new I'm struggling to find any tools which make this easier.

Comments

raymond_goo•1y ago
Try something like this

  !pip install pytesseract pdf2image pillow
  !apt install poppler-utils
  #!apt install tesseract-ocr
  from pdf2image import convert_from_path
  import pytesseract

  pages = convert_from_path('k.pdf', dpi=300)

  all_text = ""
  for page_num, img in enumerate(pages, start=1):
      text = pytesseract.image_to_string(img)
      all_text += f"\n--- Page {page_num} ---\n{text}"

  print(all_text)
constantinum•1y ago
give https://pg.llmwhisperer.unstract.com/ a try

Ask HN: Has anyone replaced Claude/GPT with a local model for daily coding?

1089•cloudking•21h ago•469 comments

Ask HN: What do people think of Apple's Siri?

2•hireshbrem•8m ago•0 comments

I indexed 669 GB of my GoPro videos using my M1 Max computer and local ML models

429•iliashad•1d ago•113 comments

Ask HN: What are you working on? (June 2026)

302•david927•1d ago•1086 comments

Anthropic pauses credit change for Claude Code

29•fabianlindfors•15h ago•7 comments

MistralAI's Le Chaton Fat Tops Web Dev Benchmark

5•nsoonhui•2h ago•0 comments

Ask HN: How are thinking efforts implemented?

101•simianwords•1w ago•30 comments

Newer macOS runs slower on Intel (undeniably) – on purpose or "accident"?

13•srevenant•17h ago•13 comments

I'm Eric Ries, author of "The Lean Startup" and new book "Incorruptible" – AMA

802•eries•5d ago•576 comments

Ask HN: Why hasn't there been a real competitor to Ticketmaster yet?

262•mdni007•1w ago•239 comments

Ask HN: How long have you been looking for a job?

10•baddash•12h ago•5 comments

Ask HN: Favorite text heavy blogs that are a joy to read?

115•joshmarinacci•6d ago•30 comments

AWS Bedrock to require sharing data with Anthropic for Mythos and future models

426•TomAnthony•6d ago•254 comments

Ask HN: Want to build something open source on nights and weekends together?

39•vira28•5d ago•18 comments

Ask HN: Do you remember when you gained consciousness? What was it like?

11•kelseyfrog•13h ago•10 comments

Notes on DeepSeek

208•vinhnx•5d ago•140 comments

What I have done with Claude Code in the last 60 days being a non tech person

7•sahiltll•1d ago•16 comments

Ask HN: Would it be useful to have a slop button in addition to flag?

37•BugsJustFindMe•5d ago•23 comments

Ask HN: Why does LLMs love the usage of –?

4•reimertz•18h ago•4 comments

Ask HN: What Jobs (Roles) are in the best position to take advantage of AI?

2•CWhiting•7h ago•1 comments

Tell HN: Claude is completely unusable for biology

14•Protostome•1d ago•3 comments

Ask HN: Are most corporate SWE jobs performative?

248•hnthrow10282910•5d ago•282 comments

Ask HN: What are tools you have made for yourself since the advent of AI?

442•aryamaan•1w ago•779 comments

Ask HN: What agentic directory structure do you use?

7•dominiek•23h ago•1 comments

Ask HN: How do you get into a flow state when using AI to code?

95•kilroy123•4d ago•122 comments

Is there a name for the type of comments agents add where they leak the prompt?

18•xdennis•2d ago•8 comments

I procrastinate by building tools to stop me from procrastinating: A sad story

20•thisislorenzov•5d ago•10 comments

Ask HN: How are you adapting technical interviews in this agentic era?

8•jcgr•17h ago•0 comments

Ask HN: Are you still using a Vision Pro?

171•y1n0•6d ago•215 comments

Ask HN: What does your local LLM setup looks like?

6•the-mitr•1d ago•7 comments