frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Ask HN: Best on device LLM tooling for PDFs?

4•martinald•1y ago
I've got very used to using the "big" LLMs for analysing PDFs

Now llama.cpp has vision support; I tried out PDFs with it locally (via LM Studio) but the results weren't as good as I hoped for. One time it insisted it couldn't do "OCR", but gave me an example of what the data _could_ look like - which was the data.

The other major problem is sometimes PDFs are actually made up of images; and it got super confused on those as well.

Given this is so new I'm struggling to find any tools which make this easier.

Comments

raymond_goo•1y ago
Try something like this

  !pip install pytesseract pdf2image pillow
  !apt install poppler-utils
  #!apt install tesseract-ocr
  from pdf2image import convert_from_path
  import pytesseract

  pages = convert_from_path('k.pdf', dpi=300)

  all_text = ""
  for page_num, img in enumerate(pages, start=1):
      text = pytesseract.image_to_string(img)
      all_text += f"\n--- Page {page_num} ---\n{text}"

  print(all_text)
constantinum•1y ago
give https://pg.llmwhisperer.unstract.com/ a try

Ask HN: How are thinking efforts implemented?

66•simianwords•4d ago•25 comments

Ask HN: Favorite text heavy blogs that are a joy to read?

88•joshmarinacci•2d ago•24 comments

Ask HN: Want to build something open source on nights and weekends together?

27•vira28•23h ago•8 comments

Are jobs and the world going to be like this, moving forward?

5•chand190•54m ago•2 comments

Ask HN: Would it be useful to have a slop button in addition to flag?

28•BugsJustFindMe•1d ago•17 comments

Ask HN: How do you get into a flow state when using AI to code?

83•kilroy123•13h ago•102 comments

I'm Eric Ries, author of "The Lean Startup" and new book "Incorruptible" – AMA

775•eries•1d ago•550 comments

Sophia NLU Home Assistant – On Device, Low Compute, No Internet, Voice Assistant

2•aquila416•1h ago•0 comments

Notes on DeepSeek

202•vinhnx•1d ago•138 comments

I procrastinate by building tools to stop me from procrastinating: A sad story

15•thisislorenzov•17h ago•7 comments

AWS Bedrock to require sharing data with Anthropic for Mythos and future models

416•TomAnthony•1d ago•251 comments

Ask HN: Are most corporate SWE jobs performative?

240•hnthrow10282910•1d ago•277 comments

Ask HN: Why hasn't there been a real competitor to Ticketmaster yet?

259•mdni007•3d ago•235 comments

I vibe coded a world cup cheer guide for fans

2•shark-salvo•4h ago•1 comments

Ask HN: Agents get dumber before release of new model version?

8•sporkland•13h ago•5 comments

Ask HN: Is anyone shorting the overspend in AI yet?

16•ggm•23h ago•12 comments

Ask HN: Is there a metric for AI code quality?

4•fractalf•17h ago•3 comments

Ask HN: What internal tool did you build that became a product?

6•nehpets•1d ago•4 comments

Ask HN: What are tools you have made for yourself since the advent of AI?

435•aryamaan•3d ago•757 comments

Ask HN: What is the long term purpose of Google releasing free offline models?

2•filup•3h ago•6 comments

Ask HN: Are you still using a Vision Pro?

166•y1n0•2d ago•211 comments

Ask HN: Just me feeling that Mythos/Fabel just 1% there?

4•punnerud•1d ago•5 comments

Ask HN: Is anyone else seeing a Slack auth bug?

2•HoyaSaxa•15h ago•0 comments

Ask HN: What coding agents are you using?

8•linzhangrun•1d ago•13 comments

I added a prompt to future ASI – TLBIC Policy Proposal v5 now available

2•michikawa59•20h ago•0 comments

Ask HN: Temporal Awareness in LLM?

2•Pamar•21h ago•0 comments

Discussion: Fable 5 is weak at flagging prompts correctly

2•eckelhesten•21h ago•0 comments

Ask HN: Any Local LLM can I run without GPU for Local Agentic workflow AI?

7•limondas•22h ago•4 comments

Ask HN: Did Anthropic Just Win?

4•lnenad•1d ago•11 comments

Tell HN: np.reddit.com now redirects to www.reddit.com

5•kevinwang•12h ago•2 comments