frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Ask HN: Best on device LLM tooling for PDFs?

4•martinald•1y ago
I've got very used to using the "big" LLMs for analysing PDFs

Now llama.cpp has vision support; I tried out PDFs with it locally (via LM Studio) but the results weren't as good as I hoped for. One time it insisted it couldn't do "OCR", but gave me an example of what the data _could_ look like - which was the data.

The other major problem is sometimes PDFs are actually made up of images; and it got super confused on those as well.

Given this is so new I'm struggling to find any tools which make this easier.

Comments

raymond_goo•1y ago
Try something like this

  !pip install pytesseract pdf2image pillow
  !apt install poppler-utils
  #!apt install tesseract-ocr
  from pdf2image import convert_from_path
  import pytesseract

  pages = convert_from_path('k.pdf', dpi=300)

  all_text = ""
  for page_num, img in enumerate(pages, start=1):
      text = pytesseract.image_to_string(img)
      all_text += f"\n--- Page {page_num} ---\n{text}"

  print(all_text)
constantinum•1y ago
give https://pg.llmwhisperer.unstract.com/ a try

AWS Bedrock to require sharing data with Anthropic for Mythos and future models

270•TomAnthony•6h ago•172 comments

Ask HN: Why hasn't there been a real competitor to Ticketmaster yet?

253•mdni007•1d ago•230 comments

Ask HN: Are you still using a Vision Pro?

159•y1n0•20h ago•199 comments

Ask HN: What are tools you have made for yourself since the advent of AI?

422•aryamaan•1d ago•723 comments

Ask HN: Are we all walking into a trap?

5•skor•5h ago•10 comments

Ask HN: Are most corporate SWE jobs performative?

46•hnthrow10282910•1h ago•56 comments

Ask HN: Favorite text heavy blogs the are a joy to read

32•joshmarinacci•19h ago•13 comments

Tell HN: Codex once again automatically activates /fast on app update

2•mfi•5h ago•0 comments

Ask HN: What was your "oh shit" moment with GenAI?

731•andrehacker•5d ago•1109 comments

Ask HN: Prediction for SpaceX IPO?

6•bix6•19h ago•10 comments

Ask HN: So what happened to Facebook "localhost" tracking?

107•juliusceasar•6d ago•102 comments

Ask HN: How to escalate a rejected Google extension?

23•modzu•1d ago•13 comments

Ask HN: How do you cope when your startup contracts?

14•jasonephraim•1d ago•13 comments

Ask HN: What is your (AI) dev tech stack / workflow?

166•dv35z•4d ago•135 comments

Ask HN: Why is the HN crowd so anti-AI?

457•Ekami•4d ago•758 comments

Ask HN: Which companies gained a competitive edge purely via engineering?

5•j1000•2d ago•9 comments

Ask HN: What's your favorite HN Recap like podcast?

5•randomor•2d ago•2 comments

Ask HN: How do you find deep technical content?

40•f311a•6d ago•28 comments

LeetCode is the best way to learn a new language

2•JasonHEIN•1d ago•4 comments

Ask HN: How to get your child interested in math?

3•gitowiec•1d ago•7 comments

Ask HN: Options for critical thinking and learning outside work?

6•hnthrow10282910•1d ago•5 comments

Ask HN: How are thinking efforts implemented?

26•simianwords•3d ago•17 comments

Ask HN: Gin rummy strategies

24•bix6•6d ago•4 comments

Ask HN: Where do you like to consume content anymore?

6•selectedambient•1d ago•10 comments

Ask HN: What is happening with the Meta Ads dashboard?

4•ramon156•2d ago•0 comments

Ask HN: Were CS profs right to look down on programming in light of modern AI?

6•amichail•4d ago•3 comments

Bad MCP design costs your agent 5x more tokens

15•JohnnyZhang483•5d ago•1 comments

Ask HN: Is there any data on whether users prefer voice/chatbot experiences?

2•fnimick•1d ago•4 comments

Authorization via Gmail and Apple ID Banned in Russia

8•levleontiev•19h ago•1 comments

Ask HN: How do PaaS hosting providers enforce user policy compliance?

2•iishanto•1d ago•0 comments