frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Ask HN: Best on device LLM tooling for PDFs?

4•martinald•1y ago
I've got very used to using the "big" LLMs for analysing PDFs

Now llama.cpp has vision support; I tried out PDFs with it locally (via LM Studio) but the results weren't as good as I hoped for. One time it insisted it couldn't do "OCR", but gave me an example of what the data _could_ look like - which was the data.

The other major problem is sometimes PDFs are actually made up of images; and it got super confused on those as well.

Given this is so new I'm struggling to find any tools which make this easier.

Comments

raymond_goo•1y ago
Try something like this

  !pip install pytesseract pdf2image pillow
  !apt install poppler-utils
  #!apt install tesseract-ocr
  from pdf2image import convert_from_path
  import pytesseract

  pages = convert_from_path('k.pdf', dpi=300)

  all_text = ""
  for page_num, img in enumerate(pages, start=1):
      text = pytesseract.image_to_string(img)
      all_text += f"\n--- Page {page_num} ---\n{text}"

  print(all_text)
constantinum•1y ago
give https://pg.llmwhisperer.unstract.com/ a try

Ask HN: Norway bans AI in elementary schools

6•mellosty•3h ago•5 comments

Ask HN: Where is the programming profession going?

106•syntaxbush•1d ago•111 comments

Tell HN: OpenAI has started putting ads on paid programs

108•shantnutiwari•13h ago•54 comments

Ask HN: How much coding should beginners learn in the AI era?

30•JohnDSDev•1d ago•41 comments

Decoupling Compute and Memory for Async GPUs

7•yiyingzhang•9h ago•2 comments

Trying to recover from thin content penalty from Google

4•anitroves•7h ago•2 comments

Ask HN: What surprised you about Estonia e-Residency and running an Estonian OÜ?

75•jvilalta•12h ago•62 comments

My website gets more attacks than human visitors

2•tommy2970•8h ago•1 comments

I feel like VSCode is falling apart

3•othmanosx•8h ago•2 comments

Ask HN: Quickbooks Alternative?

2•bix6•9h ago•0 comments

Google AI overview for "keynesian economics" is written in Korean

4•something765478•9h ago•2 comments

Ask HN: Do you thank your agents when they did a good job?

5•ex-aws-dude•11h ago•9 comments

As; HN: I was curious why MTP affects PP TPS in llama.cpp. My PoC recovers it?

2•i_am_rocoe•12h ago•1 comments

Got access to Gemini's actual thinking

4•StizzurpXDD•17h ago•0 comments

Ask HN: What home printer do you use/recommend?

18•niyazpk•2d ago•21 comments

Ask HN: What are the hardest problems AWS Lambda MicroVMs can solve now?

6•iaziz786•1d ago•1 comments

Ask HN: What is one thing about AI that annoys you the most?

4•akashwadhwani35•7h ago•6 comments

Ask HN: Will programmers write more efficient code during the memory shortage?

153•amichail•6d ago•246 comments

Overfitted a 900KB Transformer to Compress a 100MB CSV into 7MB

8•spidy__•2d ago•10 comments

Ask HN: Yahoo deleted all my emails. Now what?

15•neya•1d ago•12 comments

How to find AI-conservative companies to work for?

20•tossitawayplz•2d ago•12 comments

Ask HN: Anthropic banned me from using Claude Code and I don't know what to do

81•ayi•2d ago•93 comments

Ask HN: Why don't LLM harnesses enable/expose custom middleware hooks?

8•fur-tea-laser•1d ago•7 comments

Ask HN: Am I missing something with AI

15•vasko•2d ago•22 comments

Ask HN: Is anyone using the A2A protocol?

96•asim•1w ago•45 comments

Ask HN: I miss old days of blogging without promotions

8•throwaw12•1d ago•12 comments

Ask HN: What tools are you using for AI-assisted code review?

25•agos•1w ago•30 comments

Tell HN: I never bought anything from clicking on a paid ad

23•julienreszka•3d ago•29 comments

Ask HN: How are you finding work/gigs as a SWE?

10•mariopt•1d ago•7 comments

Anyone else feels many LLMs are heavily biased towards consumerism these days?

8•pyeri•1d ago•4 comments