frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Ask HN: Best on device LLM tooling for PDFs?

4•martinald•6mo ago
I've got very used to using the "big" LLMs for analysing PDFs

Now llama.cpp has vision support; I tried out PDFs with it locally (via LM Studio) but the results weren't as good as I hoped for. One time it insisted it couldn't do "OCR", but gave me an example of what the data _could_ look like - which was the data.

The other major problem is sometimes PDFs are actually made up of images; and it got super confused on those as well.

Given this is so new I'm struggling to find any tools which make this easier.

Comments

raymond_goo•6mo ago
Try something like this

  !pip install pytesseract pdf2image pillow
  !apt install poppler-utils
  #!apt install tesseract-ocr
  from pdf2image import convert_from_path
  import pytesseract

  pages = convert_from_path('k.pdf', dpi=300)

  all_text = ""
  for page_num, img in enumerate(pages, start=1):
      text = pytesseract.image_to_string(img)
      all_text += f"\n--- Page {page_num} ---\n{text}"

  print(all_text)
constantinum•6mo ago
give https://pg.llmwhisperer.unstract.com/ a try

Tell HN: It's now impossible to disable all AI features in Firefox 145 (latest)

39•pera•12h ago•15 comments

Tell HN: Regrets. Think carefully about how you spend your time

172•anonymous_ibex•22h ago•90 comments

I Stopped Performing Online and Started Building Again

13•truelinux1•10h ago•8 comments

A marketplace that kills the hidden ad-waste tax in e-commerce

3•Baqqla•9h ago•0 comments

Tell HN: Happy Thanksgiving

798•prodigycorp•3d ago•195 comments

Tell HN: Want a better HN? Visit /newest

288•alecco•2d ago•85 comments

Tell HN: I'm posting this while in flight over Atlantic Ocean

13•novateg•20h ago•8 comments

I changed my address, and TransferWise in two days will empty my account

36•casenmgreen•2d ago•27 comments

Ask HN: Hearing aid wearers, what's hot?

356•pugworthy•6d ago•209 comments

Ask HN: Which cloud provider do you like best and why?

15•trio8453•2d ago•18 comments

Ask HN: How do you verify front-end code in agentic LLM coding loops?

7•eugene-kim•2d ago•2 comments

Ask HN: Good resources to learn financial systems engineering?

138•_1tan•1w ago•28 comments

Ask HN: Scheduling stateful nodes when MMAP makes memory accounting a lie

24•leo_e•6d ago•25 comments

Optimze It for My GPU

2•rncode•1d ago•0 comments

Ask HN: What open source projects are you grateful for?

26•jayzalowitz•3d ago•35 comments

Ask HN: Practicality of harnessing geomagnetic fields for electrical generation?

4•keepamovin•1d ago•5 comments

TermoSlack – A Terminal Based Slack Client

7•adhyys•2d ago•1 comments

Color.io Is Going Offline

25•hilti•4d ago•16 comments

A 27M parameter model beating LLMs on reasoning tasks

6•SteadySurfdom•2d ago•1 comments

Why is OpenAI lying about the data its collecting on users?

19•kypro•3d ago•14 comments

Ask HN: Should account creation/origin country be displayed on HN profiles?

26•megraf•5d ago•37 comments

Can Management Be Outsourced?

10•ymanagers•3d ago•8 comments

Ask HN: Do AIs reply with numerous em dashes to save money somehow?

6•amichail•3d ago•8 comments

You've reached the end!