frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Ask HN: Best on device LLM tooling for PDFs?

4•martinald•6mo ago
I've got very used to using the "big" LLMs for analysing PDFs

Now llama.cpp has vision support; I tried out PDFs with it locally (via LM Studio) but the results weren't as good as I hoped for. One time it insisted it couldn't do "OCR", but gave me an example of what the data _could_ look like - which was the data.

The other major problem is sometimes PDFs are actually made up of images; and it got super confused on those as well.

Given this is so new I'm struggling to find any tools which make this easier.

Comments

raymond_goo•6mo ago
Try something like this

  !pip install pytesseract pdf2image pillow
  !apt install poppler-utils
  #!apt install tesseract-ocr
  from pdf2image import convert_from_path
  import pytesseract

  pages = convert_from_path('k.pdf', dpi=300)

  all_text = ""
  for page_num, img in enumerate(pages, start=1):
      text = pytesseract.image_to_string(img)
      all_text += f"\n--- Page {page_num} ---\n{text}"

  print(all_text)
constantinum•6mo ago
give https://pg.llmwhisperer.unstract.com/ a try

I changed my address, and TransferWise in two days will empty my account

22•casenmgreen•3h ago•9 comments

Tell HN: Want a better HN? Visit /newest

255•alecco•9h ago•82 comments

Tell HN: Happy Thanksgiving

783•prodigycorp•1d ago•191 comments

TermoSlack – A Terminal Based Slack Client

5•adhyys•6h ago•1 comments

Ask HN: Which cloud provider do you like best and why?

11•trio8453•8h ago•10 comments

Ask HN: What is the purpose of all these AI spam comments?

68•GaryBluto•4h ago•44 comments

A 27M parameter model beating LLMs on reasoning tasks

4•SteadySurfdom•10h ago•0 comments

Ask HN: Why don't closed captions boldface words that are likely to be misheard?

3•amichail•5h ago•2 comments

Ask HN: Hearing aid wearers, what's hot?

354•pugworthy•4d ago•209 comments

Ask HN: What open source projects are you grateful for?

24•jayzalowitz•1d ago•27 comments

Ask HN: Type 1 Hypervisor for use on a laptop?

3•jaitaiwan•18h ago•5 comments

Color.io Is Going Offline

24•hilti•2d ago•15 comments

Why is OpenAI lying about the data its collecting on users?

16•kypro•2d ago•12 comments

Ask HN: Scheduling stateful nodes when MMAP makes memory accounting a lie

23•leo_e•4d ago•21 comments

Ask HN: Good resources to learn financial systems engineering?

137•_1tan•5d ago•28 comments

Ask HN: Do AIs reply with numerous em dashes to save money somehow?

6•amichail•1d ago•8 comments

Ask HN: Should account creation/origin country be displayed on HN profiles?

26•megraf•3d ago•36 comments

Can Management Be Outsourced?

9•ymanagers•1d ago•8 comments

Vibro-Braille for Deaf-Blind

2•Billiamdan•1d ago•0 comments

Enterprise security can be messy: Building a Security-Aware Culture

2•rezliant•1d ago•8 comments

Tell HN: Happy Thanksgiving

2•turkeyboi•1d ago•0 comments

Tell HN: Happy Thanksgiving – Grateful

4•emreb•1d ago•3 comments

Ask HN: Would you use a fast/cheap "prior art" service instead of a patent?

5•shaheeniquebal•1d ago•6 comments

Tell HN: Stall AI progress for the benefit of humanity

9•blutoot•1d ago•14 comments

Ask HN: What work problems would your company pay to solve?

16•aryanchaurasia•4d ago•16 comments

Ask HN: What did Stripe change (Value Add)?

7•dzonga•3d ago•9 comments

Ask HN: Opinions on facial recognition at air ports?

5•bjourne•3d ago•31 comments

Tell HN: Wanted to give dang appreciation

65•razodactyl•5d ago•5 comments

A logging loop in GKE cost me $1,300 in 3 days – 9.2x my actual infrastructure

9•nthypes•4d ago•4 comments

Tell HN: Cursor charged 19 subscriptions, won't refund

16•devtailz•4d ago•7 comments