frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Ask HN: Best on device LLM tooling for PDFs?

4•martinald•6mo ago
I've got very used to using the "big" LLMs for analysing PDFs

Now llama.cpp has vision support; I tried out PDFs with it locally (via LM Studio) but the results weren't as good as I hoped for. One time it insisted it couldn't do "OCR", but gave me an example of what the data _could_ look like - which was the data.

The other major problem is sometimes PDFs are actually made up of images; and it got super confused on those as well.

Given this is so new I'm struggling to find any tools which make this easier.

Comments

raymond_goo•6mo ago
Try something like this

  !pip install pytesseract pdf2image pillow
  !apt install poppler-utils
  #!apt install tesseract-ocr
  from pdf2image import convert_from_path
  import pytesseract

  pages = convert_from_path('k.pdf', dpi=300)

  all_text = ""
  for page_num, img in enumerate(pages, start=1):
      text = pytesseract.image_to_string(img)
      all_text += f"\n--- Page {page_num} ---\n{text}"

  print(all_text)
constantinum•6mo ago
give https://pg.llmwhisperer.unstract.com/ a try

Ask HN: Is Cloudflare Down for You?

15•Palmik•3h ago•10 comments

Ask HN: Who wants to be hired? (December 2025)

152•whoishiring•2d ago•344 comments

Ask HN: I haven't had to buy a Windows computer in 20 years

8•meifun•9h ago•15 comments

Ask HN: Quality of recent gens of Dell/Lenovo laptops worse than 10 years ago?

102•ferguess_k•2d ago•183 comments

Ask HN: Does anything beat Hetzner Storage Boxes for the price?

4•opengrass•4h ago•0 comments

Ask HN: Who is hiring? (December 2025)

298•whoishiring•2d ago•439 comments

Ask HN: Selling one's self

4•it_is_beautiful•5h ago•4 comments

Ask HN: What is better to use lead-free/leaded solder?

9•DenisDolya•13h ago•9 comments

Agentic QA – Open-source middleware to fuzz-test agents for loops

2•Saurabh_Kumar_•9h ago•0 comments

Microsoft won't let me pay a $24 bill, blocking thousands in Azure spending

185•Javin007•1d ago•102 comments

Ask HN: How can a web Senior SWE move into a good game-dev or game-related job?

3•llll_lllllll_l•10h ago•9 comments

We're 15 and 17, used our data science skill to build an AI social media manager

4•akshat_wyna•15h ago•2 comments

Are RGB LED installations contributing to e-waste and energy waste?

3•emmasuntech•15h ago•1 comments

Ask HN: What has been your experience with Agentic Coding?

7•grandimam•1d ago•5 comments

Tell HN: It's now impossible to disable all AI features in Firefox 145 (latest)

72•pera•3d ago•24 comments

Ask HN: Why is everyone in tech so performative/two faced

17•bunnybomb2•21h ago•22 comments

Tell HN: Regrets. Think carefully about how you spend your time

253•anonymous_ibex•3d ago•130 comments

Regarding Thien-Thi Nguyen

366•SmolCloud•2d ago•8 comments

Ask HN: What fiction books would you recommend for programmers?

10•superconduct123•1d ago•21 comments

Ask HN: Is it OK to look at AoC solutions?

6•ifh-hn•1d ago•11 comments

Ask HN: Which web browser are you using and why?

5•throwaway81998•1d ago•15 comments

Ask HN: What did onboarding training look like in OS kernel teams?

9•markus_zhang•1d ago•2 comments

Ask HN: Any experience using LLMs to license-wash FOSS projects?

6•pera•1d ago•7 comments

Ask HN: Who is fundraising? (December 2025)

4•surprisetalk•1d ago•2 comments

Ask HN: Looking for "invisible" OSS projects to donate to for Cybermonday

7•Paradigm2020•2d ago•1 comments

Ask HN: Recruiters, does contractor vs. FTE matter?

9•salt-thrower•1d ago•3 comments

Tell HN: Want a better HN? Visit /newest

297•alecco•5d ago•85 comments

Ask HN: Seeking advice in how to deal with frustration against big corps as dev

5•_nhh•1d ago•4 comments

How do you handle lost webhooks in production?

14•everydaydev•2d ago•11 comments

Ask HN: What are you working on? (Dec 2025)

19•burgerquizz•1d ago•21 comments