frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Ask HN: Building a solo business is impossible?

34•fnoef•14h ago•53 comments

Ask HN: Teaching life skills through games, am I crazy?

2•shivaniShimpi_•5h ago•2 comments

Ask HN: Who is using OpenClaw?

333•misterchocolat•2d ago•373 comments

Opus 4.7 is horrible at writing

16•limalabs•19h ago•19 comments

Tell HN: Fiverr left customer files public and searchable

818•morpheuskafka•3d ago•231 comments

Ask HN: How do you maintain flow when vibe coding?

29•fny•1d ago•29 comments

Tell HN: 48 absurd web projects – one every month

74•absurdwebsite•1d ago•25 comments

Ask HN: How did you get your first users with zero audience?

14•arikusi•23h ago•7 comments

Ask HN: Getting depressed day by day, how to cope?

12•throwaw12•10h ago•13 comments

Aliens.gov Resolves – To a WordPress "Site Not Found" Error

10•ascarola•23h ago•5 comments

Ask HN: How are you using LLMs in production?

8•Anon84•1d ago•9 comments

Ask HN: How do you find motivation to do stuff?

24•RockstarSprain•2d ago•23 comments

Advice for tracking down a listening device?

8•comrade1234•1d ago•5 comments

Ask HN: Who is your favourite Entrepreneur/Visionary?

13•wasimsk•1d ago•31 comments

Durable Object alarm loop: $34k in 8 days, zero users, no platform warning

27•thewillmoss•1d ago•2 comments

Ask HN: How are you actively keeping your thinking sharp while using LLMs daily?

12•smonk108•1d ago•10 comments

Ask HN: How to highlight talent from untraditional backgrounds?

6•etherus•1d ago•4 comments

Tell HN: Anthropic no longer allows you to fix to specific model version

25•baobabKoodaa•2d ago•2 comments

Ask HN: Is Claude Getting Worse?

8•sahli•2d ago•19 comments

Ask HN: As an Australian, is it possible to get a remote US role?

4•apatheticonion•2d ago•8 comments

GitHub gave webhook secrets away in webhook call

12•time4tea•3d ago•1 comments

Ask HN: SeedLegals Partnerships in London, worth it?

2•pain_perdu•1d ago•1 comments

Ask HN: LeetCode, anyone still doing it?

19•kwar13•3d ago•14 comments

Tell HN: GitHub might have been leaking your webhook secrets. Check your emails.

42•ssiddharth•3d ago•12 comments

Any engineers here with experience of clinical data standards?

2•kalturnbull•2d ago•0 comments

Why most AI projects feel useless

10•vaishcodescape•4d ago•9 comments

Ask HN: I quit my job over weaponized robots to start my own venture

118•barratia•3d ago•86 comments

Ask HN: Robotics engineers – how painful was setting up GPU sim infra?

5•nikhilol•2d ago•6 comments

Ask HN: What's the point in creating a startup when anyone can copy it in days?

18•wewewedxfgdf•2d ago•25 comments

What do you want out of a coding monospace font?

2•d0able•2d ago•9 comments
Open in hackernews

Ask HN: Best on device LLM tooling for PDFs?

4•martinald•11mo ago
I've got very used to using the "big" LLMs for analysing PDFs

Now llama.cpp has vision support; I tried out PDFs with it locally (via LM Studio) but the results weren't as good as I hoped for. One time it insisted it couldn't do "OCR", but gave me an example of what the data _could_ look like - which was the data.

The other major problem is sometimes PDFs are actually made up of images; and it got super confused on those as well.

Given this is so new I'm struggling to find any tools which make this easier.

Comments

raymond_goo•11mo ago
Try something like this

  !pip install pytesseract pdf2image pillow
  !apt install poppler-utils
  #!apt install tesseract-ocr
  from pdf2image import convert_from_path
  import pytesseract

  pages = convert_from_path('k.pdf', dpi=300)

  all_text = ""
  for page_num, img in enumerate(pages, start=1):
      text = pytesseract.image_to_string(img)
      all_text += f"\n--- Page {page_num} ---\n{text}"

  print(all_text)
constantinum•11mo ago
give https://pg.llmwhisperer.unstract.com/ a try