frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Ask HN: Best on device LLM tooling for PDFs?

4•martinald•11mo ago
I've got very used to using the "big" LLMs for analysing PDFs

Now llama.cpp has vision support; I tried out PDFs with it locally (via LM Studio) but the results weren't as good as I hoped for. One time it insisted it couldn't do "OCR", but gave me an example of what the data _could_ look like - which was the data.

The other major problem is sometimes PDFs are actually made up of images; and it got super confused on those as well.

Given this is so new I'm struggling to find any tools which make this easier.

Comments

raymond_goo•11mo ago
Try something like this

  !pip install pytesseract pdf2image pillow
  !apt install poppler-utils
  #!apt install tesseract-ocr
  from pdf2image import convert_from_path
  import pytesseract

  pages = convert_from_path('k.pdf', dpi=300)

  all_text = ""
  for page_num, img in enumerate(pages, start=1):
      text = pytesseract.image_to_string(img)
      all_text += f"\n--- Page {page_num} ---\n{text}"

  print(all_text)
constantinum•11mo ago
give https://pg.llmwhisperer.unstract.com/ a try

Ask HN: How do you maintain flow when vibe coding?

14•fny•3h ago•12 comments

Tell HN: 48 absurd web projects – one every month

66•absurdwebsite•10h ago•24 comments

Ask HN: Who is using OpenClaw?

318•misterchocolat•1d ago•363 comments

Tell HN: Fiverr left customer files public and searchable

813•morpheuskafka•2d ago•230 comments

Advice for tracking down a listening device?

6•comrade1234•11h ago•5 comments

Ask HN: Who is your favourite Entrepreneur/Visionary?

11•wasimsk•15h ago•26 comments

Ask HN: How are you using LLMs in production?

3•Anon84•10h ago•5 comments

Ask HN: How are you actively keeping your thinking sharp while using LLMs daily?

6•smonk108•11h ago•7 comments

Durable Object alarm loop: $34k in 8 days, zero users, no platform warning

17•thewillmoss•22h ago•1 comments

Ask HN: How do you find motivation to do stuff?

21•RockstarSprain•1d ago•22 comments

Ask HN: SeedLegals Partnerships in London, worth it?

2•pain_perdu•17h ago•1 comments

Ask HN: How to highlight talent from untraditional backgrounds?

4•etherus•17h ago•2 comments

Tell HN: Anthropic no longer allows you to fix to specific model version

23•baobabKoodaa•1d ago•1 comments

Ask HN: As an Australian, is it possible to get a remote US role?

4•apatheticonion•1d ago•8 comments

Any engineers here with experience of clinical data standards?

2•kalturnbull•1d ago•0 comments

GitHub gave webhook secrets away in webhook call

11•time4tea•2d ago•1 comments

Ask HN: Is Claude Getting Worse?

6•sahli•1d ago•13 comments

Tell HN: GitHub might have been leaking your webhook secrets. Check your emails.

39•ssiddharth•2d ago•12 comments

Ask HN: Robotics engineers – how painful was setting up GPU sim infra?

4•nikhilol•1d ago•7 comments

Ask HN: What's the point in creating a startup when anyone can copy it in days?

17•wewewedxfgdf•1d ago•24 comments

Ask HN: I quit my job over weaponized robots to start my own venture

116•barratia•2d ago•82 comments

Ask HN: Can anyone suggest me a SaaS product idea?

4•wasimsk•1d ago•11 comments

What do you want out of a coding monospace font?

2•d0able•1d ago•7 comments

Ask HN: LeetCode, anyone still doing it?

14•kwar13•2d ago•12 comments

Ask HN: Are you negatively affected by the recent economic stagnation?

9•adinhitlore•3d ago•22 comments

Ask HN: Are Web Agencies Cooked?

10•mijustin•2d ago•9 comments

Ask HN: What's your favorite security cam system?

4•SunshineTheCat•2d ago•5 comments

Ask HN: What standards or protocols exist for AI Agent permissions

2•lyfeninja•2d ago•3 comments

PersMEM: Persistent Semantic Memory and Multi-Instance Communication for AI

3•asixicle•2d ago•0 comments

Why most AI projects feel useless

8•vaishcodescape•3d ago•9 comments