frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Ask HN: Best on device LLM tooling for PDFs?

4•martinald•9mo ago
I've got very used to using the "big" LLMs for analysing PDFs

Now llama.cpp has vision support; I tried out PDFs with it locally (via LM Studio) but the results weren't as good as I hoped for. One time it insisted it couldn't do "OCR", but gave me an example of what the data _could_ look like - which was the data.

The other major problem is sometimes PDFs are actually made up of images; and it got super confused on those as well.

Given this is so new I'm struggling to find any tools which make this easier.

Comments

raymond_goo•9mo ago
Try something like this

  !pip install pytesseract pdf2image pillow
  !apt install poppler-utils
  #!apt install tesseract-ocr
  from pdf2image import convert_from_path
  import pytesseract

  pages = convert_from_path('k.pdf', dpi=300)

  all_text = ""
  for page_num, img in enumerate(pages, start=1):
      text = pytesseract.image_to_string(img)
      all_text += f"\n--- Page {page_num} ---\n{text}"

  print(all_text)
constantinum•9mo ago
give https://pg.llmwhisperer.unstract.com/ a try

Ask HN: How many of you hold an amateur radio license in your country?

49•ToddWBurgess•3d ago•58 comments

I started making money online in 10th grade – some lessons about capital

2•udit_50•1h ago•1 comments

Ask HN: Do You Have a Homelab?

7•ricardbejarano•5h ago•4 comments

Ask HN: How are you using multi-agent AI systems in your daily workflow?

13•paifamily•16h ago•9 comments

Ask HN: Anyone fought a big corp over IP theft courts?

5•NatalijaAAD•7h ago•0 comments

Ask HN: Who wants to be hired? (March 2026)

126•whoishiring•4d ago•401 comments

Ask HN: Do You Enjoy Your Career in Tech Nowadays?

24•karakoram•22h ago•24 comments

Ask HN: How are LLMs supposed to be used for warfare?

3•sirnicolaz•12h ago•6 comments

Ask HN: Who is hiring? (March 2026)

247•whoishiring•4d ago•380 comments

Self-Learning Customer Marketing

3•davismartens•17h ago•0 comments

Ask HN: Has anyone noticed the fear-driven prompt suggestions that GPT5.3 makes?

14•cedarscarlett•1d ago•8 comments

Aura-State: Formally Verified LLM State Machine Compiler

22•rohanmunshi08•5d ago•6 comments

How do I get startups to use my open-code project?

5•ErezShahaf•22h ago•11 comments

Ask HN: How are you all staying sane?

150•throwaway53463•4d ago•158 comments

Amazon degraded shopping- you have to put in cart to see the price

15•talkingtab•23h ago•12 comments

Tell HN: Digital Ocean has run out of GPU droplets

17•nathannaveen•2d ago•4 comments

Ask HN: If your project is free, what are you building and why keep it free?

11•LeanVibe•2d ago•21 comments

HATEOAS Works with an LLM in the Mix

2•charlieflowers•1d ago•1 comments

Why is arstechnica.com still running dev story advertorials for a game that...

6•chrisjj•1d ago•5 comments

Ask HN: What sources like HN do you consume?

59•DavidHaerer•4d ago•37 comments

Ask HN: How do you give AI agents real codebase context without burning tokens?

4•donhardman•1d ago•1 comments

I lost my ability to learn anything new because of AI and I need your opinions

21•dokdev•2d ago•28 comments

Ask HN: What's your experience self-hosting in 2026?

27•rustcore•2d ago•11 comments

We don't need continual learning for AGI. What top labs are currently doing

6•kok14•1d ago•6 comments

I keep building projects nobody wants. So this time I'm doing it backwards

5•thefern•23h ago•9 comments

Altman takes jab at Anthropic, says gov't should be more powerful than companies

5•spenvo•1d ago•9 comments

An offline map using OruxMaps(satellite,routing,3D terrain,GPS and POI)

3•supergoogler•1d ago•1 comments

Ask HN: Maintainers, do LLM-only users often clutter your issues/PRs?

9•lucrbvi•2d ago•9 comments

Ask HN: What will OpenAI employees do now who have signed notdividedorg petition

17•Imustaskforhelp•3d ago•16 comments

Ask HN: Seeing More Techcrunch on Frontpage?

2•par•1d ago•0 comments