frontpage.

What are peoples' experiences with using LLMs to mine information from scientific papers?

My own experience: I first attempted to extract the anti-drug antibody (ADA) rate from each of 3730 clinical-trial papers, all indexed in PubMed. I started from PDFs. Claude Opus 4.7 analyzed each PDF using a written rules doc that we had formulated. Running all the papers took about a week because I kept hitting session limits; the total cost was ~$25 (USD). We got actual rates from 909 papers. The rest were mostly cases where the rate was not present or did not meet our criteria, including administering only one drug at a time.

I read thirty of the papers and re-read those where I got a different answer from Claude, concluding that it had erred one time and I had erred three times.

So this works, but is not totally convenient: session limits mean that I can't start it up and walk away. Or I don't know how to engineer this capability. In addition I was curious how local models would perform.

To that end I tried llama 3.3 70B on my Mac M5 Max (128 GB mem). I used Ollama, Q4_K_M, 128 k context, ~80 k input tokens after pdftotext -layout.

One paper took 18 minutes; the model was unable to determine the ADA rate, whereas it is clearly in the paper. One paper is not a proper benchmark but it's too slow to do a proper test. Clearly part of the speed issue here is that Claude has access to a server farm, whereas I'm running on just one Mac. This is part of the practical problem that someone would face with local computation.

What is the state of the art on this type of problem, for answering questions one paper at a time or using many papers at once? I'd love to hear success stories!

The US Tech Giant Where Employees Wear IDF Uniforms to Work

At Protocol: Building the Social Internet

Codex and ForgeCAD: Generating a Model of the Teenage Engineering KO II

NASA chief Jared Isaacman says he's fighting for Pluto

Better Hardware Could Turn Zeros into AI Heroes

Anaconda Acquires Outerbounds to Unify AI-Native Development

Potemkin Village

Show HN: VT Code – Rust coding agent with AST-level code intelligence

Nikita Bier Runs X. Give Me a Few Hours. Iranian flag change and account purge

FastCGI: 30 Years Old and Still the Better Protocol for Reverse Proxies

TI-84 Evo

Customer.io told me to delete 80% of my list. Rebuilt it with Claude in 27 days

Maximising the Value of Ajinomoto

30 ClawHub skills secretly turn AI agents into a crypto swarm

Ramping Figure 03 Production

Superpower for Gemini – Chrome Extension

NASA Boss: Make Pluto a Planet Again

Is there any way to stop getting AI made video suggestions in YouTube?

Why Math's Final Axiom Proved So Controversial

Cyberdeck Design Log #1

Canada Proposes Poet Mission to Hunt Earth-Sized Planets

Session-Surface Protocol v0.1: A draft spec for private surfaces in LUIs

Show HN: Chrome extension for Gmail/Workspace users to alias emails at signup

Court Rules 2nd Amendment Covers Firearms Parts Good News Those Who Build Guns

Why TVs Are Getting Uncomfortably Bright, and Here's Why

Show HN: TripBalls – plan road trips to away games (MLB, NFL, NBA, WC2026)

CPanel, WHM emergency update fixes critical auth bypass bug

Communicating Our Research with Stakeholders to Achieve Alignment and Trust

DESI Completes Its Epic 3D Map, Hinting That Dark Energy Might Be Changing

Show HN: Ccmeter – local-first cost and cache dashboard for Claude Code

Ask HN: Mining Scientific Papers

Comments