Show HN: Paper Lantern – on-demand techniques from 2M+ papers for coding agents

https://www.paperlantern.ai/code

2•paperlantern•1h ago

Paper Lantern is an MCP server that lets coding agents ask for personalized techniques / ideas from 2M+ CS research papers. Your coding agent tells PL what problem it is working on --> PL finds the most relevant ideas from 100+ research papers for you --> gives it to your coding agent including trade-offs and implementation instructions.

We had previously shown that this helps research work and want to know understand whether it helps everyday software engineering tasks. We built out 9 tasks to measure this and compared using only a Coding Agent (Opus 4.6) (baseline) vs Coding Agent + Paper Lantern access.

(Blog post with full breakdown: https://www.paperlantern.ai/blog/coding-agent-benchmarks)

Some interesting results : 1. we asked the agent to write tests that maximize mutation score (fraction of injected bugs caught). The baseline caught 63% of injected bugs. Baseline + Paper Lantern found mutation-aware prompting from recent research (MuTAP, Aug 2023; MUTGEN, Jun 2025), which suggested enumerating every possible mutation via AST analysis and then writing tests to target each one. This caught 87%.

2. extracting legal clauses from 50 contracts. The baseline sent the full document to the LLM and correctly extracted 44% of clauses. Baseline + Paper Lantern found two papers from March 2026 (BEAVER for section-level relevance scoring, PAVE for post-extraction validation). Accuracy jumped to 76%.

Five of nine tasks improved by 30-80%. The difference was technique selection. 10 of 15 most-cited papers across all experiments were published in 2025 or later.

Everything is open source : https://github.com/paperlantern-ai/paper-lantern-challenges

Each experiment has its own README with detailed results and an approach.md showing exactly what Paper Lantern surfaced and how the agent used it.

Quick setup: `npx paperlantern@latest`

Comments

vunderba•51m ago

Nice job. I put together a similar system a while back - it's just a self-contained Go binary called paper-search along with an accompanying LLM skill to facilitate search, retrieval, and downloading of relevant academic papers using OpenAlex, Semantic Scholar, and arXiv.

In my experience it's been a better solution versus just asking the LLM to directly to search the web for this kind of information via search engine tooling.

Also just FYI the link provided in your Show HN (https://github.com/paper-lantern-ai/paper-lantern-challenges) is a 404. I think it should be:

https://github.com/paperlantern-ai/paper-lantern-challenges

paperlantern•12m ago

yes - definitely, i have noticed the same that customized solutions for research papers works better than LLM web search.

thanks for catching the link issue

in case you can try out our solution for code agents, i'd love to hear what you think of it...

parima08•1m ago

Trying this out on a valve engineering PDF extraction task — the repo claims 72% improvement on PDF extraction specifically, which is why I'm biting. Installed the MCP and the results were actually contextualized against what I'd already tried in the repo, which surprised me. Handing it to Claude Code now. Quick question: does it pull implementation details from the paper itself, or just the technique? Will report back.

Opus 4.7: better or worse so far compared to 4.6? (don't forget to upvote)

What Psychedelics Do to the Brain

Age verification app ready as EU moves to curb children's social media access

Chinese groups call for global AI governance framework – Chinadaily.com.cn

A Buick GNX Merged with an El Camino to Create This 470-HP Masterpiece

Logfare.ai – Free LLM Inference. No Auth. No Limits

Beyond the Hype: Practical and Responsible Use Cases for Agentic AI Webinar

Hacking the old HackerNews codebase

Using a USB switch as a full KVM

From SIMT to Systolic Part 2: A Kernel Author's Field Report

Synthetic Astrophysics Photometry

Airbus Likely Provided Satellite Imagery of US Military Assets to China Before

Agentic Infrastructure

Inside Notion

Witter Coin to host a $50k coin scavenger hunt in SF

Worldmonitor: Real-time global intelligence dashboard

Show HN: AI Subroutines – Run automation scripts inside the browser tab

Focused microwaves allow 3D printers to fuse circuits onto almost anything

Opus 4.7 refuses to solve NYT Connections puzzles

"Project Hail Mary's" Success: A Story You Can Believe In

Stitch – Google's AI design tool

Glyph Protocol for Terminals

Tesla Roadstar does not even have a release date so how is that not real fraud?

Agile Is Dying

Wild Gunman: Resurrecting Nintendo's First Coin-Op on Its 50th Anniversary [video]

I reversed Opus 4.7 costs

Fulu bounty for Ring Camera jailbreak reaches $23k

The new World ID and the partners bringing proof of human to the internet

Show HN: Realtime Voice AI on ESP32 with Cloudflare Durable Objects

Change management problem rarely mentioned when pushing AI to engineering teams