Show HN: Paper Lantern – improving Autoresearch with research knowledge

https://www.paperlantern.ai/code

2•paperlantern•1h ago

Hi, we've been working on Paper Lantern - an MCP server that searches 2M+ CS research papers for coding agents. The coding agent describes its problem and PL returns ranked techniques with implementation steps, hyperparameters, and failure modes.

We tested it on Karpathy's autoresearch framework : where the task is to find better llm architecture and training configs. In autoresearch, the agent proposes an optimization, tries a 5 min training run, calculates the val loss and then keeps / discards if the val loss lowered / increased.

We compared a strong baseline agent (Opus 4.6 + web search) vs that same agent + Paper Lantern.

  - agent + Paper Lantern iterated to a config that got a much lower val loss on 5-min runs  

  - we trained the two final configs for 2 hours : the config from Paper Lantern got a 3.2% lower val loss

Two concrete examples :

  1. Both agents tried halving the batch size. The paper-access agent pulled a 2022 paper and scaled the learning rate by 1/sqrt(2) as the paper prescribed. It worked, and further halving kept working. The web-search agent made the same batch change, got worse loss, and moved on without diagnosing the LR.  

  2. The with-paper-lantern agent also implemented AdaGC (adaptive gradient clipping, arxiv 2502.11034, published Feb 2025) on the first try with no tuning. Which the baseline agent did not try at all.

If you want to deep-dive:

  - (code) https://github.com/paperlantern-ai/autoresearch-experiment

  - (blog) https://www.paperlantern.ai/blog/autoresearch

If you want to try Paper Lantern yourself:

  - Quick setup: `npx paperlantern@latest`

Comments

parima08•1h ago

That's an impressive jump in performance by providing the agent with access to relevant literature.

Is there a breakdown of which wins came from hyperparameter values (where BO would likely match this) vs. wins from techniques the agent wouldn’t have tried without the paper?

paperlantern•1h ago

yes - the blog post has a figure showing all the improvements and how big they were.

also, some times the baseline agent tries the same idea but doesn't get as big a boost as the baseline + Paper Lantern agent. We studied it and found the reason was that the baseline tries changes in isolation whereas the research-backed ideas understand the interactions between parameters and suggests multiple changes at the same time - which the baseline agent never discovers.

Mhdybnb

One unusual thing in SV is the topics of billboard ads

U.S. Attorney's Office Filed 143 Border-Related Cases This Week

The Spitfire

Indianapolis councilman says shots fired at home and 'No Data Centers' note left

10k-watt GPU meet 40-watt lump of meat

Usage of psychedelic psilocybin rises after state decriminalization

Mozilla Used Anthropic's Mythos to Find and Fix 271 Bugs in Firefox

Running full coding loop on DGX Spark

Show HN: Million Dollar Homepage, 21 years later, priced in satoshis

Wells Fargo, Citi and Goldman lead in AI venture investment

Elite law firm Sullivan and Cromwell admits to AI 'hallucinations'

Native Apps with ClojureScript, React and Static Hermes

Show HN: Humanoid.js – One HTML file that scores how human your clicks look

ChatGPT Images 2.0

Assault at Antarctic base could be a warning for future travellers to Mars

Wrkflw v0.8.0 – Validate and Run GitHub Actions Locally

The $400M Machine That Spawned the Most Coveted Toy

I can never talk to an AI anonymously again

How to Program Computers

Testing a Local LLM

Blue Origin rocket grounded after satellite 'mishap'

Show HN: A simple intermittent fasting tracker and meditation timer

The problem with Europe's Big Tech breakup: It's still hooked

PAI

Show HN: Resumemind – A developer-first resume builder

Hyprglaze

We train LLMs like dogs, not raise them: RLHF and sycophancy

FAA sets records in effort to hire gamers as air traffic controllers

U.S. Personnel Who Died in Mexico Were Working for the CIA, Sources Say