The pipeline chains 5 agents: a Retriever that selects reference diagrams, a Planner that generates a textual description, a Stylist that refines for visual aesthetics, a Visualizer that renders the image (Gemini for diagrams, Matplotlib for plots), and a Critic that evaluates and triggers iterative refinement.
Uses Google Gemini as the default backend (free tier works). Ships with an MCP server so you can use it directly from Claude Code or Cursor. Quick start: pip install -e ".[dev,google]" then paperbanana generate --input method.txt --output diagram.png
Repo: https://github.com/llmsresearch/paperbanana
This is an unofficial reimplementation and will differ from the original system. I plan to link to the official release once it drops. Happy to answer questions about the architecture or prompt engineering decisions.