Producing publication-ready illustrations is a labor-intensive bottleneck within the analysis workflow. Whereas AI scientists can now deal with literature critiques and code, they battle to visually talk advanced discoveries. A analysis workforce from Google and Peking College introduce new framework known as ‘PaperBanana‘ which is altering that by utilizing a multi-agent system to automate high-quality educational diagrams and plots.


5 Specialised Brokers: The Structure
PaperBanana doesn’t depend on a single immediate. It orchestrates a collaborative workforce of 5 brokers to remodel uncooked textual content into skilled visuals.


Section 1: Linear Planning
- Retriever Agent: Identifies the 10 most related reference examples from a database to information the model and construction.
- Planner Agent: Interprets technical methodology textual content into an in depth textual description of the goal determine.
- Stylist Agent: Acts as a design guide to make sure the output matches the “NeurIPS Look” utilizing particular coloration palettes and layouts.
Section 2: Iterative Refinement
- Visualizer Agent: Transforms the outline into a visible output. For diagrams, it makes use of picture fashions like Nano-Banana-Professional. For statistical plots, it writes executable Python Matplotlib code.
- Critic Agent: Inspects the generated picture towards the supply textual content to seek out factual errors or visible glitches. It offers suggestions for 3 rounds of refinement.
Beating the NeurIPS 2025 Benchmark


The analysis workforce launched PaperBananaBench, a dataset of 292 take a look at instances curated from precise NeurIPS 2025 publications. Utilizing a VLM-as-a-Choose method, they in contrast PaperBanana towards main baselines.
| Metric | Enchancment over Baseline |
| General Rating | +17.0% |
| Conciseness | +37.2% |
| Readability | +12.9% |
| Aesthetics | +6.6% |
| Faithfulness | +2.8% |
The system excels in ‘Agent & Reasoning’ diagrams, attaining a 69.9% total rating. It additionally offers an automatic ‘Aesthetic Guideline’ that favors ‘Mushy Tech Pastels’ over harsh main colours.
Statistical Plots: Code vs. Picture
Statistical plots require numerical precision that customary picture fashions typically lack. PaperBanana solves this by having the Visualizer Agent write code as an alternative of drawing pixels.
- Picture Era: Excels in aesthetics however typically suffers from ‘numerical hallucinations’ or repeated parts.
- Code-Primarily based Era: Ensures 100% knowledge constancy by utilizing the Matplotlib library to render the ultimate plot.
Area-Particular Aesthetic Preferences in AI Analysis
In keeping with the PaperBanana model information, aesthetic decisions typically shift primarily based on the analysis area to match the expectations of various scholarly communities.
| Analysis Area | Visible ‘Vibe‘ | Key Design Components |
| Agent & Reasoning | Illustrative, Narrative, “Pleasant” | 2D vector robots, human avatars, emojis, and “Consumer Interface” aesthetics (chat bubbles, doc icons) |
| Laptop Imaginative and prescient & 3D | Spatial, Dense, Geometric | Digicam cones (frustums), ray traces, level clouds, and RGB coloration coding for axis correspondence |
| Generative & Studying | Modular, Move-oriented | 3D cuboids for tensors, matrix grids, and “Zone” methods utilizing mild pastel fills to group logic |
| Idea & Optimization | Minimalist, Summary, “Textbook” | Graph nodes (circles), manifolds (planes), and a restrained grayscale palette with single spotlight colours |
Comparability of Visualization Paradigms
For statistical plots, the framework highlights a transparent trade-off between utilizing a picture technology mannequin (IMG) versus executable code (Coding).
| Function | Plots through Picture Era (IMG) | Plots through Coding (Matplotlib) |
| Aesthetics | Usually increased; plots look extra “visually interesting” | Skilled and customary educational look |
| Constancy | Decrease; liable to “numerical hallucinations” or aspect repetition | 100% correct; strictly represents the uncooked knowledge offered |
| Readability | Excessive for sparse knowledge however struggles with advanced datasets | Constantly excessive; handles dense or multi-series knowledge with out error |
Key Takeaways
- Multi-Agent Collaborative Framework: PaperBanana is a reference-driven system that orchestrates 5 specialised brokers—Retriever, Planner, Stylist, Visualizer, and Critic—to remodel uncooked technical textual content and captions into publication-quality methodology diagrams and statistical plots.
- Twin-Section Era Course of: The workflow consists of a Linear Planning Section to retrieve reference examples and set aesthetic pointers, adopted by a 3-round Iterative Refinement Loop the place the Critic agent identifies errors and the Visualizer agent regenerates the picture for increased accuracy.
- Superior Efficiency on PaperBananaBench: Evaluated towards 292 take a look at instances from NeurIPS 2025, the framework outperformed vanilla baselines in General Rating (+17.0%), Conciseness (+37.2%), Readability (+12.9%), and Aesthetics (+6.6%).
- Precision-Targeted Statistical Plots: For statistical knowledge, the system switches from direct picture technology to executable Python Matplotlib code; this hybrid method ensures numerical precision and eliminates “hallucinations” widespread in customary AI picture mills.
Take a look at the Paper and Repo. Additionally, be at liberty to observe us on Twitter and don’t overlook to affix our 100k+ ML SubReddit and Subscribe to our E-newsletter. Wait! are you on telegram? now you may be a part of us on telegram as nicely.










