Hey HN — Akshay & Ashwin here, co-founders of Spine AI (YC S23).
We've been rethinking how AI agents work together. Instead of a single model in a chat loop or agents reading/writing to a file system, we built a visual canvas where multiple agents collaborate across connected blocks — and it turns out this architecture significantly outperforms both single and multi-agent systems on hard tasks.
The approach has three parts:
1. Canvas-based workspace — Agents operate on an infinite canvas of intelligent blocks (web browsing, prompts, tables, memos) that connect and pass context to each other. Instead of a flat file system, agents get a structured, non-linear environment that mirrors how complex problems actually decompose.
2. Tiered multi-agent orchestration — An orchestrating agent decomposes tasks, delegates to specialized persona agents (researcher, analyst, reviewer), and manages dependencies. Agents validate each other's work before passing it downstream, catching errors before they compound across long chains.
3. Dynamic multi-model ensembling — Rather than one model for everything, we select from 300+ models per subtask. When confidence is low, we pull in additional models and treat disagreement as a signal for deeper scrutiny — like classical ML ensembling, but at the agent level.
The results: 61.5% on GAIA Level 3 (vs Manus 57.7%, OpenAI Deep Research 47.6%) and 87.6% on DeepSearchQA (vs Perplexity 79.5%, Gemini Deep Research 66.1%). Same frontier models available to everyone — the difference is architecture.
Because everything runs on the canvas, we could audit our agents' work step by step. That's how we caught what appear to be mislabeled questions in the GAIA dataset itself — we link to sample canvases in the post so you can see the reasoning traces.
Spine Swarms is open to try at www.getspine.ai. Happy to go deep on any of the architecture.
a24venka•2h ago
We've been rethinking how AI agents work together. Instead of a single model in a chat loop or agents reading/writing to a file system, we built a visual canvas where multiple agents collaborate across connected blocks — and it turns out this architecture significantly outperforms both single and multi-agent systems on hard tasks.
The approach has three parts:
1. Canvas-based workspace — Agents operate on an infinite canvas of intelligent blocks (web browsing, prompts, tables, memos) that connect and pass context to each other. Instead of a flat file system, agents get a structured, non-linear environment that mirrors how complex problems actually decompose.
2. Tiered multi-agent orchestration — An orchestrating agent decomposes tasks, delegates to specialized persona agents (researcher, analyst, reviewer), and manages dependencies. Agents validate each other's work before passing it downstream, catching errors before they compound across long chains.
3. Dynamic multi-model ensembling — Rather than one model for everything, we select from 300+ models per subtask. When confidence is low, we pull in additional models and treat disagreement as a signal for deeper scrutiny — like classical ML ensembling, but at the agent level.
The results: 61.5% on GAIA Level 3 (vs Manus 57.7%, OpenAI Deep Research 47.6%) and 87.6% on DeepSearchQA (vs Perplexity 79.5%, Gemini Deep Research 66.1%). Same frontier models available to everyone — the difference is architecture.
Because everything runs on the canvas, we could audit our agents' work step by step. That's how we caught what appear to be mislabeled questions in the GAIA dataset itself — we link to sample canvases in the post so you can see the reasoning traces.
Spine Swarms is open to try at www.getspine.ai. Happy to go deep on any of the architecture.