Counting Down Capabilities to AGI

https://shash42.substack.com/p/counting-down-capabilities-to-agi

1•shash42•7mo ago

Comments

shash42•7mo ago

This is a living document where I'll track my evolving thoughts on what remains on the path to building generally-intelligent agents. Why does this matter? Three compelling reasons:

Top-down view: AI research papers (and product releases) move bottom-up, starting from what we have right now and incrementally improving, in the hope we eventually converge to the end-goal. This is good, that’s how concrete progress happens. At the same time, to direct our efforts, it is important to have a top-down view of what we have achieved, and what are the remaining bottlenecks towards the end-goal. Besides, known unknowns are better than unknown unknowns.

Research prioritisation: I want this post to serve as a personal compass, reminding me which capabilities I believe are most critical for achieving generally intelligent agents—capabilities we haven't yet figured out. I suspect companies have internal roadmaps for this, but it’s good to also discuss this in the open.

Forecasting AI Progress: Recently, there is much debate about the pace of AI advancement, and for good measure—this question deserves deep consideration. Generally-intelligent agents will be transformative, requiring both policymakers and society to prepare accordingly. Unfortunately, I think AI progress is NOT a smooth exponential that we can extrapolate to make predictions. Instead, the field moves by shattering one (or more) wall(s) every time a new capability gets unlocked. These breakthroughs present themselves as large increases in benchmark performance in a short period of time, but the absolute performance jump on a benchmark provides little information about when the next breakthrough will occur. This is because, for any given capability, it is hard to predict when we will know how to make a model learn it. But it’s still useful to know what capabilities are important and what kinds of breakthroughs are needed to achieve them, so we can form our own views about when to expect a capability. This is why this post is structured as a countdown of capabilities, which as we build out, will get us to “AGI” as I think about it.

*Framework* To be able to work backwards from the end-goal, I think it’s important to use accurate nomenclature to intuitively define the end-goal. This is why I’m using the term generally-intelligent agents. I think it encapsulates the three qualities we want from “AGI”:

Generality: Be useful for as many tasks and fields as possible.

Intelligence: Learn new skills from as few experiences as possible

Agency: Planning and performing a long chain of actions.

Click and read the blog for:

Introduction

…. Framework

…. AI 2024 - Generality of Knowledge

Part I on The Frontier: General Agents

…. Reasoning: Algorithmic vs Bayesian

…. Information Seeking

…. Tool-use

…. Towards year-long action horizons

…. …. Long-horizon Input: The Need for Memory

…. …. Long-horizon Output

…. Multi-agent systems

Part II on The Future: Generally-Intelligent Agents [TBA]

BookTalk: A Reading Companion That Captures Your Voice

Is AI "good" yet? – tracking HN's sentiment on AI coding

Show HN: Amdb – Tree-sitter based memory for AI agents (Rust)

OpenClaw Partners with VirusTotal for Skill Security

Show HN: Seedance 2.0 Release

Leisure Suit Larry's Al Lowe on model trains, funny deaths and Disney

Towards Self-Driving Codebases

VCF West: Whirlwind Software Restoration – Guy Fedorkow [video]

Show HN: COGext – A minimalist, open-source system monitor for Chrome (<550KB)

FOSDEM 26 – My Hallway Track Takeaways

Show HN: Env-shelf – Open-source desktop app to manage .env files

Show HN: Almostnode – Run Node.js, Next.js, and Express in the Browser

Dell support (and hardware) is so bad, I almost sued them

Project Pterodactyl: Incremental Architecture

Styling: Search-Text and Other Highlight-Y Pseudo-Elements

Crypto firm accidentally sends $40B in Bitcoin to users

Magnetic fields can change carbon diffusion in steel

Fantasy football that celebrates great games

Show HN: Animalese

StrongDM's AI team build serious software without even looking at the code

John Haugeland on the failure of micro-worlds

Show HN: Velocity - Free/Cheaper Linear Clone but with MCP for agents

Corning Invented a New Fiber-Optic Cable for AI and Landed a $6B Meta Deal [video]

Show HN: XAPIs.dev – Twitter API Alternative at 90% Lower Cost

Near-Instantly Aborting the Worst Pain Imaginable with Psychedelics

Show HN: Nginx-defender – realtime abuse blocking for Nginx

The Super Sharp Blade

Smart Homes Are Terrible

What I haven't figured out

KPMG pressed its auditor to pass on AI cost savings