frontpage.

Vision agents (browser-use, computer-use) are the default for letting AI agents operate web apps without APIs. Writing an MCP or REST API per app is the alternative, but every app needs its own. Enterprise teams have 20+ internal tools.

We ran the agents on a Reflex port of a react demo (a small business’ admin panel). The task was to find the "Smith" with the most orders, accept their pending reviews, mark their most recent order as delivered.

Results (medians, n=5 API / n=3 vision):

- Vision agent: 47 steps, 495k tokens, ~14 min - API agent: 8 calls, 12k tokens, 19.7s

The vision agent failed on the abstract task and needed a 14-step UI walkthrough before completing it, and even with the walkthrough it made 47 round-trips each carrying a full-page screenshot.

Vision-run variance was wide enough (853-1296s, 407k-751k tokens) that a single run isn't representative, while API runs were tightly clustered. This is the cost of being lazy about making an agent-friendly interface.

The endpoints in Path B were auto-generated by a plugin shipped in Reflex 0.9 this week. You can find full methodology here: https://reflex.dev/blog/vision-agents-vs-api-calls/

Benchmarking Local LLM/Harness Combinations

Cyborg Evals

Real Linux. In a browser tab. No install. No server. No Docker

The Evolution of Open Source with Kelsey Hightower [video]

Anthropic wants to be the AWS of agentic AI

Tess Observations

What is Windows K2? Inside Microsoft's big plan to save Windows 11

What Happens in the First 24 Hours After a New Asset Goes Live

Ukraine Bets on Battlefield AI

Monthly News – April 2026

Coding agents expose this: same VPS, 3 runs, ~65% drift

The Enhanced Games, Where Athletes Compete on Steroids, HGH, Adderall

Difference between good debt and bad debt

Digging into Claude Code and codex source codes to understand how they work

From items to users: Rebuilding Plaid's API in flight

Palantir's Al Targeting System Running the Iran War [video]

The Alice and Bob After Dinner Speech

IBM Selectric

A Year on an E-Reader

Paraconsistent Logic (Substantive Revision)

SFO Gate Explorer

Greptile's New Pricing Is Predatory

Before DevRel Was a Thing

The invisible force making food less nutritious

Introducing Stage: Engineers deserve a better code review platform

More Tokens Isn't More Intelligence

AI On-Call Engineer That Fixes Prod While I Sleep

Show HN: Milkdrop Visualizations with WASM+WebGPU [TW: flashing lights]

Granite 4.1 LLMs: How They're Built

Main quests, subquests, side quests and minigames

Vision agents vs. structured APIs on the same internal tool task