frontpage.

Hi HN,

I built this tool to solve the "flakiness" problem in UI testing. Existing AI agents often struggle with precise interactions, while traditional frameworks (Selenium/Playwright) break whenever the DOM changes.

The Approach: Instead of relying on hard-coded selectors or pure computer vision, I’m using a multi-agent system powered by multimodal LLMs. We pass both the screenshot (pixels) and the browser context (network requests, console logs, etc) to the model. This allows the agent to:

"See" the UI like a user and accurately map semantic intent ("Click the Signup button") to precise coordinates even if the layout shifts.

The goal is to mimic natural user behavior rather than following a predefined script. It handles exploratory testing and finds visual bugs that code-based assertions miss.

I’d love feedback on the implementation or to discuss the challenges of using LLMs for deterministic testing.

Show HN: WeaveMind – AI Workflows with human-in-the-loop

Show HN: Seedream 5.0: free AI image generator that claims strong text rendering

A contributor trust management system based on explicit vouches

Show HN: Analyzing 9 years of HN side projects that reached $500/month

The Floating Dock for Developers

Arcan Explained – A browser for different webs

We are not scared of AI, we are scared of irrelevance

Quartz Crystals

Show HN: I built a free dictionary API to avoid API keys

Show HN: Kybera – Agentic Smart Wallet with AI Osint and Reputation Tracking

Show HN: brew changelog – find upstream changelogs for Homebrew packages

Any chess position with 8 pieces on board and one pair of pawns has been solved

LLMs as Language Compilers: Lessons from Fortran for the Future of Coding

Projecting high-dimensional tensor/matrix/vect GPT–>ML

Show HN: Free Bank Statement Analyzer to Find Spending Leaks and Save Money

Our Stolen Light

Matchlock: Linux-based sandboxing for AI agents

Show HN: A2A Protocol – Infrastructure for an Agent-to-Agent Economy

Drinking More Water Can Boost Your Energy

Proving Laderman's 3x3 Matrix Multiplication Is Locally Optimal via SMT Solvers

Fire may have altered human DNA

"Compiled" Specs

The Next Big Language (2007) by Steve Yegge

Open-Weight Models Are Getting Serious: GLM 4.7 vs. MiniMax M2.1

Using AI for Code Reviews: What Works, What Doesn't, and Why

Show HN: Solnix – an early-stage experimental programming language

DoNotNotify is now Open Source

The British Empire's Brothels

What rare disease AI teaches us about longitudinal health

The Brand Savior Complex and the New Age of Self Censorship

Show HN: UI testing using multimodal LLMs