frontpage.

When AI generates code, I first instruct the model to find, fix, and verify any issues. After that, I start the server and test whether it actually works from the user’s perspective.

What I’m looking for is a workflow where issues are received, fixed, tested, and deployed—but it seems that current AI agents aren’t very good at performing browser tests from the user’s perspective.

I’ve tried using the built-in browsers in Codex and Cursor, but they often only checked whether the page loaded. In the end, I had to instruct them step by step on what to do, and it turned out to be cheaper and faster for me to test it myself.

So I’m curious to know how you’ve set up test automation. Are there any services that do this (for individuals, not just enterprises)? If you’re using a harness like Codex, I’d like to know what instructions and skills are needed to get it to perform tests from the user’s perspective.

Ranked: Countries Spending the Most on Research and Development

Smart Hotel Management Software for Hotels, Resorts and Vacation Rentals

"Start with a Monolith" Was Good Advice. AI Is Changing That

How to Apply Google's Open Knowledge Format (OKF) on Enterprise Level

Full Metal Jacket. Copper Edition – Vollebak

OpenAI Codex bombards SSDs with needless write operations, costing millions

The Digital Sovereignty Trap

PixelSmash – FFmpeg's MagicYUV decoder vuln leads to RCE via media file

AI Steps Off the Screen

Benchmark object storage in objects/s, not GB/s

Dietary guidelines do not yield sufficient flavanol for cardiovascular benefit

AxLLM

RIP Fable

Lucid to lay off roughly 18% of U.S. workforce, COO Marc Winterhoff leaves

Clean sweep for Mamdani-backed candidates in New York's Democratic primary

2026 vs. 1996 Chevrolet Blazer IIHS crash test

VoltanaLLM: Energy-Efficient LLM Serving

2003-era DDR2 memory prices jump up to 60%

Sakana Fugu Technical Report

Show HN: Deploy to Vercel, Netlify, Railway, Render, Cloudflare in 1 Command

Intel shareholder sues to void deal giving U.S. gov $11B in stock for free

Sakana Fugu Ultra promises to deliver "the best frontier-level performance"

TSMC: 36.1 A 32Gb/s 10.5Tb/s/mm 0.6pJ/b UCIe-Compliant Low-Latency Interface 3nm

Trump Gets Negative Reviews Internationally as Fewer Say US Is Reliable Partner

OpenAI spending hit $34B last year ahead of planned IPO

The Junior Developer Problem Is Becoming a Senior Developer Problem

Show HN: Fork.ai – branch any AI answer into a mind map instead of a chat log

Conspiracy Theories, Spontaneous Orders, and Global Politics [pdf]

Lippmann Color Plates

Statement from Five Eyes agencies on cyber risk

Ask HN: How do you test AI-generated code?

Comments