frontpage.

Ask HN: Experience automating E2E manual testing with AI

1•rudderdev•5mo ago

I see lots of discussions around using AI in testing. Let's make this discussion more objective and useful by sharing our experiences, here's my experience of using AI to automate e2e manual testing (especially where user interaction is required):

What I’m testing: RudderStack iOS SDK, it is used to track customer event data and send it to various product, marketing, and business tools.

The problem in my current testing workflow: Manual testing is important for quality assurance. In the case of testing RudderStack SDK, it requires multiple time-consuming and error-prone steps such as - plan specific steps for the test, perform interactions, review lengthy amounts of log text, and then verify logs which includes comparing long IDs.

The solution I experimented with: I leveraged LLM to plan test steps, used mobile-mcp to simulate user interactions (clicking some buttons such as track, reset, track, etc.), review logs using LLM (verify the event ID changes sent to the server), and prepare a final comprehensive report. All packaged as an MCP server that can work in my IDE (cursor) with test cases as prompt in plain English.

Result: My agent did click through track → reset → track and caught the anonymous ID change (something that ensures the tracking by the SDK worked properly)

What actually worked:

- Once set up, it did catch the regression correctly - Consistent results vs my manual testing where I sometimes miss things

Issues I ran into:

- Had to write extremely detailed step-by-step instructions and extensive context. If I missed anything, it just failed

- WebDriver setup on port 4723 was finicky

- It is slow. Took 2 minutes for what should be a 30-second manual test

Biggest problem: The amount of upfront work to get it running properly. I spent more time writing instructions than I would have just testing manually.

The real value might be in consistency for regression testing, not speed. But the initial investment is rough.

What would make this useful:

I need to create a workflow where, based on the feature or fixes, agents automatically generate test cases—including all edge cases—targeting the code impacted by the changes, and then perform a thorough end-to-end QA.

Has anyone else tried automating QA using AI? How was your experience and how did you resolve the challenges you faced? (I want to learn the practice that I can incorporate in my workflow)

A BSOD for All Seasons – Send Bad News via a Kernel Panic

Show HN: I got tired of copy-pasting between Claude windows, so I built Orcha

Omarchy First Impressions

Reinforcement Learning from Human Feedback

Show HN: Versor – The "Unbending" Paradigm for Geometric Deep Learning

Show HN: HypothesisHub – An open API where AI agents collaborate on medical res

Big Tech vs. OpenClaw

Anofox Forecast

Ask HN: How do you figure out where data lives across 100 microservices?

Motus: A Unified Latent Action World Model

Rotten Tomatoes Desperately Claims 'Impossible' Rating for 'Melania' Is Real

The protein denitrosylase SCoR2 regulates lipogenesis and fat storage [pdf]

Los Alamos Primer

NewASM Virtual Machine

Terminal-Bench 2.0 Leaderboard

I vibe coded a BBS bank with a real working ledger

The Path to Mojo 1.0

Show HN: I'm 75, building an OSS Virtual Protest Protocol for digital activism

Show HN: I built Divvy to split restaurant bills from a photo

Hot Reloading in Rust? Subsecond and Dioxus to the Rescue

Skim – vibe review your PRs

Show HN: Open-source AI assistant for interview reasoning

Tech Edge: A Living Playbook for America's Technology Long Game

Golden Cross vs. Death Cross: Crypto Trading Guide

Hoot: Scheme on WebAssembly

What the longevity experts don't tell you

Monzo wrongly denied refunds to fraud and scam victims

They were drawn to Korea with dreams of K-pop stardom – but then let down

Show HN: AI-Powered Merchant Intelligence

Bash parallel tasks and error handling