Open-Source Agentic QA Harness with Memory

50•pranshuchittora•51m ago

Comments

pranshuchittora•50m ago

Hey, I am the creator of agent-qa.

Coding agents have accelerated software development, allowing folks to ship features at lightning speed, but whether the feature works in production without breaking existing behavior is still questionable.

Conventionally, either a software engineer or a QA engineer converts user stories / feature PRDs into composable end-to-end tests, allowing teams to catch regressions.

But with AI writing code, tests become the bottleneck. Though you can ask the coding agent to write tests, and it does write tests with reasonable correctness, AI greedily chases passing tests and sometimes bends the rules. Also, having access to the code allows it to write tests with shortcuts that might not mimic real user behavior.

With agent-qa, you can write tests in plain English (natural language). It is built upon battle-tested testing frameworks (Playwright for web and Appium for mobile). Playwright and Appium work as a kernel executing the planned actions, while AI runs in the harness doing observation -> planning -> executing planned actions (via kernel) -> self-healing (in case a planned action fails) -> verification.

The agent also evolves with every test run. It generates learning & product memories from each run, improving itself over time.

This is in an early stage, and I’m looking forward to your feedback.

Thanks!

Live Demo - https://vostride.com/demo/agent-qa GitHub - https://github.com/vostride/agent-qa (Consider giving it star) Good Day!

mkdsf01•29m ago

That looks interesting

pranshuchittora•2m ago

Do give it a try https://vostride.com/docs/agent-qa/quickstart

willowwd9•27m ago

What's the need of this? I run codex in loop and it writes and runs the playwright tests without any intervention.

pranshuchittora•29s ago

This is what teams are doing today. But LLMs have a tendency to greedily write tests, which leads to hacky tricks to make the test succeed.

agent-qa is a harness where playwright works as an execution kernel and LLM works as a observer, planner and verifier.

ofdgdfkg9034•27m ago

Can I use it with claude code?

pranshuchittora•2m ago

Yes, https://vostride.com/docs/agent-qa/configuration/global-conf...

AI is likely to widen the gap between corporate giants and everyone else

Ebola outbreak: WHO declares emergency, US restricts travel, American infected

Millions of merchants speak UCP

Googolplex Written Out

Gemini is in danger of going full Copilot

Gaussian Splat of a Strawberry

Drug Development Failure: How GLP-1 Development Was Abandoned in 1990

NosDAV: Nostr-native Solid storage server. Powered by JSS

AdminForth – Open-source admin framework with a built-in AI agent [video]

An open question about how AI agent skills should be distributed

Trump admin creates $1.7B fund for allies of the president

All the Bugs They Found

Show HN: Barstool, a Prettier macOS Menubar

Show HN: ShakeToFocus – blur everything except your active window on macOS

The million-dollar math problem hardly anyone is trying to solve

Before Making It Configurable

Adobe Lightroom CC on Linux via Wine

Automate shitty tasks with dog agents

The founder's playbook: Building an AI-native startup

An uptime monitor that knows the difference between a blip and an outage

OpenAI, Microsoft and Friends Build a Better, More Scalable Ethernet

Ascetic Computing

Growing bread queues in Gaza as Israel restricts fuel, flour imports

How Socialism Could Work

Iran invites CNN to show "a call to arms", arming and training 7-8 year olds

Rust: Project Goals Update

Bito's AI Architect Boosts Claude Opus's task success rate by 35%

All the a Trading Zone, and All the Languages Merely Pidgins

How to Lose a Fight (skillfully) (2011)

Hunting orphan objects: 45% off our ClickHouse storage bill and a near data-loss