frontpage.

Show HN: StageWright – A performance-focused Playwright reporter with AI

https://stagewright.dev/

1•qagaryparker•1h ago

Hi HN,

I’m the creator of StageWright (and the open-source playwright-smart-reporter).

I’ve been frustrated by the "black box" nature of E2E test failures. Standard reporters tell you that a test failed, but they don't help you understand why it’s failing across 50 different runs or whether its execution time is trending toward a regression.

I built StageWright to treat test results as a performance and stability dataset.

Key Technical Features:

Historical Flakiness Detection: Unlike Playwright's default "retry" logic, we track failures across runs. A test only gets a high "Stability Grade" if it consistently passes over time.

Flamechart Step Timelines: We added a color-coded flamechart for test steps (v1.0.8). It categorizes steps into Navigation, Action, and API, making it easy to see if a 10s test is hanging on a locator or a slow backend response.

2-Sigma Anomaly Detection: The trends view uses moving averages and 2-sigma outlier detection to flag performance regressions that might otherwise go unnoticed.

AI-Powered Failure Clustering: We batch failures and use Claude/GPT-4 to cluster similar errors. Instead of 20 separate failures, you see "1 cluster: TimeoutError on payment-submit-btn."

Virtual Scroll Performance: We optimized the UI with virtual scrolling to handle suites with 500+ tests without the browser freezing—a common issue with the default HTML reporter.

Native Trace & Network Logs: Traces and network waterfalls are embedded directly in the report. No downloading .zip files from CI; they open instantly in an inline viewer.

The Architecture: StageWright is built to be "Playwright-native." It hooks into the reporter API and can run locally (outputting a standalone HTML/JSON history) or via our new Starter/Pro cloud tiers. The Pro tier provides a centralized dashboard for teams, long-term history retention, and cross-project analytics.

I’m currently supporting both Node.js and Python (pytest-playwright) environments.

I’d love to hear what the community thinks—especially regarding how you handle "test debt" in large CI pipelines. I'm here for any questions!

Nvidia and Analyslop

I made a new AI disorder

Eustress and Distress: Neither Good nor Bad, but Rather the Same?

TLA+ by Example

AI=B+

Child-free 'Disney adults' are transforming the company's theme parks

Nvidia Linux Driver fork with P2PDMA support enabled on non-SoC platforms

The Model, the Chat and the Application

Anatomy of a Production AI Agent

Building a Pythonic REST client that feels like an ORM

Show HN: A WASM to Go Translator

Federal Funding of Public Key Cryptography (Martin Hellman)

Sliced by Go's Slices

The Tax Nerd Who Bet His Life Savings Against DOGE

Show HN: Ansible TUI – a zero-dependency terminal UI for running playbooks

Building front end UIs with Codex and Figma

Persistent Reasoning Hieroglyphic Calculator (Academic Boundary Case)

DMS-100.net: The SL-100 Story

Show HN: Talkatui – WWE style live commentary for your AI coding sessions

Interview with Øyvind Kolås, GIMP developer

Ask HN: Is LLM training infra still broken enough to build a company around?

New York sues Valve for enabling "illegal gambling" with loot boxes

Hyperbolic Versions of Latest Posts

Anthropic acquires Vercept to advance Claude's computer use capabilities

Danske Bank adjusts the organisation with role redundancies

How AI skills are quietly automating my workday

DeepSeek withholds latest AI model V4 from US chipmakers including Nvidia

Exercise-induced activation of steroidogenic factor-1 neurons improves endurance

The Linux Memory Manager

Fueling Open Source with Vibes and Money