frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Finalrun – Spec-driven testing using English and vision for mobile apps

https://github.com/final-run/finalrun-agent
16•ashish004•3h ago
I wanted to test mobile apps in plain English instead of relying on brittle selectors like XPath or accessibility IDs.

With a vision-based agent, that part actually works well. It can look at the screen, understand intent, and perform actions across Android and iOS.

The bigger problem showed up around how tests are defined and maintained.

When test flows are kept outside the codebase (written manually or generated from PRDs), they quickly go out of sync with the app. Keeping them updated becomes a lot of effort, and they lose reliability over time.

I then tried generating tests directly from the codebase (via MCP). That improved sync, but introduced high token usage and slower generation.

The shift for me was realizing test generation shouldn’t be a one-off step. Tests need to live alongside the codebase so they stay in sync and have more context.

I kept the execution vision-based (no brittle selectors), but moved test generation closer to the repo.

I’ve open sourced the core pieces:

1. generate tests from codebase context 2. YAML-based test flows 3. Vision-based execution across Android and iOS

Repo: https://github.com/final-run/finalrun-agent Demo: https://youtu.be/rJCw3p0PHr4

In the Demo video, you’ll see the "post-development hand-off." An AI builds a feature in an IDE, and Finalrun immediately generates and executes a vision-based test for it verifying the feature developed by AI.

Comments

arnold_laishram•3h ago
Looks pretty cool. How does your agent understand plain english?
ashish004•3h ago
We have built a QA agent that can understand your plain english intent and uses vision to reason and navigate the app to test your intent. You can check our benchmark here https://finalrun.app/benchmark/ and how we architected our agent for the benchmark https://github.com/final-run/finalrun-android-world-benchmar.... Its all open source
sahilahuja•1h ago
Agentic testing. Kudos to your decision to open-source it!
avikaa•21m ago
This solves a massive headache. The drift between externally generated tests and an active codebase is a brutal problem to maintain.

Using vision-based execution instead of brittle XPaths is a great baseline, but moving the test definitions to live directly alongside the repo context is definitely the real win here.

Did you find that generating the YAML from the codebase context entirely eliminated the "stale test" issue, or do developers still need to manually tweak the generated YAML when mobile UI layouts change drastically? Great project!

ashish004•13m ago
Hi Avikaa, finalrun provides skills that you can integrate with any IDE of your choice. You can just ask the finalrun-generate-test skill to update all the test for your new feature.

GLM-5.1: Towards Long-Horizon Tasks

https://z.ai/blog/glm-5.1
140•zixuanlimit•1h ago•28 comments

Cambodia unveils a statue of famous landmine-sniffing rat Magawa

https://www.bbc.com/news/articles/c0rx7xzd10xo
29•speckx•43m ago•4 comments

Show HN: Brutalist Concrete Laptop Stand (2024)

https://sam-burns.com/posts/concrete-laptop-stand/
555•sam-bee•6h ago•180 comments

Cloudflare targets 2029 for full post-quantum security

https://blog.cloudflare.com/post-quantum-roadmap/
148•ilreb•3h ago•45 comments

Google open-sources experimental agent orchestration testbed Scion

https://www.infoq.com/news/2026/04/google-agent-testbed-scion/
67•timbilt•4h ago•20 comments

Rescuing old printers with an in-browser Linux VM bridged to WebUSB over USB/IP

https://printervention.app/details
29•gmac•1h ago•5 comments

Moving fast in hardware: lessons from lab to $100M ARR

https://blog.zacka.io/p/simplify-then-add-lightness-bc4
66•rryan•2h ago•15 comments

AI helps add 10k more photos to OldNYC

https://www.danvk.org/2026/03/08/oldnyc-updates.html
32•evakhoury•23h ago•5 comments

We found an undocumented bug in the Apollo 11 guidance computer code

https://www.juxt.pro/blog/a-bug-on-the-dark-side-of-the-moon/
315•henrygarner•7h ago•161 comments

Good Taste the Only Real Moat Left

https://rajnandan.com/posts/taste-in-the-age-of-ai-and-llms/
124•speckx•2h ago•104 comments

A new Postcrossing stamp from the USA

https://www.postcrossing.com/blog/2026/03/31/a-new-postcrossing-stamp-from-the-usa
34•Tomte•3d ago•9 comments

John Coltrane Illustrates the Mathematics of Jazz

https://www.americanjazzmusicsociety.com/blog/john-coltrane-draws
35•luu•12h ago•0 comments

12k Tons of Dumped Orange Peel Grew into a Landscape Nobody Expected (2017)

https://www.sciencealert.com/how-12-000-tonnes-of-dumped-orange-peel-produced-something-nobody-im...
146•pulisse•2h ago•44 comments

Dropping Cloudflare for Bunny.net

https://jola.dev/posts/dropping-cloudflare
301•shintoist•4h ago•144 comments

9 Mothers (YC P26) Is Hiring – Lead Robotics and More

https://jobs.ashbyhq.com/9-mothers?utm_source=x8pZ4B3P3Q
1•ukd1•4h ago

Emotion Concepts and Their Function in a Large Language Model

https://transformer-circuits.pub/2026/emotions/index.html
26•Anon84•3d ago•0 comments

Show HN: A cartographer's attempt to realistically map Tolkien's world

https://www.intofarlands.com/atlasofarda
123•intofarlands•5h ago•23 comments

A whole civilization might die tonight

https://www.nbcnews.com/politics/white-house/trump-threat-whole-civilization-will-die-iran-war-de...
43•hedayet•35m ago•16 comments

Every GPU That Mattered

https://sheets.works/data-viz/every-gpu
267•jonbaer•9h ago•152 comments

You can't cancel a JavaScript promise (except sometimes you can)

https://www.inngest.com/blog/hanging-promises-for-control-flow
63•goodoldneon•4h ago•33 comments

SQLite in Production: Lessons from Running a Store on a Single File

https://ultrathink.art/blog/sqlite-in-production-lessons
121•thunderbong•3d ago•72 comments

Identify a London Underground Line just by listening to it

https://tubesoundquiz.com/
148•nelson687•7h ago•45 comments

My Experience as a Rice Farmer

https://xd009642.github.io/2026/04/01/My-Experience-as-a-Rice-Farmer.html
312•surprisetalk•5d ago•156 comments

Show HN: Finalrun – Spec-driven testing using English and vision for mobile apps

https://github.com/final-run/finalrun-agent
16•ashish004•3h ago•5 comments

Global Physics Photowalk: 2025 winners revealed

https://www.quantamagazine.org/global-physics-photowalk-2025-winners-revealed-20260401/
21•ibobev•4d ago•1 comments

Haunting Photos Show the Aftermath of the Kursk Submarine Disaster in 2000

https://rarehistoricalphotos.com/kursk-submarine-disaster-photos/
126•mooreds•5d ago•33 comments

Wi-Fi That Can Withstand a Nuclear Reactor: This receiver chip can take it

https://spectrum.ieee.org/robotics-in-nuclear-industry
70•voxadam•5d ago•6 comments

Kindle to end store downloads and registering for 1st-5th gen kindles in May

https://www.reddit.com/r/kindle/s/xg8uCdAWU3
34•seam_carver•1h ago•16 comments

Blackholing My Email

https://www.johnsto.co.uk/blog/blackholing-my-email/
140•semyonsh•9h ago•22 comments

DeiMOS – A Superoptimizer for the MOS 6502

https://aransentin.github.io/deimos/
64•Aransentin•6h ago•17 comments