frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Finalrun – Spec-driven testing using English and vision for mobile apps

https://github.com/final-run/finalrun-agent
16•ashish004•3h ago
I wanted to test mobile apps in plain English instead of relying on brittle selectors like XPath or accessibility IDs.

With a vision-based agent, that part actually works well. It can look at the screen, understand intent, and perform actions across Android and iOS.

The bigger problem showed up around how tests are defined and maintained.

When test flows are kept outside the codebase (written manually or generated from PRDs), they quickly go out of sync with the app. Keeping them updated becomes a lot of effort, and they lose reliability over time.

I then tried generating tests directly from the codebase (via MCP). That improved sync, but introduced high token usage and slower generation.

The shift for me was realizing test generation shouldn’t be a one-off step. Tests need to live alongside the codebase so they stay in sync and have more context.

I kept the execution vision-based (no brittle selectors), but moved test generation closer to the repo.

I’ve open sourced the core pieces:

1. generate tests from codebase context 2. YAML-based test flows 3. Vision-based execution across Android and iOS

Repo: https://github.com/final-run/finalrun-agent Demo: https://youtu.be/rJCw3p0PHr4

In the Demo video, you’ll see the "post-development hand-off." An AI builds a feature in an IDE, and Finalrun immediately generates and executes a vision-based test for it verifying the feature developed by AI.

Comments

arnold_laishram•3h ago
Looks pretty cool. How does your agent understand plain english?
ashish004•3h ago
We have built a QA agent that can understand your plain english intent and uses vision to reason and navigate the app to test your intent. You can check our benchmark here https://finalrun.app/benchmark/ and how we architected our agent for the benchmark https://github.com/final-run/finalrun-android-world-benchmar.... Its all open source
sahilahuja•1h ago
Agentic testing. Kudos to your decision to open-source it!
avikaa•34m ago
This solves a massive headache. The drift between externally generated tests and an active codebase is a brutal problem to maintain.

Using vision-based execution instead of brittle XPaths is a great baseline, but moving the test definitions to live directly alongside the repo context is definitely the real win here.

Did you find that generating the YAML from the codebase context entirely eliminated the "stale test" issue, or do developers still need to manually tweak the generated YAML when mobile UI layouts change drastically? Great project!

ashish004•26m ago
Hi Avikaa, finalrun provides skills that you can integrate with any IDE of your choice. You can just ask the finalrun-generate-test skill to update all the test for your new feature.

Show HN: Brutalist Concrete Laptop Stand (2024)

https://sam-burns.com/posts/concrete-laptop-stand/
563•sam-bee•7h ago•180 comments

Show HN: A cartographer's attempt to realistically map Tolkien's world

https://www.intofarlands.com/atlasofarda
125•intofarlands•6h ago•23 comments

Show HN: Finalrun – Spec-driven testing using English and vision for mobile apps

https://github.com/final-run/finalrun-agent
16•ashish004•3h ago•5 comments

Show HN: Pion/handoff – Move WebRTC out of browser and into Go

https://github.com/pion/handoff
79•Sean-Der•6h ago•11 comments

Show HN: Stop paying for Dropbox/Google Drive, use your own S3 bucket instead

https://locker.dev
200•Zm44•7h ago•176 comments

Show HN: A reasoning hierarchical robotics pipeline you can run in the browser

https://avikde.github.io/vla-pipeline/
3•avikde•44m ago•0 comments

Show HN: A (marginally) useful x86-64 ELF executable in 298 bytes

https://github.com/meribold/btry
4•meribold•2h ago•0 comments

Show HN: Clawcast – A peer-to-peer podcast network for agents

https://www.clawcast.dev/
5•PiersonMarks•1h ago•4 comments

Show HN: GovAuctions lets you browse government auctions at once

https://www.govauctions.app/
304•player_piano•1d ago•86 comments

Show HN: AdaShape-3D modeler for intuitive 3D printing parts / Windows 11

https://adashape.com
27•fsloth•3d ago•24 comments

Show HN: Ghost Pepper – Local hold-to-talk speech-to-text for macOS

https://github.com/matthartman/ghost-pepper
442•MattHart88•22h ago•193 comments

Show HN: Anos – a hand-written ~100KiB microkernel for x86-64 and RISC-V

https://github.com/roscopeco/anos
102•noone_youknow•3d ago•31 comments

Show HN: Hippo, biologically inspired memory for AI agents

https://github.com/kitfunso/hippo-memory
116•kitfunso•20h ago•22 comments

Show HN: Tusk for macOS and Gnome

https://shapemachine.xyz/tusk/
112•factorialboy•3d ago•42 comments

Show HN: Output.ai - OSS framework we extracted from 500+ production AI agents

https://output.ai/
34•bnchrch•3h ago•6 comments

Show HN: TTF-DOOM – A raycaster running inside TrueType font hinting

https://github.com/4RH1T3CT0R7/ttf-doom
62•4RH1T3CT0R•22h ago•12 comments

Show HN: The King James Bible deserved a better website

https://officialkingjamesbible.com/
4•L23234•4h ago•2 comments

Show HN: BitBang – P2P tunnels to localhost, no account required

https://github.com/richlegrand/bitbang
3•narragansett•4h ago•0 comments

Show HN: Veil a Drop-in PII redaction proxy for any LLM API

https://veil-api.com/
2•A5omic•4h ago•0 comments

Show HN: I built a tiny LLM to demystify how language models work

https://github.com/arman-bd/guppylm
886•armanified•1d ago•133 comments

Show HN: Real-time AI (audio/video in, voice out) on an M3 Pro with Gemma E2B

https://github.com/fikrikarim/parlor
281•karimf•2d ago•35 comments

Show HN: A social feed with no algo where communities decide what gets seen

https://veridonia.com
3•smnkgv•5h ago•4 comments

Show HN: Gemma Gem – AI model embedded in a browser – no API keys, no cloud

https://github.com/kessler/gemma-gem
153•ikessler•1d ago•21 comments

Show HN: td – a CLI to manage tasks, sessions, and worktrees for agentic coding

https://github.com/rosgoo/td
6•rosgoo•5h ago•0 comments

Show HN: Bx – macOS native sandbox for AI and coding tools

https://github.com/holtwick/bx-mac
4•holtwick•6h ago•1 comments

Show HN: Weird Clocks

https://clocks.specr.net
48•vunderba•1d ago•16 comments

Show HN: SwellSlots – Grid Based Surf Forecast App with a Street Fighter 2 UI

https://swellslots.com
4•rawoke083600•6h ago•2 comments

Show HN: I made a YouTube search form with advanced filters

https://playlists.at/youtube/search/
315•nevernothing•1d ago•201 comments

Show HN: LookAway 2.0 – a break reminder for Mac that respects what you're doing

https://lookaway.com
2•_kush•6h ago•0 comments

Show HN: A game where you build a GPU

https://jaso1024.com/mvidia/
951•Jaso1024•3d ago•186 comments