frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Browser-use, Qwen 2.5 3B, Sentience – Jest assertions for AI web agents

1•tonyww•1h ago
Hi HN,

This is an integration between browser-use and Sentience SDK, where Sentience acts like Jest for AI web agents: it lets agents assert what changed on the page, instead of guessing when they’re “done”.

Most web agents today rely on screenshots or raw DOM dumps and hope the model infers layout, state, and completion correctly. That can work for demos, but it’s expensive, brittle, and very hard to validate or debug.

Sentience takes a different approach.

The goal isn't to replace vision models, but to avoid paying for them when geometric processing of page structure is sufficient for most web pages.

At each step, Sentience builds a semantic snapshot of the page that captures what matters for interaction, not pixels or raw DOM:

* interactive elements (links, buttons, inputs) * roles + normalized text * bounding boxes + relative position * dominant groups (main lists / feeds) * ordinal structure (“first”, “top”, “last”)

This snapshot is passed to the agent as structured text (≈ 0.6–1.2k tokens per step in practice), which enables:

* browser-use agents to run without screenshots by default * text-based local LLMs (e.g. Qwen 2.5 3B) to work reliably using text-only prompts * Jest-style assertions over semantic state (e.g. “main list exists”, “first item clicked”, “task complete”)

In short:

* browser-use acts * Sentience SDK asserts

That separation makes agent behavior inspectable, debuggable, and testable, instead of opaque and heuristic-driven. It also reduces reliance on vision models when they aren’t strictly necessary.

Sentience SDK includes an agent runtime with per-step and task-level assertions, so agents can explicitly verify progress rather than relying on implicit “done” signals. On assertion failure after retries, Sentience SDK will trigger a fallback to vision models.

Example (browser-use + local LLM): Multi-step agent using Qwen 2.5 3B with semantic assertions: https://github.com/SentienceAPI/browser-use/pull/6/files

Example for failed task assertion logs: https://justpaste.it/lixt2

Example for successful task run logs: https://justpaste.it/izg2y

Full write-up with design rationale, tradeoffs, and examples: https://medium.com/@rcholic/beyond-clicking-how-we-taught-ai...

Open source SDK:

Python: https://github.com/SentienceAPI/sentience-python TypeScript: https://github.com/SentienceAPI/sentience-ts browser-use integrations:

Jest-style assertions for agents: https://github.com/SentienceAPI/browser-use/pull/5

Browser-use + Local LLM (Qwen 2.5 3B) demo: https://github.com/SentienceAPI/browser-use/pull/4

Token usage comparison (semantic snapshots vs screenshots in browser-use): https://github.com/SentienceAPI/browser-use/pull/1

Happy to answer questions or share minimal examples if you’re curious how this works in practice.

ShowHN screenshots from the test runs: https://jpcdn.it/img/small/435580414b5e9bc2236ca025573f3724....

Comments

tonyww•1h ago
A useful way to think about this: browser-use is the runner, Sentience is the assertion layer.

Agents act → Sentience verifies.

If you’ve built flaky E2E tests or agent demos that “usually work”, this is an attempt to make those workflows inspectable and testable.

Saks Global files for bankruptcy after takeover leads to financial collapse

https://www.theguardian.com/business/2026/jan/14/saks-global-files-for-bankruptcy
1•toomuchtodo•46s ago•0 comments

The Future of Vertical SaaS Is Personal Software

https://blog.excel.holdings/p/the-future-of-vertical-saas-is-personal
1•carlcortright•1m ago•0 comments

Ford F-150 Lightning outsold the Cybertruck and was then canceled for poor sales

https://electrek.co/2026/01/13/ford-f150-lightning-outsold-tesla-cybertruck-canceled-not-selling-...
2•MBCook•1m ago•0 comments

Show HN: Grsh – A high-performance shell for FreeBSD written in Rust

https://grimreaper.icu/
2•antomal•2m ago•0 comments

I built a free browser extension to hide ads and irrelevant posts on LinkedIn

https://www.linktopics.me
2•miguelsdc•2m ago•1 comments

Types of Communication Protocols

https://www.opal-rt.com/blog/5-types-of-communication-protocols-in-plc-systems/
3•mahirsaid•3m ago•0 comments

Designing inverted indexes in a KV-store on object storage

https://turbopuffer.com/blog/fts-v2-postings
2•_peregrine_•3m ago•0 comments

AI Has an Image Problem

https://brittanyellich.com/ai-has-an-image-problem/
2•mooreds•6m ago•1 comments

Show HN: Differentiable Quantum Chemistry

https://github.com/lowdanie/hartree-fock-solver
2•lowdanie•6m ago•0 comments

Setting Boundaries with People

2•zwilderrr•7m ago•0 comments

Alternatives to 100% free text-to-speech websites

https://figtalia.com/free-ai-text-to-speech
2•sifuncion•7m ago•0 comments

US freezes visas for 75 nations

https://english.mathrubhumi.com/news/world/us-visa-ban-public-charge-bjbpzu02
2•cdrnsf•8m ago•0 comments

String Theory Can Now Describe a Universe That Has Dark Energy

https://www.quantamagazine.org/string-theory-can-now-describe-a-universe-that-has-dark-energy-202...
2•rbanffy•8m ago•0 comments

Quixote: An open-source event indexer for EVM blockchains (Rust and DuckDB)

https://github.com/bilinearlabs/quixote
2•bibiver•9m ago•0 comments

Local LLMs are how nerds now justify a big computer they don't need

https://world.hey.com/dhh/local-llms-are-how-nerds-now-justify-a-big-computer-they-don-t-need-af2...
1•isaacdl•9m ago•0 comments

Tell HN: Use the collective noun "a bungle of agents"

2•skeltoac•11m ago•0 comments

Show HN: ClaimVault – Proof for Insurance Claims

https://claimvault365.com
1•sargizsakoo•12m ago•1 comments

The Arctic's 'last ice area' is showing signs of weakness

https://www.science.org/content/article/arctic-s-last-ice-area-showing-signs-weakness
2•bikenaga•13m ago•0 comments

TT-Ascalon – RISC-V CPU

https://tenstorrent.com/ip/risc-v-cpu
1•JoshTriplett•13m ago•0 comments

Behind Oklahoma Cannabis Farms, New Yorkers with Ties to Beijing

https://www.nytimes.com/2025/12/31/us/ny-china-hometown-association-oklahoma-marjiuana.html
1•bookofjoe•14m ago•1 comments

Ask HN: Could you share your personal website here?

1•susam•14m ago•2 comments

16-year-old builds fully functional robotic hand from LEGO parts

https://scienceclock.com/teen-builds-lego-robotic-hand/
1•akg130522•16m ago•0 comments

EU to become 'military powerhouse,' von der Leyen told MEPs

https://www.euractiv.com/news/eu-to-become-military-powerhouse-von-der-leyen-told-meps/
1•saubeidl•17m ago•0 comments

Is it possible to live without WhatsApp?

https://manualdousuario.net/en/living-without-whatsapp/
2•rpgbr•18m ago•1 comments

Show HN: A tool to capture my every ADHD thought

https://tryultrathink.com
1•chriswright1664•18m ago•1 comments

A high-memory elimination timeline for the Linux kernel

https://lwn.net/Articles/1051010/
1•voxadam•19m ago•0 comments

Show HN: Cybercore CSS – Cyberpunk Design System

https://sebyx07.github.io/cybercore-css/
1•wowzzz•22m ago•0 comments

Bypassing Synthid in Gemini Photos

https://deepwalker.xyz/blog/bypassing-synthid-in-gemini-photos
1•m00dy•23m ago•0 comments

Love, Your Mind World Is Supporting Teen Mental Health on Roblox (2025)

https://www.dentsu.com/us/en/blog/gaming-for-good-how-love-your-mind-world-is-supporting-teem-men...
1•mooreds•23m ago•0 comments

Show HN: Lazypg – A simple terminal UI for PostgreSQL

https://github.com/rebelice/lazypg
1•bluehuman•23m ago•0 comments