frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Introduce the Vouch/Denouncement Contribution Model

https://github.com/ghostty-org/ghostty/pull/10559
1•DustinEchoes•1m ago•0 comments

Show HN: SSHcode – Always-On Claude Code/OpenCode over Tailscale and Hetzner

https://github.com/sultanvaliyev/sshcode
1•sultanvaliyev•1m ago•0 comments

Microsoft appointed a quality czar. He has no direct reports and no budget

https://jpcaparas.medium.com/microsoft-appointed-a-quality-czar-he-has-no-direct-reports-and-no-b...
1•RickJWagner•3m ago•0 comments

Multi-agent coordination on Claude Code: 8 production pain points and patterns

https://gist.github.com/sigalovskinick/6cc1cef061f76b7edd198e0ebc863397
1•nikolasi•3m ago•0 comments

Washington Post CEO Will Lewis Steps Down After Stormy Tenure

https://www.nytimes.com/2026/02/07/technology/washington-post-will-lewis.html
1•jbegley•4m ago•0 comments

DevXT – Building the Future with AI That Acts

https://devxt.com
2•superpecmuscles•5m ago•4 comments

A Minimal OpenClaw Built with the OpenCode SDK

https://github.com/CefBoud/MonClaw
1•cefboud•5m ago•0 comments

The silent death of Good Code

https://amit.prasad.me/blog/rip-good-code
2•amitprasad•5m ago•0 comments

The Internal Negotiation You Have When Your Heart Rate Gets Uncomfortable

https://www.vo2maxpro.com/blog/internal-negotiation-heart-rate
1•GoodluckH•7m ago•0 comments

Show HN: Glance – Fast CSV inspection for the terminal (SIMD-accelerated)

https://github.com/AveryClapp/glance
2•AveryClapp•8m ago•0 comments

Busy for the Next Fifty to Sixty Bud

https://pestlemortar.substack.com/p/busy-for-the-next-fifty-to-sixty-had-all-my-money-in-bitcoin-...
1•mithradiumn•9m ago•0 comments

Imperative

https://pestlemortar.substack.com/p/imperative
1•mithradiumn•10m ago•0 comments

Show HN: I decomposed 87 tasks to find where AI agents structurally collapse

https://github.com/XxCotHGxX/Instruction_Entropy
1•XxCotHGxX•13m ago•1 comments

I went back to Linux and it was a mistake

https://www.theverge.com/report/875077/linux-was-a-mistake
1•timpera•14m ago•1 comments

Octrafic – open-source AI-assisted API testing from the CLI

https://github.com/Octrafic/octrafic-cli
1•mbadyl•16m ago•1 comments

US Accuses China of Secret Nuclear Testing

https://www.reuters.com/world/china/trump-has-been-clear-wanting-new-nuclear-arms-control-treaty-...
2•jandrewrogers•17m ago•1 comments

Peacock. A New Programming Language

1•hashhooshy•21m ago•1 comments

A postcard arrived: 'If you're reading this I'm dead, and I really liked you'

https://www.washingtonpost.com/lifestyle/2026/02/07/postcard-death-teacher-glickman/
2•bookofjoe•22m ago•1 comments

What to know about the software selloff

https://www.morningstar.com/markets/what-know-about-software-stock-selloff
2•RickJWagner•26m ago•0 comments

Show HN: Syntux – generative UI for websites, not agents

https://www.getsyntux.com/
3•Goose78•27m ago•0 comments

Microsoft appointed a quality czar. He has no direct reports and no budget

https://jpcaparas.medium.com/ab75cef97954
2•birdculture•27m ago•0 comments

AI overlay that reads anything on your screen (invisible to screen capture)

https://lowlighter.app/
1•andylytic•28m ago•1 comments

Show HN: Seafloor, be up and running with OpenClaw in 20 seconds

https://seafloor.bot/
1•k0mplex•29m ago•0 comments

Tesla turbine-inspired structure generates electricity using compressed air

https://techxplore.com/news/2026-01-tesla-turbine-generates-electricity-compressed.html
2•PaulHoule•30m ago•0 comments

State Department deleting 17 years of tweets (2009-2025); preservation needed

https://www.npr.org/2026/02/07/nx-s1-5704785/state-department-trump-posts-x
3•sleazylice•30m ago•1 comments

Learning to code, or building side projects with AI help, this one's for you

https://codeslick.dev/learn
1•vitorlourenco•31m ago•0 comments

Effulgence RPG Engine [video]

https://www.youtube.com/watch?v=xFQOUe9S7dU
1•msuniverse2026•32m ago•0 comments

Five disciplines discovered the same math independently – none of them knew

https://freethemath.org
4•energyscholar•33m ago•1 comments

We Scanned an AI Assistant for Security Issues: 12,465 Vulnerabilities

https://codeslick.dev/blog/openclaw-security-audit
1•vitorlourenco•34m ago•0 comments

Amazon no longer defend cloud customers against video patent infringement claims

https://ipfray.com/amazon-no-longer-defends-cloud-customers-against-video-patent-infringement-cla...
2•ffworld•34m ago•0 comments
Open in hackernews

Show HN: Browser-use, Qwen 2.5 3B, Sentience – Jest assertions for AI web agents

1•tonyww•3w ago
Hi HN,

This is an integration between browser-use and Sentience SDK, where Sentience acts like Jest for AI web agents: it lets agents assert what changed on the page, instead of guessing when they’re “done”.

Most web agents today rely on screenshots or raw DOM dumps and hope the model infers layout, state, and completion correctly. That can work for demos, but it’s expensive, brittle, and very hard to validate or debug.

Sentience takes a different approach.

The goal isn't to replace vision models, but to avoid paying for them when geometric processing of page structure is sufficient for most web pages.

At each step, Sentience builds a semantic snapshot of the page that captures what matters for interaction, not pixels or raw DOM:

* interactive elements (links, buttons, inputs) * roles + normalized text * bounding boxes + relative position * dominant groups (main lists / feeds) * ordinal structure (“first”, “top”, “last”)

This snapshot is passed to the agent as structured text (≈ 0.6–1.2k tokens per step in practice), which enables:

* browser-use agents to run without screenshots by default * text-based local LLMs (e.g. Qwen 2.5 3B) to work reliably using text-only prompts * Jest-style assertions over semantic state (e.g. “main list exists”, “first item clicked”, “task complete”)

In short:

* browser-use acts * Sentience SDK asserts

That separation makes agent behavior inspectable, debuggable, and testable, instead of opaque and heuristic-driven. It also reduces reliance on vision models when they aren’t strictly necessary.

Sentience SDK includes an agent runtime with per-step and task-level assertions, so agents can explicitly verify progress rather than relying on implicit “done” signals. On assertion failure after retries, Sentience SDK will trigger a fallback to vision models.

Example (browser-use + local LLM): Multi-step agent using Qwen 2.5 3B with semantic assertions: https://github.com/SentienceAPI/browser-use/pull/6/files

Example for failed task assertion logs: https://justpaste.it/lixt2

Example for successful task run logs: https://justpaste.it/izg2y

Full write-up with design rationale, tradeoffs, and examples: https://medium.com/@rcholic/beyond-clicking-how-we-taught-ai...

Open source SDK:

Python: https://github.com/SentienceAPI/sentience-python TypeScript: https://github.com/SentienceAPI/sentience-ts browser-use integrations:

Jest-style assertions for agents: https://github.com/SentienceAPI/browser-use/pull/5

Browser-use + Local LLM (Qwen 2.5 3B) demo: https://github.com/SentienceAPI/browser-use/pull/4

Token usage comparison (semantic snapshots vs screenshots in browser-use): https://github.com/SentienceAPI/browser-use/pull/1

Happy to answer questions or share minimal examples if you’re curious how this works in practice.

ShowHN screenshots from the test runs: https://jpcdn.it/img/small/435580414b5e9bc2236ca025573f3724....

Comments

tonyww•3w ago
A useful way to think about this: browser-use is the runner, Sentience is the assertion layer.

Agents act → Sentience verifies.

If you’ve built flaky E2E tests or agent demos that “usually work”, this is an attempt to make those workflows inspectable and testable.