frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Dexterous robotic hands: 2009 – 2014 – 2025

https://old.reddit.com/r/robotics/comments/1qp7z15/dexterous_robotic_hands_2009_2014_2025/
1•gmays•42s ago•0 comments

Interop 2025: A Year of Convergence

https://webkit.org/blog/17808/interop-2025-review/
1•ksec•10m ago•1 comments

JobArena – Human Intuition vs. Artificial Intelligence

https://www.jobarena.ai/
1•84634E1A607A•13m ago•0 comments

Concept Artists Say Generative AI References Only Make Their Jobs Harder

https://thisweekinvideogames.com/feature/concept-artists-in-games-say-generative-ai-references-on...
1•KittenInABox•17m ago•0 comments

Show HN: PaySentry – Open-source control plane for AI agent payments

https://github.com/mkmkkkkk/paysentry
1•mkyang•19m ago•0 comments

Show HN: Moli P2P – An ephemeral, serverless image gallery (Rust and WebRTC)

https://moli-green.is/
1•ShinyaKoyano•29m ago•0 comments

The Crumbling Workflow Moat: Aggregation Theory's Final Chapter

https://twitter.com/nicbstme/status/2019149771706102022
1•SubiculumCode•33m ago•0 comments

Pax Historia – User and AI powered gaming platform

https://www.ycombinator.com/launches/PMu-pax-historia-user-ai-powered-gaming-platform
2•Osiris30•34m ago•0 comments

Show HN: I built a RAG engine to search Singaporean laws

https://github.com/adityaprasad-sudo/Explore-Singapore
1•ambitious_potat•40m ago•0 comments

Scams, Fraud, and Fake Apps: How to Protect Your Money in a Mobile-First Economy

https://blog.afrowallet.co/en_GB/tiers-app/scams-fraud-and-fake-apps-in-africa
1•jonatask•40m ago•0 comments

Porting Doom to My WebAssembly VM

https://irreducible.io/blog/porting-doom-to-wasm/
1•irreducible•40m ago•0 comments

Cognitive Style and Visual Attention in Multimodal Museum Exhibitions

https://www.mdpi.com/2075-5309/15/16/2968
1•rbanffy•42m ago•0 comments

Full-Blown Cross-Assembler in a Bash Script

https://hackaday.com/2026/02/06/full-blown-cross-assembler-in-a-bash-script/
1•grajmanu•47m ago•0 comments

Logic Puzzles: Why the Liar Is the Helpful One

https://blog.szczepan.org/blog/knights-and-knaves/
1•wasabi991011•58m ago•0 comments

Optical Combs Help Radio Telescopes Work Together

https://hackaday.com/2026/02/03/optical-combs-help-radio-telescopes-work-together/
2•toomuchtodo•1h ago•1 comments

Show HN: Myanon – fast, deterministic MySQL dump anonymizer

https://github.com/ppomes/myanon
1•pierrepomes•1h ago•0 comments

The Tao of Programming

http://www.canonical.org/~kragen/tao-of-programming.html
2•alexjplant•1h ago•0 comments

Forcing Rust: How Big Tech Lobbied the Government into a Language Mandate

https://medium.com/@ognian.milanov/forcing-rust-how-big-tech-lobbied-the-government-into-a-langua...
3•akagusu•1h ago•0 comments

PanelBench: We evaluated Cursor's Visual Editor on 89 test cases. 43 fail

https://www.tryinspector.com/blog/code-first-design-tools
2•quentinrl•1h ago•2 comments

Can You Draw Every Flag in PowerPoint? (Part 2) [video]

https://www.youtube.com/watch?v=BztF7MODsKI
1•fgclue•1h ago•0 comments

Show HN: MCP-baepsae – MCP server for iOS Simulator automation

https://github.com/oozoofrog/mcp-baepsae
1•oozoofrog•1h ago•0 comments

Make Trust Irrelevant: A Gamer's Take on Agentic AI Safety

https://github.com/Deso-PK/make-trust-irrelevant
7•DesoPK•1h ago•4 comments

Show HN: Sem – Semantic diffs and patches for Git

https://ataraxy-labs.github.io/sem/
1•rs545837•1h ago•1 comments

Hello world does not compile

https://github.com/anthropics/claudes-c-compiler/issues/1
35•mfiguiere•1h ago•20 comments

Show HN: ZigZag – A Bubble Tea-Inspired TUI Framework for Zig

https://github.com/meszmate/zigzag
3•meszmate•1h ago•0 comments

Metaphor+Metonymy: "To love that well which thou must leave ere long"(Sonnet73)

https://www.huckgutman.com/blog-1/shakespeare-sonnet-73
1•gsf_emergency_6•1h ago•0 comments

Show HN: Django N+1 Queries Checker

https://github.com/richardhapb/django-check
1•richardhapb•1h ago•1 comments

Emacs-tramp-RPC: High-performance TRAMP back end using JSON-RPC instead of shell

https://github.com/ArthurHeymans/emacs-tramp-rpc
1•todsacerdoti•1h ago•0 comments

Protocol Validation with Affine MPST in Rust

https://hibanaworks.dev
1•o8vm•2h ago•1 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
5•gmays•2h ago•1 comments
Open in hackernews

Show HN: Browser-use, Qwen 2.5 3B, Sentience – Jest assertions for AI web agents

1•tonyww•3w ago
Hi HN,

This is an integration between browser-use and Sentience SDK, where Sentience acts like Jest for AI web agents: it lets agents assert what changed on the page, instead of guessing when they’re “done”.

Most web agents today rely on screenshots or raw DOM dumps and hope the model infers layout, state, and completion correctly. That can work for demos, but it’s expensive, brittle, and very hard to validate or debug.

Sentience takes a different approach.

The goal isn't to replace vision models, but to avoid paying for them when geometric processing of page structure is sufficient for most web pages.

At each step, Sentience builds a semantic snapshot of the page that captures what matters for interaction, not pixels or raw DOM:

* interactive elements (links, buttons, inputs) * roles + normalized text * bounding boxes + relative position * dominant groups (main lists / feeds) * ordinal structure (“first”, “top”, “last”)

This snapshot is passed to the agent as structured text (≈ 0.6–1.2k tokens per step in practice), which enables:

* browser-use agents to run without screenshots by default * text-based local LLMs (e.g. Qwen 2.5 3B) to work reliably using text-only prompts * Jest-style assertions over semantic state (e.g. “main list exists”, “first item clicked”, “task complete”)

In short:

* browser-use acts * Sentience SDK asserts

That separation makes agent behavior inspectable, debuggable, and testable, instead of opaque and heuristic-driven. It also reduces reliance on vision models when they aren’t strictly necessary.

Sentience SDK includes an agent runtime with per-step and task-level assertions, so agents can explicitly verify progress rather than relying on implicit “done” signals. On assertion failure after retries, Sentience SDK will trigger a fallback to vision models.

Example (browser-use + local LLM): Multi-step agent using Qwen 2.5 3B with semantic assertions: https://github.com/SentienceAPI/browser-use/pull/6/files

Example for failed task assertion logs: https://justpaste.it/lixt2

Example for successful task run logs: https://justpaste.it/izg2y

Full write-up with design rationale, tradeoffs, and examples: https://medium.com/@rcholic/beyond-clicking-how-we-taught-ai...

Open source SDK:

Python: https://github.com/SentienceAPI/sentience-python TypeScript: https://github.com/SentienceAPI/sentience-ts browser-use integrations:

Jest-style assertions for agents: https://github.com/SentienceAPI/browser-use/pull/5

Browser-use + Local LLM (Qwen 2.5 3B) demo: https://github.com/SentienceAPI/browser-use/pull/4

Token usage comparison (semantic snapshots vs screenshots in browser-use): https://github.com/SentienceAPI/browser-use/pull/1

Happy to answer questions or share minimal examples if you’re curious how this works in practice.

ShowHN screenshots from the test runs: https://jpcdn.it/img/small/435580414b5e9bc2236ca025573f3724....

Comments

tonyww•3w ago
A useful way to think about this: browser-use is the runner, Sentience is the assertion layer.

Agents act → Sentience verifies.

If you’ve built flaky E2E tests or agent demos that “usually work”, this is an attempt to make those workflows inspectable and testable.