frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

I made a zero cost browser-use tool – let AI click and type on webpages for you

https://github.com/pdufour/browser-use-wasm
1•pdufour•1h ago

Comments

pdufour•1h ago
Link: https://github.com/pdufour/browser-use-wasm

So one of the big constraints of browser-use models is that they require a server running your vision language model to handle the images and convert it to actions.

That means if for instance you are a site owner and you want to include a AI widget that lets users control the webpage you are on via AI (i.e. ask the page to fill out this form) you would need a complicated server setup running a VLM.

I decided to build something different. We have had WebGPU and client-side models for a while, so I decided to build a library that does the following:

[Live page (iframe)] ──► [SnapDOM screenshot] ──► [ShowUI VLA WASM worker] ──► [DOM action at [x, y]]

Essentially this creates a browser-use model that runs entirely in your browser (no servers). There are a couple of libraries that make this possible:

- wllama for instance allows you to run any gguf model, which means easy access to VLA model on HF (I found ShowUi-2b to be the best but I want to try Nvidia LocateAnything)

- snapdom - as mentioned, this renders your webpage to an svg which is then passed to the VLA

After creating the workflow with those libraries, the rest is cake (not).

Some difficulties I had and my solutions for them:

- Snapdom had 1px rendering differences due to the inconsistencies rendering html that used a system font within a foreignObject tag in a svg - the fix it to use fonts from a CDN which provide font metrics for leading values

- Image resizing - you have to do some resizing to fit everything into limited space - this involved many different resizing methodologies

- Accuracy - finding out what increased my accuracy was quite hard at first till I found some evals such as MiniWoB++ (a web interaction test suite)

- Multi-step planning - my half-baked solution is to let the LLM generate the multiple steps, but in order for it to be comprehensive I would need to capture page, generate, capture page, generate, etc in a loop. I haven't done that yet

I am very interested in the client side LLM space so let me know if you have any thoughts!

Show HN: Live 3D map of London with planes, trains, buses and boats

https://london.jamespotter.dev/
2•drpancake•3m ago•0 comments

How Comics Are Made

https://howcomicsaremade.com/
2•nate•3m ago•0 comments

LeadLu

https://www.leadlu.com/
1•brevn•4m ago•0 comments

The rise of 'AI slop ' accusations is becoming a new form of gatekeeping

https://www.unite.ai/the-rise-of-ai-slop-accusations-is-becoming-a-new-form-of-gatekeeping/
1•50kIters•7m ago•1 comments

Ask HN: Do you have some tips to sanitize YouTube's suggestions?

1•hamburgererror•7m ago•0 comments

Linear Agent

https://linear.app/changelog/2026-03-24-introducing-linear-agent
1•samber•8m ago•0 comments

Show HN: Started 25 projects in the last 2 years. Made all open source

https://github.com/orgs/One-Million-Lines/repositories
1•websku•10m ago•0 comments

If you use Claude to harm Anthropic's reputation, you will be sued

https://twitter.com/RnaudBertrand/status/2064892380701237647
3•jimjin•12m ago•0 comments

Parallel Intelligence and Cognitive Warfare

https://jackson-t.com/parallel-intelligence-and-cognitive-warfare/
1•jthuraisamy•12m ago•0 comments

David Hockney – iPad Works

https://www.hockney.com/index.php/works/digital/ipad
1•basisword•15m ago•0 comments

Wave-inspired path-planning strategy for support-free horizontal overhang FDM

https://www.sciencedirect.com/science/article/pii/S277236902600040X?via%3Dihub=
1•rbanffy•15m ago•0 comments

An open letter to office suite users

https://blog.documentfoundation.org/blog/2026/06/08/an-open-letter/
3•nitnelave•16m ago•0 comments

Isomorphic Labs Hunts Hidden Drug Targets

https://spectrum.ieee.org/isomorphic-labs-ai-drug-discovery
1•rbanffy•16m ago•0 comments

The Computer Science Degree Isn't Dead

https://spectrum.ieee.org/computer-science-degree-isnt-dead
1•jruohonen•17m ago•0 comments

Editxr – A WYSIWYG Markdown editor for the terminal

https://editxr.org/
1•mromanuk•17m ago•0 comments

High-Level Notes on DS/ML Job Hunting

https://gentrexha.xyz/datascience/machinelearning/interviews/career/jobsearch/2026/06/11/preparin...
1•gentrexha•18m ago•1 comments

Sharing my product lexicon to lead product teams

https://www.scapellato.dev/blog/build-an-effective-product-lexicon
1•antonscap•18m ago•0 comments

The last line of defense must not be AI

https://worklifenotes.com/2026/06/12/the-last-line-of-defense-must-not-be-ai/
2•taleodor•19m ago•0 comments

Sovereignty Washing

https://www.article19.org/resources/sovereignty-washing-a-critical-reading-of-eus-tech-sovereignt...
1•jruohonen•19m ago•0 comments

Mojo Nightly

https://mojolang.org/releases/nightly/
1•andrewstetsenko•21m ago•0 comments

Encrypted Spaces An architecture for collaborative applications

https://encryptedspaces.org/
1•_____k•21m ago•0 comments

Our politicians fear engineers so much that they risk high treason charges

http://mikhailian.mova.org/node/325
1•sam_lowry_•22m ago•1 comments

Due to DMA, Siri AI delayed in EU for iOS 27 and iPadOS 27

https://www.apple.com/newsroom/2026/06/due-to-dma-siri-ai-delayed-in-eu-for-ios-27-and-ipados-27/
2•prawn•22m ago•0 comments

Agent Skills that teach AI coding agents to integrate barcode scanning

https://github.com/scandit/skills
1•1lb3r•23m ago•0 comments

World of Claudecraft: the first vibecoded MMORPG by Fable (open source)

https://worldofclaudecraft.com
2•niftynanometer•23m ago•0 comments

Show HN: Chirp, My constraint-driven metaprogramming language

https://github.com/FrancoisChabot/chirp/tree/main
1•Chabsff•24m ago•1 comments

Let's Do Postfix Again

https://brokkr.net/tag/lets-do-postfix-again/
1•ofrzeta•25m ago•0 comments

Why are pull requests so hard to review?

https://www.pyor.review/blog/why-are-pull-requests-so-hard-to-review
1•othmanosx•26m ago•0 comments

What's the Future of Gene Editing?

https://www.quantamagazine.org/whats-the-future-of-gene-editing-20260611/
2•haeseong•29m ago•0 comments

Reuse Less Software

https://wiki.alopex.li/ReuseLessSoftware
1•haeseong•31m ago•0 comments