> Under the hood, we're building a client-sourced RAG for the DOM. An agent's first move on a page is to check a vector DB for a known "map." ... This creates a wild side-effect: the system is self-healing for everyone. One person's failed automation accidentally fixes it for the next hundred users.
I think I'd like to know exactly what kind of data is extracted from the DOM to build that shared map.
I guess Playwright can do it in "record" mode; I'm curious how you do it from a Chrome extension.
Spitballing here, you inject an event filter on the page and when the click happens, grab the element and run some code to synthesize a selector that just refers to that element? (Presumably you could just reuse Playwright's element-to-locator code at this point.)
If you run this against let's say a typical e-commerce page where the navigation and all screen elements are super dynamic — user specific data, language etc. — this problems becomes even harder.
Self healing css selectors is also only 1 part of the story. The other part is the cohesive interface for the agent itself to use these selectors.
So it does the first pass (based on your goals) makes memories (and these are local)
Now you tell the agent you want to do this repeatedly, so it will make a workflow (these workflows are saved on server, currently all public for now but we are working out permissions/group based access) for you based on these memories and interactions.
The problem is many times that the agent thinks is stable isn't really, so there a feedback loop for the agent to test out the workflow and improve them. (its basically claude code/codex sitting in the browser)
Workflow details are appended to prompt based on user query match/opened tabs match.
I ask it to fix something. It claims to know what the problem is, changes the code, and then claims it's fixed. I open app, and it's still broken. I have to continuously and way to often repeatedly tell it what it broken.
Now, supposing I'm "vibe coding" and don't really care about the obvious fact that the AI doesn't actually know what it is doing, it's still frustrating that I have to be in the loop just to provide very basic information like that.
Are there any agentic coding setups that allow the agent to interact with the app it's working on to check if it actually works?
It's sped up debugging a lot since I can just give it instructions like "found a bug that does XYZ, I think it's a problem with functionABC(); connect to app, click these four buttons in this order, examine the internal state, then trace through the code to figure out what's going wrong and present a solution"
I was pretty resistant at first of delegating debugging blindly like that, but it's made the workflow pretty smooth to where I can occasionally just open the app, run through it as a human user and take notes on bugs and flow issues that I find, log them with steps to reproduce, then give Claude a list of bugs to noodle on while I'm focusing on stuff LLMs are terrible at (design, UI, frontend work, etc)
Hm somehow I feel like this is a giant step in the wrong direction.
It seems like I'm only able to use the pre-existing/canned workflows that are provided under different "Persona"s? And there's no way for me to just create a new workflow from scratch for my specific use case.
Am I missing something obvious?
The one you refer will be taken down soon. Ping me on discord if you need help in trying it.
Basically, how do I use this self-healing DOM that the article is all about?
Related - the new extension only works if I allow it to be my new tab default page? That's pretty intrusive, if I'm honest.
You can use floating window or sidepanel also for the agent. New tab is just for convenience.
one off putting thing about installing the extension is all the reviewers seem to be Indian and I've seen similar patterns across Google Reviews where there is a flood of reviews from Indian users and they are almost always fraud or some weird scam
not saying this is the case here but whenever I see a bunch of reviews from Indian names, it automatically makes me trust whatever service or product less just fyi.
brianjking•3h ago
shardullavekar•3h ago
brianjking•3h ago
memet_rush•3h ago
phgn•2h ago
There's this error in the console: Failed to load module script: Expected a JavaScript-or-Wasm module script but the server responded with a MIME type of "text/html". Strict MIME type checking is enforced for module scripts per HTML spec.