frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: PageAgent, A GUI agent that lives inside your web app

https://alibaba.github.io/page-agent/
42•simon_luv_pho•2h ago
Title: Show HN: PageAgent, A GUI agent that lives inside your web app

Hi HN,

I'm building PageAgent, an open-source (MIT) library that embeds an AI agent directly into your frontend.

I built this because I believe there's a massive design space for deploying general agents natively inside the web apps we already use, rather than treating the web merely as a dumb target for isolated bots.

Currently, most AI agents operate from external clients or server-side programs, effectively leaving web development out of the AI ecosystem. I'm experimenting with an "inside-out" paradigm instead. By dropping the library into a page, you get a client-side agent that interacts natively with the live DOM tree and inherits the user's active session out of the box, which works perfectly for SPAs.

To handle cross-page tasks, I built an optional browser extension that acts as a "bridge". This allows the web-page agent to control the entire browser with explicit user authorization. Instead of a desktop app controlling your browser, your web app is empowered to act as a general agent that can navigate the broader web.

I'd love to start a conversation about the viability of this architecture, and what you all think about the future of in-app general agents. Happy to answer any questions!

Comments

simon_luv_pho•2h ago
This is highly experimental right now, but here are some quick links for anyone wanting to dig deeper:

- GitHub: https://github.com/alibaba/page-agent

- Live Demo (No sign-up): https://alibaba.github.io/page-agent/ (you can drag the bookmarklet from here to try it on other sites)

- Browser Extension: https://chromewebstore.google.com/detail/page-agent-ext/akld...

I'd be really interested in feedback on the security model of client-side agents giving extension-bridge access, and taking questions on the implementation!

jauntywundrkind•2h ago
Not exactly the same but I'd also point to Paul Kinlan's FolioLM as a very interesting project in this space. A very nice browser extension,

> Collect and query content from tabs, bookmarks, and history - your AI research companion. FolioLM helps you collect sources from tabs, bookmarks, and history, then query and transform that content using AI.

https://github.com/PaulKinlan/NotebookLM-Chrome https://chromewebstore.google.com/detail/foliolm/eeejhgacmlh...

simon_luv_pho•1h ago
Thanks for sharing! We need more projects like this in the JS ecosystem.
klueinc•49m ago
I've been trying to arrive to something like this with my own sidepanel extension called Klue but its more of a user notes + web page context approach. Nice to see another take on this! https://chromewebstore.google.com/detail/cackjmmgcmnkjnffabk...
pscanf•1h ago
Very cool!

I'm particularly impressed by the bookmark "trick" to install it on a page. Despite having spent 15 years developing for the browser, I had somehow missed that feature of the bookmarks bar. But awesome UX for people to try out the tool. Congrats!

simon_luv_pho•1h ago
Thanks!

Bookmarklets are such an underrated feature. It's super convenient to inject and test scripts on any page. Seemed like the perfect low-friction entry point for people to try it out.

Spent some time on that UX because the concept is a bit hard to explain. Glad it worked!

MeteorMarc•1h ago
Confusing name because of the existence of pageant, the putty agent.
kirth_gersen•54m ago
Came here to say missed opportunity to call it "PAgent". Rolls off the tongue better than Page Agent.
simon_luv_pho•46m ago
Darn. Pageant would've been a nice name though. Maybe `page-agent.js` is more relevant in web dev community.
coreylane•51m ago
Looks cool! Are you open to adding AWS Bedrock or LiteLLM support?
simon_luv_pho•14m ago
Thanks!

It supports any OpenAI-compatible API out of the box, so AWS Bedrock, LiteLLM, Ollama, etc. should all work. The free testing LLM is just there for a quick demo. Please bring your own LLM for long-time usage.

dzink•50m ago
Is this Affiliated with the Chinese company Alibaba? Any chance data goes there too?
simon_luv_pho•20m ago
Full transparency: I work at Alibaba and published this under Alibaba's open-source org. I maintain it during work hours, so yes, Alibaba technically pays me for it. That said, this is my project — it's MIT-licensed, includes no backend service, and is open for anyone to audit.

The free testing LLM endpoint is hosted on Alibaba Cloud because I happen to have some company quota to spend, but it's not part of the library. Bring your own LLM and there is zero data transmission to Alibaba or anywhere else you haven't configured yourself.

I highly recommend using it with a local Ollama setup.

mentalgear•44m ago
> Data processed via servers in Mainland China

Appreciate the transparency, but maybe you could add some European (preferably) alternatives ?

simon_luv_pho•36m ago
Please use your own LLM api instead!

The free testing LLM is Qwen hosted by Aliyun. Qwen and DeepSeek are the only ones I can afford to offer for free. It's just there to lower the try-out barrier; please DO NOT rely on it.

The library itself does NOT include any backend service. Your data only goes to the LLM api you configured.

I tested it on local Ollama models it works fine.

simon_luv_pho•18m ago
I'm looking into a European testing endpoint. The problem is I don't have enough resources to figure out all the legal and compliance requirements, and persuading my company to pay for that infrastructure is gonna be a tough sell.
general_reveal•25m ago
I’ve been thinking about something like this. If it’s just a one line script import, how the heck are you trusting natural language to translate to commands for an arbitrary ui?

The only thing I can think of is you had the AI rewrite and embed selectors on the entire build file and work with that?

Mnexium•22m ago
Curious - how does it perform with captchas and other "are you human" stuff on the web?
simon_luv_pho•6m ago
I added in the system prompt that it should skip CAPTCHAs and hand control back to the user. Currently working on a proper human-in-the-loop feature. That's actually one of the key advantages of running the agent inside your own browser.
popalchemist•11m ago
Does it support long-click / click-and-drag?
simon_luv_pho•1m ago
Not yet. Currently focused on the more common interaction patterns. PRs welcome though!

Wikipedia in read-only mode following mass admin account compromise

https://www.wikimediastatus.net
580•greyface-•3h ago•178 comments

GPT-5.4 Thinking System Card

https://openai.com/index/gpt-5-4-thinking-system-card/
174•mudkipdev•1h ago•112 comments

The Brand Age

https://paulgraham.com/brandage.html
58•bigwheels•2h ago•39 comments

A GitHub Issue Title Compromised 4k Developer Machines

https://grith.ai/blog/clinejection-when-your-ai-tool-installs-another
144•edf13•3h ago•32 comments

Show HN: Jido 2.0, Elixir Agent Framework

https://jido.run/blog/jido-2-0-is-here
163•mikehostetler•3h ago•38 comments

Good software knows when to stop

https://ogirardot.writizzy.com/p/good-software-knows-when-to-stop
222•ssaboum•5h ago•135 comments

Hardware hotplug events on Linux, the gory details

https://arcanenibble.github.io/hardware-hotplug-events-on-linux-the-gory-details.html
20•todsacerdoti•3d ago•0 comments

Datasets for Reconstructing Visual Perception from Brain Data

https://github.com/seelikat/neuro-visual-reconstruction-dataset-index
31•katsee•3h ago•6 comments

Optimizing Recommendation Systems with JDK's Vector API

https://netflixtechblog.com/optimizing-recommendation-systems-with-jdks-vector-api-30d2830401ec
32•mariuz•2d ago•1 comments

The Government Uses Targeted Advertising to Track Your Location

https://www.eff.org/deeplinks/2026/03/targeted-advertising-gives-your-location-government-just-as...
139•hn_acker•2h ago•56 comments

Show HN: PageAgent, A GUI agent that lives inside your web app

https://alibaba.github.io/page-agent/
42•simon_luv_pho•2h ago•21 comments

Launch HN: Vela (YC W26) – AI for complex scheduling

15•Gobhanu•2h ago•14 comments

Nvidia PersonaPlex 7B on Apple Silicon: Full-Duplex Speech-to-Speech in Swift

https://blog.ivan.digital/nvidia-personaplex-7b-on-apple-silicon-full-duplex-speech-to-speech-in-...
326•ipotapov•12h ago•109 comments

US asked Ukraine for help fighting Iranian drones, Zelensky says

https://www.bbc.com/news/articles/cr5llg0e9g9o
65•tartoran•1h ago•29 comments

Google Workspace CLI

https://github.com/googleworkspace/cli
853•gonzalovargas•19h ago•269 comments

Fast-Servers

https://geocar.sdf1.org/fast-servers.html
78•tosh•5h ago•25 comments

Greg Kroah-Hartman Stretches Support Periods for Key Linux LTS Kernels

https://fossforce.com/2026/03/greg-kroah-hartman-stretches-support-periods-for-key-linux-lts-kern...
34•brideoflinux•3d ago•13 comments

Let's Get Physical

https://m4iler.cloud/posts/lets-get-physical/
6•MBCook•23m ago•0 comments

GPT-5.4 Thinking and GPT-5.4 Pro

https://twitter.com/i/status/2029620619743219811
87•denysvitali•1h ago•22 comments

World-first gigabit laser link between aircraft and geostationary satellite

https://www.esa.int/Applications/Connectivity_and_Secure_Communications/World-first_gigabit-per-s...
137•giuliomagnifico•4d ago•53 comments

Relicensing with AI-Assisted Rewrite

https://tuananh.net/2026/03/05/relicensing-with-ai-assisted-rewrite/
334•tuananh•14h ago•333 comments

Pentagon Formally Labels Anthropic Supply-Chain Risk

https://www.wsj.com/politics/national-security/pentagon-formally-labels-anthropic-supply-chain-ri...
12•klausa•19m ago•0 comments

GPT 5.4 Thinking and Pro

https://twitter.com/OpenAI/status/2029620619743219811
60•twtw99•1h ago•10 comments

Google Safe Browsing missed 84% of confirmed phishing sites

https://www.norn-labs.com/blog/huginn-report-feb-2026
221•jdup7•4h ago•67 comments

GPT-5.4

https://openai.com/index/introducing-gpt-5-4/
151•meetpateltech•1h ago•57 comments

Poor Man's Polaroid

https://boxart.lt/blog/poor_mans_polaroid
163•ZacnyLos•11h ago•48 comments

The Man Who Broke into Jail

https://www.newyorker.com/magazine/2026/03/09/alexander-friedmann-profile-prison-reform
58•fortran77•1d ago•35 comments

Building a new Flash

https://bill.newgrounds.com/news/post/1607118
685•TechPlasma•23h ago•226 comments

AMD will bring its “Ryzen AI” processors to standard desktop PCs for first time

https://arstechnica.com/gadgets/2026/03/amd-ryzen-ai-400-cpus-will-bring-upgraded-graphics-to-soc...
199•Bender•3d ago•186 comments

Smalltalk's Browser: Unbeatable, yet Not Enough

https://blog.lorenzano.eu/smalltalks-browser-unbeatable-yet-not-enough/
120•mpweiher•11h ago•60 comments