frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Bonsai –- Using agentic AI / browser / memory to replace ChatGPT

https://drive.google.com/drive/folders/1YUQ3tmcBSLEyBKLi5JdJgmod9mqXFTgl
2•coolwulf•1h ago

Comments

coolwulf•4m ago
Bonsai: A Local Agentic AI Harness Built Around Small Models Since last year, I've been teaching a course at UT Southwestern Medical Center on how to build Agentic AI systems and harnesses for specialized domains.

One thing I've noticed is that as companies like OpenAI, Google, and Anthropic continue raising API prices, the cost of running frontier models in the cloud keeps increasing. At the same time, many users are using ChatGPT the same way they used Google years ago: asking questions and looking up information. Most of these use cases simply don't justify paying for GPT-5.5, Opus 4.8, or other expensive flagship models.

That led me to explore a different idea: combining efficient local models with a purpose-built harness that provides tools, memory, and domain-specific skills.

Part of the reason I named this project Bonsai is that I had some interactions with Stanford's Prism Lab. The architecture follows an Agent + Skills + Memory design. Memory is implemented locally using embeddings and SQLite, allowing semantic retrieval through cosine similarity search. This helps compensate for the limited context windows of smaller local models.

I believe this approach can make small models much more capable than their parameter count would suggest.

Although Anthropic has never publicly disclosed the exact size of Claude Sonnet, my analysis suggests it is likely a Mixture-of-Experts (MoE) model with tens of billions of active parameters and hundreds of billions of total parameters.

The active parameters determine how much computation is used during inference, while the total parameters represent the model's stored knowledge. My hypothesis is that a dense thinking model with only tens of billions of parameters can still deliver strong performance if paired with effective harness engineering, specialized tools, memory, and retrieval systems.

If that hypothesis is correct, local models could satisfy the majority of everyday ChatGPT-style use cases without requiring expensive cloud inference.

As a first step, I'm releasing an experimental version of Bonsai.

Bonsai communicates directly with a local Google Chrome instance and provides a collection of browser-oriented tools that allow a local LLM to interact with the web in an agentic fashion. The default model is Google Gemma 4B, although Qwen models can also be used.

(One reason I chose Gemma as the default is that some government agencies and schools in Texas prohibit the use of Chinese open-source models.)

Download https://drive.google.com/drive/folders/1YUQ3tmcBSLEyBKLi5JdJ...

Screenshot https://i.imgur.com/9MacuXk.png

The left side shows the chat interface, while the right side displays the agent operating the browser in real time.

The harness includes many browser-specific tools, including JavaScript injection capabilities that allow the agent to locate page elements, inspect DOM structures, click buttons, fill forms, and perform other browser interactions.

Current features include:

Browser integration

VectorDB-based semantic memory for small-context local models

Custom browser-oriented skills and tools

Local embedding + SQLite memory system

Agentic web navigation

WebRTC-based communication layer (lower-level than MCP)

The current release was compiled for Windows and requires NVIDIA CUDA.

I've also added an Apple Silicon (M-series) Mac version to the same download directory.

The default model is a 4B thinking model because agent workflows benefit significantly from high token throughput. On my test system (Windows 11 + RTX 4090), Bonsai reaches roughly 140 tokens/sec. On an M4 Mac using Metal, I see around 50 tokens/sec.

I'm curious whether others think specialized harness engineering can make small local models practical for everyday AI workflows, rather than relying exclusively on increasingly large cloud-hosted models.

RAG Without Persona Modeling Fails Patient Clinical Relevance

https://www.riddhimohan.com/blog/hppie-rag-without-persona-modeling-fails-patient-clinical-relevance
1•riddhimohan•1m ago•0 comments

What happens if Japan takes in zero immigrants?

https://www.konichivalue.com/p/what-happens-if-japan-takes-in-zero
1•Konichivalue•4m ago•0 comments

Dirk and Linus discuss AI and kernel development

https://lwn.net/Articles/1073761/
1•signa11•6m ago•0 comments

Mathematicians warn of AI threats to profession as industry encroaches

https://arstechnica.com/tech-policy/2026/06/mathematicians-warn-of-ai-threats-to-profession-as-in...
1•SegfaultSeagull•8m ago•1 comments

AI should earn its keep: Introducing the AI Productivity Guarantee

https://cognition.ai/blog/ai-guarantee
2•nadis•9m ago•1 comments

Why I'm Joining the Board of Dreamdata

https://www.kellblog.com/why-im-joining-the-board-of-dreamdata/
1•doppp•15m ago•0 comments

SpaceX IPO available to Fidelity customers with as little as $2k

https://www.fidelity.com/learning-center/trading-investing/spacex-ipo-explained
1•dnw•15m ago•1 comments

The Weather Machine (2008)

https://events.foresight.org/the-weather-machine/
1•zetalyrae•16m ago•0 comments

Agentic systems for what comes next

https://kenn.io/
1•pbd•20m ago•0 comments

Validity of the EJamar Game Controller for Tracking Hand Rehabilitation

https://www.mdpi.com/2673-4117/7/5/197
1•PaulHoule•26m ago•0 comments

Boeing and Air India Escaped Scrutiny After the AI171 Crash

https://caravanmagazine.in/crime/air-india-crash-aaib-boeing-pilot
1•ms7892•27m ago•0 comments

AI assistant shouldn't have your passwords

https://bitwarden.com/blog/how-bitwarden-helps-secure-agentic-ai-access-to-your-credentials/
1•adm4•31m ago•0 comments

Basecamp CLI and Agent Skill: Agent first, agent native

https://basecamp.com/agents
1•doppp•32m ago•0 comments

Proposal would block solar storms with orbital 'airbag'

https://www.science.org/content/article/radical-proposal-would-block-solar-storms-orbital-airbag
1•gmays•36m ago•0 comments

Anthropic calls for global pause in AI development before humans lose control

https://siliconangle.com/2026/06/04/anthropic-calls-global-pause-ai-development-humans-lose-control/
3•patrickdavey•37m ago•1 comments

"News Man Bad": A Personnel Memo from Animal, Your Editor-in-Chief

https://www.mcsweeneys.net/articles/news-man-bad-a-personnel-memo-from-animal-your-editor-in-chief
1•Geekette•37m ago•0 comments

Scala: An Experiment That Changed Programming – Martin Odersky – The Marco Show

https://www.youtube.com/watch?v=Xn_YpUtXWT4
2•birdculture•39m ago•0 comments

My competitors have flawed products but I can't get traction

2•saveitincork•44m ago•0 comments

LLM AI Chatbots are letting me down every single day

https://umrashrf.github.io/llm-ai-chatbots-are-letting-me-down-every-single-day/
2•postbase•48m ago•0 comments

Bumblebees spontaneously solve problems – Science News [video]

https://www.youtube.com/watch?v=B77Hb2SKJZo
2•hheikinh•55m ago•0 comments

Cloudflare: bots have passed human traffic online, a year faster than expected

https://www.tomshardware.com/tech-industry/artificial-intelligence/bots-have-now-passed-human-tra...
3•spenvo•55m ago•1 comments

Bumblebees show advanced problem-solving skills in new experiment

https://www.cnn.com/2026/06/04/science/bumble-bees-insight-problem-solving
4•hheikinh•55m ago•0 comments

The Kyle Kingsbury Podcast Podcast – Episode 1 – Alex Dripchak

https://aphyr.com/posts/422-the-kyle-kingsbury-podcast-podcast-episode-1-alex-dripchak
3•yurivish•56m ago•0 comments

'Aren't the Organs a Silver Lining?'

https://longreads.com/2026/05/19/fentanyl-opioids-organ-donation-arizona-oneill/
2•gmays•56m ago•0 comments

Is LinkedIn Entering Its Post-Cringe Era?

https://www.nytimes.com/2026/06/04/business/linkedin-social-media-influencers.html
3•1vuio0pswjnm7•59m ago•1 comments

Show HN: Laravel Octane Benchmark (Swoole, RoadRunner, FrankenPHP)

https://terrylinooo.github.io/laravel-octane-benchmark/
2•terrylinooo•1h ago•0 comments

Unicode Fonts and Tools for X11

https://www.cl.cam.ac.uk/~mgk25/ucs-fonts.html
3•kristianp•1h ago•0 comments

Jo – Secure Programming for the AI Era

https://jo-lang.org/blog/2026-06-04-introducing-jo.html
3•rguiscard•1h ago•0 comments

Easy Writer: On Ted Geltner's Biography of Denis Johnson

https://www.metropolitanreview.org/p/easy-writer
2•benbreen•1h ago•0 comments

In a First, Scientists Precisely Edit Human Embryo Genes

https://www.nytimes.com/2026/06/04/science/embryos-gene-editing-crispr.html
3•bonsai_spool•1h ago•0 comments