Show HN: Vibe coded an AI chat app with features I wanted, Poe

1•SamInTheShell•2mo ago

It supports local inference with Ollama and LM Studio (going to add other provider support in the future).

The big thing for me in this project was really telegraphing where the working directory is.

Comments

renevanpelt•2mo ago

Very cool, always great to build projects because you need them yourself.

From the screenshots it seems that there's a "tool" in the list of tools provided to the LLM for command line utilities like `rm`, `mkdir`, `ls` and so forth.

Just a small piece of advice: you might want to look into further. You can also expose the command line as a single tool, and most LLMs will be able to provide pretty good formatted commands. You could still filter invalid or allowed and non-allowed commands out within the tool that's actually being called by the LLM.

Just wanted to share that!

SamInTheShell•2mo ago

I've seen that before. Idk how I feel about that pattern in general. When I see any of the tools I use do stuff like `bash(cmd... I didn't ask your permissions - hehe!~)`, I get a bit pissed that it wasn't a straight up standalone tool. The number of times it's gas lit me into panicing isn't zero.

mutant•2mo ago

would be interested in reading your outlook on the execution model, isolation, guardrails, tool calling. those are some of the baseline things i evaluate before try before trying an agentic env

SamInTheShell•2mo ago

Isolation is a smart idea for these things, as you literally can't verify their behavior beyond "it's kinda doing the thing most of the time". My chat app just kinda settles for "you can require permissions and audit", which it's a fool proof bullet, when the AI can churn out more code than a human can read in a reasonable amount of time.

mutant•2mo ago

might be good to get your hands around this early.

reading up on how crush, goose, and opencode handle this may be a good idea.

i've been trying to build a web native terminal assistant for a while (just a side project) and this is easily the thing that keeps me up at night.

### Primary Sources: - *Anthropic Engineering Blog: "Making Claude Code more secure and autonomous with sandboxing"* Detailed article on Claude Code's sandboxing features, including OS-level primitives (e.g., Linux Bubblewrap, macOS Seatbelt) for filesystem and network isolation. [Read here](https://www.anthropic.com/engineering/claude-code-sandboxing) (Published Oct 20, 2025).

- *Claude Code Documentation: Sandboxing* Official docs covering setup, configuration, security benefits (e.g., prompt injection protection), and limitations of filesystem/network isolation in Claude Code. [Read here](https://code.claude.com/docs/en/sandboxing).

- *Claude Blog: "Beyond permission prompts: making Claude Code more secure and autonomous"* Overview of sandboxing in Claude Code, emphasizing boundaries for safer agent execution. [Read here](https://claude.com/blog/beyond-permission-prompts-making-cla...) (Published Oct 31, 2025).

### Additional Resources: For broader context on sandboxing agentic AI: - *arXiv Paper: "Securing AI Agent Execution"* Research on isolation techniques for AI agents, including risk assessment. [Read here](https://arxiv.org/abs/2510.21236) (Published Oct 24, 2025). - *HopX Documentation* Practical guide to sandboxing for AI agents (e.g., using Firecracker micro-VMs). [Read here](https://hopx.ai/) (Open-source SDK available at [GitHub](https://github.com/hopx-ai/sdk)).

### Cursor Cursor uses local-first editing with optional sandboxing via Docker containers for isolated execution (no default vendor-owned sandboxes). It respects user-defined rules without overriding them.

- *Skywork AI Blog: Security in Cursor 2.0* Details Cursor's sandboxing for code execution, network protection, and isolation. [Read here](https://skywork.ai/blog/vibecoding/cursor-2-0-security-priva...) (Published Nov 1, 2025).

- *Skywork AI Blog: Cursor 2.0 vs Claude Code SDK* Compares isolation techniques, noting Cursor's local sandboxes vs. Claude's cloud-based ones. [Read here](https://skywork.ai/blog/vibecoding/cursor-2-0-vs-claude-code...) (Published Nov 1, 2025).

### OpenAI Codex Codex primarily relies on API-based execution with optional user-managed sandboxes (e.g., via Firecracker or custom proxies). It emphasizes provider retention policies but lacks built-in native sandboxing like Claude Code.

- *Render Blog: Testing AI Coding Agents (2025)* Benchmarks Codex's handling of isolation in production tasks, including Docker-based sandboxes. [Read here](https://render.com/blog/ai-coding-agents-benchmark) (Published Aug 12, 2025).

- *Medium: Claude Code vs Cursor* Indirect comparison noting Codex's API retention and sandbox limitations vs. Cursor/Claude. [Read here](https://open-data-analytics.medium.com/claude-code-vs-cursor...) (Published Aug 6, 2025).

### Goose AI (Codename Goose) Goose uses container-based isolation via tools like Container Use (built on Dagger) for git-branch-isolated environments, emphasizing safe experimentation without affecting the host.

- *Goose Blog: Isolated Dev Environments* Explains Goose's container-use for sandboxes, including lifecycle management and rollback. [Read here](https://block.github.io/goose/blog/2025/06/19/isolated-devel...) (Published Jun 19, 2025).

- *GitHub Discussion: Goose vs Claude Code* Community analysis comparing Goose's local isolation to Claude Code's cloud sandboxes. [Read here](https://github.com/block/goose/discussions/3133) (Ongoing, started Jun 27, 2025).

- *Slashdot: Compare Claude vs. Goose* High-level comparison including deployment isolation. [Read here](https://slashdot.org/software/comparison/Claude-vs-codename-...).

also: check out the open-source sandbox runtime from Anthropic: [GitHub Repo](https://github.com/anthropic-experimental/sandbox-runtime).

clearly i have a bias on this topic, lol

SamInTheShell•2mo ago

I'm inclined to isolate the chat processes only if I keep the bash tool. I'm undecided if I'm keeping it in. Implementing an MCP server in python is dead easy and so is making a bash tool.

Right now I'm more interested in getting ACP working for gemini-cli and claude-code to serve it their models. My first goal is just to make the manually operated tool that just gets out of the way or whatever. Sane permissions out of the box.

If anyone wants to just come along and add an optional feature today, I'm happy to merge under the same license. Otherewise, I will eventually add this feature, I'm just not sure if it will be sooner or later.

Show HN: Real-time path tracing of medical CT volumes in the browser via WebGPU

United States – Crypto Scam Help – Intelligence Cyber Wizard Safe Guide

What to Do After a Crypto Scam (USA) Intelligence Cyber Wizard Explained

The Physics of 588: A 17.64μm Isolation Barrier Strategy for 5nm Process

My Eighth Year as a Bootstrapped Founder

Data Modelling Open Source

Mid-life transitions

My Airships – My "No. 9," the Little Runabout

Show HN: Portview, A diagnostic-first port viewer for Linux (~930 KB, zero deps)

Show HN: Claude has a compiler, I have SlopScript

Context Is Part of the Game

Dave Farber has passed away

Researchers find brain mechanism behind 'flashes of intuition'

Extracting Xcode's Claude Code Prompt

AI is not another abstraction because god plays dice

Show HN: Tandem – An open-source, local-first AI workspace (Rust and React)

Show HN: AI Perks – A curated list of free AI credits and deals for developers

Why E cores make Apple Silicon fast

Show HN: Google Maps but for your repo (Open Source)

Djevops: Host Django on Bare Metal

How to Destroy a Space Station

Show HN: I built a framework to benchmark LLMs on System Design and Architecture

What do you expect from a Turkey-based hosting provider?

Why Files Are Not Enough as Memory for AI Agents

Nabaztag: Embodiment of "IoT" that was before its time

Show HN: Friends don't let friends do math after a few drinks

Show HN: A free, minimal CV builder I made as a side project

Show HN: Competitor Finder API – find real competitors from one hostname

Show HN: Textream: Dynamic Island-style teleprompter for macOS with voice track

How do you use AI coding tools at scale without losing architectural control?

Show HN: Real-time path tracing of medical CT volumes in the browser via WebGPU

United States – Crypto Scam Help – Intelligence Cyber Wizard Safe Guide

What to Do After a Crypto Scam (USA) Intelligence Cyber Wizard Explained

The Physics of 588: A 17.64μm Isolation Barrier Strategy for 5nm Process

My Eighth Year as a Bootstrapped Founder

Data Modelling Open Source

Mid-life transitions

My Airships – My "No. 9," the Little Runabout

Show HN: Portview, A diagnostic-first port viewer for Linux (~930 KB, zero deps)

Show HN: Claude has a compiler, I have SlopScript

Context Is Part of the Game

Dave Farber has passed away

Researchers find brain mechanism behind 'flashes of intuition'

Extracting Xcode's Claude Code Prompt

AI is not another abstraction because god plays dice

Show HN: Tandem – An open-source, local-first AI workspace (Rust and React)

Show HN: AI Perks – A curated list of free AI credits and deals for developers

Why E cores make Apple Silicon fast

Show HN: Google Maps but for your repo (Open Source)

Djevops: Host Django on Bare Metal

How to Destroy a Space Station

Show HN: I built a framework to benchmark LLMs on System Design and Architecture

What do you expect from a Turkey-based hosting provider?

Why Files Are Not Enough as Memory for AI Agents

Nabaztag: Embodiment of "IoT" that was before its time

Show HN: Friends don't let friends do math after a few drinks

Show HN: A free, minimal CV builder I made as a side project

Show HN: Competitor Finder API – find real competitors from one hostname

Show HN: Textream: Dynamic Island-style teleprompter for macOS with voice track

How do you use AI coding tools at scale without losing architectural control?

Show HN: Vibe coded an AI chat app with features I wanted, Poe

Comments