frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: One-click AI employee with its own cloud desktop

https://cloudbot-ai.com
1•fainir•1m ago•0 comments

Show HN: Poddley – Search podcasts by who's speaking

https://poddley.com
1•onesandofgrain•2m ago•0 comments

Same Surface, Different Weight

https://www.robpanico.com/articles/display/?entry_short=same-surface-different-weight
1•retrocog•4m ago•0 comments

The Rise of Spec Driven Development

https://www.dbreunig.com/2026/02/06/the-rise-of-spec-driven-development.html
2•Brajeshwar•9m ago•0 comments

The first good Raspberry Pi Laptop

https://www.jeffgeerling.com/blog/2026/the-first-good-raspberry-pi-laptop/
2•Brajeshwar•9m ago•0 comments

Seas to Rise Around the World – But Not in Greenland

https://e360.yale.edu/digest/greenland-sea-levels-fall
1•Brajeshwar•9m ago•0 comments

Will Future Generations Think We're Gross?

https://chillphysicsenjoyer.substack.com/p/will-future-generations-think-were
1•crescit_eundo•12m ago•0 comments

State Department will delete Xitter posts from before Trump returned to office

https://www.npr.org/2026/02/07/nx-s1-5704785/state-department-trump-posts-x
2•righthand•15m ago•0 comments

Show HN: Verifiable server roundtrip demo for a decision interruption system

https://github.com/veeduzyl-hue/decision-assistant-roundtrip-demo
1•veeduzyl•16m ago•0 comments

Impl Rust – Avro IDL Tool in Rust via Antlr

https://www.youtube.com/watch?v=vmKvw73V394
1•todsacerdoti•16m ago•0 comments

Stories from 25 Years of Software Development

https://susam.net/twenty-five-years-of-computing.html
2•vinhnx•17m ago•0 comments

minikeyvalue

https://github.com/commaai/minikeyvalue/tree/prod
3•tosh•22m ago•0 comments

Neomacs: GPU-accelerated Emacs with inline video, WebKit, and terminal via wgpu

https://github.com/eval-exec/neomacs
1•evalexec•26m ago•0 comments

Show HN: Moli P2P – An ephemeral, serverless image gallery (Rust and WebRTC)

https://moli-green.is/
2•ShinyaKoyano•31m ago•1 comments

How I grow my X presence?

https://www.reddit.com/r/GrowthHacking/s/UEc8pAl61b
2•m00dy•32m ago•0 comments

What's the cost of the most expensive Super Bowl ad slot?

https://ballparkguess.com/?id=5b98b1d3-5887-47b9-8a92-43be2ced674b
1•bkls•33m ago•0 comments

What if you just did a startup instead?

https://alexaraki.substack.com/p/what-if-you-just-did-a-startup
5•okaywriting•40m ago•0 comments

Hacking up your own shell completion (2020)

https://www.feltrac.co/environment/2020/01/18/build-your-own-shell-completion.html
2•todsacerdoti•42m ago•0 comments

Show HN: Gorse 0.5 – Open-source recommender system with visual workflow editor

https://github.com/gorse-io/gorse
1•zhenghaoz•43m ago•0 comments

GLM-OCR: Accurate × Fast × Comprehensive

https://github.com/zai-org/GLM-OCR
1•ms7892•44m ago•0 comments

Local Agent Bench: Test 11 small LLMs on tool-calling judgment, on CPU, no GPU

https://github.com/MikeVeerman/tool-calling-benchmark
1•MikeVeerman•45m ago•0 comments

Show HN: AboutMyProject – A public log for developer proof-of-work

https://aboutmyproject.com/
1•Raiplus•45m ago•0 comments

Expertise, AI and Work of Future [video]

https://www.youtube.com/watch?v=wsxWl9iT1XU
1•indiantinker•46m ago•0 comments

So Long to Cheap Books You Could Fit in Your Pocket

https://www.nytimes.com/2026/02/06/books/mass-market-paperback-books.html
4•pseudolus•46m ago•2 comments

PID Controller

https://en.wikipedia.org/wiki/Proportional%E2%80%93integral%E2%80%93derivative_controller
1•tosh•50m ago•0 comments

SpaceX Rocket Generates 100GW of Power, or 20% of US Electricity

https://twitter.com/AlecStapp/status/2019932764515234159
2•bkls•50m ago•0 comments

Kubernetes MCP Server

https://github.com/yindia/rootcause
1•yindia•51m ago•0 comments

I Built a Movie Recommendation Agent to Solve Movie Nights with My Wife

https://rokn.io/posts/building-movie-recommendation-agent
4•roknovosel•51m ago•0 comments

What were the first animals? The fierce sponge–jelly battle that just won't end

https://www.nature.com/articles/d41586-026-00238-z
2•beardyw•1h ago•0 comments

Sidestepping Evaluation Awareness and Anticipating Misalignment

https://alignment.openai.com/prod-evals/
1•taubek•1h ago•0 comments
Open in hackernews

Show HN: Vibe coded an AI chat app with features I wanted, Poe

https://github.com/SamInTheShell/poe
1•SamInTheShell•2mo ago
It supports local inference with Ollama and LM Studio (going to add other provider support in the future).

The big thing for me in this project was really telegraphing where the working directory is.

Comments

renevanpelt•2mo ago
Very cool, always great to build projects because you need them yourself.

From the screenshots it seems that there's a "tool" in the list of tools provided to the LLM for command line utilities like `rm`, `mkdir`, `ls` and so forth.

Just a small piece of advice: you might want to look into further. You can also expose the command line as a single tool, and most LLMs will be able to provide pretty good formatted commands. You could still filter invalid or allowed and non-allowed commands out within the tool that's actually being called by the LLM.

Just wanted to share that!

SamInTheShell•2mo ago
I've seen that before. Idk how I feel about that pattern in general. When I see any of the tools I use do stuff like `bash(cmd... I didn't ask your permissions - hehe!~)`, I get a bit pissed that it wasn't a straight up standalone tool. The number of times it's gas lit me into panicing isn't zero.
mutant•2mo ago
would be interested in reading your outlook on the execution model, isolation, guardrails, tool calling. those are some of the baseline things i evaluate before try before trying an agentic env
SamInTheShell•2mo ago
Isolation is a smart idea for these things, as you literally can't verify their behavior beyond "it's kinda doing the thing most of the time". My chat app just kinda settles for "you can require permissions and audit", which it's a fool proof bullet, when the AI can churn out more code than a human can read in a reasonable amount of time.
mutant•2mo ago
might be good to get your hands around this early.

reading up on how crush, goose, and opencode handle this may be a good idea.

i've been trying to build a web native terminal assistant for a while (just a side project) and this is easily the thing that keeps me up at night.

### Primary Sources: - *Anthropic Engineering Blog: "Making Claude Code more secure and autonomous with sandboxing"* Detailed article on Claude Code's sandboxing features, including OS-level primitives (e.g., Linux Bubblewrap, macOS Seatbelt) for filesystem and network isolation. [Read here](https://www.anthropic.com/engineering/claude-code-sandboxing) (Published Oct 20, 2025).

- *Claude Code Documentation: Sandboxing* Official docs covering setup, configuration, security benefits (e.g., prompt injection protection), and limitations of filesystem/network isolation in Claude Code. [Read here](https://code.claude.com/docs/en/sandboxing).

- *Claude Blog: "Beyond permission prompts: making Claude Code more secure and autonomous"* Overview of sandboxing in Claude Code, emphasizing boundaries for safer agent execution. [Read here](https://claude.com/blog/beyond-permission-prompts-making-cla...) (Published Oct 31, 2025).

### Additional Resources: For broader context on sandboxing agentic AI: - *arXiv Paper: "Securing AI Agent Execution"* Research on isolation techniques for AI agents, including risk assessment. [Read here](https://arxiv.org/abs/2510.21236) (Published Oct 24, 2025). - *HopX Documentation* Practical guide to sandboxing for AI agents (e.g., using Firecracker micro-VMs). [Read here](https://hopx.ai/) (Open-source SDK available at [GitHub](https://github.com/hopx-ai/sdk)).

### Cursor Cursor uses local-first editing with optional sandboxing via Docker containers for isolated execution (no default vendor-owned sandboxes). It respects user-defined rules without overriding them.

- *Skywork AI Blog: Security in Cursor 2.0* Details Cursor's sandboxing for code execution, network protection, and isolation. [Read here](https://skywork.ai/blog/vibecoding/cursor-2-0-security-priva...) (Published Nov 1, 2025).

- *Skywork AI Blog: Cursor 2.0 vs Claude Code SDK* Compares isolation techniques, noting Cursor's local sandboxes vs. Claude's cloud-based ones. [Read here](https://skywork.ai/blog/vibecoding/cursor-2-0-vs-claude-code...) (Published Nov 1, 2025).

### OpenAI Codex Codex primarily relies on API-based execution with optional user-managed sandboxes (e.g., via Firecracker or custom proxies). It emphasizes provider retention policies but lacks built-in native sandboxing like Claude Code.

- *Render Blog: Testing AI Coding Agents (2025)* Benchmarks Codex's handling of isolation in production tasks, including Docker-based sandboxes. [Read here](https://render.com/blog/ai-coding-agents-benchmark) (Published Aug 12, 2025).

- *Medium: Claude Code vs Cursor* Indirect comparison noting Codex's API retention and sandbox limitations vs. Cursor/Claude. [Read here](https://open-data-analytics.medium.com/claude-code-vs-cursor...) (Published Aug 6, 2025).

### Goose AI (Codename Goose) Goose uses container-based isolation via tools like Container Use (built on Dagger) for git-branch-isolated environments, emphasizing safe experimentation without affecting the host.

- *Goose Blog: Isolated Dev Environments* Explains Goose's container-use for sandboxes, including lifecycle management and rollback. [Read here](https://block.github.io/goose/blog/2025/06/19/isolated-devel...) (Published Jun 19, 2025).

- *GitHub Discussion: Goose vs Claude Code* Community analysis comparing Goose's local isolation to Claude Code's cloud sandboxes. [Read here](https://github.com/block/goose/discussions/3133) (Ongoing, started Jun 27, 2025).

- *Slashdot: Compare Claude vs. Goose* High-level comparison including deployment isolation. [Read here](https://slashdot.org/software/comparison/Claude-vs-codename-...).

also: check out the open-source sandbox runtime from Anthropic: [GitHub Repo](https://github.com/anthropic-experimental/sandbox-runtime).

clearly i have a bias on this topic, lol

SamInTheShell•2mo ago
I'm inclined to isolate the chat processes only if I keep the bash tool. I'm undecided if I'm keeping it in. Implementing an MCP server in python is dead easy and so is making a bash tool.

Right now I'm more interested in getting ACP working for gemini-cli and claude-code to serve it their models. My first goal is just to make the manually operated tool that just gets out of the way or whatever. Sane permissions out of the box.

If anyone wants to just come along and add an optional feature today, I'm happy to merge under the same license. Otherewise, I will eventually add this feature, I'm just not sure if it will be sooner or later.