frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Managed MCP Sandbox Environments for RL Training on Tool Use

3•wirehack•8h ago
Hi HN! We are Klavis AI (https://www.klavis.ai/) and we are launching a managed MCP Sandbox-as-a-Service for RL training on tool use.

If you want a model to learn tool use through RL, you need realistic environments where the model can take actions, you can observe the resulting state, and compute a reward. For SaaS tools, this means managing dozens of test accounts, handling OAuth and token refresh, seeding realistic data for each episode, resetting state between runs, and ensuring isolation when you're running concurrent training sessions. Most research teams spend months building this plumbing per integration.

Klavis is a managed sandbox service that handles all of that. You call our API to get an isolated sandbox backed by a real service instance (not a mock), initialize it with whatever data state you need, let your model interact via MCP, then dump the final state to compute your reward. One more API call resets everything for the next episode.

The key thing is these are real services, not static mocks. When your model creates a calendar event or updates a Salesforce record, that action actually executes against real infrastructure. The state changes are real. This matters because you want training to reflect production behavior exactly.

We currently support 50+ integrations across productivity tools (Google Calendar, Outlook, Slack), CRM (Salesforce, HubSpot), dev tools (GitHub, Jira, Linear), databases (Postgres, Snowflake), and others. We handle the account pooling, auth management, and lifecycle orchestration so researchers can focus on the actual training.

Technically, the workflow is: create a sandbox, call initialize API with a JSON payload defining your starting state, let the model interact via standard MCP tools, call dump API to get a typed snapshot of the final state, compare against your target for reward calculation, then call reset or delete. We use strict Pydantic schemas for all inputs and outputs so malformed data gets rejected immediately rather than causing silent failures mid-training.

Here is a quick demo: https://youtu.be/10C18rpCYcA.

We look forward to your comments. Thanks for reading!

Show HN: Tandem – Real-time collaborative editor with AI attribution tracking

https://github.com/lmanchu/tandem/tree/v3
1•Lmanchu•41s ago•0 comments

UK developing urgent plan for conflict, minister says

https://ukdefencejournal.org.uk/uk-developing-urgent-plan-for-conflict-minister-says/
1•Bender•2m ago•0 comments

Show HN: Claude Code Recipes for Knowledge Workers (Open Source)

https://github.com/sgharlow/claude-code-recipes
2•sgharlow•2m ago•0 comments

Switzerland's Security Policy Strategy

https://www.news.admin.ch/en/newnsb/BLkWfUbUsXtBFoSj-krgU
1•samuel246•3m ago•0 comments

BoxLite Love AI agent – SQLite for VMs: embeddable AI agent sandboxing

https://github.com/boxlite-labs/boxlite
1•dorianzheng•16m ago•1 comments

Don't Build Agents, Build Skills Instead – Barry and Mahesh, Anthropic [video]

https://www.youtube.com/watch?v=CEvIs9y1uog
1•kerim-ca•20m ago•0 comments

Color Spaces, Gamuts, and Transformations

https://ari-atori.dev/articles/color-spaces-gamuts-and-transformations.html
1•todsacerdoti•23m ago•0 comments

Michael Jordan was a basketball legend. Now, he's one in NASCAR too

https://www.nytimes.com/athletic/6882918/2025/12/11/michael-jordan-nascar-settlement-trial-legend/
1•divbzero•24m ago•1 comments

Deno 2.6 and Socket: Supply Chain Defense in Your CLI

https://socket.dev/blog/deno-2-6-socket-supply-chain-defense-in-your-cli
2•feross•28m ago•0 comments

Battery storage hits $65/MWh, a tipping point for solar

https://electrek.co/2025/12/12/battery-storage-hits-65-mwh-tipping-point-solar/
5•toomuchtodo•35m ago•3 comments

EV sticker shock: Solo drivers using California carpool lanes face hefty fines

https://www.latimes.com/california/story/2025-12-01/ev-sticker-shock-solo-drivers-using-californi...
1•PaulHoule•35m ago•0 comments

Show HN: I made a grid that sizes your subscriptions by what they cost

https://visualize.nguyenvu.dev/
1•hoangvu12•36m ago•1 comments

UK Lords propose ban on VPNs for children

https://www.techradar.com/vpn/vpn-privacy-security/uk-lords-propose-ban-on-vpns-for-children
2•josephcsible•37m ago•0 comments

Google Removes Sci-Hub Domains from U.S. Search Results Due to Dated Court Order

https://torrentfreak.com/google-removes-sci-hub-domains-from-u-s-search-results-due-to-dated-cour...
4•t-3•39m ago•0 comments

Processing 630M More Pwned Passwords, Courtesy of the FBI

https://www.troyhunt.com/processing-630-million-more-pwned-passwords-courtesy-of-the-fbi/
1•LorenDB•39m ago•0 comments

Waymo: "Not yet a legal path to operating in New York"; NYC demo video

https://twitter.com/Waymo/status/1999620430970167481
3•tech234a•40m ago•0 comments

Cycle-accurate YM2149 PSG emulator

https://github.com/slippyex/ym2149-rs
1•todsacerdoti•43m ago•0 comments

The Coming Need for Formal Specification

https://benjamincongdon.me/blog/2025/12/12/The-Coming-Need-for-Formal-Specification/
1•todsacerdoti•43m ago•0 comments

.NET Wrapper for latest PCRE2 library

https://github.com/ltrzesniewski/pcre-net
1•hooge•46m ago•0 comments

Oliver Sacks fabricated key details in his books

https://boingboing.net/2025/12/12/oliver-sacks-fabricated-key-details-in-his-books.html
4•talonx•47m ago•0 comments

Fairly Trained AI

https://www.fairlytrained.org
1•pabs3•48m ago•0 comments

Meta's Pivot from Open Source to Money-Making AI Model

https://www.bloomberg.com/news/articles/2025-12-10/inside-meta-s-pivot-from-open-source-to-money-...
2•gmays•49m ago•0 comments

Papermoon: A Space-Grade Linux for the NewSpace Era

https://thenewstack.io/papermoon-a-space-grade-linux-for-the-newspace-era/
1•CrankyBear•50m ago•0 comments

A Lisp Interpreter Implemented in Conway's Game of Life (2022)

https://woodrush.github.io/blog/posts/2022-01-12-lisp-in-life.html
1•pabs3•53m ago•0 comments

Redis-rs and Redis-test 1.0.0

https://github.com/redis-rs/redis-rs/blob/main/version1.md
1•stmw•1h ago•0 comments

1300 Still Images from the Animated Films of Hayao Miyazaki's Studio Ghibli

https://www.ghibli.jp/info/013772/
3•vinhnx•1h ago•0 comments

Visualizing the 4th Dimension with WebGPU

https://dugas.ch/funderstanding/visualizing_the_4th_dimension.html
1•chronolitus•1h ago•1 comments

Visual Proof of Pythagoras' Theorem [video]

https://www.youtube.com/watch?v=tTHhBE5lYTg
2•thunderbong•1h ago•1 comments

Roundup of Events for Bootstrappers in December 2025

https://bootstrappersbreakfast.com/2025/11/25/roundup-of-december-2025-bootstrapper-events/
1•skmurphy•1h ago•1 comments

How the Team Behind Valkey Knew It Was Time to Fork

https://thenewstack.io/how-the-team-behind-valkey-knew-it-was-time-to-fork/
1•CrankyBear•1h ago•0 comments