frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Managed MCP Sandbox Environments for RL Training on Tool Use

3•wirehack•1mo ago
Hi HN! We are Klavis AI (https://www.klavis.ai/) and we are launching a managed MCP Sandbox-as-a-Service for RL training on tool use.

If you want a model to learn tool use through RL, you need realistic environments where the model can take actions, you can observe the resulting state, and compute a reward. For SaaS tools, this means managing dozens of test accounts, handling OAuth and token refresh, seeding realistic data for each episode, resetting state between runs, and ensuring isolation when you're running concurrent training sessions. Most research teams spend months building this plumbing per integration.

Klavis is a managed sandbox service that handles all of that. You call our API to get an isolated sandbox backed by a real service instance (not a mock), initialize it with whatever data state you need, let your model interact via MCP, then dump the final state to compute your reward. One more API call resets everything for the next episode.

The key thing is these are real services, not static mocks. When your model creates a calendar event or updates a Salesforce record, that action actually executes against real infrastructure. The state changes are real. This matters because you want training to reflect production behavior exactly.

We currently support 50+ integrations across productivity tools (Google Calendar, Outlook, Slack), CRM (Salesforce, HubSpot), dev tools (GitHub, Jira, Linear), databases (Postgres, Snowflake), and others. We handle the account pooling, auth management, and lifecycle orchestration so researchers can focus on the actual training.

Technically, the workflow is: create a sandbox, call initialize API with a JSON payload defining your starting state, let the model interact via standard MCP tools, call dump API to get a typed snapshot of the final state, compare against your target for reward calculation, then call reset or delete. We use strict Pydantic schemas for all inputs and outputs so malformed data gets rejected immediately rather than causing silent failures mid-training.

Here is a quick demo: https://youtu.be/10C18rpCYcA.

We look forward to your comments. Thanks for reading!

First Proof

https://arxiv.org/abs/2602.05192
2•samasblack•56s ago•1 comments

I squeezed a BERT sentiment analyzer into 1GB RAM on a $5 VPS

https://mohammedeabdelaziz.github.io/articles/trendscope-market-scanner
1•mohammede•2m ago•0 comments

Kagi Translate

https://translate.kagi.com
1•microflash•2m ago•0 comments

Building Interactive C/C++ workflows in Jupyter through Clang-REPL [video]

https://fosdem.org/2026/schedule/event/QX3RPH-building_interactive_cc_workflows_in_jupyter_throug...
1•stabbles•4m ago•0 comments

Tactical tornado is the new default

https://olano.dev/blog/tactical-tornado/
1•facundo_olano•5m ago•0 comments

Full-Circle Test-Driven Firmware Development with OpenClaw

https://blog.adafruit.com/2026/02/07/full-circle-test-driven-firmware-development-with-openclaw/
1•ptorrone•6m ago•0 comments

Automating Myself Out of My Job – Part 2

https://blog.dsa.club/automation-series/automating-myself-out-of-my-job-part-2/
1•funnyfoobar•6m ago•0 comments

Google staff call for firm to cut ties with ICE

https://www.bbc.com/news/articles/cvgjg98vmzjo
13•tartoran•6m ago•0 comments

Dependency Resolution Methods

https://nesbitt.io/2026/02/06/dependency-resolution-methods.html
1•zdw•7m ago•0 comments

Crypto firm apologises for sending Bitcoin users $40B by mistake

https://www.msn.com/en-ie/money/other/crypto-firm-apologises-for-sending-bitcoin-users-40-billion...
1•Someone•7m ago•0 comments

Show HN: iPlotCSV: CSV Data, Visualized Beautifully for Free

https://www.iplotcsv.com/demo
1•maxmoq•8m ago•0 comments

There's no such thing as "tech" (Ten years later)

https://www.anildash.com/2026/02/06/no-such-thing-as-tech/
1•headalgorithm•8m ago•0 comments

List of unproven and disproven cancer treatments

https://en.wikipedia.org/wiki/List_of_unproven_and_disproven_cancer_treatments
1•brightbeige•9m ago•0 comments

Me/CFS: The blind spot in proactive medicine (Open Letter)

https://github.com/debugmeplease/debug-ME
1•debugmeplease•9m ago•1 comments

Ask HN: What are the word games do you play everyday?

1•gogo61•12m ago•1 comments

Show HN: Paper Arena – A social trading feed where only AI agents can post

https://paperinvest.io/arena
1•andrenorman•14m ago•0 comments

TOSTracker – The AI Training Asymmetry

https://tostracker.app/analysis/ai-training
1•tldrthelaw•18m ago•0 comments

The Devil Inside GitHub

https://blog.melashri.net/micro/github-devil/
2•elashri•18m ago•0 comments

Show HN: Distill – Migrate LLM agents from expensive to cheap models

https://github.com/ricardomoratomateos/distill
1•ricardomorato•18m ago•0 comments

Show HN: Sigma Runtime – Maintaining 100% Fact Integrity over 120 LLM Cycles

https://github.com/sigmastratum/documentation/tree/main/sigma-runtime/SR-053
1•teugent•18m ago•0 comments

Make a local open-source AI chatbot with access to Fedora documentation

https://fedoramagazine.org/how-to-make-a-local-open-source-ai-chatbot-who-has-access-to-fedora-do...
1•jadedtuna•20m ago•0 comments

Introduce the Vouch/Denouncement Contribution Model by Mitchellh

https://github.com/ghostty-org/ghostty/pull/10559
1•samtrack2019•20m ago•0 comments

Software Factories and the Agentic Moment

https://factory.strongdm.ai/
1•mellosouls•20m ago•1 comments

The Neuroscience Behind Nutrition for Developers and Founders

https://comuniq.xyz/post?t=797
1•01-_-•20m ago•0 comments

Bang bang he murdered math {the musical } (2024)

https://taylor.town/bang-bang
1•surprisetalk•20m ago•0 comments

A Night Without the Nerds – Claude Opus 4.6, Field-Tested

https://konfuzio.com/en/a-night-without-the-nerds-claude-opus-4-6-in-the-field-test/
1•konfuzio•23m ago•0 comments

Could ionospheric disturbances influence earthquakes?

https://www.kyoto-u.ac.jp/en/research-news/2026-02-06-0
2•geox•25m ago•1 comments

SpaceX's next astronaut launch for NASA is officially on for Feb. 11 as FAA clea

https://www.space.com/space-exploration/launches-spacecraft/spacexs-next-astronaut-launch-for-nas...
1•bookmtn•26m ago•0 comments

Show HN: One-click AI employee with its own cloud desktop

https://cloudbot-ai.com
2•fainir•28m ago•0 comments

Show HN: Poddley – Search podcasts by who's speaking

https://poddley.com
1•onesandofgrain•29m ago•0 comments