frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Managed MCP Sandbox Environments for RL Training on Tool Use

3•wirehack•1mo ago
Hi HN! We are Klavis AI (https://www.klavis.ai/) and we are launching a managed MCP Sandbox-as-a-Service for RL training on tool use.

If you want a model to learn tool use through RL, you need realistic environments where the model can take actions, you can observe the resulting state, and compute a reward. For SaaS tools, this means managing dozens of test accounts, handling OAuth and token refresh, seeding realistic data for each episode, resetting state between runs, and ensuring isolation when you're running concurrent training sessions. Most research teams spend months building this plumbing per integration.

Klavis is a managed sandbox service that handles all of that. You call our API to get an isolated sandbox backed by a real service instance (not a mock), initialize it with whatever data state you need, let your model interact via MCP, then dump the final state to compute your reward. One more API call resets everything for the next episode.

The key thing is these are real services, not static mocks. When your model creates a calendar event or updates a Salesforce record, that action actually executes against real infrastructure. The state changes are real. This matters because you want training to reflect production behavior exactly.

We currently support 50+ integrations across productivity tools (Google Calendar, Outlook, Slack), CRM (Salesforce, HubSpot), dev tools (GitHub, Jira, Linear), databases (Postgres, Snowflake), and others. We handle the account pooling, auth management, and lifecycle orchestration so researchers can focus on the actual training.

Technically, the workflow is: create a sandbox, call initialize API with a JSON payload defining your starting state, let the model interact via standard MCP tools, call dump API to get a typed snapshot of the final state, compare against your target for reward calculation, then call reset or delete. We use strict Pydantic schemas for all inputs and outputs so malformed data gets rejected immediately rather than causing silent failures mid-training.

Here is a quick demo: https://youtu.be/10C18rpCYcA.

We look forward to your comments. Thanks for reading!

TSMC to produce 3-nanometer chips in Japan

https://www3.nhk.or.jp/nhkworld/en/news/20260205_B4/
1•cwwc•2m ago•0 comments

Quantization-Aware Distillation

http://ternarysearch.blogspot.com/2026/02/quantization-aware-distillation.html
1•paladin314159•2m ago•0 comments

List of Musical Genres

https://en.wikipedia.org/wiki/List_of_music_genres_and_styles
1•omosubi•4m ago•0 comments

Show HN: Sknet.ai – AI agents debate on a forum, no humans posting

https://sknet.ai/
1•BeinerChes•4m ago•0 comments

University of Waterloo Webring

https://cs.uwatering.com/
1•ark296•4m ago•0 comments

Large tech companies don't need heroes

https://www.seangoedecke.com/heroism/
1•medbar•6m ago•0 comments

Backing up all the little things with a Pi5

https://alexlance.blog/nas.html
1•alance•7m ago•1 comments

Game of Trees (Got)

https://www.gameoftrees.org/
1•akagusu•7m ago•1 comments

Human Systems Research Submolt

https://www.moltbook.com/m/humansystems
1•cl42•7m ago•0 comments

The Threads Algorithm Loves Rage Bait

https://blog.popey.com/2026/02/the-threads-algorithm-loves-rage-bait/
1•MBCook•10m ago•0 comments

Search NYC open data to find building health complaints and other issues

https://www.nycbuildingcheck.com/
1•aej11•13m ago•0 comments

Michael Pollan Says Humanity Is About to Undergo a Revolutionary Change

https://www.nytimes.com/2026/02/07/magazine/michael-pollan-interview.html
2•lxm•15m ago•0 comments

Show HN: Grovia – Long-Range Greenhouse Monitoring System

https://github.com/benb0jangles/Remote-greenhouse-monitor
1•benbojangles•19m ago•1 comments

Ask HN: The Coming Class War

1•fud101•19m ago•1 comments

Mind the GAAP Again

https://blog.dshr.org/2026/02/mind-gaap-again.html
1•gmays•21m ago•0 comments

The Yardbirds, Dazed and Confused (1968)

https://archive.org/details/the-yardbirds_dazed-and-confused_9-march-1968
1•petethomas•22m ago•0 comments

Agent News Chat – AI agents talk to each other about the news

https://www.agentnewschat.com/
2•kiddz•22m ago•0 comments

Do you have a mathematically attractive face?

https://www.doimog.com
3•a_n•26m ago•1 comments

Code only says what it does

https://brooker.co.za/blog/2020/06/23/code.html
2•logicprog•32m ago•0 comments

The success of 'natural language programming'

https://brooker.co.za/blog/2025/12/16/natural-language.html
1•logicprog•32m ago•0 comments

The Scriptovision Super Micro Script video titler is almost a home computer

http://oldvcr.blogspot.com/2026/02/the-scriptovision-super-micro-script.html
3•todsacerdoti•32m ago•0 comments

Discovering the "original" iPhone from 1995 [video]

https://www.youtube.com/watch?v=7cip9w-UxIc
1•fortran77•34m ago•0 comments

Psychometric Comparability of LLM-Based Digital Twins

https://arxiv.org/abs/2601.14264
1•PaulHoule•35m ago•0 comments

SidePop – track revenue, costs, and overall business health in one place

https://www.sidepop.io
1•ecaglar•38m ago•1 comments

The Other Markov's Inequality

https://www.ethanepperly.com/index.php/2026/01/16/the-other-markovs-inequality/
2•tzury•39m ago•0 comments

The Cascading Effects of Repackaged APIs [pdf]

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6055034
1•Tejas_dmg•41m ago•0 comments

Lightweight and extensible compatibility layer between dataframe libraries

https://narwhals-dev.github.io/narwhals/
1•kermatt•44m ago•0 comments

Haskell for all: Beyond agentic coding

https://haskellforall.com/2026/02/beyond-agentic-coding
3•RebelPotato•47m ago•0 comments

Dorsey's Block cutting up to 10% of staff

https://www.reuters.com/business/dorseys-block-cutting-up-10-staff-bloomberg-news-reports-2026-02...
2•dev_tty01•50m ago•0 comments

Show HN: Freenet Lives – Real-Time Decentralized Apps at Scale [video]

https://www.youtube.com/watch?v=3SxNBz1VTE0
1•sanity•52m ago•1 comments