frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Gambit, an open-source agent harness for building reliable AI agents

https://github.com/bolt-foundry/gambit
28•randall•1h ago
Hey HN!

Wanted to show our open source agent harness called Gambit.

If you’re not familiar, agent harnesses are sort of like an operating system for an agent... they handle tool calling, planning, context window management, and don’t require as much developer orchestration.

Normally you might see an agent orchestration framework pipeline like:

compute -> compute -> compute -> LLM -> compute -> compute -> LLM

we invert this so with an agent harness, it’s more like:

LLM -> LLM -> LLM -> compute -> LLM -> LLM -> compute -> LLM

Essentially you describe each agent in either a self contained markdown file, or as a typescript program. Your root agent can bring in other agents as needed, and we create a typesafe way for you to define the interfaces between those agents. We call these decks.

Agents can call agents, and each agent can be designed with whatever model params make sense for your task.

Additionally, each step of the chain gets automatic evals, we call graders. A grader is another deck type… but it’s designed to evaluate and score conversations (or individual conversation turns).

We also have test agents you can define on a deck-by-deck basis, that are designed to mimic scenarios your agent would face and generate synthetic data for either humans or graders to grade.

Prior to Gambit, we had built an LLM based video editor, and we weren’t happy with the results, which is what brought us down this path of improving inference time LLM quality.

We know it’s missing some obvious parts, but we wanted to get this out there to see how it could help people or start conversations. We’re really happy with how it’s working with some of our early design partners, and we think it’s a way to implement a lot of interesting applications:

- Truly open source agents and assistants, where logic, code, and prompts can be easily shared with the community.

- Rubric based grading to guarantee you (for instance) don’t leak PII accidentally

- Spin up a usable bot in minutes and have Codex or Claude Code use our command line runner / graders to build a first version that is pretty good w/ very little human intervention.

We’ll be around if ya’ll have any questions or thoughts. Thanks for checking us out!

Walkthrough video: https://youtu.be/J_hQ2L_yy60

Comments

franciscomello•1h ago
This looks quite interesting in terms of the architecture. Seems like a fresh take on stuff like Langchain, which at least last time I checked sucks.
randall•20m ago
thx!
sofdao•22m ago
this is awesome

are things like file system baked in?

fan of the design of the system. looks great architecturally

randall•20m ago
omg thank you so much. We're working on the file system stuff, that's an easier lift for us than the initial work, so we wanted to start with the big stuff and work backward. Claude Code and Codex are obviously really great at that stuff, and we'd like to be able to support a lot of that out of the box.
alberson•7m ago
I’m excited to give this a spin at Agentive! Really interesting approach.
pych•6m ago
wow this looks cool - been meaning to dig into harness stuff this looks like a good starting point

Show HN: OpenWork – an open-source alternative to Claude Cowork

https://github.com/different-ai/openwork
129•ben_talent•1d ago•24 comments

Show HN: Gambit, an open-source agent harness for building reliable AI agents

https://github.com/bolt-foundry/gambit
28•randall•1h ago•6 comments

Show HN: Control Claude permissions using a cloud-based decision table UI

https://github.com/rulebricks/claude-code-guardrails
12•sidgarimella•7h ago•8 comments

Show HN: TinyCity – A tiny city SIM for MicroPython (Thumby micro console)

https://github.com/chrisdiana/TinyCity
117•inflam52•11h ago•19 comments

Show HN: Tabstack – Browser infrastructure for AI agents (by Mozilla)

103•MrTravisB•1d ago•20 comments

Show HN: Tusk Drift – Turn production traffic into API tests

https://github.com/Use-Tusk/tusk-drift-cli
18•jy-tan•7h ago•0 comments

Show HN: The Hessian of tall-skinny networks is easy to invert

https://github.com/a-rahimi/hessian
22•rahimiali•5h ago•20 comments

Show HN: Sparrow-1 – Audio-native model for human-level turn-taking without ASR

https://www.tavus.io/post/sparrow-1-human-level-conversational-timing-in-real-time-voice
112•code_brian•1d ago•47 comments

Show HN: Munimet.ro – ML-based status page for the local subways in SF

https://munimet.ro/
7•MrEricSir•7h ago•0 comments

Show HN: Webctl – Browser automation for agents based on CLI instead of MCP

https://github.com/cosinusalpha/webctl
122•cosinusalpha•1d ago•35 comments

Show HN: Ghostty Ambient – Terminal theme switcher that learns your preferences

https://github.com/gezibash/ghostty-ambient
2•zimzima•3h ago•1 comments

Show HN: ContextFort – Visibility and controls for browser agents

https://contextfort.ai/
11•ashwinr2002•1d ago•1 comments

Show HN: Voice Composer – Browser-based pitch detection to MIDI/strudel/tidal

https://dioptre.github.io/tidal/
29•dioptre•4d ago•6 comments

Show HN: Beni AI – Real-time face-to-face AI companion

https://thebeni.ai/
4•chaeeunlee9611•1d ago•0 comments

Show HN: GoGen – A simple template-based file generator written in Go

https://github.com/zaheershaikh936/gogen
2•zaheer9360•5h ago•1 comments

Show HN: Tiny FOSS Compass and Navigation App (<2MB)

https://github.com/CompassMB/MBCompass
131•nativeforks•1d ago•45 comments

Show HN: HyTags – HTML as a Programming Language

https://hytags.org
67•lassejansen•2d ago•32 comments

Show HN: A 10KiB kernel for cloud apps

https://github.com/ReturnInfinity/BareMetal-Cloud
66•ianseyler•1d ago•11 comments

Show HN: I built an 11MB offline PDF editor because mobile Acrobat is 500MB

https://revpdf.com/
6•pawandeepsingh•7h ago•1 comments

Show HN: Cache Explorer – The Compiler Explorer for CPU Cache Behavior

https://github.com/AveryClapp/Cache-Explorer
2•AveryClapp•8h ago•0 comments

Show HN: Xoscript

https://xoscript.com/history.xo
53•gabordemooij•1d ago•43 comments

Show HN: Keypost – Policy enforcement for MCP pipelines

https://keypost.ai
3•kxb4032•8h ago•1 comments

Show HN: I'm building an open-source AI agent runtime using Firecracker microVMs

https://github.com/moru-ai/moru
2•markoh49•8h ago•0 comments

Show HN: Digital Carrot – Block social media with programmable rules and goals

https://www.digitalcarrot.app/
38•newswangerd•1d ago•11 comments

Show HN: A fast CLI and MCP server for managing Lambda cloud GPU instances

https://github.com/Strand-AI/lambda-cli
23•odedfalik•1d ago•2 comments

Show HN: OSS AI agent that indexes and searches the Epstein files

https://epstein.trynia.ai/
205•jellyotsiro•2d ago•95 comments

Show HN: 1D-Pong Game at 39C3

https://github.com/ogermer/1d-pong
67•oger•4d ago•13 comments

Show HN: Nogic – VS Code extension that visualizes your codebase as a graph

https://marketplace.visualstudio.com/items?itemName=Nogic.nogic
128•davelradindra•2d ago•50 comments

Show HN: An iOS budget app I've been maintaining since 2011

https://primoco.me/en/
158•Priotecs•2d ago•59 comments

Show HN: The Tsonic Programming Language

https://tsonic.org
59•jeswin•2d ago•9 comments