frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Hello

1•otrebladih•1m ago•0 comments

FSD helped save my father's life during a heart attack

https://twitter.com/JJackBrandt/status/2019852423980875794
1•blacktulip•3m ago•0 comments

Show HN: Writtte – Draft and publish articles without reformatting, anywhere

https://writtte.xyz
1•lasgawe•5m ago•0 comments

Portuguese icon (FROM A CAN) makes a simple meal (Canned Fish Files) [video]

https://www.youtube.com/watch?v=e9FUdOfp8ME
1•zeristor•7m ago•0 comments

Brookhaven Lab's RHIC Concludes 25-Year Run with Final Collisions

https://www.hpcwire.com/off-the-wire/brookhaven-labs-rhic-concludes-25-year-run-with-final-collis...
2•gnufx•9m ago•0 comments

Transcribe your aunts post cards with Gemini 3 Pro

https://leserli.ch/ocr/
1•nielstron•13m ago•0 comments

.72% Variance Lance

1•mav5431•14m ago•0 comments

ReKindle – web-based operating system designed specifically for E-ink devices

https://rekindle.ink
1•JSLegendDev•16m ago•0 comments

Encrypt It

https://encryptitalready.org/
1•u1hcw9nx•16m ago•1 comments

NextMatch – 5-minute video speed dating to reduce ghosting

https://nextmatchdating.netlify.app/
1•Halinani8•17m ago•1 comments

Personalizing esketamine treatment in TRD and TRBD

https://www.frontiersin.org/articles/10.3389/fpsyt.2025.1736114
1•PaulHoule•18m ago•0 comments

SpaceKit.xyz – a browser‑native VM for decentralized compute

https://spacekit.xyz
1•astorrivera•19m ago•0 comments

NotebookLM: The AI that only learns from you

https://byandrev.dev/en/blog/what-is-notebooklm
1•byandrev•19m ago•1 comments

Show HN: An open-source starter kit for developing with Postgres and ClickHouse

https://github.com/ClickHouse/postgres-clickhouse-stack
1•saisrirampur•20m ago•0 comments

Game Boy Advance d-pad capacitor measurements

https://gekkio.fi/blog/2026/game-boy-advance-d-pad-capacitor-measurements/
1•todsacerdoti•20m ago•0 comments

South Korean crypto firm accidentally sends $44B in bitcoins to users

https://www.reuters.com/world/asia-pacific/crypto-firm-accidentally-sends-44-billion-bitcoins-use...
2•layer8•21m ago•0 comments

Apache Poison Fountain

https://gist.github.com/jwakely/a511a5cab5eb36d088ecd1659fcee1d5
1•atomic128•23m ago•2 comments

Web.whatsapp.com appears to be having issues syncing and sending messages

http://web.whatsapp.com
1•sabujp•23m ago•2 comments

Google in Your Terminal

https://gogcli.sh/
1•johlo•24m ago•0 comments

Shannon: Claude Code for Pen Testing: #1 on Github today

https://github.com/KeygraphHQ/shannon
1•hendler•25m ago•0 comments

Anthropic: Latest Claude model finds more than 500 vulnerabilities

https://www.scworld.com/news/anthropic-latest-claude-model-finds-more-than-500-vulnerabilities
2•Bender•29m ago•0 comments

Brooklyn cemetery plans human composting option, stirring interest and debate

https://www.cbsnews.com/newyork/news/brooklyn-green-wood-cemetery-human-composting/
1•geox•29m ago•0 comments

Why the 'Strivers' Are Right

https://greyenlightenment.com/2026/02/03/the-strivers-were-right-all-along/
1•paulpauper•31m ago•0 comments

Brain Dumps as a Literary Form

https://davegriffith.substack.com/p/brain-dumps-as-a-literary-form
1•gmays•31m ago•0 comments

Agentic Coding and the Problem of Oracles

https://epkconsulting.substack.com/p/agentic-coding-and-the-problem-of
1•qingsworkshop•32m ago•0 comments

Malicious packages for dYdX cryptocurrency exchange empties user wallets

https://arstechnica.com/security/2026/02/malicious-packages-for-dydx-cryptocurrency-exchange-empt...
1•Bender•32m ago•0 comments

Show HN: I built a <400ms latency voice agent that runs on a 4gb vram GTX 1650"

https://github.com/pheonix-delta/axiom-voice-agent
1•shubham-coder•32m ago•0 comments

Penisgate erupts at Olympics; scandal exposes risks of bulking your bulge

https://arstechnica.com/health/2026/02/penisgate-erupts-at-olympics-scandal-exposes-risks-of-bulk...
4•Bender•33m ago•0 comments

Arcan Explained: A browser for different webs

https://arcan-fe.com/2026/01/26/arcan-explained-a-browser-for-different-webs/
1•fanf2•35m ago•0 comments

What did we learn from the AI Village in 2025?

https://theaidigest.org/village/blog/what-we-learned-2025
2•mrkO99•35m ago•0 comments
Open in hackernews

Show HN: Gambit, an open-source agent harness for building reliable AI agents

https://github.com/bolt-foundry/gambit
91•randall•3w ago
Hey HN!

Wanted to show our open source agent harness called Gambit.

If you’re not familiar, agent harnesses are sort of like an operating system for an agent... they handle tool calling, planning, context window management, and don’t require as much developer orchestration.

Normally you might see an agent orchestration framework pipeline like:

compute -> compute -> compute -> LLM -> compute -> compute -> LLM

we invert this so with an agent harness, it’s more like:

LLM -> LLM -> LLM -> compute -> LLM -> LLM -> compute -> LLM

Essentially you describe each agent in either a self contained markdown file, or as a typescript program. Your root agent can bring in other agents as needed, and we create a typesafe way for you to define the interfaces between those agents. We call these decks.

Agents can call agents, and each agent can be designed with whatever model params make sense for your task.

Additionally, each step of the chain gets automatic evals, we call graders. A grader is another deck type… but it’s designed to evaluate and score conversations (or individual conversation turns).

We also have test agents you can define on a deck-by-deck basis, that are designed to mimic scenarios your agent would face and generate synthetic data for either humans or graders to grade.

Prior to Gambit, we had built an LLM based video editor, and we weren’t happy with the results, which is what brought us down this path of improving inference time LLM quality.

We know it’s missing some obvious parts, but we wanted to get this out there to see how it could help people or start conversations. We’re really happy with how it’s working with some of our early design partners, and we think it’s a way to implement a lot of interesting applications:

- Truly open source agents and assistants, where logic, code, and prompts can be easily shared with the community.

- Rubric based grading to guarantee you (for instance) don’t leak PII accidentally

- Spin up a usable bot in minutes and have Codex or Claude Code use our command line runner / graders to build a first version that is pretty good w/ very little human intervention.

We’ll be around if ya’ll have any questions or thoughts. Thanks for checking us out!

Walkthrough video: https://youtu.be/J_hQ2L_yy60

Comments

tomhow•3w ago
[under-the-rug stub]

[see https://news.ycombinator.com/item?id=45988611 for explanation]

franciscomello•3w ago
This looks quite interesting in terms of the architecture. Seems like a fresh take on stuff like Langchain, which at least last time I checked sucks.
randall•3w ago
thx!
sofdao•3w ago
this is awesome

are things like file system baked in?

fan of the design of the system. looks great architecturally

randall•3w ago
omg thank you so much. We're working on the file system stuff, that's an easier lift for us than the initial work, so we wanted to start with the big stuff and work backward. Claude Code and Codex are obviously really great at that stuff, and we'd like to be able to support a lot of that out of the box.
alberson•3w ago
I’m excited to give this a spin at Agentive! Really interesting approach.
pych•3w ago
wow this looks cool - been meaning to dig into harness stuff this looks like a good starting point
randall•3w ago
Thx! Happy to help if you need it. :)
randall•3w ago
thx, i appreciate it, believe it or not. :)
Trufa•3w ago
Is this an alternative to https://mastra.ai/docs

How would it compare?

randall•3w ago
So I look at something like Mastra (or LangChain) as agent orchestration, where you do computing tasks to line up things for an LLM to execute against.

I look at Gambit as more of an "agent harness", meaning you're building agents that can decide what to do more than you're orchestrating pipelines.

Basically, if we're successful, you should be able to chain agents together to accomplish things extremely simply (using markdown). Mastra, as far as I'm aware, is focused on helping people use programming languages (typescript) to build pipelines and workflows.

So yes it's an alternative, but more like an alternative approach rather than a direct competitor if that makes sense.

iainctduncan•3w ago
You might want to know that Gambit is an open source Scheme implementation that has been around a very long time.
benban•3w ago
nice work. the idea of breaking agents into short-lived executors with explicit inputs/outputs makes a lot of sense - most failures i've seen come from agents staying alive too long and leaking assumptions across steps.

curious how you're handling context lifetimes when agents call other agents. do you drop context between calls or is there a way to bound it? that's been the trickiest part for us.

randall•2w ago
right now yeah we’re just dropping context… sub agents are short lived.

thinking about ways to deal with that but we haven’t yet done it.

elgrantomate•3w ago
I've been playing with this for the past 24 hours or so. I like the atomic containment of the LLM, and the clear separation of logic, code, and prompts.

You have some great working examples, but, for example: translate_text specifies the default language in three places: the card, the input schema, and the deck. This can't be necessary; I'll experiment, but shouldn't it just be defined in one place?

The descriptive language of the project is a bit dense for me too. I'm having a hard time figuring out how to do basic things like parameters -- let's say that I want to constrain summarize_text to a certain length... I've tried to write language in the cards/decks, but the model doesn't seem to be paying attention.

I also want to be able to load a file, e.g. not just "translate 'hello my friend' to Italian" but "translate '/test/hello_my_friend.txt' to Italian" and have it load the contents of the file as input text. How do I do that?

Super cool project!

randall•2w ago
yeah the way to do that stuff is through zod schemas… input and output schemas.

you can set up really complex validation.

thanks for checking it out!!

elgrantomate•3w ago
also, it seems like this works with openrouter, and perhaps OpenAI -- what about Gemini API?
randall•2w ago
thanks for the PR! :)
niyikiza•2w ago
Nice architecture. The typed deck composition pattern is exactly right for making agent workflows testable.

One thing I've been thinking about is that schema validation catches "is this data shaped correctly?" but not "is this action permitted given who initiated the request?" When you have deck → child deck → grandchild deck chains, a prompt injection at any level could trigger actions the root caller never intended.

I've been working on offline capability verification for this using cryptographically signed warrants that attenuate as they propagate down the call chain. Curious if you've thought about that layer, or if you're relying on the model to self-police tool selection?

randall•2w ago
So two things.

1/ crypto signing is totally the right way to think about this. 2/ I'm limiting prompt injection by using chain of command: https://model-spec.openai.com/2025-12-18.html#chain_of_comma...

we have a "gambit_init" tool call that is synthetically injected into every call which has the context. Because it's the result of a tool call, it gets injected into layer 6 of the chain of command, so it's less likely to be subject to prompt injections.

Also, relatedly, yes i have thought EXTREMELY deeply about cryptographic primitives to replace HTTP with peer-to-peer webs of trust as the primary units of compute and information.

Imagine being able to authenticate the source of an image using "private blockchains" ala holepunch's hypercore.

niyikiza•2w ago
Injecting context via tool outputs to hit Layer 6 is a clever way to leverage the model spec.

The gap I keep coming back to is that even at Layer 6, enforcement is probabilistic. You are still negotiating with the model's weights. "Less likely to fail" is great for reliability, but hard to sell on a security questionnaire.

Tenuo operates at the execution boundary. It checks after the model decides and before the tool runs. Even if the model gets tricked (or just hallucinates), the action fails if the cryptographic warrant doesn't allow that specific action.

Re: Hypercore/P2P, I actually see that as the identity layer we're missing. You need a decentralized root of trust (Provenance) to verify who signed the Warrant (Authorization). Tenuo handles the latter, but it needs something like Hypercore for the former.

Would be curious to see how Gambit's Deck pattern could integrate with warrant-based authorization. Since you already have typed inputs/outputs, mapping those to signed capabilities seems like a natural fit.

randall•2w ago
yaaaaa exactly. You're totally on the same wavelength as me. Let's be friends lol
yencabulator•2w ago
> - Rubric based grading to guarantee you (for instance) don’t leak PII accidentally

That does not sound like a "guarantee", at all.

randall•2w ago
once your team comes to a consensus on what PII is, you can roughly guarantee it... especially as models improve.