frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Lightbox – Flight recorder for AI agents (record, replay, verify)

https://uselightbox.app/
3•Berticus12•4h ago
I built Lightbox because I kept running into the same problem: an agent would fail in production, and I had no way to know what actually happened.

Logs were scattered, the LLM’s “I called the tool” wasn’t trustworthy, and re-running wasn’t deterministic.

This week, tons of Clawdbot incidents have driven the point home. Agents with full system access can expose API keys and chat histories. Prompt injection is now a major security concern.

When agents can touch your filesystem, execute code, and browse the web…you probably need a tamper-proof record of exactly what actions it took, especially when a malicious prompt or compromised webpage could hijack the agent mid-session.

Lightbox is a small Python library that records every tool call an agent makes (inputs, outputs, timing) into an append-only log with cryptographic hashes. You can replay runs with mocked responses, diff executions across versions, and verify the integrity of logs after the fact.

Think airplane black box, but for your hackbox.

*What it does:*

- Records tool calls locally (no cloud, your infra)

- Tamper-evident logs (hash chain, verifiable)

- Replay failures exactly with recorded responses

- CLI to inspect, replay, diff, and verify sessions

- Framework-agnostic (works with LangChain, Claude, OpenAI, etc.)

*What it doesn’t do:* - Doesn’t replay the LLM itself (just tool calls) - Not a dashboard or analytics platform - Not trying to replace LangSmith/Langfuse (different problem)

*Use cases I care about:*

- Security forensics: agent behaved strangely, was it prompt injection? Check the trace.

- Compliance: “prove what your agent did last Tuesday”

- Debugging: reproduce a failure without re-running expensive API calls

- Regression testing: diff tool call patterns across agent versions

As agents get more capable and more autonomous (Clawdbot/Molt, Claude computer use, Manus, Devin), I think we’ll need black boxes the same way aviation does.

This is my attempt at that primitive.

It’s early (v0.1), intentionally minimal, MIT licensed.

Site: <https://uselightbox.app> install: `pip install lightbox-rec`

GitHub: <https://github.com/mainnebula/Lightbox-Project>

Would love feedback, especially from anyone thinking about agent security or running autonomous agents in production.

Show HN: LemonSlice – Upgrade your voice agents to real-time video

46•lcolucci•3h ago•58 comments

Show HN: I wrapped the Zorks with an LLM

https://infocom.tambo.co/
15•alecf•47m ago•7 comments

Show HN: One Human + One Agent = One Browser From Scratch in 20K LOC

https://emsh.cat/one-human-one-agent-one-browser/
98•embedding-shape•8h ago•57 comments

Show HN: Script: JavaScript That Runs Like Rust

https://docs.script-lang.org/blog/introducing-script
2•jucasoliveira•32m ago•1 comments

Show HN: I Stopped Hoping My LLM Would Cooperate

2•seanlf•56m ago•0 comments

Show HN: Open-source Robotics – Curated projects with interactive 3D URDF viewer

https://robotics.growbotics.ai/
2•Tomas0413•1h ago•3 comments

Show HN: Distributed Training Observability for PyTorch (TraceML)

https://github.com/traceopt-ai/traceml
2•traceml-ai•1h ago•0 comments

Show HN: A 4.8MB native iOS voice notes app built with SwiftUI

https://apps.apple.com/us/app/convoxa-ai-meeting-minutes/id6755150446
2•karamalaskar•1h ago•0 comments

Show HN: Decrypting the Zodiac Z32 triangulates a 100ft triangular crop mark

https://zenodo.org/records/18335902
3•dstamp•2h ago•1 comments

Show HN: Cosmic AI Workflows – Chain AI agents to automate multi-step projects

https://www.cosmicjs.com/blog/introducing-ai-workflows
2•tonyspiro•3h ago•0 comments

Show HN: TetrisBench – Gemini Flash reaches 66% win rate on Tetris against Opus

https://tetrisbench.com/tetrisbench/
108•ykhli•1d ago•40 comments

Show HN: An open-source starter for developing with Postgres and ClickHouse

https://github.com/ClickHouse/postgres-clickhouse-stack
2•saisrirampur•4h ago•0 comments

Show HN: Only 1 LLM can fly a drone

https://github.com/kxzk/snapbench
175•beigebrucewayne•1d ago•91 comments

Show HN: Lightbox – Flight recorder for AI agents (record, replay, verify)

https://uselightbox.app/
3•Berticus12•4h ago•0 comments

Show HN: First autonomous ML and AI engineering Agent

https://marketplace.visualstudio.com/items?itemName=NeoResearchInc.heyneo
2•svij137•1h ago•1 comments

Show HN: Honcho – Open-source memory infrastructure, powered by custom models

https://github.com/plastic-labs/honcho
8•vvoruganti•5h ago•0 comments

Show HN: I built a CSV parser to try Go 1.26's new SIMD package

https://github.com/nnnkkk7/go-simdcsv
2•tokkyokky•7h ago•0 comments

Show HN: SF Microclimates

https://github.com/solo-founders/sf-microclimates
32•weisser•1d ago•31 comments

Show HN: An interactive map of US lighthouses and navigational aids

https://www.lighthouses.app/
98•idd2•2d ago•21 comments

Show HN: 13-Virtues – A tracker for Benjamin Franklin's 13-week character system

https://www.13-virtues.com
3•HeleneBuilds•8h ago•1 comments

Show HN: I made AI earphones remember everything (auto-sync to Obsidian)

23•Paddyz•6d ago•5 comments

Show HN: TUI for managing XDG default applications

https://github.com/mitjafelicijan/xdgctl
134•mitjafelicijan•2d ago•45 comments

Show HN: Ourguide – OS wide task guidance system that shows you where to click

https://ourguide.ai
39•eshaangulati•1d ago•20 comments

Show HN: Netfence – Like Envoy for eBPF Filters

https://github.com/danthegoodman1/netfence
57•dangoodmanUT•2d ago•7 comments

Show HN: Actionbase – A database for likes, views, follows at 1M+ req/min

https://github.com/kakao/actionbase
4•em3s•10h ago•3 comments

Show HN: A small programming language where everything is pass-by-value

https://github.com/Jcparkyn/herd
88•jcparkyn•1d ago•57 comments

Show HN: Managed Postgres with native ClickHouse integration

44•saisrirampur•5d ago•9 comments

Show HN: Fence – Sandbox CLI commands with network/filesystem restrictions

https://github.com/Use-Tusk/fence
78•jy-tan•1w ago•23 comments

Show HN: Get recommendations or convert agent skills directly in your workspace

https://www.agenstskills.com/
2•rohitghumare•12h ago•0 comments

Show HN: isometric.nyc – giant isometric pixel art map of NYC

https://cannoneyed.com/isometric-nyc/
1320•cannoneyed•5d ago•240 comments