frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: OpenCastor Agent Harness Evaluator Leaderboard

https://craigm26.github.io/OpenCastor/
3•craigm26•4h ago
I've been building OpenCastor, a runtime layer that sits between a robot's hardware and its AI agent. One thing that surprised me: the order you arrange the skill pipeline (context builder → model router → error handler, etc.) and parameters like thinking_budget and context_budget affect task success rates as much as model choice does.

So I built a distributed evaluator. Robots contribute idle compute to benchmark harness configurations against OHB-1, a small benchmark of 30 real-world robot tasks (grip, navigate, respond, etc.) using local LLM calls via Ollama. The search space is 263,424 configs (8 dimensions: model routing, context budget, retry logic, drift detection, etc.). The demo leaderboard shows results so far, broken down by hardware tier (Pi5+Hailo, Jetson, server, budget boards).

The current champion config is free to download as a YAML and apply to any robot. P66 safety parameters are stripped on apply — no harness config can touch motor limits or ESTOP logic.

Looking for feedback on: (1) whether the benchmark tasks are representative, (2) whether the hardware tier breakdown is useful, and (3) anyone who's run fleet-wide distributed evals of agent configs for robotics or otherwise.

Show HN: Cq – Stack Overflow for AI coding agents

https://blog.mozilla.ai/cq-stack-overflow-for-agents/
72•peteski22•10h ago•27 comments

Show HN: The King Wen Permutation: [52, 10, 2]

https://gzw1987-bit.github.io/iching-math/
54•gezhengwen•18h ago•27 comments

Show HN: WhyThere.life – Compare cities side-by-side to decide where to move

https://whythere.life
7•daversa•5h ago•1 comments

Show HN: VoidLLM – privacy-first LLM proxy (Go, self-hosted)

https://github.com/voidmind-io/voidllm
3•chrisremo85•3h ago•0 comments

Show HN: Mutatr – an open source A/B testing agent

https://github.com/novynlabs-repo/mutatr
3•AhmedAshraf•3h ago•0 comments

Show HN: Littlebird – Screenreading is the missing link in AI

https://littlebird.ai/
37•delu•9h ago•20 comments

Show HN: Dgs-CLI – 63-command CLI for D-Link DGS-1100 switches via Selenium

https://github.com/bobberb/dgs-cli
2•ShellackGobln7•4h ago•0 comments

Show HN: OpenCastor Agent Harness Evaluator Leaderboard

https://craigm26.github.io/OpenCastor/
3•craigm26•4h ago•0 comments

Show HN: Burn Room – ephemeral SSH chat, messages burn after 1 hour

https://burnroom.chat
6•joematrix•6h ago•0 comments

Show HN: Revise – An AI Editor for Documents

https://revise.io
77•artursapek•1d ago•68 comments

Show HN: Codala, a social network built on scanning barcodes

https://play.google.com/store/apps/details?id=com.hsynkrkye.codala&hl=en
61•hsynkrkye•5d ago•27 comments

Show HN: Shrouded, secure memory management in Rust

https://github.com/thesis/shrouded
5•mhluongo•7h ago•0 comments

Show HN: Agent Kernel – Three Markdown files that make any AI agent stateful

https://github.com/oguzbilgic/agent-kernel
39•obilgic•19h ago•19 comments

Show HN: Aerko_ – An offline-first, Vanilla JavaScript fitness PWA with local AI

https://github.com/SrPakura/AERKO_PWA
7•SrPakura•11h ago•2 comments

Show HN: Primer – build software with AI agents one milestone at a time

https://github.com/armgabrielyan/primer
3•armen99•8h ago•6 comments

Show HN: JulIDE – Lightweight Julia IDE Built with Tauri

https://github.com/sinisterMage/JulIde
4•SinisterMage2•8h ago•3 comments

Show HN: Minimalist library to generate SVG views of scientific data

https://github.com/alefore/mini_svg/
5•afc•8h ago•0 comments

Show HN: Lockpaw One hotkey to cover your Mac screen without putting it to sleep

https://github.com/sorkila/lockpaw
5•eriknielsen•13h ago•1 comments

Show HN: Atomic – Self-hosted, semantically-connected personal knowledge base

https://github.com/kenforthewin/atomic
144•kenforthewin•2d ago•23 comments

Show HN: Story Trainer, a self-guided tool for learning story structure

https://planetofthepaul.github.io/StoryTrainer/
2•minviex•10h ago•0 comments

Show HN: I made a tool for converting text snippets to shareable image

https://snip2img.com
5•wesammikhail•17h ago•1 comments

Show HN: We built a terminal-only Bluesky / AT Proto client written in Fortran

https://github.com/FormerLab/fortransky
144•FormerLabFred•3d ago•82 comments

Show HN: Sonar – A tiny CLI to see and kill whatever's running on localhost

https://github.com/RasKrebs/sonar
200•raskrebs•3d ago•80 comments

Show HN: Threadprocs – executables sharing one address space (0-copy pointers)

https://github.com/jer-irl/threadprocs
63•jer-irl•10h ago•40 comments

Show HN: Time Keep – Location timezones, timers, alarms, countdowns in one place

25•jmbuilds•4d ago•8 comments

Show HN: Termcraft – Terminal-first 2D sandbox survival in Rust

https://github.com/pagel-s/termcraft
135•sebosch•2d ago•25 comments

Show HN: Duplicate 3 layers in a 24B LLM, logical deduction .22→.76. No training

https://github.com/alainnothere/llm-circuit-finder
263•xlayn•5d ago•81 comments

Show HN: Three new Kitten TTS models – smallest less than 25MB

https://github.com/KittenML/KittenTTS
555•rohan_joshi•4d ago•183 comments

Show HN: Refrax – my Arc Browser replacement I made from scratch

https://refrax.website/
11•kageroumado•1d ago•6 comments

Show HN: GladAITor – Judge AI Products for Free

https://glad-ia-tor.com/
4•Enjoyooor•17h ago•2 comments