frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

Show HN: Morph Reflexes – Multi-head classifiers for agent traces

3•bhaktatejas922•2h ago
The most common failures for production agents are behavioral: looping, reasoning leakage, user frustration, and more. Using a frontier model like GPT or Sonnet to judge every turn is too expensive and slow to run at scale.

What Reflexes are: semantic signals from agent traces, served fast and cheap over API. Built on custom kernels and a custom inference engine forked from vLLM.

Under the hood, it is a small LLM architected around multi-head inference. Small models need to be trained for specific tasks, but running 50 separate small models on the same input for 50 tasks makes no sense.

How it works: We use a modern LLM with hybrid attention and remove the decode step. We built an inference engine that lets prefill compute be 99% reused from reflex to reflex, similar in spirit to older 2019-era BERT/HYDRA and older multiple-head techniques. we built the inference engine to reuse the KV/cache across inputs and compute across all reflexes. One shared backbone reads the trace once, then many heads classify different signals. Our inference engine reuses the same KV/cache and compute across all reflexes, giving us sub-30ms inference with less than 0.1% overhead for each additional reflex.

We took the same high-level idea and did the hard work to make it work with a modern architecture and attention. On it, we can run inference in under 30ms and serve the full request in under 90ms. If you run 4 reflexes or 100, the extra overhead is less than 2ms.

Why does optimizing this matter?

If you’re even a medium-sized startup, you’re dealing with tens of thousands of agent runs and millions of turns. If you want to track things like user frustration rates over time, frontier LLM-as-judge does not scale.

I built a similar stack at Tesla. When ML engineers needed to sample data across petabytes for signals like `is_camera_obfuscated=true`, along with 200 other things, you need to 1) spin them up quickly 2) run at scale efficiently

What it is not: A dashboard. 99% of dashboards go unused. 100% API first and made for devs who want to use this to trigger their own stuff.

vibetrain a custom reflex in our dashboard, and/or then let it self improve in production: https://www.morphllm.com/dashboard/reflex

Docs: https://docs.morphllm.com/sdk/components/reflexes/index

I’d love feedback from people running agents in prod: what sorts of things do you wish you could track over time across 100% of turns but cant right now?

TLDR: semantic signals from agent traces, super fast, cheap via API

Show HN: My 13-year-old built an ant colony tracker

https://formicarium.es
30•abelgvidal•7h ago•22 comments

Show HN: Free Online GIS Viewer and Format Converter

https://geodataviewer.com/
2•twainyoung•14m ago•0 comments

Show HN: TakoVM – open-source sandboxing for your agent's code

https://github.com/Tako-Research/TakoVM
2•sakuraiben•1h ago•0 comments

Show HN: Open-source restreaming and live studio

https://github.com/muxshed/shed
3•franticstone•1h ago•2 comments

Show HN: Morph Reflexes – Multi-head classifiers for agent traces

3•bhaktatejas922•2h ago•0 comments

Show HN: Openleetcode – LeetCode runner where tests live in the repo

https://github.com/therepanic/openleetcode/releases/tag/v1.0.0
4•therepanic•3h ago•0 comments

Show HN: Kage, verification and freshness for Google's OKF agent memory

https://kage-core.com/
3•kage18•3h ago•0 comments

Show HN: Jensen – a Deus Ex: Human Revolution theme for 30 developer apps

https://tomaytotomato.github.io/jensen/
3•tomaytotomato•3h ago•0 comments

Show HN: Clusy – Cursor for data science notebooks in cloud

https://www.clusy.io/
5•eldar_hsnv•6h ago•0 comments

Show HN: Shot-scraper video tool for recording YAML-defined webapp feature demos

https://simonwillison.net/2026/Jun/30/shot-scraper-video/
5•simonw•6h ago•1 comments

Show HN: I made a heatmap of 3400 VCs who are open to cold emails

https://apparent.social/heat-map
23•west_subject•5h ago•27 comments

Show HN: Makes local LLMs faster and more reliable by optimizing for your device

https://www.autotunellm.com/
5•tanavc•5h ago•0 comments

Show HN: I built an AI agent to yell at me about my ADHD

https://0xff.nu/hex/
3•hxii•6h ago•0 comments

Show HN: fenic – LLMs as dataframe operators, query meaning and structure

https://github.com/typedef-ai/fenic
3•cpard•7h ago•0 comments

Show HN: Openleetcode – local LeetCode runner with open test suites

https://github.com/therepanic/openleetcode
3•therepanic•7h ago•0 comments

Show HN: Don't ask if devs cheat with AI, test if they're good with it

https://tryevaluator.com
5•skyepstein•8h ago•3 comments

Show HN: Classic Minesweeper

https://guokai.dev/minesweeper/
9•hanguokai•15h ago•8 comments

Show HN: OM Core – multidimensional models without spreadsheet cell formulas

https://github.com/cloudcell/om-core
2•cloudcell•8h ago•1 comments

Show HN: Curvytron 2, I rewrote my browser party game, 10 years later

https://curvytron2.com/
2•tom32i•9h ago•0 comments

Show HN: Shoaku – Your Coding Navigator

https://github.com/seachicken/intellij-shoaku
4•seachicken•9h ago•4 comments

Show HN: Second opinion – A skill to query different models

https://github.com/kmcheung12/second-opinion
4•a_c•9h ago•2 comments

Show HN: PDFMergely – In-browser PDF tools that never upload your files

https://pdfmergely.com
16•pdfmergely•17h ago•19 comments

Show HN: Agentic Orchestrator, a TUI for long-running coding agents

https://github.com/doordash-oss/agentic-orchestrator
15•ivrr•22h ago•2 comments

Show HN: DRM-Free Books

https://frequal.com/Perspectives/DrmFreeAuthors.html
118•TeaVMFan•2d ago•46 comments

Show HN: TraceAIO – open-source LLM visibility tracker

https://traceaio.org
6•owenthejumper•10h ago•1 comments

Show HN: Zanagrams

https://zanagrams.com/
390•pompomsheep•2d ago•104 comments

Show HN: NodePad – AI agent on a canvas instead of a linear chat

https://node-pad.com/
5•palazski•11h ago•0 comments

Show HN: Running Gemma-4 26B at 124 tokens/SEC on a CPU, no GPU

https://apeg.dev/writing/running-gemma4-26b-on-a-cpu/
10•arun-prasath•11h ago•1 comments

Show HN: Privacy policy generator for AI apps (LLM disclosure, EU AI Act)

https://ai-policy-gen.pages.dev
6•wyss0513•13h ago•2 comments

Show HN: Bash4LLM+ – A lightweight, dependency-free Bash wrapper for LLM APIs

https://github.com/kamaludu/bash4llm/
60•kamaludu•2d ago•22 comments