Show HN: LoopGain – Cut agent API spend by measuring when loops stop improving

1•fitz2882•1h ago

Comments

fitz2882•1h ago

Hey HN — Dave here. While building a multi-agent system recently, I kept noticing that the diagrams of agents passing feedback to each other looked just like the electrical circuit diagrams from control theory. I got curious whether the established math transferred, and to my surprise it did. LoopGain is the first product of that research: an open-source library that replaces the max_iterations=N cap on agent loops with an actual measurement of whether the loop is still improving.

The cap is how nearly every verify-revise loop gets stopped today, and it's wrong in both directions. Stop too early and you clip a loop that was still improving. Stop too late and you pay for iterations after the loop already found its best answer — and ship the final attempt, which is sometimes worse than one it already had.

On a 2,000-trial benchmark (paired real-API runs, five loop patterns across six framework adapters, three model providers, pre-registered protocol with kill criteria), LoopGain cut total API spend by 92.8% vs max_iterations=20 ($27.05 → $1.94) and median wall-clock ~15× (30.9s → 2.1s). A cross-vendor judge preferred LoopGain's outputs on the weighted average (0.678 across 1,800 pairwise comparisons) — mostly because LoopGain returns the iteration with the lowest error it saw, while a fixed cap ships whatever the final iteration produced. The raw data and methodology are public, and the full run is browsable at dashboard.loopgain.ai/benchmark.

How it works: each iteration, LoopGain takes the ratio of the current error to the previous error — the loop's empirical loop gain (Aβ, borrowed from control theory). Aβ<1 means the error shrank — the loop is improving. Aβ≥1 means it held or grew — the loop is stuck or making things worse. A trajectory classifier reads the recent Aβ values, labels the loop (FAST_CONVERGE / CONVERGING / STALLING / OSCILLATING / DIVERGING), and decides whether to keep going, stop here, or stop and roll back to the lowest-error output so far.

Integration is a few lines around any loop that produces an error signal:

  from loopgain import LoopGain
  lg = LoopGain(target_error=0.0)
  output = generate(task)                # first attempt
  while lg.should_continue():
      errors = verify(output)            # e.g. count of failing tests
      lg.observe(errors, output=output)  # the only LoopGain call in the loop
      output = revise(output, errors)
  result = lg.result   # best_output, outcome, convergence_profile, savings_vs_fixed_cap

Honest limits, because they matter more than the headline: LoopGain detects convergence, not correctness — it inherits your verifier's blind spots. I re-graded my own benchmark and 4.5% of "converged" code-gen runs passed every check the loop ran but failed a fuller held-out test suite. And savings depend on workload: failure-heavy loops save ~78–84%, not 92.8%. There's a writeup on designing verifiers strong enough to trust on the blog.

Apache-2.0, pip install loopgain. Adapters for LangGraph, CrewAI, AutoGen, LangChain, OpenAI Agents SDK, and the Claude Agent SDK; the raw API works for anything with a measurable error.

What I'd really love from HN: if you run production agent loops, I'm interested in whether the stop decisions match what you see empirically — and what your loops' error signals actually look like, since the verifier is what makes or breaks the stop. Happy to answer anything.

Show HN: Petiglyph – TUI/CLI to turn images and videos into custom font glyphs

Ninety Percent of Job Platforms Sell User Data, Study Finds

Narra – offline bilingual e-reader that translates books on-device

Show HN: DESi Sees It

Bumsrakete: FreeBSD 15 CopyFail Style LPE – Many say the best

Show HN: A curated collection of simple datasets for machine learning

I'm launching Tech Influence Watch as AI follows crypto into politics

Google Gemini in Workspaces is down

TorchCodec 0.14: HDR Video Decoding for CPU and CUDA, and Fast Wav Decoder

Sprite: From Static Mockups to Engine-Ready Game UI

Explicit Seams as Agent Affordances

GnuCash is right. It's also why I built my own finance app

Locked in heated rivalry with researcher, Microsoft fixes 0-day they disclosed

AtlasForgeX – AI-powered living business intelligence network

Ask HN: Has Multi-AZ (AWS or others) helped you stay up during downtime?

Show HN: Construct SQL from table records by breaking down decision tree

BBC cancels Doctor Who Christmas special and Russell T Davies announces exit

PocketSentry – Sentry-compatible lightweight error tracker in Go

Notes on DeepSeek

Show HN: Pressbook – On-device photo albums, no server in the path (iOS)

PgDog is funded and coming to a database near you

Linux latency measurements and compositor tuning [KWin Wayland]

Why a Project May Have Low Cognitive Debt and High Intent Debt

Camel Mono – a monospace font that makes camelCase easier to read

Webhook-verification bugs across 45 popular OSS repos

NoopApp/noop: Offline WHOOP companion – in my opinion,better than Whoop app

Ask HN: Best place to sell digital assets? (software, domains, etc.)

A Proof That P ≠ NP via the Traveling Salesman Problem

Is the Space Pope Reptilian?

Xteink X4 Pocket EReader, 77g Magnetic, Unlocked Firmware, Developer Edition