frontpage.

Show HN: An "earned autonomy" architecture for AI agents using Subjective Logic

https://kenschachter.substack.com/p/earned-autonomy

1•ken_neth•1h ago

Most agent systems treat autonomy as binary: the agent either does the thing or asks permission first. In practice, this means you end up rubber-stamping a stream of approval requests until you stop paying attention. The system designed to keep you in control trains you to stop caring.

To manage operations for my independent video game studio, I built a trust system that works more like onboarding a new hire. Agents start in draft mode (every action needs approval), and earn autonomy over time based on their track record in specific task categories.

The core idea: each agent maintains a separate Beta distribution per task category (support triage, expense reports, publisher emails, etc.). A Beta distribution is basically a track record parameterized by successes and failures. But raw E[p] = α/(α+β) can't tell the difference between "9 successes, 0 failures" and "90 successes, 10 failures" since both give E[p] = 0.90. So I use Jøsang's Subjective Logic to map these to opinion tuples that explicitly separate belief from uncertainty. High uncertainty means "not enough data yet," which is different from "we know this agent is bad."

Every action passes through a gate:

  VoI = stakes × (1 - trust) × uncertainty

Low VoI = auto-execute. High VoI = draft for human review. Static trust thresholds set the maximum autonomy level an agent can reach (Auto-Execute, Soft-Execute, Draft, Restricted), and VoI acts as a secondary gate that can restrict it further based on context — an agent might qualify for auto-execute in general, but a high-stakes situation still gets flagged.

Three things that made the biggest difference:

1. Edit distance feedback. If you rewrite half an email before hitting "approve," the system notices. A 0% edit = full trust credit. A 71%+ rewrite = penalty. This single change prevented agents from reaching auto-execute on work users were quietly fixing.

2. Time-based decay. Trust scores decay daily for inactive categories (λ = 0.95). If an agent hasn't done a task in two months, it gets supervised again. This also handles model upgrades, since the track record was earned on a different model.

3. Weakest-link chains. Multi-step workflows (send welcome email → create project → schedule meeting → notify team) use a weakest-link model. If any step needs approval, the whole chain surfaces as one inbox item. Nothing runs until you approve the full picture.

The core mapping from track record to opinion looks like this:

  def beta_to_opinion(alpha, beta, base_rate=0.5):
      n = alpha + beta
      return Opinion(
          belief=(alpha - 1) / n,
          disbelief=(beta - 1) / n,
          uncertainty=2 / n,
          base_rate=base_rate,
      )

The math is all well-established (Beta distributions, Subjective Logic, Value of Information). The part that worked was combining them into something that mirrors how trust actually develops between people.

Article with full implementation details, code examples, and diagrams: https://kenschachter.substack.com/p/earned-autonomy

Touchscreen OLED MacBook Pro Coming in 2026: Dynamic Island, Redesigned Controls

Lattice-proxy – 93% token compression for LLM APIs (drop-in replacement)

Fed's Cook says AI triggering big changes, sees possible unemployment rise

Show HN: Linex – A daily challenge: placing pieces on a board that fights back

Claude says its DeepSeek when asked in Chinese

UK fines Reddit for not checking user ages aggressively enough

Show HN: Tau Router – Using Number-Theory to Partition Long-Context Retrieval

Show HN: Claud-ometer – See your Claude Code usage, costs, and sessions locally

Show HN: MCP app for viewing/generating 3D Gaussian Splatting in ChatGPT

Sandcastles Made of Bits

Show HN: VoooAI – natural language to multi-modal AI pipelines

Every privacy concession in history has been permanent

Show HN: Ctrl and Click on text to jump to its translation key in VSCode

Xkcd Simulation for Real Packages

New Research Suggests Myopia Could Be Caused by How We Use Our Eyes Indoors

AI and My Crisis of Meaning

Men develop heart disease 7 years before women

AI isn't killing SaaS – it's killing single-purpose SaaS

Show HN: WaveGuard – Anomaly detection using wave physics simulation (GPU, MCP)

Show HN: Jqueue: A job queue that runs on a single JSON file

AST-Guided Translation of Natural Language into First-Order Logic with LLMs

How Long Is the Coast of Britain? (1967)

Hegseth threatens to blacklist Anthropic over 'woke AI' concerns

Ask HN: Do you use trading AI agents?

People-Shaped Problems

In Pursuit of High-Fidelity GPU Kernel Benchmarking

Chile's Project Cybersyn (1971-1973)

Mac mini plant was planned under the Biden administration

Cortex 2.0 – The Future of Robotic Intelligence

GRC Engineering