For the past 25 days I’ve been building CASPER 4 — a deterministic autonomy research stack focused entirely on governance, observability, and operator-over-the-loop control.
No weaponization, no actuation, no external commands.
This is a simulation environment for studying safety, clarity, and decision making in complex autonomy systems.
What makes CASPER different:
Most autonomy demos optimize for capability. CASPER is built around explainability, replayability, bounded authority, and human gating. The system must be able to justify its internal state transitions at every tick, or it isn’t allowed to act.
⸻
Core ideas
1. Determinism as a first-class requirement
Every domain (flight model, AO environment, swarm, vision, governance) runs on its own isolated deterministic RNG stream.
Replays reproduce bit-for-bit identical state sequences.
2. Immutable telemetry model
All inputs and transitions produce a new state object. No mutation, no hidden flags.
This forces clean reasoning surfaces and makes post-hoc analysis tractable.
3. Factor-graph clarity engine + multi-vector risk model
Instead of collapsing the system into a single “confidence score,” CASPER computes clarity and risk as explicit factor contributions (envelope pressure, environmental noise, comms degradation, nav drift, threat fields, anomaly load).
4. Governance-first autonomy
Operators review proposals, not actions.
Changes to the simulated autonomy environment require:
• a reversible proposal,
• a bounds check,
• and an operator decision packet.
Gate closed → no state transitions allowed.
Gate open → decisions applied through a deterministic corridor reshaper (CRS-1).
5. Full auditability
Every event, state diff, and governance action is written to a SHA-256/Merkle audit chain.
Replay mode verifies divergence at the tick where it occurred.
6. Synthetic but realistic telemetry
High-speed flight envelope, thermal model, q-pressure, IMU drift, threat field evolution, civilian density, nav drift, and comms loss — all deterministic, parameterized, and reproducible.
7. Swarm model
Finite-state agents with deterministic comms degradation, role assignments, formation logic, and AO-constrained movement.
8. Synthetic vision
Tile-based renderer (PIL) with HUD parallax, scanline drift, and environment overlays. Stateless and deterministic per-frame.
⸻
What CASPER is for
Not autonomy execution.
Not control loops.
Not targeting or actuation.
CASPER is a governance simulation lab for studying:
• explainable autonomy pipelines
• reversible decisions
• operator cognitive load
• auditability and replay
• swarm observability
• failure-mode reconstruction
• human-in-the-loop autonomy under stress
The intent is to push toward safe autonomy, where the system can always explain what it believed, why it acted, and how its internal confidence evolved.
⸻
Tech stack
Python, Streamlit UI, PyDeck maps, NumPy, PIL for rendering.
Single-file version and modular version included.
Runs fully offline.
⸻
Would appreciate feedback on
• whether deterministic micro-models per domain are a sensible abstraction,
• the audit chain approach (block-per-tick vs grouped blocks),
• how others think about maintaining legibility as agent count scales,
• where replayable autonomy stacks tend to break under more complex sensor loads.
Open to critique. The goal is to make this a solid research tool for operator-centered, transparent autonomy — even in a purely synthetic environment.
FoxhunterLabs•11h ago
What makes CASPER different: Most autonomy demos optimize for capability. CASPER is built around explainability, replayability, bounded authority, and human gating. The system must be able to justify its internal state transitions at every tick, or it isn’t allowed to act.
⸻
Core ideas
1. Determinism as a first-class requirement Every domain (flight model, AO environment, swarm, vision, governance) runs on its own isolated deterministic RNG stream. Replays reproduce bit-for-bit identical state sequences.
2. Immutable telemetry model All inputs and transitions produce a new state object. No mutation, no hidden flags. This forces clean reasoning surfaces and makes post-hoc analysis tractable.
3. Factor-graph clarity engine + multi-vector risk model Instead of collapsing the system into a single “confidence score,” CASPER computes clarity and risk as explicit factor contributions (envelope pressure, environmental noise, comms degradation, nav drift, threat fields, anomaly load).
4. Governance-first autonomy Operators review proposals, not actions. Changes to the simulated autonomy environment require: • a reversible proposal, • a bounds check, • and an operator decision packet. Gate closed → no state transitions allowed. Gate open → decisions applied through a deterministic corridor reshaper (CRS-1).
5. Full auditability Every event, state diff, and governance action is written to a SHA-256/Merkle audit chain. Replay mode verifies divergence at the tick where it occurred.
6. Synthetic but realistic telemetry High-speed flight envelope, thermal model, q-pressure, IMU drift, threat field evolution, civilian density, nav drift, and comms loss — all deterministic, parameterized, and reproducible.
7. Swarm model Finite-state agents with deterministic comms degradation, role assignments, formation logic, and AO-constrained movement.
8. Synthetic vision Tile-based renderer (PIL) with HUD parallax, scanline drift, and environment overlays. Stateless and deterministic per-frame.
⸻
What CASPER is for
Not autonomy execution. Not control loops. Not targeting or actuation.
CASPER is a governance simulation lab for studying: • explainable autonomy pipelines • reversible decisions • operator cognitive load • auditability and replay • swarm observability • failure-mode reconstruction • human-in-the-loop autonomy under stress
The intent is to push toward safe autonomy, where the system can always explain what it believed, why it acted, and how its internal confidence evolved.
⸻
Tech stack
Python, Streamlit UI, PyDeck maps, NumPy, PIL for rendering. Single-file version and modular version included. Runs fully offline.
⸻
Would appreciate feedback on • whether deterministic micro-models per domain are a sensible abstraction, • the audit chain approach (block-per-tick vs grouped blocks), • how others think about maintaining legibility as agent count scales, • where replayable autonomy stacks tend to break under more complex sensor loads.
Open to critique. The goal is to make this a solid research tool for operator-centered, transparent autonomy — even in a purely synthetic environment.