Let me introduce "fault". A Rust-CLI to help engineers rapidly explore their reliability as close to the code as possible. Hopefully with a bit of fun while doing it.
Its features at glance:
- Native networking faults: latency, bandwidth, packet loss...
- Native support to change DNS or LLM responses
- OpenAPI scenario generator and runner with basic load test support builtin
- Easy command to inject itself into GCP, AWS or Kubernetes against a running application
- eBPF optional support for seamless interception
- LLM-based review of results with a system prompt leaning on SRE-analysis
- MCP server for AI-agents
- Fully Open Source and written in Rust
The premises are simple: a TCP proxying that you can configure to play on inbound and outbound streams.
For instance, injecting 300ms latency:
$ fault run \
--proxy "9090=127.0.0.1:7070" \
--with-latency --latency-mean 300
Point your client to it and see how it behaves.This is a small tool that is handy to have under your belt I believe. A bit of low-friction Chaos Engineering.
Would love to hear how you guys handle reliability so I know how to keep improving it.