Slowlog and COMMANDLOG persistence. Valkey's slowlog holds 128 entries by default. Once the buffer fills, old entries are gone. The agent captures every entry before it rotates and persists it. When your p99 spikes at 3am, the evidence is still there at 9am. COMMANDLOG (Valkey 8.1+) tracks large requests/replies - that 50MB MSET killing your network is now visible. Client attribution. Not just "something was slow" but "app-server-03 sent 80% of the HGETALL user:* queries that caused the slowdown." Connects slow queries to specific clients. Pattern analysis. Instead of scrolling raw slowlog, you see aggregated patterns: "HGETALL user:* accounts for 80% of slow queries, avg 12ms, mostly from 2 clients." Per-slot metrics for clusters. If you're running clustered Valkey, per-slot key counts and memory distribution for spotting hot slots. 99 Prometheus metrics. If you already have Grafana, just point it at our /prometheus/metrics endpoint. Get Valkey-specific data that redis_exporter doesn't expose.
Important caveat on managed services: The agent works everywhere, but the value varies. On self-managed instances and standard ElastiCache, you get the full picture - slowlog, client lists, ACL logs, config monitoring. On fully managed services like ElastiCache Serverless that restrict admin commands (SLOWLOG, CLIENT LIST, ACL LOG), the agent is limited to what INFO provides. We surface core metrics (memory, CPU, connections, ops/sec, replication) but the historical persistence features that make BetterDB unique won't be available. The dashboard shows you exactly which features are available vs restricted for your instance.
Self-hosted: https://github.com/BetterDB-inc/monitor Cloud: https://betterdb.com Docs: https://docs.betterdb.com
Everything is free during beta - no credit card. Happy to answer architecture questions, take feature requests, or talk about the Valkey ecosystem.