It gives the tools to measure the changes in "behaviors that the users define". This means that it is more like a hypothesis testing framework for what the agent is doing over actually telling what the agent might do.
The reasoning and derivations behind these tools is given over here https://technoyoda.github.io/agent-science.html
Would be very happy to hear feedback and questions. (Please ignore the names given to theorization, it was for shits and giggles)