- "entropy" is information; "information", therefore, already is surprise; thus, it's dangerous to re-define "surprise" as -log P(x), which is already part of the definition of suprise, as that leads to ambiguity and a circularity;
- KL divergence is relative entropy (added surprise by a second distribution, given a first, so _relative_ surprise);
- I would caution about terms like "expected surprise" for the same reason as I object to "dry water"...
jll29•1h ago
- "entropy" is information; "information", therefore, already is surprise; thus, it's dangerous to re-define "surprise" as -log P(x), which is already part of the definition of suprise, as that leads to ambiguity and a circularity;
- KL divergence is relative entropy (added surprise by a second distribution, given a first, so _relative_ surprise);
- I would caution about terms like "expected surprise" for the same reason as I object to "dry water"...