What's the point of the relu in the loss function? Its inputs are nonnegative anyway.
Nevermark•7m ago
Could you just try to be more positive?
js8•2m ago
You can also imagine a similar thing on binary vectors. There two vectors are "orthogonal" if they share no bits that are one. So if you can encode huge number of concepts using only small number of bits in modestly sized vectors, and most of them will be orthogonal.
bigdict•26m ago
Nevermark•7m ago