Looks interesting. Should start with a definition of the Hyperbolic Tangent. It is only about 2/3 of the way that the definition occurs in a discussion of computing exp(x).
agalunar•34m ago
There’s an analysis of the Schraudolph approximation of the exponential function (along with an improvement upon it) that someone might find interesting at https://typ.dev/attention#affine-cast
raphlinus•19m ago
A different approach, refining the square root based sigmoid with a polynomial, is in my blog post "a few of my favorite sigmoids" [1]. I'm not sure which is faster without benchmarking, but I'm pretty sure its worst case error is better than any of the fast approximations.
mjcohen•55m ago