IIRC, next thing on my TODO list was to add vectorization. Also (like OP) it uses log probabilities to avoid floating-point underflow.
leetrout•2mo ago
Could you put a license on it so we know how it can be used?
throwaway81523•2mo ago
Bayesian spam classification is or soon will be destroyed by LLM spam. Yuck.
eikenberry•2mo ago
How so? Bayes learns to classify your email and unless you classify lots of LLM content as ham you should be fine. LLMs have speech patterns that make them pretty easy to sort out by humans, let alone statistics.
netdevphoenix•2mo ago
I remember during the 10s, there was this wave of frenzied almost cult-like talk of Bayes among the engineer types. Does anyone remember it? You had folks arguing Bayes was the ultimate stats tool, that it was superior to frequentist tools, etc. Some guy called Yudowsky was a heavy supporter of it too.
sigwinch•2mo ago
In my experience, interest from sociology promoted it to those writing A/B testing. It’s not that the difference in a degree in applied statistics and a degree in data science demonstrated frequentist versus Bayesian.
yearolinuxdsktp•2mo ago
Well Bayesian stats can be a superior stats tool vs frequentist for A/B testing applications, especially around digital traffic, where it builds with time and it’s different than calling up a sample size of people, and when you want to subdivide the samples, or if you want to stop testing early. These are some techniques now to end frequentist tests early in the presence of a strong signal that were developed in response to the criticisms in the 10s.
esafak•2mo ago