The pdf got replaced for some reason (bug, sensitive information in the meta or idk), but the article seems to have stayed the same, except the date.
[0]: https://arxiv.org/pdf/1201.2590v1.pdf
[1]: https://web.archive.org/web/0if_/https://arxiv.org/pdf/1201....
I need to look this up, but I recall in the 90s a social psychology journal briefly had a policy of "if you show us you're handling your data ethically, you can just show us a self-explanatory plot if you're conducting simple comparisons instead of NHST". That was after some early discussions about statistical reform in the 90s - Cohen's "The Earth is round (p < .05)" I think kick-started things off.
I wonder if the generality of the Bayesian approach is what's prevented its wide adoption? Having a prescribed algorithm ready to plug in data is mighty convenient! Frequentism lowered the barrier and let anyone run stats, but more isn't necessarily a good thing.
I fear you operate under the illusion that frequentist statistics are somehow limited to hypothesis testing. It is absolutely not the case.
Bayesian methods are more intuitive, and fit how most be reason when they reason probabilistically. Unfortunately Bayesian computational methods are often less practical to use in non-trivial settings (usually involves some MCMC).
I'm a Bayesian reasoner, but happily use frequentist computation methods (max likelihood estimation) because they're just more tractable.
Frequentists view probability as a long-run frequency, while Bayesians view it as a degree of belief.
Frequentists treat parameters as fixed, while Bayesians treat them as random variables.
Frequentists don't use prior information, while Bayesians do.
Frequentists make inferences about parameters, while Bayesians make inferences about hypotheses.
---
If we state the full nature of our experiment, what we controlled and what we didn't... how can it be a "degree of belief"? Sure, it's impossible to be 100% objective, but it is easy to add enough background info to your paper so people can understand the context of your experiment and why you got your results. "we found that at our college in this year, when you ask random students on the street this question, 40% say this, 30% say this..." and then considering how the college campus sample might not fully represent a desired larger sample population... what is different? you can confidently say something about the students you sampled, less so about the town as a whole, less so about the state as a whole...
I don't know, I finished my science degree after 10 years and apparently have an even mix of these philosophies.
Would love to learn more if someone's inclined.
NewsaHackO•5h ago
jxjnskkzxxhx•5h ago
Because it looks more credible, obviously. In a sense it's cargo cult science: people observe this is the style of science, and so copy just the style; to a casual observer it appears to be science.
nickpsecurity•3h ago
jxjnskkzxxhx•3h ago
billfruit•5h ago
BlarfMcFlarf•5h ago
SJC_Hacker•4h ago
BDPW•4h ago
mcswell•3h ago
tsimionescu•4h ago
SoftTalker•4h ago
Yet it's what we train LLMs on.
tsimionescu•4h ago
birn559•4h ago
verbify•3h ago
> We introduce phi-1, a new large language model for code, with significantly smaller size than competing models: phi-1 is a Transformer-based model with 1.3B parameters, trained for 4 days on 8 A100s, using a selection of ``textbook quality" data from the web (6B tokens) and synthetically generated textbooks and exercises with GPT-3.5 (1B tokens). Despite this small scale, phi-1 attains pass@1 accuracy 50.6% on HumanEval and 55.5% on MBPP. It also displays surprising emergent properties compared to phi-1-base, our model before our finetuning stage on a dataset of coding exercises, and phi-1-small, a smaller model with 350M parameters trained with the same pipeline as phi-1 that still achieves 45% on HumanEval
We train on the internet because, for example, I speak a fairly niche English dialect influenced by Hebrew, Yiddish and Aramaic, and there are no digitised textbooks or dictionaries that cover this language. I assume the base weights of models are still using high quality materials.
birn559•4h ago
In addition, peer reviews are anonymous for both sides (as far as possible).
ujkiolp•4h ago
watwut•3h ago
jxjnskkzxxhx•3h ago
> Why the gatekeeping. Only what is said matters, not who says it.
Tell me you zero media literacy without telling me you have zero media literacy.
groceryheist•5h ago
1. Preprint servers create DOIs, making works better citable.
2. Preprint servers are archives, ensuring works remain accessible.
My blog website won't outlive me for long. What happened to geocities could also happen to medium.
SoftTalker•4h ago
mitthrowaway2•4h ago
If you're writing a paper about a longstanding math problem and the solution gets published on 4chan, you still need to cite it.
NooneAtAll3•3h ago
mousethatroared•3h ago
amelius•4h ago
mousethatroared•2h ago
Both models are fallible, which is why discernment is so important.
jononor•1h ago
bowsamic•3h ago
jononor•1h ago
That something is unreviewed does not mean that it is bad or useless.
constantcrying•2h ago
It is weird how people use a platform exactly how it is supposed to be used.