Scientists have informal trust networks that I’d like to see made explicit. For example, I’d like to see a social media network for scientists where they can PRIVATELY specify trust levels in each other and in specific papers, and subscribe to each others’ trust networks, to get an aggregated private view of how their personal trusted community views specific labs and papers.
That sounds fascinating, but I'd have a darned high bar to participate to make sure I wasn't inadertently disclosing my very personal trust settings. Past experiences with intentional or unintentional data deanonymization (or just insufficient anonymization) makes me very wary of such claims.
I don't think you want to slow down publication (and probably peer review and prestiage journals are useless/obsolete in era of internet); it's already crazy slow.
So let's see: you want people to incentivize two things (1) no false claims in original research (2) to have people try to reproduce claims.
So here's a humble proposal for a funding source (say...the govt): set aside a pot of money specifically for people to try to reproduce research; let this be a valid career path. Your goal should try to be getting research validated by repro before OTHER research starts to build on those premises (avoiding having the whole field go off on wild goose chases like happened w/ Alzheheimer's). And then, when results DON'T repro, blackball the original researchers from funding. (With whatever sort of due process is needed to make this reasonable.)
I think it'd sort things out.
We have money to fund direct reproducibility studies (this one is an example), and indirect replication by applying othogonal methods to similar research topics can be more powerful than direct replication.
Given the way that science and statistics work, completely honest researchers that do everything correct and don't make any mistakes at all will have some research that fails to reproduce. And the flip side of that is that some completely correct work that got the right answer, some proportion of the time, the reproduction attempt will incorrectly fail to reproduce. Type 1 and Type 2 errors are both real and occur without any need for misconduct or mistakes.
Surely that just means that we shouldn't spend too much effort achieving small marginal progress towards that ideal, rather than that's not the ideal? I am a scientist (well, a mathematician), and I can maintain my idealism about my discipline in the face of the idea that we can't and shouldn't try to catch and stop all fraud, but I can't maintain it in the face of the idea that we should aim for a small but positive amount of fraud.
You CANNOT create a system that has zero fraud without rejecting a HUGE amount of legitimate work/requests.
This is as true for credit card processing as it is for scientific publishing.
There's no such thing as "Reject 100% of fraud, accept 100% of non-fraud". It wouldn't be "ideal" to make our spaceships with anti-gravity drives, it would be "science fiction".
The relationship between how hard you prevent fraud and how much legitimate traffic you let through is absurdly non-linear, and super dependent on context. Is there still low hanging fruit on the fraud prevention pipeline for scientific publishing?
That depends. Scientists claim that having to treat each other as hostile entities would basically destroy scientific progress. I wholeheartedly agree.
This should be obvious to anyone who has approved a PR from a coworker. Part of our job in code review is to prevent someone from writing code to do hostile things. I'm sure most of us put some effort towards preventing obvious problems, but if you've ever seen https://en.wikipedia.org/wiki/International_Obfuscated_C_Cod... or some of the famous bits of code used to hack nation states then you should recognize that the amount of effort it would take to be VERY SURE that this PR doesn't introduce an attack is insane, and no company could afford it. Instead, we assume that job interviews, coworker vibes, and reputation are enough to dissuade that attack vector, and it works for almost everyone except the juiciest targets.
Science is a high trust industry. It also has "juicy targets" like "high temp superconductor" or "magic pill to cure cancer", but scientists approach everything with "extreme claims require extreme results" and that seems to do alright. They mostly treated LK-99 with "eh, let's not get hasty" even as most of the internet was convinced it was a new era of materials. I think scientists have a better handle on this than the rest of us.
> You CANNOT create a system that has zero fraud without rejecting a HUGE amount of legitimate work/requests.
I think that we are using different definitions of "ideal." It sounds like your definition is something like "practically achievable," or even just "can exist in the real world," in which case, sure, zero fraud is not ideal in that sense. To check whether I am using the word completely idiosyncratically, I just looked it up in Apple Dictionary, and most of the senses seem to match my conception, but I meant especially "2b. representing an abstract or hypothetical optimum." It seems very clear to me that you would agree with zero fraud being ideal in sense "2a. existing only in the imagination; desirable or perfect but not likely to become a reality," but possibly we can even agree that it also fits sense 2b above.
> With whatever sort of due process is needed to make this reasonable
Is it not reasonable to not continue to fund scientists whose results consistently do not reproduce? And should we not spend the funds to verify that they do (or don't) reproduce (rather than e.g. going down an incredibly expensive goose-chase like recently happened w/ Alzheimer's research)?
Currently there is more or less no reason not to fudge results; your chances of getting caught are slim, and consequences are minimal. And if you don't fudge your results, you'll be at a huge disadvantage when competing against everyone that does!
Hence the replication crises.
So clearly something must be done. If not penalyzing failures to reproduce and funding reproduction efforst, then what?
Science is a field with low wages, uncertain careers, and relatively little status. If you respond strongly to incentives, why would you choose science in the first place? People tend to choose science for other reasons. And, as a result, incentives are not a particularly effective tool for managing scientists.
And without the proper systemic arrangements, people with strong internal values will just tend to get pushed out. E.g., an example from today's NY times: https://archive.is/wV4Sn
I don't mean to seem too cynical about human nature; it's not so much that I don't think people with good motivations won't exist, it's that you need to create a broader ecosystems where those motivations are adaptive. Otherwise they'll just get pushed out.
By analogy, consider a competitive sport, like bicycling. Imagine if it was just an honor system to not use performance enhancing drugs; even if 99% of cyclists were completely honest, the sport would still be dominated by cheaters, because you simply wouldn't be able compete without cheating.
The dynamics are similar in science if you allow for bad research to go unchallenged.
(PS: Being a scientist is very high-status! I can imagine very few things with as much cachet at a dinner-party as saying "I'm a scientist".)
Science selects actively against people who react strongly to incentives. The common and incentivized path is not doing science. Competitive sports are the opposite, as they appeal more to externally motivated people. From a scientist's point of view, the honest 99% of cyclists would absolutely dominate the race, as they ride 99% of the miles. Maybe they won't win, but winning is overrated anyway. Just like prestigious awards, vanity journals, and top universities are nice but ultimately not that important.
I don't think this is true at all! If it were true, we would not have the reproducibility crises and various other scandals that we do, in fact, have.
Scientists are humans like any other, and respond to incentives.
Funding is a game -- you have to play the game in a way that wins to keep getting funding, so necessarily idealists that don't care about the rules of the game will be washed out and not get funding. It's in our collective interest, then, to make sure that winning the game equates to doing good science!
The reproducibility crisis seems to be mostly about applying the scientific method naively. You study a black box nobody really understands. You formulate a hypothesis, design and perform an experiment, collect data, and analyze the data under a simple statistical model. Often that's the best thing you can do, but you don't get reliable results that way. If you need reliability, you have to build models that explain and predict the behavior of the former black box. You need experiments that build on a large number of earlier experiments and are likely to fail in obvious ways if the foundations are not fundamentally correct.
I'm pretty bad at getting grants myself, but I've known some people who are really good at it. And they are not "playing the game", or at least that's not the important part. What sets them apart is the ability to see the big picture, the attention to details, the willingness to approach the topic from whatever angle necessary, and vision of where the field should be going. They are good at identifying the problems that need to be solved and the approaches that will likely solve them. And then finding the right people to solve them.
Just open up a comment section for institutional affiliates.
There is an huge amount of pressure to publish publish publish.
So, many researchers prefeer to write very simple things that are probably true or applicative work, which is kind of useful, or publish false/fake results.
(And may be add more points if in order to reproduce you didn't have to ask plenty of questions to the original team, ie the original paper didn't omit essential information)
Because a great many who comment on this site are infantile but self-congratulating idiots who just can't help themselves on downvoting anything that doesn't fit their pet dislikes. That button should be removed or at least made not to grey-out text.
But at the same time, I doubt that fields like physics and chemistry had better practices in, say, the 19th century. It would be interesting to conduct a reproducibility project on the empirical studies supporting electromagnetism or thermodynamics. There were probably a lot of crap papers!
Those fields had a backup, which was that studies and theories were interconnected, so that they tended to cross-validate one another. This also meant that individual studies were hot-pluggable. One of them could fail replication and the whole edifice wouldn't suddenly collapse.
My graduate thesis project was never replicated. For one thing, the equipment that I used had been discontinued before I finished, and cost about a million bucks in today's dollars. On the other hand, two labs built similar experiments that were considerably better, made my results obsolete, and enabled further progress. That was a much better use of resources.
I think fixing replication will have to involve fixing more than replication, but thinking about how science progresses as a whole.
Then you perform the experiment exactly* how you said you would based on the pre-registration, and you get to publish your results whether they are positive or negative.
* Changes are allowed, but must be explicitly called out and a valid reason given.
And really if you want to be dishonest it’s easier to manipulate the raw data then it is to secretly perform the experiments ahead of time.
I take it you don’t do research. Cause boring is nothing compared to wasting month of time and money only to get a negative result that nobody will publish.
I have a broad and open-ended focus, I work as usual on the things I find interesting, then sometimes I see a thing that looks interesting and decide to investigate, then sometimes my initial tests give good results, but more often then they don't, but they give me an idea to do something completely different, and some iterations later I have a result.
I imagine that depends on a field of research. IT is cheap, but I imagine a physicist who wants to do an experiment must secure a funding first, because otherwise it's impossible to do anything. And it requires one to commit to a single topic of research.
That part is true in all fields. And one of the things that pre-registration enables is the publishing of those negative results.
Otherwise, once you're done the research and got the negative result nobody wants to publish it (unless it’s very flashy). Without being able to publish negative results, and therefore read about them, each researcher must conduct an experiment already known, if only in private, to not work.
IMO, the best way forward would be simply doubling every study with independent researchers (ideally they shouldn't have contact with each other beyond the protocol). That certainly doubles the costs, but it's really just about the only way to catch bad actors early.
True, although, as you doubtless know, as with most things that cost money, the alternative also costs money (for example, in funding experiments chasing after worthless science). It's just that we tend to set aside the costs that we have already priced in. So I tend to think in such settings that a useful approach might be to see how we can make such costs more visible, to increase the will to address them.
The flaw being that cost is everything. And, in particular, the initial cost matters a lot more than the true cost. This is why people don't install solar panels or energy efficient appliances.
When it comes to scientific research, proposing you do a higher cost study to avoid false results/data manipulation will be seen as a bug. Bad data/results that make a flashy journal paper (room temp superconductivity, for example) bring in more eyeballs and prestige to the institute vs a well-done study which shows negative results.
It's the same reason the public/private cooperation is often a broken model for government spending. A government agency will happily pick a road builder that puts out the lowest bid and will later eat the cost when that builder ultimately needs more money because the initial bid was a fantasy.
Making costs more visible is a good goal, I just don't know how you accomplish that when surfacing those costs will be seen as a negative for anyone in charge of the budget.
> for example, in funding experiments chasing after worthless science
This is tricky. It's basically impossible to know when an experiment will be worthless. Further, a large portion of experiments will be worthless (like 90% of them).
An example of this is superglue. It was originally supposed to be a replacement glass for jet fighters. While running refractory experiments on it and other compounds, the glue destroyed the machine. Funnily, it was known to be highly adhesive even before the experiment but putting the "maybe we can sell this as a glue" thought to it didn't happen until after the machine was destroyed.
A failed experiment that led to a useful product.
How does someone budget for that? How would you start to surface that sort of cost?
That's where I think the current US grant system isn't a terrible way to do things, provided more guidelines are put in place to enforce reproducibility.
> This is tricky. It's basically impossible to know when an experiment will be worthless. Further, a large portion of experiments will be worthless (like 90% of them).
I don't mean "worthless science" in the sense "doesn't lead to a desired or exciting outcome." Such science can still be very worthwhile. I mean "worthless science" in the sense of "based on fraudulent methods." This might accidentally arrive at the right answer, but the answer, whether wrong or accidentally right, has no scientific value.
One issue is that internal science within a company/lab can move incredibly fast -- assays, protocols, datasets and algorithms change often. People tend to lose track of what data, what parameters, and what code they used to arrive at a particular figure or conclusion. Inevitably, some of those end up being published.
Journals requiring data and code for publication helps, but it's usually just one step at the end of a LONG research process. And as far as I'm aware, no one actually verifies that the code you submitted produces the figures in your paper.
It's a big reason why we started https://GoFigr.io. I think making reproducibility both real-time and automatic is key to make this situation better.
You committed the same sin you are attempting to condemn, while sophomorically claiming it is obvious this sin deserves an intellectual death penalty.
It made me smile. :) Being human is hard!
Now I'm curious, will you acknowledge the elephant in this room? It's hard to, I know, but I have a strong feeling you have a commitment to honesty even if it's hard to always enact all the time. (i.e. being a human is hard :) )
Often famous/more cited studies are not replicable. But if you want to work on similar research problem and publish null/non exciting results, you're up for a fight. Journals want new, fun, exciting results but unfortunately the world doesn't work that way
And then, once you got your PhD, only then you would be expected to publish new, original research.
I would like to think that the truly important papers receive some sort of additional validation before people start to build lives and livelihoods on them, but I’ve also seen some pretty awful citation chains where an initial weak result gets overegged by downstream papers which drop mention of its limitations.
I say "a form of Alzheimer's" because it is likely we are labelling a few different diseases as Alzheimer's.
Even Einstein tried to find flaws in his own theories. This is how science should actually work.
We need to actively try and falsify theories and beliefs. Only if we fail to falsify, the theories should be considered valid.
It would be worse if the experiments were not even falsifiable, yes.
But it’s pretty damn bad when the conclusion of the original study can never be confirmed when once in a rare min they try.
I am saying we should be happy that the scientific method is working.
In your example, it’s the same as someone publishing a paper that disproves Relativity - only for us to find that the author fabricated the data.
They are continually prescribed because their actual mechanism doesn't matter, they demonstrably work. That is a matter of statistics, not science.
Anti-science types always point to the same EXTREMELY FEW examples of how science "fails", like Galileo (which had nothing to do with science) and ulcers.
They never seem to point to the much more common examples where people became convinced of something scientifically untrue for decades despite plenty of evidence otherwise. The British recognized a link between citrus and scurvy well before they were even called "Limeys"! They then screwed themselves over by changing some variables (cooking lime juice) and instead turned to a quack ("respected doctor" from a time when most people recognized doctors were worse than the sickness they treated) who insisted on alternative treatment. For about a hundred years, British sailors suffered and died due to one quacks ego.
Phrenology was always, from day one, unscientific. You STILL find morons pushing it's claims, using it to justify their godawful, hateful, and murderous world views.
Ivermectin is a great example, since you can create a "study" in Africa to show Ivermectin cures anything you want, because it is a parasite killer and most people in impoverished areas suffer from parasites, so will improve if they take it. It's entirely unrelated to the illness you claim to treat, but nobody on Facebook will ever understand that, because they tuned out science education decades ago.
How many people have died from alternative medicine quacks pushing outright disproven pseudoscience on people who have been told not to trust scientists by people pushing an agenda?
How much money is made selling sugarpills to idiots who have been told to distrust science, not just "be skeptical of any paper" but outright, scientists are in a conspiracy to lie to you!
That's not how it works. Science is hard, experiment design is hard, and a failure to reproduce could mean a bunch of different things. It could mean the original research failed to mention something critical, or you had a fluke, or you didn't understand the process right, or something about YOUR setup is unknowingly different. Or the process itself is somewhat stochastic.
This goes 10X for such difficult sciences as psychology (which is literally still in infancy) and biology. In these fields, designing a proper experiment (controlling as much as you can) is basically impossible, so we have to tease signal out of noise and it's failure prone.
Hell, go watch Youtube Chemists who have Phds fail to reproduce old papers. Were those papers fraudulent? No, science is just difficult and failure prone.
If you treat "Paper published in Nature/Science" as a source of truth, you will regularly be wrong. Scientists do not do that. Nature is a magazine, and is a business, and sees themselves as trying to push the cutting edge of research, and they will happily publish an outright fraudulent paper if there is even the slightest chance it might be valid, and especially if it would be really cool if it's right.
When discussing how Jan Hendrik Schön got tens of outright fraudulent papers into Nature despite nobody being able to even confirm he ran any experiments, they said that "even false papers can push the field forward". One of the scientists who investigated and helped Schon get fired even said that peer review is no indicator of quality or correctness. Peer review wasn't even a formal part of science publishing until the 60s.
Science is "self correcting" because if the "effect" you saw isn't real, nobody will be able to build off your work. Alzheimer's Amyloid research has been really unproductive, which is how we knew it probably wasn't the magic bullet even before it had fraud scandals.
If you doubt this, look to China. They have ENORMOUS amounts of explicit fraud in their system, as well as a MUCH WORSE "publish or perish" state. Would you suggest it has slowed them down?
Stop trying to outsource your critical thinking to an authority. You cannot do science without publishing wrong or false papers. If you are reading about "science" in a news article, press release, or advertisement, you don't know science. I am continually flabbergasted by how often "Computer Scientists" don't even know the basics of the scientific method.
Scientists understood there was a strong link between cigarettes and cancer at least 20 years before we had comprehensive scientific studies to "prove" it.
That said, there are good things to do to mitigate the harms that "publish or perish" causes, like preregistration and an incentive to publish failed experiments, even though science progressed pretty well for 400 years without them. These reproducibility projects are great, but do not mistake their "these papers failed" as "these papers were written fraudulently, or by bad scientists, or were a waste".
Good programmers WILL ship bugs sometimes. Good scientists WILL publish papers that don't pan out. These are truths of human processes and imperfect systems.
Agreed. Lab technique is a thing. There is a reason for the dark joke that in Physics, theorists are washed up by age 30, but experimentalists aren't even competent until age 40.
For psychology replace "Difficult" with "Pseudo".
To lose that tag, Psychology has to take a step back, do basic research, replicate that research multiple times, think about how to do replicatable new research, and only then start actually letting psychologists do new research to advance science.
Instead of that, unreplicated pseudo-scientific nonsense psychology papers are being used to tell governments how to force us to live our lives.
In addition to his sub stack, his Twitter is great and very accessible
To my mind there is a nasty pressure that exists for some professions/careers, where publishing becomes essential. Because it’s essential, standards are relaxed and barriers lowered, leading to the lower quality work being published. Publishing isn’t done in response to genuine discovery or innovation, it’s done because boxes need to be checked. Publishers won’t change because they benefit from this system, authors won’t change because they’re bound to the system.
I want to note there is hope. Contrary to what the root comment says, some publishers try to endorse reproducible results. See for example the ACM reproducibility initiative [1]. I have participated in this before and believe it is a really good initiative. Reproducing results can be very labor intensive though, loading a review system already struggling under massive floods of papers. And it is also not perfect, most of the time it is only ensured that the author-supplied code produces the presented results, but I still think more such initiatives are healthy. When you really want to ensure the rigor of a presented method, you have to replicate it, i.e., using a different programming language or so, which is really its own research endeavor. And there is also a place to publish such results in CS already [2]! (although I haven‘t tried this one). I imagine this may be especially interesting for PhD students just starting out in a new field, as it gives them the opportunity to learn while satisfying the expectation of producing papers.
[1] https://www.acm.org/publications/policies/artifact-review-an... [2] https://rescience.github.io
Even better is when the paper says code will be released after publication, but they cannot be bothered to post it anywhere.
I think I heard this idea from Freakonomics, but a fix is to propose research to a journal before conducting it and being committed to publication regardless of outcome.
Imagine the guy who got a FAANG job and made it nine weeks in before washing out, informing you how the entire industry doesn’t know how to write code. Maybe they’re right and the industry doesn’t know how to write code! But I want to hear it from the person who actually made a career, not the intern who made it through part of a summer.
Not to mention that we know a lot of overhyped results did fail replication and then powerful figures in academia did their best to pretend that still their thrones were not placed on top of sandcastles.
Their findings are often irrelevant to industry at best and contradictory at worst.
Of course I'm talking almost solely about SE.
It's lack of industry experience. I complained about this is a recent comment here: https://news.ycombinator.com/item?id=43769856
Basically, in SE anyway, the largest number of publications are authored by new graduates.
Think about how clueless the new MSc or PhD graduate is when they join your team: thesebare the majority of authors.
The system is set up to incentivise the wrong thing.
The most common crime they commit is fraud, the 2nd. most common one is sexual harassment, while the third one would be plagiarism, although this one might not necessarily be punishable depending on the jurisdiction.
(IMO. I can't provide data on that and I'm not willing to prosecute them personally, if that breaks the deal for you, that's ok to me.)
I know academia like the palm of my hand and have been everywhere around the world, it's the same thing all over. I can speak loudly about it because I'm catholic and have money, so those lowlives can't touch me :D.
Every single time this topic comes up, there's a lot of resistance from "the public" who is willing to go to great lengths to defend "the academics" even though they know absolutely nothing about academic life and their only grasp of it was created through TV and movies.
Anyone who has been involved in Academia for more than like 2 years can tell you the exact same thing. That doesn't mean they're also rotten, I'm just saying they've seen all these things taking place around.
We should really move the overton window around this topic so that scientists are held to the same public scrutiny as everybody else, like public officials, because btw. 9 out of 10 times they are being funded by public money. They should be held accountable, there should be jail for the offenders.
1: https://dictionary.cambridge.org/dictionary/english/criminal
Reproducing ML Robotics papers requires the exact robot/environment/objects/etc -> people fudge their numbers and have strawman implementation of benchmarks.
LLMs are so expensive to train + the datasets are non-public -> Meta trained on the test set for Llama4 (and we wouldn't have known if not for some forum leak).
In some way it's no different than startups or salesmen overpromising - it's just lying for personal gain. The truth usually wins in the end though.
A lot of things, in fact, do work. Hence, modern science producing so much despite this reproducibility crisis being even worse in decades past.
For central limit theorem to hold, the random variables must be (independently and identically dustributed) i.i.d. How do we know our samples are i.i.d.? We can only show if they are not
Add to that https://en.m.wikipedia.org/wiki/Why_Most_Published_Research_...
We've got to do better or science will stagnate
coastermug•17h ago
elashri•16h ago
> The teams were able to replicate the results of less than half of the tested experiments1. That rate is in keeping with that found by other large-scale attempts to reproduce scientific findings. But the latest work is unique in focusing on papers that use specific methods and in examining the research output of a specific country, according to the research teams.