The entities enabling scientific fraud at scale are large, resilient and growing

https://doi.org/10.1073/pnas.2420092122

91•peyton•2h ago

Comments

pixl97•1h ago

This is Goodhart's law at scale. Number of released papers/number of citations is a target. Correctness of those papers/citations is much more difficult so is not being used as a measure.

With that said, due to the apparent sizes of the fraud networks I'm not sure this will be easy to address. Having some kind of kill flag for individuals found to have committed fraud will be needed, but with nation state backing and the size of the groups this may quickly turn into a tit for tat where fraud accusations may not end up being an accurate signal.

May you live in interesting times.

armchairhacker•1h ago

There’s an accurate way to confirm fraud: look for inconsistencies and replicate experiments.

If the fraudsters “fail to replicate” legitimate experiments, ask them for details/proof, and replicate the experiment yourself while providing more details/proof. Either they’re running a different experiment, their details have inconsistencies, or they have unreasonable omissions.

wswope•49m ago

Yeah, but this happens all the time.

>>95% of the time, the fraudsters get off scot-free. Look at Dan Ariely: Caught red-handed faking data in Excel using the stupidest approach imaginable, and outed as a sex pest in the Epstein files. Duke is still giving him their full backing.

It’s easy to find fraud, but what’s the point if our institutions have rotten all the way through and don’t care, even when there’s a smoking gun?

pixl97•38m ago

Of course this is slightly messy too. Fraudsters are probably always incorrect, of course they could have stolen the data. But being incorrect doesn't mean your intentionally committing fraud.

john_strinlai•13m ago

that approach is accurate, but not scalable.

the effort to publish a fraudulent study is less (sometimes much less) than the effort to replicate a study.

bwfan123•14m ago

> This is Goodhart's law at scale.

Also, Brandolini's law. And Adam Smith's law of supply and demand. When the ability to produce overwhelms the ability to review or refute, it cheapens the product.

gjsman-1000•1h ago

The future of science, the Internet, and all things: The Library of Babel by Jorge Luis Borges.

Some things should not have been democratized. Silicon Valley assumes that removing restrictions on information brings freedom, but reality shows that was naïve.

honeycrispy•1h ago

You shouldn't just assume that the inverse would be free from fraud. The incentives for fraud still apply even when the system is not democratized.

gjsman-1000•1h ago

Except with AI, a fraudulent gatekept world would still be a smaller percentage of fraud than what is coming. Infinite scale fraud.

The soviets may have rigged a few studies; but the democratized world now faces almost all studies being rigged.

honeycrispy•1h ago

I think it'd be a different form of fraud that would be much harder to discredit. Think sugar industry blaming fat for health issues. More of that.

rdevilla•1h ago

Tearing down gatekeeping (i.e. "high standards") in pursuit of maximal inclusivity is just another way of saying "regression to the mean."

The gate has been removed from the signal chain, and now the noise floor is at infinity.

qsera•48m ago

There is a saying in my native language that goes something like "If you mix poison and milk, the milk will turn poisonous, instead of poison becoming milk (aka beneficial)".

I guess, to convert it into this context, we can say that if you mix the high minded and infantile (which I think is what Internet and social media did), the high minded becomes infantile, instead of the other way around.

leoc•46m ago

In what way was it was democratised? We're not talking about Substacks and YouTube channels here, we're not even talking about arXiv preprints and the like, we're talking about peer-reviewed journal publications, and that system remains gated in much the same way that it was in the 1980s when it comes to trying to publish in it. If anything this system is the poster child for top-down gatekeeping by the recognised authorities, and it's precisely the value of that official recognition that makes people so desperate to break into it. The major changes seem to have been the easy availability of author publication lists and the advent of publication metrics, not things which have been or were ever meant to be particularly democratising for would-be authors; and an increase in the number of people playing the game, driven to a large extent by increasing participation from developing countries, and hopefully not many people would have the gall to argue for a ban on developing-country participation.

niam•37m ago

The Library of Babel comparison is too fatalistic imo, even granting that it's maybe just an extreme example. The real world doesn't quite resemble a closed system with no metadata. We can still establish chains of trust.

Whether or not people will build resilient chains is another story, contingent on whether the strength of that chain actually matters to people. It probably doesn't for a lot of people. Boo. But inasmuch as I care, I feel I ought to be free to try and derive a strong signal through the noise.

RobotToaster•1h ago

It kinda skips over how large mainstream journals, with their restrictive and often arbitrary standards, have contributed to this. Most will refuse to publish replications, negative studies, or anything they deem unimportant, even if the study was conducted correctly.

tppiotrowski•34m ago

Maybe we need a journal completely dedicated to replication studies? It would attract a lot of attention I think.

pfdietz•14m ago

And funding dedicated to replication studies.

MichaelDickens•8m ago

Economics has the Journal of Comments and Replications in Economics: https://jcr-econ.org/

CGMthrowaway•32m ago

So much of this started with the rise of the peer-review journal cartel, beginning with Pergamon Press in 1951 (coincidentally founded by Ghislaine Maxwell's father). "Peer review" didn't exist before then, science papers and discussion was published openly, and scientists focused on quality not quantity.

leoc•20m ago

Right, it seems that many of the weaknesses in the system exist because they serve the interests of journal publishers or of normal, legitimate-ish researchers, but in the process open the door to full-time system-hackers and pure fraudsters.

ramraj07•5m ago

Do you want issues of Nature and cell to be replication studies? As a reader even from within the field, im not interested in browsing through negative studies. It'll be great if I can look them up when needed but im not looking forward to email ToC alerts filled with them.

Also who's funding you for replication work? Do you know the pressure you have in tenure track to have a consistent thesis on what you work on?

Literally every single know that designs academia is tuned to not incentivize what you complain about. Its not just journals being picky.

Also the people committing fraud aren't ones who will say "gosh I will replicate things now!" Replicating work is far more difficult than a lot of original work.

temporallobe•1h ago

My wife completed her PhD two years ago and she put a LOT of work into it. Many sleepless nights, and it almost destroyed our marriage. It took her about 6 years of non-stop madness and she didn’t even work during that time. She said that many of her colleagues engaged in fraudulent data generation and sometimes just complete forgery of anything and everything. It was obvious some people were barely capable of putting together coherent sentences in posts, but somehow they generated a perfect dissertation in the end. It was common knowledge that candidates often hired writers and even experts like statisticians to do most of the heavy lifting. I don’t know if this is the norm now, but I simultaneously have more respect and less respect for those doctoral degrees, knowing that some poured their heart and soul into it, while others essentially cheated their way through. OTOH, I also understand that there may be a lot of grey area.

My eyes have been opened!

titzer•53m ago

I found the article and your third-hand anecdotes troubling. The good news is that it does not match any of the years of experience in my field. Fraud is just not that rampant. At PhD-granting institutions, the level of fraud you describe here is very seriously punished. It's career-ending. The violations that you are serious enough that any institution would expel said students (or harshly punish faculty--probably firing them). She did no one any favors by not reporting them.

Unfortunately I don't think a dialogue around vague anecdotes is going to be particularly enlightening. What matters is culture, but also process--mechanisms and checks--plus consequences. Consequences don't happen if everyone is hush-hush about it and no one wants to be a "rat".

qsera•41m ago

>It's career-ending..

That is where being good at politics come into play. And if you are good at it, instead of being career-ending, fraud will put you in the highest of the positions!

No one wants a "plant" who cannot navigate scrutiny!

mistrial9•40m ago

yeah - skeptical here. Among certain departments, at large schools, under certain leaders.. The combination of "my marriage almost crumbled" for motivated reasoning, and "I have never seen any of this before" total inexperience with actual process.. the post shows itself to be biased and unreliable.

However, among certain departments, at large schools, under certain leaders.. yes, and growing

$0.02

fastaguy88•51m ago

It is useful to distinguish between "effective" scientific fraud, where some set of fraudulent papers are published that drive a discipline in an unproductive direction, and "administrative" scientific fraud, where individuals use pseudo-scientific measures (H-index, rankings, etc) to make allocation decisions (grants, tenure, etc). This article suggests that administrative scientific fraud has become more accessible, but it is very unclear whether this is having a major impact on science as it is practiced.

Non-scientists often seem to think that if a paper is published, it is likely to be true. Most practicing scientists are much more skeptical. When I read a that paper sounds interesting in a high impact journal, I am constantly trying to figure out whether I should believe it. If it goes against a vast amount of science (e.g. bacteria that use arsenic rather than phosphorus in their DNA), I don't believe it (and can think of lots of ways to show that it is wrong). In lower impact journals, papers make claims that are not very surprising, so if they are fraudulent in some way, I don't care.

Science has to be reproducible, but more importantly, it must be possible to build on a set of results to extend them. Some results are hard to reproduce because the methods are technically challenging. But if results cannot be extended, they have little effect. Science really is self-correcting, and correction happens faster for results that matter. Not all fraud has the same impact. Most fraud is unfortunate, and should be reduced, but has a short lived impact.

qsera•37m ago

>methods are technically challenging.

And finanacially too..

>Science really is self-correcting..

When economy allows it....

pfdietz•3m ago

One approach is more integration of researchers with businesses. Fraud (or simple incompetence) by researchers negatively affects businesses, as they expend effort on things that aren't real. I understand this is a constant problem in the pharmaceutical industry.

Lego's 0.002 mm Specification and Its Implications for Manufacturing (2025)

The entities enabling scientific fraud at scale are large, resilient and growing

Faster Asin() Was Hiding in Plain Sight

Microsoft BitNet: 100B Param 1-Bit model for local CPUs

Whistleblower: DOGE member took Social Security data to new job

PeppyOS: A simpler alternative to ROS 2 (now with containers support)

AI Agent Hacks McKinsey

Building a TB-303 from Scratch

Zig – Type Resolution Redesign and Language Changes

Ask HN: Is Claude Down Again?

Cloudflare crawl endpoint

Create value for others and don’t worry about the returns

Yann LeCun raises $1B to build AI that understands the physical world

Tony Hoare has died

U+237C ⍼ Is Azimuth

TADA: Fast, Reliable Speech Generation Through Text-Acoustic Synchronization

Julia Snail – An Emacs Development Environment for Julia Like Clojure's Cider

Agents that run while I sleep

SSH Secret Menu

RISC-V Is Sloooow

Let yourself fall down more

Hurricane Electric (HE.NET) IPv6 tunnelbroker page offline due to expired domain

When the chain becomes the product: Seven years inside a token-funded venture

Debian decides not to decide on AI-generated contributions

Levels of Agentic Engineering

Roblox is minting teen millionaires

Where did you think the training data was coming from?

Launch HN: RunAnywhere (YC W26) – Faster AI Inference on Apple Silicon

Writing my own text editor, and daily-driving it

Standardizing source maps

The entities enabling scientific fraud at scale are large, resilient and growing

Comments

Lego's 0.002 mm Specification and Its Implications for Manufacturing (2025)

The entities enabling scientific fraud at scale are large, resilient and growing

Faster Asin() Was Hiding in Plain Sight

Microsoft BitNet: 100B Param 1-Bit model for local CPUs

Whistleblower: DOGE member took Social Security data to new job

PeppyOS: A simpler alternative to ROS 2 (now with containers support)

AI Agent Hacks McKinsey

Building a TB-303 from Scratch

Zig – Type Resolution Redesign and Language Changes

Ask HN: Is Claude Down Again?

Cloudflare crawl endpoint

Create value for others and don’t worry about the returns

Yann LeCun raises $1B to build AI that understands the physical world

Tony Hoare has died

U+237C ⍼ Is Azimuth

TADA: Fast, Reliable Speech Generation Through Text-Acoustic Synchronization

Julia Snail – An Emacs Development Environment for Julia Like Clojure's Cider

Agents that run while I sleep

SSH Secret Menu

RISC-V Is Sloooow

Let yourself fall down more

Hurricane Electric (HE.NET) IPv6 tunnelbroker page offline due to expired domain

When the chain becomes the product: Seven years inside a token-funded venture

Debian decides not to decide on AI-generated contributions

Levels of Agentic Engineering

Roblox is minting teen millionaires

Where did you think the training data was coming from?

Launch HN: RunAnywhere (YC W26) – Faster AI Inference on Apple Silicon

Writing my own text editor, and daily-driving it

Standardizing source maps