2% of ICML papers desk rejected because the authors used LLM in their reviews

https://blog.icml.cc/2026/03/18/on-violations-of-llm-review-policies/

77•sergdigon•1h ago

Comments

michaelbuckbee•1h ago

Worth reading for the discussion of the LLM watermark technique alone.

mijoharas•1h ago

One thing to note.

They were quite conservative in their approach, so the only things that were rejected were from people who had agreed not to use an LLM and almost definitely did use an LLM (since they fed hidden watermarked instructions to the llm's).

This means the true number of people that used LLM's in their review (even in group A that had agreed not to) is likely higher.

Also worth noting, 10% of these authors used them in more than half of their reviews.

grey-area•51m ago

Yes for those in group B I'd suspect many were doing exactly what these cheaters in group A were doing - submitting the unaltered output of an LLM as their review.

hodgehog11•1h ago

I'm amazed that such a simple method of detection worked so flawlessly for so many people. This would not work for those who merely used LLMs to help pinpoint strengths and weaknesses in the paper; there are separate techniques to judge that. Instead, it only detects those who quite literally copied and pasted the LLM output as a review.

It's incredible how so many people thought it was fair that their paper should be assessed by human reviewers alone, and yet would not extend the same courtesy to others.

everdrive•1h ago

Generally speaking people have worse impulse control than they believe they do. Once you give a tool that does most of the work for you, very very few people will actually be able to use that tool in truly enriching ways. The majority of people (even the smart ones) will weaken over time and take shortcuts.

hodgehog11•1h ago

That's an excellent point. It seems likely they thought they could operate as a proper reviewer, but when the deadline came, they took the shortcut they knew they were not supposed to take.

It really does sound like an addiction when you put it this way.

jacquesm•55m ago

I have a very simple solution to this but it is a bit expensive. I run two laptops, one that I talk to an LLM on and another where I do all my work and which is my main machine. The LLM is strictly there in a consulting role, I've done some coding experiments as well (see previous comments) but nothing that stood out to me as a major improvement.

The trick is: I can't cut-and-paste between the two machines. So there is never even a temptation to do so and I can guarantee that my writing or other professional output will never be polluted. Because like you I'm well aware of that poor impulse control factor and I figured the only way to really solve this is to make sure it can not happen.

jjgreen•40m ago

You could ssh in to the "dirty" machine ... just sayin'

jacquesm•36m ago

Yes, I could. But I've purposefully made linking the two quite hard.

manbash•27m ago

This somewhat of the equivalent of "quitting cold turkey", in the sense that you remove the temptation from your reach.

The problem is that it's just much easier to un-quit and run the LLM in the same laptop you work on.

It's just so very tempting.

jacquesm•14m ago

I think that's the only way to deal with such temptations. Kidding yourself that you are strong enough to do it 'just once' or that you can handle the temptation is foolish and will only lead to predictable outcomes. I have a similar policy to smoking, drugs, alcohol and so on, I just don't want the temptation. It helps to have seen lots of people who thought they were smart enough eventually go under (but the price is pretty high).

Oh, and LLMs are of course geared to pull you in further, they are on a continuous upsell salespitch. Drug pushers could learn a thing or two from them.

retsibsi•39m ago

I think you're framing this behaviour too generously. Laziness is one thing, lack of integrity is another, and this seems to be a straightforward case of cheating and lying.

bonoboTP•1h ago

I'm not surprised at all. The ML research community isn't a community any more, it's turned into a dog-eat-dog low-trust fierce competition. So much more people, papers, churn, that everyone is just fending for themselves. Any moment that you charitably spend on community service can be felt as a moment you take away from the next project, jeopardizing the next paper, getting scooped, delaying your graduation, your contract, your funding, your visa, your residence permit, your industry plans etc. It's a machine. I don't think people outside the phd system really understand the incentives involved. People are offered very little slack in this system. It's sink or swim, with very little instruction or scientific culture or integrity getting passed on. The PhD students see their supervisors cut corners all the time too, authorship bullshit jockeying even in big name labs etc. People I talked to are quite disillusioned, expect their work to have little impact and get superseded by a new better model in a few months so it's all about who can grind faster, who can twist the benchmarks into showing a minimal improvement etc. And the starry eyed novices get slapped by reality into thinking this way fairly early.

To be clear this is not an excuse but an explanation why I am not surprised.

jacquesm•58m ago

This is 'spam' all over again. Before spam every email was valuable and required some attention. It was a better version of paper mail in that it was faster and cheaper. But then the spam thing happened and suddenly being 'faster and cheaper' was no longer an advantage, it was a massive drawback. But by then there was no way back. I think LLMs will do the same with text in general. By making the production of text faster and cheaper the value of all text will diminish, quite probably to something very close to the energy value of the bits that carry the data.

bonoboTP•1h ago

To be clear, as the article says, these authors were offered a choice and agreed to be on the "no LLMs allowed" policy.

And detection was not done with some snake oil "AI detector" but by invisible prompt injection in the paper pdf, instructing LLMs to put TWO long phrases into the review. They then detected LLM use through checking if both phrases appear in the review.

This did not detect grammar checks and touchups of an independently written review. The phrases would only get included if the reviewer fed the pdf to the LLM in clear violation to their chosen policy.

> After a selection process, in which reviewers got to choose which policy they would like to operate under, they were assigned to either Policy A or Policy B. In the end, based on author demands and reviewer signups, the only reviewers who were assigned to Policy A (no LLMs) were those who explicitly selected “Policy A” or “I am okay with either [Policy] A or B.” To be clear, no reviewer who strongly preferred Policy B was assigned to Policy A.

mikkupikku•1h ago

In that case, I hope these frauds have been banned for life.

hodgehog11•1h ago

I was thinking this too, but I don't believe this is the case, and I feel like it would not be a good idea either.

Most of these people are likely students; this should be a learning moment, but I don't think it is yet grounds for their entire academic career to be crippled by being unable to publish in a top-tier ML venue.

mikkupikku•1h ago

If this is tolerated, it sends exactly the wrong kind of message. The students, if they are, should be banned for life. Let them serve as an example for myriads of future students, this will be a better outcome in the long run.

This didn't trip for people who were merely bouncing ideas off a LLM, they caught people who copy and pasted straight from their LLM.

linkregister•58m ago

It's not a fully consensus view, but a majority of sociologists agree that high severity deterrence has limited effectiveness against crime. Instead, certainty of enforcement is the most salient factor.

jona-f•49m ago

But the mob wants their kick.

crimsoneer•48m ago

Yup, precisely this. Doing something bad is rarely a rational commitment and cost of benefits. Likelihood and celerity of getting caught seem to be the driving factors.

_flux•43m ago

But this method is now spent, as if someone is determined on keep using LLM, this should be pretty easy to overcome.

I suppose though new methods could be devised, but it's not "certainty" that they will catch them.

jacquesm•39m ago

That's not true. People still pick up USB sticks from the street, people still fall for scam phone calls and people still click on links in mail.

Just because a method was successful once does not mean it was 'burned', none of these people will be checking each and every future pdf or passing it through a cleaner before they will do the same thing all over again and others are going to be 'virgin' and won't even be warned because this is not going to be widely distributed in spite of us discussing it here.

If anything you can take this as proof that this method is more or less guaranteed to work.

harmf•58m ago

agree

CoastalCoder•48m ago

FYI we tend to use up votes rather than "I agree" comments, partly because it keeps the overall signal-to-noise ratio for comments higher.

CoastalCoder•50m ago

This line of reasoning interests me because it seems to arise in other contexts as well.

Do very harsh punishments significantly reduce future occurrences of the offense in question?

I've heard opponents of the death penalty argue that it's generallynot the case. E.g., because often the criminals aren't reasoning in terms that factor in the death penalty.

On the other hand (and perhaps I'm misinformed), I've heard that some countries with death penalties for drug dealers have genuinely fewer problems with drug addiction. Lower, I assume, than the numbers you'd get from simply executing every user.

So I'm curious where the truth lies.

armchairhacker•46m ago

Is the death penalty scarier than life in prison?

CoastalCoder•29m ago

I assume that depends on the individual.

But FWIW, my point was about very harsh punishments in general, not specifically the death penalty.

Tade0•42m ago

My understanding is that something among those lines happened:

> All Policy A (no LLMs) reviews that were detected to be LLM generated were removed from the system. If more than half of the reviews submitted by a Policy A reviewer were detected to be LLM generated, then all of their reviews were deleted, and the reviewer themselves was removed from the reviewer pool.

Half is a bit lenient in my view, but I suppose they wanted to avoid even a single false positive.

wiseowise•25m ago

Why not put them on a chain and let village stone them? Or better yet shoot them on the spot! That would send a message for sure.

withinboredom•23m ago

> The students, if they are, should be banned for life.

I'm all for repurcussions ... but a life is a long time and students are usually only at the beginning of it.

laughingcurve•6m ago

Thank goodness we have you passing judgment on the internet; otherwise who else would be around for us to do it? I'm glad you're willing to destroy someone for a mistake rather than letting them learn and change. We all know that arbitrary and harsh punishments solve everything.

noduerme•5m ago

2% would be on the very low end of the number of people who lie, get caught, and become repeat offenders anyway.

nurettin•36m ago

What terrible deeds have you done to outburst so harshly?

quinndupont•30m ago

It’s an unethical, false choice. The reviewers are not perfectly rational agents that do free work, they have real needs and desires. Shame on ICML for exploiting their desperation.

jojomodding•24m ago

Is it? The reviewers could simply have chosen a different option in a form field. While I understand that they were "forced" to review under reciprocal review, they still had other choices where I don't see coercion happening and that could have avoided the outcome for them.

qbit42•23m ago

Banned for life is a stretch but the actual response is completely fine. They can just resubmit to the next conference.

Words mean something, if you promise to uphold a contract and break it, there are consequences. The reviewers were free to select the policy which allows LLM use.

notrealyme123•12m ago

In many cases authors and reviewers are not the same. In your first two publications to such venues you are not allowed to review yourself and need someone else.

I think consequences are well deserved, but hopefully not on the authors cost (if innocent).

coldtea•1h ago

Another 30-40% just didn't get caught because the reviewers also used LLM in their "reviews"

jsnell•1h ago

I think you've misunderstood something. This is not about rejecting LLM-written articles. It is about rejecting the articles of people who used LLMs for their reviews.

So your quip is just nonsensical.

coldtea•59m ago

Those second-level reviewers, checking whether the first-level authors used LLMs in their reviews, also used LLMs to do their screening, and the latter missed it in many cases.

My original point (loosely based on the subject, not TFA) is that it's LLMs all the way down, way more than it's "measured" to be.

jacquesm•1h ago

I keep spotting clear LLM 'tells' in text where I know the people on the other side believe they're 'getting away with it'. It is incredible at what levels of commerce people do this, and how they're prepared to risk their reputation by saving a few characters typed. It makes me wonder what they think they are getting paid for.

grey-area•52m ago

Interesting, so someone submitting a paper for review could also submit one with hidden instructions for LLMs to summarise or review it in a very positive light.

Given this detection method works so well in the use case of feeding reviewing LLMs instructions, it should also work for the original submitted paper itself, as long as it was passed along with its watermark intact. Even those just using LLMs to summarise could easily be affected if LLMs were instructed to generate very positive summaries.

So the 2% cheaters on policy A, AND 100% of policy B reviewers could fall for this and be subtly guided by the LLMs overly-positive summaries or even complete very positive reviews (based on hidden instructions).

That this sort of adversarial attack works is really quite troubling for those using LLMs to help them understand texts, because it would work even if asked to summarise something.

wood_spirit•45m ago

Then these papers with these instructions get included in the training corpus for the next frontier models and those models learn to put these kinds of instructions into what they generate and …?

Tade0•38m ago

> Interesting, so someone submitting a paper for review could also submit one with hidden instructions for LLMs to summarise or review it in a very positive light.

I may or may not know a guy who added several hidden sentences in Finnish to his CV that might have helped him in landing an interview.

duskdozer•13m ago

>several hidden sentences in Finnish

Is this a reference to something?

mika-el•47m ago

The irony here is that the detection method is literally prompt injection — the same technique that's a security vulnerability everywhere else. ICML embedded hidden instructions in PDFs that manipulate LLM output. In a different context that's an attack, here it's enforcement.

From my perspective this says something important about where we are with LLMs. The fact that you can reliably manipulate model output by hiding instructions in the input means the model has no real separation between data and commands. That's the fundamental problem whether you're catching lazy reviewers or defending against actual attacks.

geremiiah•39m ago

If you need an LLM to understand a paper you should not be a reviewer for said paper.

quinndupont•35m ago

How is nobody considering the broader political economy of scholarly publications and reviews? These are UNPAID reviews! Sure, maybe ICML isn’t Elsevier, but they are cousins to the socially parasitic and exploitative companies, at the very least.

Hiding behind a false “choice” to not use AI or basically not use AI isn’t an appropriate proposal. This is crooked and shameful. We should boycott ICML except we can’t because they are already the gatekeepers!

qbit42•18m ago

What? Why is that a false choice? The only way you got caught here is if you literally gave an LLM the PDF and used its response verbatim.

And they didn't give a permanent ban or anything, these authors can just resubmit to another conference, of which there are many.

merelysounds•28m ago

Related discussion elsewhere and from a different point of view:

> ICML: every paper in my review batch contains prompt-injection text embedded in the PDF

source: https://old.reddit.com/r/MachineLearning/comments/1r3oekq/d_...

There are recent comments there as well:

> Desk Reject Comments: The paper is desk rejected, because the reciprocal reviewer nominated for this paper ([OpenReview ID redacted]) has violated the LLM reviewing policy. The reviewer was required to follow Policy A (no LLMs), but we have found a strong evidence that LLM was used in the preparation of at least one of their reviews. This is a breach of peer-review ethics and grounds for desk rejection. (...)

source: https://old.reddit.com/r/MachineLearning/comments/1r3oekq/d_...

aledevv•20m ago

Great experiment!

Correct me if I'm wrong, but this means that many people are using LLMs despite claiming not to.

It's the first symptom of a dependency mechanism.

If this happens in this context, who knows what happens in normal work or school environments?

(P.S.: The use of watermarks in PDFs to detect LLM usage is very interesting, even though the LLM might ignore hidden instructions.)

Lerc•15m ago

I have heard people say that they find that people who broadcast their distaste for LLMs secretly use it. I was fairly sceptical of the claim, but this seems to suggest that it happens more than I would have thought.

One wonders what leads them to the AI rejecting option in the first place.

IshKebab•12m ago

I bet plenty of people that leave voicemails don't like listening to them.

Data Science Weekly – Issue 643

MOVA – contract runtime for AI agents that require human approval (Rust/WASM)

Simulating watercolor paint with generative art

Show HN: High Output Software Engineering (Book)

Free scanner for AI agent readiness – 13 signals and 3 bonus signals

Enormous (`Mega') Satellite Constellations

'Your Frustration Is the Product'

CSS Isn't Real

I found a Vue 3 Gantt Chart with Resource Scheduling

Rolls-Royce scraps goal to go all-electric by 2030

Enforcing Prohibition with poorly trained agents didn't go so well in the 1920s

SereneDB's C++ search engine is the fastest on search benchmarks

Another Vibe Coding Story

I've optimized the well known "Superpowers" plugin, the result is mind-blowing

Show HN: Ripl – A unified 2D/3D engine for Canvas, SVG, WebGPU, and the Terminal

Ask HN: How do you handle wide-to-long without Pandas/R?

Ask HN: Should a tl;dr be mandatory for long submissions?

Show HN: A Nascent Analytical Engine

A Personal Appeal from Cofounder of Session – Chris McCabe

Pitchwise – we built a pitch deck tracker, now it's a document Intel platform

Molecule in Python blood could pave way for new obesity drugs

Build, run and debug iOS and Mac apps in Zed instead of Xcode

The U.S. attacked Iran to show its power but the war is lost

How I Work with Parallel Agents

Iran war energy shock sparks global push to reduce fossil fuel dependence

Show HN: Showslide – Live Presentations in 60 seconds

How Ghostty is extracting a reusable terminal library from a 237KB Zig file

An Atlas of DRAGNs

Aura confirms data breach exposing 900k marketing contacts

Ask HN: Would you use a tool that turns Slack threads into decisions?

Data Science Weekly – Issue 643

MOVA – contract runtime for AI agents that require human approval (Rust/WASM)

Simulating watercolor paint with generative art

Show HN: High Output Software Engineering (Book)

Free scanner for AI agent readiness – 13 signals and 3 bonus signals

Enormous (`Mega') Satellite Constellations

'Your Frustration Is the Product'

CSS Isn't Real

I found a Vue 3 Gantt Chart with Resource Scheduling

Rolls-Royce scraps goal to go all-electric by 2030

Enforcing Prohibition with poorly trained agents didn't go so well in the 1920s

SereneDB's C++ search engine is the fastest on search benchmarks

Another Vibe Coding Story

I've optimized the well known "Superpowers" plugin, the result is mind-blowing

Show HN: Ripl – A unified 2D/3D engine for Canvas, SVG, WebGPU, and the Terminal

Ask HN: How do you handle wide-to-long without Pandas/R?

Ask HN: Should a tl;dr be mandatory for long submissions?

Show HN: A Nascent Analytical Engine

A Personal Appeal from Cofounder of Session – Chris McCabe

Pitchwise – we built a pitch deck tracker, now it's a document Intel platform

Molecule in Python blood could pave way for new obesity drugs

Build, run and debug iOS and Mac apps in Zed instead of Xcode

The U.S. attacked Iran to show its power but the war is lost

How I Work with Parallel Agents

Iran war energy shock sparks global push to reduce fossil fuel dependence

Show HN: Showslide – Live Presentations in 60 seconds

How Ghostty is extracting a reusable terminal library from a 237KB Zig file

An Atlas of DRAGNs

Aura confirms data breach exposing 900k marketing contacts

Ask HN: Would you use a tool that turns Slack threads into decisions?

2% of ICML papers desk rejected because the authors used LLM in their reviews

Comments