They were quite conservative in their approach, so the only things that were rejected were from people who had agreed not to use an LLM and almost definitely did use an LLM (since they fed hidden watermarked instructions to the llm's).
This means the true number of people that used LLM's in their review (even in group A that had agreed not to) is likely higher.
Also worth noting, 10% of these authors used them in more than half of their reviews.
It's incredible how so many people thought it was fair that their paper should be assessed by human reviewers alone, and yet would not extend the same courtesy to others.
It really does sound like an addiction when you put it this way.
The trick is: I can't cut-and-paste between the two machines. So there is never even a temptation to do so and I can guarantee that my writing or other professional output will never be polluted. Because like you I'm well aware of that poor impulse control factor and I figured the only way to really solve this is to make sure it can not happen.
The problem is that it's just much easier to un-quit and run the LLM in the same laptop you work on.
It's just so very tempting.
Oh, and LLMs are of course geared to pull you in further, they are on a continuous upsell salespitch. Drug pushers could learn a thing or two from them.
To be clear this is not an excuse but an explanation why I am not surprised.
And detection was not done with some snake oil "AI detector" but by invisible prompt injection in the paper pdf, instructing LLMs to put TWO long phrases into the review. They then detected LLM use through checking if both phrases appear in the review.
This did not detect grammar checks and touchups of an independently written review. The phrases would only get included if the reviewer fed the pdf to the LLM in clear violation to their chosen policy.
> After a selection process, in which reviewers got to choose which policy they would like to operate under, they were assigned to either Policy A or Policy B. In the end, based on author demands and reviewer signups, the only reviewers who were assigned to Policy A (no LLMs) were those who explicitly selected “Policy A” or “I am okay with either [Policy] A or B.” To be clear, no reviewer who strongly preferred Policy B was assigned to Policy A.
Most of these people are likely students; this should be a learning moment, but I don't think it is yet grounds for their entire academic career to be crippled by being unable to publish in a top-tier ML venue.
This didn't trip for people who were merely bouncing ideas off a LLM, they caught people who copy and pasted straight from their LLM.
I suppose though new methods could be devised, but it's not "certainty" that they will catch them.
Just because a method was successful once does not mean it was 'burned', none of these people will be checking each and every future pdf or passing it through a cleaner before they will do the same thing all over again and others are going to be 'virgin' and won't even be warned because this is not going to be widely distributed in spite of us discussing it here.
If anything you can take this as proof that this method is more or less guaranteed to work.
Do very harsh punishments significantly reduce future occurrences of the offense in question?
I've heard opponents of the death penalty argue that it's generallynot the case. E.g., because often the criminals aren't reasoning in terms that factor in the death penalty.
On the other hand (and perhaps I'm misinformed), I've heard that some countries with death penalties for drug dealers have genuinely fewer problems with drug addiction. Lower, I assume, than the numbers you'd get from simply executing every user.
So I'm curious where the truth lies.
But FWIW, my point was about very harsh punishments in general, not specifically the death penalty.
> All Policy A (no LLMs) reviews that were detected to be LLM generated were removed from the system. If more than half of the reviews submitted by a Policy A reviewer were detected to be LLM generated, then all of their reviews were deleted, and the reviewer themselves was removed from the reviewer pool.
Half is a bit lenient in my view, but I suppose they wanted to avoid even a single false positive.
I'm all for repurcussions ... but a life is a long time and students are usually only at the beginning of it.
Words mean something, if you promise to uphold a contract and break it, there are consequences. The reviewers were free to select the policy which allows LLM use.
I think consequences are well deserved, but hopefully not on the authors cost (if innocent).
So your quip is just nonsensical.
My original point (loosely based on the subject, not TFA) is that it's LLMs all the way down, way more than it's "measured" to be.
Given this detection method works so well in the use case of feeding reviewing LLMs instructions, it should also work for the original submitted paper itself, as long as it was passed along with its watermark intact. Even those just using LLMs to summarise could easily be affected if LLMs were instructed to generate very positive summaries.
So the 2% cheaters on policy A, AND 100% of policy B reviewers could fall for this and be subtly guided by the LLMs overly-positive summaries or even complete very positive reviews (based on hidden instructions).
That this sort of adversarial attack works is really quite troubling for those using LLMs to help them understand texts, because it would work even if asked to summarise something.
I may or may not know a guy who added several hidden sentences in Finnish to his CV that might have helped him in landing an interview.
Is this a reference to something?
From my perspective this says something important about where we are with LLMs. The fact that you can reliably manipulate model output by hiding instructions in the input means the model has no real separation between data and commands. That's the fundamental problem whether you're catching lazy reviewers or defending against actual attacks.
Hiding behind a false “choice” to not use AI or basically not use AI isn’t an appropriate proposal. This is crooked and shameful. We should boycott ICML except we can’t because they are already the gatekeepers!
And they didn't give a permanent ban or anything, these authors can just resubmit to another conference, of which there are many.
> ICML: every paper in my review batch contains prompt-injection text embedded in the PDF
source: https://old.reddit.com/r/MachineLearning/comments/1r3oekq/d_...
There are recent comments there as well:
> Desk Reject Comments: The paper is desk rejected, because the reciprocal reviewer nominated for this paper ([OpenReview ID redacted]) has violated the LLM reviewing policy. The reviewer was required to follow Policy A (no LLMs), but we have found a strong evidence that LLM was used in the preparation of at least one of their reviews. This is a breach of peer-review ethics and grounds for desk rejection. (...)
source: https://old.reddit.com/r/MachineLearning/comments/1r3oekq/d_...
Correct me if I'm wrong, but this means that many people are using LLMs despite claiming not to.
It's the first symptom of a dependency mechanism.
If this happens in this context, who knows what happens in normal work or school environments?
(P.S.: The use of watermarks in PDFs to detect LLM usage is very interesting, even though the LLM might ignore hidden instructions.)
One wonders what leads them to the AI rejecting option in the first place.
michaelbuckbee•1h ago