The only reason why this is helpful is because humans have natural biases and/or inverse of AI biases which allow them to find patterns that might just be the same graph being scaled up 5 to 10 times.
Having seen from close-up how these reviews go, I get why people use tools like this unfortunately. it doesn't make me very hopeful for the near future of reviewing.
It is a tool, and there always needs to be a user that can validate the output.
Seeing the high percentage of usage of AI for composing reviews is concerning, but, also, peer review is an unpaid racket which seems basically random anyway (https://academia.stackexchange.com/q/115231), and probably needs to die given alternatives like ArXiV and OpenPeerReview and etc. I'm not sure how much I care about AI slop contaminating an area that already might be mostly human slop in the first place.
But of course, you are often not allowed to do that. Review copies are confidential documents, and you are not allowed to upload them to random third-party services.
Peer review has random elements, but thats true for all other situations (such as job interviews), where the final decision is made using subjective judgment. There is nothing wrong in that.
I get where you are coming from here, but, in my opinion, no, this is not part of peer review (where expertise implies preconceptions), nor for really anything humans do. If you ignore your pre-conceptions and/or priors (which are formed from your accumulated knowledge and experience), you aren't thinking.
A good example in peer review (which I have done) would be: I see a paper where I have some expertise of the technical / statistical methods used in a paper, but not of the very particular subject domain. I can use AI search to help me find papers in the subject domain faster than I can on my own, and then I can more quickly see if my usual preconceptions about the statistical methods are relevant on this paper I have to review. I still have to check things, but, previously, this took a lot more time and clever crafting of search queries.
Failing to use AI for search in this way harms peer review, because, in practice, you do less searching and checking than AI does (since you simply don't have the time, peer review being essentially free slave labor).
You are also supposed to review the paper and not just check it for correctness. If the presentation is unclear, or if earlier sections mislead the reader before later sections clarify the situation, you are supposed to point that out. But if you have seen an AI summary of the paper before reading it, you can no longer do that part. (And if a summary helps to interpret the paper correctly, that summary should be a part of the paper.)
If you don't have sufficient expertise to review every aspect of the paper, you can always point that out in the review. Reading papers in unfamiliar fields is risky, because it's easy to misinterpret them. Each field has its own way of thinking that can only be learned by exposure. If you are not familiar with the way of thinking, you can read the words but fail to understand the message. If you work in a multidisciplinary field (such as bioinformatics), you often get daily reminders of that.
(Now that I think about it, I haven't seen much battery hype lately. The battery hype people may have pivoted to AI. Lots of stuff is going on in batteries, but mostly by billion-dollar companies in China quietly building plants and mostly shutting up about what's going on inside.)
There are also reasons for discouraging the use LLMs in peer review at all: it defeats the purpose of peer in the peer review; hallucinations; criticism not relevant to the community; and so on.
However, I think it's high time to reconsider what scientific review is supposed to be. Is it really important to have so-called peers as gatekeepers? Are there automated checks we can introduce to verify claims or ensure quality (like CI/CD for scientific articles), and leave content interpretation to the humans?
Let's make the benefits and costs explicit: what would we be gaining or losing if we just switched to LLM-based review, and left the interpretation of content to the community? The journal and conference organizers certainly have the data to do that study; and if not, tool providers like EasyChair do.
LLMs simply don't do science. They have no integrity.
There is no "bullshit me" step in the scientific process
There are also strong reasons why the peers-as-gatekeepers model is detrimental to the pursuit of knowledge, such as researchers forming semi-closed communities that bestow local political power on senior people in the field, creating social barriers to entry or critique. This is especially pernicious given the financial incentives (competition for a limited pool of grant money; award of grant money based on publication output) that researchers are exposed to.
Of course that makes it harder for people outside to penetrate but this also depends on the culture of the specific domain and there's usually people writing summaries and surveys. Great task for grad students tbh (you read a ton of papers, summarize, and by that point you should have a good understanding of what needs to be worked on in the field and not just dragged through by your advisor)
I also don't think the categories are exclusive.
However, there are still big issues with how these peers perform reviews today [1].
For example, if there's a scientifically arbitrary cutoff (e.g., the 25% acceptance rate at top conferences), reviewers will be mildly incentivized to reject (what they consider to be) "borderline-accept" submissions. If the scores are still "too high", the associate editors will overrule the decision of the reviewers, sometimes for completely arbitrary reasons [2].
There's also a whole number of things reviewers should look out for, but for which they neither have the time, space, tools, nor incentives to do. For example, reviewers are meant to check if the claims fit what is cited, but I can't know how many actually take the time to look at the cited content. There's also checking for plagiarism, GenAI and hallucinated content, does the evidence support the claims, how were charts generated, "novelty", etc. There are also things that reviewers shouldn't check, but that pop up occasionally [3].
However, you would be right to point out that none of this has to do with peers doing the gatekeeping, but with how the process is structured. But I'd argue that this structure is so common that it's basically synonymous with peer review. If it results in bad experiences often enough, we really need to push for the introduction of more tools and honesty into the process [4].
[1] This is based on my experience as a submitter and a reviewer. From what I see/hear online and in my community, it's not an uncommon experience, but it could be a skewed sample.
[2] See, for example: https://forum.cspaper.org/topic/140/when-acceptance-isn-t-en...
[3] Example things reviewers shouldn't check for or use as arguments: did you cite my work; did you cite a paper from the conference; can I read the diagram without glasses if I print out the PDF; do you have room to appeal if I say I can't access publicly available supplementary material; etc.
[4] Admittedly, I also don't know what would be the solution. Still, some mechanisms come to mind: open but guaranteed double-blind anonymous review; removal of arbitrary cutoffs for digital publications; (responsible, gradual) introduction of tools like LLMs and replication checks before it gets to the review stage; actually monitoring reviewers and acting on bad behavior.
> However, I think it's high time to reconsider what scientific review is supposed to be
I've been arguing for years we should publish to platforms like OpenReview and that basically we check for plagiarism and obvious errors but otherwise publish.The old days the bottleneck was the physical sending out of papers. Now that's cheap. So make comments public. We're all on the same side. The people that will leave reviews are more likely to actually be invested in the topic rather than doing review as purely a service. It's not perfect but no system will be and we currently waste lots of time chasing reviewers
The arXiv and the derivative preprint repositories (e.g., bioRxiv) are other good initiatives.
However, I don't think it's enough to leave the content review completely to the community. There's are known issues with researchers using arXiv, for example, to stake claims on novel things, or readers jumping on the claims made by well-known institutions in preprints, which may turn out to be overconfident or bogus.
I believe that a number of checks (beyond plagiarism) need to happen before the paper is endorsed by a journal or a conference. Some of these can and should be done in a peer review-like format, but it needs to be heavily redesigned to support review quality without sacrificing speed. Also, there are things that we have good tools for (e.g., checking citation formatting), so this part should be integrated.
Plus, time may be one of the bottlenecks, but that's partly because publishers take money from academic institutions, yet expect voluntary service. There's no reason for this asymmetry, IMO.
D-Machine•10h ago
croes•9h ago
Faster isn’t the metric here
D-Machine•9h ago
I get that HN has a policy to allow duplicates so that duplicates that were missed for arbitrary timing reasons can still gain traction at later times. I've seen plenty of "[Duplicate]" tagged posts, and have just seen this as a sort of useful thing for readers (duplicates may have interesting info, or seeing that the dupe did or did not gain traction also gives me info). But maybe I am missing something here, particularly etiquette-wise?
kachapopopow•9h ago
D-Machine•9h ago
kachapopopow•9h ago
D-Machine•9h ago
kachapopopow•9h ago
D-Machine•9h ago
layer8•8h ago
The fact that a previous submission didn’t gain traction isn’t usually interesting, because it can be pretty random whether something gains traction or not, depending on time of day and audience that happens to be online.
D-Machine•8h ago
I also think, on reflection, that you are right in this particular case (given there are no comments on the previous duplicate) so, thank you also for clarifying.
I suppose in the future an e.g. "[Previous discussion]" tag would be more appropriate, providing comments were made, otherwise, just say nothing and leave it to HN.