Publishers are deploying AI-based tools to detect suspicious images, but generative AI threatens their efforts.
出版商们正在使用AI为基础的工具来检测可疑图像,但生成式AI对该方法的效果构成威胁。
原文发表时间:2024.02.12(2024.02.20有修正)
作者/Author:
It seems that every month brings a fresh slew of high-profile allegations against researchers whose papers — some of them years old — contain signs of possible .
每个月都会涌现出大量针对学者而备受关注的指控,这些学者的论文——有些论文甚至有点年头了——其中的图像存在可疑伪造的痕迹。
(译者注:“slew”是“slay”的过去式,“slay”的人称名词是“slayer”——ゴブリンスレイヤー;然而遗憾的是这里的“slew”做名词,表“大量”;“high-profile”指备受公众关注的)
are using their own trained eyes, along with commercial software based on artificial intelligence (AI), to spot image duplication and other issues that might hint at sloppy record-keeping or worse. They are bringing these concerns to light in places like PubPeer, an online forum featuring every day flagging image concerns.
学术侦探们以他们训练有素的眼力配以搭载AI的商业软件,检查图像重复以及其他可能指向数据记录混乱或更严重情况的问题。他们把这些问题通过PubPeer之类的平台公布出来——每天都会许多标记着图像问题的新帖文出现在线上论坛PubPeer上。
(译者注:“sleuth”指专职搜集犯罪信息的人员,可译为“侦探”,“学术侦探”这个翻译取自众多新闻稿和知乎相关回答;“flag”指标记,变得疲倦、兴趣衰减;“sloppy”指草率,马虎的)
Some of these efforts have led to action. Last month, for example, the Dana-Farber Cancer Institute (DFCI) in Boston, Massachusetts, said that it would ask journals to retract or correct a slew of papers authored by its staff members. The disclosure came after an about images in the papers. The institute says it is continuing to investigate the concerns.
侦探们的一些努力已经促成了影响。例如上月,位于马萨诸塞波士顿的丹娜·法伯癌症研究院发布声明称,会要求期刊对其研究人员署名的大量论文进行撤稿或打回修稿的处理。这条信息的公开回应了早前一位学术“侦探”指出这些论文中图片的问题。该机构称其会继续调查这些问题。
(译者注:“disclosure”指被公布的事实,公开;“retract”指撤回,撤销)
That incident was just one of many. In the face of public scrutiny, academic journals , including commercial , to spot problematic imagery before, rather than after, publication. Here, Nature reviews the problem and how publishers are attempting to tackle it.
这只是众多案例其一。面对群众监督,学术期刊正在逐步推进诸如AI辅助系统的工具与方法的应用,以在出版前而非事后发现存在问题的图像。接下来《自然》将带您了解这其中存在的问题,以及出版业者们为解决问题做出的尝试。(译者注:“scrutiny”指仔细审查、细看)
Questionable image practices include the use of the same data across several graphs, the replication of photos or portions of photos, and the deletion or splicing of images. Such issues can indicate an intent to mislead, but can also result from an innocent attempt to improve a figure’s aesthetics, for example. Nonetheless, even innocent mistakes can be damaging to the integrity of science, experts say.
图像存在问题的相关行为包括在多张图表内使用相同的数据,复制照片或者照片的局部,以及删除或拼接图片。这些问题可能指向作者的刻意误导,也可能是诸如美化图像时的无意之举。尽管如此,学者们认为,就算是无意产生的错误也会损害科研诚信。(译者注:“splice”指拼接,粘接)
The precise number of such incidents is unknown. A database maintained by the website Retraction Watch lists more than 51,000 documented retractions, corrections or expressions of concern. Of those, about 4% flag a concern about images.
这类案例具体有多少我们无从得知。由网站“Retraction Watch”(译者注:建议翻译为“撤稿监控”)维护的数据库列出了超过5.1万条撤稿、修正或问题质询的记录。其中,有4%标记为图像相关问题。
One of the largest efforts to quantify the problem was carried out by Elisabeth Bik, a scientific image sleuth and consultant in San Francisco, California, and her colleagues. They examined images in more than 20,000 papers that were published between 1995 and 2014. Overall, they found that . The study also revealed an increase in inappropriate image duplications starting around 2003, probably because digital photography made it easier to alter photos, Bik says.
加利福尼亚旧金山的咨询师与学术侦探伊丽莎白·比克和她的同事们为确定这样问题的多少做出了重要贡献。他们检查了1995年到2014年间出版的2万余篇论文。他们发现,整体上,近4%的论文中的图像存在问题。他们的研究还发现,2003年前后,不端的图像重复问题增加,比克解释道,这应该是由于图像数字化后,修改图像更容易了。
(译者注:“quantify”指确定某事物的大小或数量)
Modern papers also contain more images than do those from decades ago, notes Bik. “Combine all of this with many more papers being published per day compared to ten years ago, and the increased pressure put on scientists to publish, and there will just be many more problems that can be found.”
比克还补充道,现在的论文比几十年以前包含更多的图像,“现在每天出版的论文比十年前更多,科研人员们刊登论文的压力也更大,这些叠加在一起,造成的问题必然比我们能发现的还多得多。”
The high rate of reports of image issues might also be driven by “a rise in whistle-blowing because of the global community’s increased awareness of integrity issues”, says Renee Hoch, who works for the PLOS Publication Ethics team in San Francisco, California.
于加利福尼亚旧金山的公共科学图书馆出版伦理团队供职的蕾妮·霍赫还提到,图像问题反馈频率的增加也可能是在“由于全球社群对诚信问题意识的增加而让更多人敢于‘吹哨’”这样的情况下推动的。
In January, biologist and investigator Sholto David, based in Pontypridd, UK, blogged about possible image manipulation in more than 50 biology papers published by scientists at the DFCI, which is affiliated with Harvard University in Cambridge, Massachusetts. Among the authors were DFCI president Laurie Glimcher and her deputy, William Hahn; a DFCI spokesperson said they are not speaking to reporters. David’s blog highlighted what seemed to be duplications or other image anomalies in papers spanning almost 20 years. The post was first reported by The Harvard Crimson.
今年1月,英国庞特浦里德的生物学家与学术“侦探”肖尔托·大卫在博客上揭露了附属于哈佛大学马萨诸塞州剑桥分校的丹娜·法伯癌症研究院(以下简称DFCI)的研究人员出版的超过50篇生物学论文中可疑的图像伪造问题。DFCI主席劳莉·葛礼谦以及她的副手威廉·哈恩也位于这些论文的作者之列;DFCI的发言人称他们不准备接见记者。大卫的博客中特别标注了论文中的图片复制或其他图片异常问题,这些论文的时间跨度接近20年;投稿的博客由《哈佛深红报》首先报道。
(译者注:“crimson”指深红色;《哈佛深红报》是哈佛校内的日报)
The DFCI, which had already been investigating some of these issues, is seeking retractions for several papers and corrections for many others. Barrett Rollins, the DFCI’s research-integrity officer, says that “moving as quickly as possible to correct the scientific record is important and a common practice of institutions with strong research integrity”.
正在对其中一些问题进行调查的DFCI正在寻求将其中的数篇论文撤稿,并要求对剩下的论文进行修正。DFCI主管学术诚信的巴雷特·罗林斯称“尽快修正科研数据很重要,但这也是重视学术诚信的机构常见的措施。”
“It bears repeating that the presence of image duplications or discrepancies in a paper is not evidence of an author’s intent to deceive,” he adds.
他还补充道,“得反复强调的一点是,一篇论文中存在图像重复或者出入不能作为作者意图欺诈的证据。”
(译者注:“discrepancy”指出入,不一致,差异)
In an effort to reduce publication of mishandled images, some journals, including the Journal of Cell Science, PLOS Biology and PLOS ONE, either require or ask that authors submit raw images in addition to the cropped or processed images in their figures.
为了减少错误图像的出版,包括《Journal of Cell Science》、《PLOS Biology》、《PLOS ONE》等一些期刊均要求作者额外提交图表中裁剪或处理过的图像其原始图片。
(译者注:“crop”指修剪、裁剪)
Many publishers are also incorporating AI-based tools including ImageTwin, ImaCheck and Proofig into consistent or spot pre-publication checks. The Science family of journals in January it is now using Proofig to screen all its submissions. Holden Thorp, editor in chief of the Science family of journals, says Proofig has spotted things that led editors to decide against publishing papers. He says authors are usually grateful to have their errors identified.
还有很多出版人士将诸如ImageTwin、ImaCheck和Proofig搭载AI的工具用于持续或定点的版前检查工作。《科学》系列期刊在1月宣布,其已经启用Proofig来对所有投稿论文进行扫描。《科学》系列期刊总编辑霍顿·索普称,Proofig已经检测到相关线索,这些线索会让编辑们决定中止某些论文的出版。他还补充道,作者们一般也会感谢论文中的错误被指出。
(译者注:“spot”此处为形容词,定点的)
All these systems can, for example, quickly detect duplicates of images in the same paper, even if those images have been rotated, stretched or cropped or had their colour altered.
这些系统可以快速检测诸如同一论文中图片复制的问题,就算图片经过旋转、拉伸、裁剪或者调色操作,也能一并检出。
Different systems have different merits. Proofig, for example, can spot splices created by chopping out or stitching together portions of images. ImageTwin, says Bik, has the advantage of allowing users to cross-check an image against a large data set of other papers. Some publishers, including Springer Nature, are developing their own AI image-integrity software. (Nature’s news team is editorially independent of its publisher, Springer Nature.)
不同的系统各有其长。例如Proofig能够检出由图像局部拼合或裁掉形成的拼接图片。比克则说,ImageTwin在允许用户从大量论文数据库中查证出与某一张图片一致的图像上独具优势。例如《斯普林格·自然》的出版商正在研发他们自己的AI检测图像真实性的软件。(《自然》的新闻团队在编辑架构上独立于其出版商《斯普林格·自然》)
Many of the errors flagged by AI tools seem to be innocent. In a submitted to 9 American Association for Cancer Research journals in 2021 and early 2022, Proofig flagged 15% as having possible image duplications that required follow-up with authors. Author responses indicated that 28% of the 207 duplications were intentional — driven, for example, by authors using the same image to illustrate multiple points. Sixty-three per cent were unintentional mistakes.
很多由AI标注出的错误可能是无意为之的。一项研究以2021年到2022年稍前提交给“美国癌症研究协会”下属的9份期刊,共超过一千三百篇论文为对象,发现Proofig标注15%了论文存在图像复制问题且需要与作者后续沟通。而从作者们的反馈来看,207个图像重复问题中有28%是有意而为之,比如说作者在说明多个要点时都用到了同一张图片。63%的则是无意间造成的错误。
Users report that AI-based systems definitely to spot some kinds of image problems. The Journal of Clinical Investigation trialled Proofig from 2021 to 2022 and found that it tripled the proportion of manuscripts with potentially problematic images, from 1% to 3%.
从用户的反馈来看,AI系统在检出一些种类的图像问题上肯定是更快更方便的。《The Journal of Clinical Investigation》在2021年到2022年间试用了Proofig,发现其检出原稿存在潜在问题图像的论文比例从1%提高到了3%,翻了2倍。
But they are less adept at spotting more complex manipulations, says Bik, or AI-generated fakery. The tools are “useful to detect mistakes and low-level integrity breaches, but that is but one small aspect of the bigger issue”, agrees Bernd Pulverer, chief editor of EMBO Reports. “The existing tools are at best showing the tip of an iceberg that may grow dramatically, and current approaches will soon be largely obsolete.”
但是比克说,这些系统在检出更复杂的图像伪造问题或者AI生成的虚假图像上并不专业。这些工具“有助于检测出错误与低级造假行为,但是这些只是众多问题之一隅,”《EMBO Reports》的总编辑贝恩德·普尔维勒也赞同道,“现有的工具最擅长发现的问题只是冰山一角,而整座“冰山”可能会以夸张的速度发展,现有的方法很快就会被远远抛下。”
(译者注:“adept”指擅长的)
A combination of expert teams, technology tools and increased vigilance seems to be working — for the time being. “We have applied systematic screening now for over a decade and for the first time see detection rates decline,” says Pulverer.
目前看来专家团队、技术工具与更强警惕意识的结合下似乎有所成效。普尔维勒说,“我们目前已经采用系统性检查的方法超过一个世纪了,而现在,我们第一次注意到,检测到问题的比率正在下降。”
(译者注:“vigilance”指警觉)
But as image manipulation gets more sophisticated, catching it will become ever harder, he says. “In a couple of years all of our current image-integrity screening will still be useful for filtering out mistakes, but certainly not for detecting fraud,” Pulverer says.
而他补充道,随着图片伪造技术逐渐成熟,发现问题将会越来越困难,“接下来几年,我们所有检测图像真实性的方法依然会有效的剔出错误,但是在发现造假上就不尽然了。”
Ultimately, stamping out image manipulation will involve complex changes to how science is done, says Bik, with more focus on rigour and reproducibility, and repercussions for bad behaviour. “There are too many stories of bullying and highly demanding PIs spending too little time in their labs, and that just creates a culture where cheating is ok,” she says. “This needs to change.”
比克说,要杜绝图像伪造实际上还是得从科研工作中各种复杂的变化出发,更加注重严谨性和可复现性,并让不端行为食其恶果。她说,“有太多的故事是关于好高骛远的首席研究员作威作福,实验室一趟都没去过的故事了,这样的话只会让容许造假的风气甚嚣尘上;这不能不改。”
(译者注:“stamp something out”指消除,消灭某事物;“rigour”指严密,严谨;“repercussion”指恶果,不好的后果;“PI”全称“principal investigator”)
Correction 20 February 2024: An earlier version of this article used an incorrect pronoun to refer to Barrett Rollins. This has now been corrected.
2024.02.20修正:巴雷特·罗林斯的代词用错了,现在改了
References 原始文章参考文献
Bik, E. M., Casadevall, A. & Fang, F. C. mBio 7, e00809-16 (2016).
https://www.nature.com/articles/d41586-024-00372-6#ref-CR1
Jackson, S., Williams, C. L., Collins, K. L. & McNally, E. M. J. Clin. Invest. 132, e162884 (2022).
封面来源:https://tmp.nulla.top/ba-logo/