Elusive scientists trying to publish poor-quality research may want to think twice. Academic publishers are increasingly using AI software to automatically detect signs of data manipulation.
Duplicate images in which the same image of a group of cells is copied, flipped, rotated, shifted or cropped is unfortunately very common. If the error is not accidental, a fake image is generated to make it appear as if the researcher had more data and did more experiments than they actually did.
Daniel Ivanko, the company’s director of journal operations and systems, said duplication of images is the number one reason for articles to be withdrawn by the American Association for Cancer Research (AACR) from 2016 to 2020. Having to withdraw a research paper damages the reputation of the author and publisher. This indicates that the quality of the researcher’s work was poor and that the editor’s peer review process overlooked errors.
To avoid embarrassing both parties, academic publishers such as AACR have turned to artificial intelligence software to detect duplicate images before publishing articles in journals. AACR has begun trial of Proofig, an image-checking program developed by an Israel-based startup with the same product name. At the International Conference on Peer Review and Scientific Publications Conference in Chicago last week, Ivanko presented results of a pilot study showing how Proofig affected AACR processes.
The AACR publishes 10 research journals and reviews more than 13,000 entries each year. From January 2021 to May 2022, the agency used Proofig to screen 1,367 manuscripts provisionally approved for publication, 208 after checking for duplicates of images tagged by the program. I contacted the author of the duplicates are mostly dirty bugs that are easy to fix. Scientists may have mistakenly mixed up the results, and often reintroducing new data solves the problem.
However, in rare cases, dangerous images highlighted by the program can be a sign of fraudulent activity. Of the 208 papers, 4 were withdrawn and one was subsequently rejected. Academic fraud is rare and is often associated with paper mills or institutions with a poor reputation. However, cases of cheating are still found in the best laboratories of prestigious universities. A recent study revealed by Science found that decades of Alzheimer’s research had led to fruitless searches for new treatments and failed clinical trials, but citations were plagued by duplicate images, and were reportedly based on a number of research papers.
The result in question was a series of faint lines created using a technique known as western blotting, allegedly copied, edited, and pasted into the mouse data. Replicating effects are very difficult to detect with the untrained eye. Looking for subtle changes like this is a daunting task for most humans, but it’s great for computers, says Dror Kolodkin-Gal, co-founder of Proofig. registration.
Proofig’s job is to first find all the analysis-related images in the uploaded article. The program ignores images of bar charts or line charts. Proofig needs to check if a given image matches all other sub-images in the sheet. Sub images can be shifted, flipped, or rotated. Parts can be cut, copied, and duplicated. “There are a lot of possibilities,” Kolodkin-Gal said.
Proofig uses a combination of computer vision and artificial intelligence algorithms to extract and classify images. Kolodkin-Gal said the software is computationally complex and would not have been possible without recent advances in machine learning. He added, “Before AI, extracting sub-images from the paper required a tenfold investment in research and development. God only knows how to calculate. Algorithms and the ability to run GPUs in the cloud, what has changed is the improvement in technology in both,” he added. .
The human in the required episode
AI programs like Proofig cannot catch scammers on their own. “It takes people with a certain amount of knowledge and experience to interpret the results,” said Elizabeth Beck, a forensic imaging expert and independent scientific advisor. registration. “You can’t just let the program run automatically. It can define quite a lot of good things.” In some cases, the human eye is better than a computer.
Beck uses another AI-based program called ImageTwin for his work. I sometimes find it difficult with Western blot analysis. “Western smudges are basically just black stripes on a solid background. There are subtle details in the look that I see as a human, but the software somehow can’t see it. I think it has to do with how very complex the brain is. The program only sees relative distances, so black lines always look like black lines, I’m not good at finding block shapes.
Kolodkin-Gal agreed that Western blots are particularly difficult to check with the machine. He said: “It took a lot of investment to finally find a good algorithm for finding western domains. It was very difficult for the AI because it is so small.” rice field.
Academic publishers use image-checking tools like Proofig at various stages of the publishing process. The approved manuscripts are temporarily scanned by the AACR, and other institutions like Taylor & Francis only use them to check papers about which editors and reviewers have raised concerns. “If the software detects potential duplication or manipulation of images and it is supported by our team of experts, our established procedures and those established by the Publication Ethics Committee for such cases, we launch an investigation in accordance with the guidelines set by the company,” a company spokesperson said. He told us.
When and where to use these tools in your deployment pipeline is a matter of cost. Image processing is computationally intensive, and requires publishers to bear the costs of cloud computing for startups like Proofig. Reviewing all papers at the application stage is very expensive. For example, analyzing 120 sub-images with Proofig costs an individual $99. Not cheap, given the number of all possible combinations Proofig has to deal with on one sheet. Organizations such as AACR and Taylor & Francis negotiate specific, low-priced packages tailored to their individual operations.
Tell us, Helen King, Head of Transformation and Product Innovation at SAGE Publishing. “So far, about a third of the research papers we run through the program have pointed to problems, and more subject matter expertise is needed to understand and interpret the results.”
AI can’t yet detect stolen photos across different papers
The American Society for Clinical Research has also adopted Proofig, and other publishers such as Frontiers have built their own tools. Wiley also uses some form of software, and PLOS, Elsevier and Nature are either open to or actively testing the software, Nature first reported.
AI programs are getting better at spotting suspicious data, but they don’t know all the different ways scientists cheat. Proofig can tell if an image appears to be a duplicate within the same sheet, but cannot yet find copies across different sheets. We haven’t yet discovered cases where the photos might have been stolen across different sheets. The company needs to build a database of caches of images cut from papers published for comparison.
“The main challenge facing society today is big data,” Korodkingal said. “If publishers don’t start working together to create databases of problematic images, [image plagiarism] The problem remains. Developing artificial intelligence requires big data. “
However, programs like Proofig are a good starting point for eliminating fraud and improving scientific integrity. “I think getting started with software is a good development for publishers because it provides a little bit of quality control of the publishing process,” Beck says. “It acts as a deterrent. Tell the authors that you intend to check your paper for these types of redundancy. I don’t think it will prevent cheating, but it does make cheating a little more difficult.” ®
“Proud explorer. Freelance social media expert. Problem solver. Gamer. Extreme travel aficionado.”