Analysis of scientific paper retractions due to data problems: Revealing challenges and countermeasures in data management

Authors : Wanfei Hu, Guiliang Yan, Jingyu Zhang, Zhenli Chen, Qing Qian, Sizhu Wu

Background

Scientific data, the cornerstone of scientific endeavors, face management challenges amid technological advances. While retractions are analyzed, a rigorous focus on data problems leading to them is missing.

Methods

This study collected 49,979 retraction records up to 17 December 2023. After screening 16,842 records were related to data problems and 19,656 were due to other reasons. Methods such as descriptive statistics, hypothesis testing, and the BERTopic (Bidirectional Encoder Representations from Transformers Topic Modelling) were applied to conduct a topic analysis of article titles.

Result

The results show that since 2000, retractions due to data problems have increased significantly (p < 0.001), with the percentage in 2023 exceeding 75%. Among 16,842 data-related retractions, 59.0% were in Basic Life Sciences and 40.2% in Health Sciences. Data problems involve accuracy, reliability, validity, and integrity. There are significant differences (p < 0.001) in subjects, journal quartiles, retraction intervals, and other characteristics between data-related and other retractions. Data-related retractions are more concentrated in high-impact journals (Q1 37.6% and Q2 43.0%).

Conclusions

Institutions, publishers, and journals should adopt image-screening tools, enforce data deposition, standardize retraction notices, provide ethics training, and strengthen peer review to address these data problems, guiding better data management and healthier scientific development.

URL : Analysis of scientific paper retractions due to data problems Revealing challenges and countermeasures in data management

DOI : https://doi.org/10.1080/08989621.2025.2531987

TheConversation.com : relais de l’expertise des universitaires et chercheurs au sein de l’espace médiatique

Auteur : Edern Appéré

La crise sanitaire liée au covid-19 a révélé les rapports parfois difficiles entre science et médias. De nombreuses critiques ont été adressées aux médias sur leur traitement de la maladie, notamment sur le manque de connaissances scientifiques des journalistes, leurs difficultés à comprendre et expliquer la science en train de se faire, les choix discutables d’experts sollicités. L’importance pour le grand public de disposer d’informations fiables, vérifiées, provenant de personnes dont c’est le domaine d’expertise, a été plus que jamais mise en lumière.

Fondé en 2011 en Australie, le média TheConversation.com a pour ambition de renouveler la manière dont scientifiques et journalistes travaillent ensemble. Ce média se présente sous la forme d’un site web d’information généraliste qui aborde tous les sujets qui font l’actualité. La principale spécificité de son modèle éditorial est que les articles sont tous rédigés par des chercheurs, assistés de journalistes, dans une démarche de vulgarisation scientifique. Ce modèle original est résumé par la devise « L’expertise universitaire, l’exigence journalistique ».

Ce mémoire, concentré sur l’édition française de The Conversation cherche à répondre aux questions, légitimes, que ce média atypique ne manque de susciter : le contrat de lecture énoncé est-il respecté ? TheConversation.com atteint-il sa promesse de donner la parole aux chercheurs pour informer et éclairer le débat public ?

URL : https://dumas.ccsd.cnrs.fr/dumas-04965431v1

Research Data: A Public Good or a Private Asset?

Authors : Tadeu Fernando Nogueira, Trude Eikebrokk, Laila Økdal Aksetøy

This article is concerned with the issue of how Research Performing Organizations can balance the market and non-market values of the research data they hold. To address this issue, we adopt the lenses of the Resource Based View and Open Science and explore the interplay between them.

In doing so, this article addresses the question of whether it is possible to achieve a balance between research data as a public good and as a private asset and if so, how. Of particular interest are Research Performing Organizations in the institute sector that operate under both market and non-market logics, which have implications for how they govern their research data.

From the discussions undertaken in the article, one of the main conclusions is that Research Performing Organizations may benefit from adopting a research data governance model that captures both the economic and societal values of research data.

They could do so, for instance, by developing an integrative institutional policy and by actively using data management plans to evaluate the value of the data produced in research projects.

URL : Research Data: A Public Good or a Private Asset?

DOI : https://doi.org/10.53377/lq.22604

 

Greenwashing at Elsevier: A political ecology of corporate publishing

Authors : Angus Lyall, Mark Ortiz, Emily Billo

The largest science publishing corporations, including Elsevier, Wiley, Taylor & Francis, Springer, and Sage, are key partners for the oil, gas, and coal industries insofar as they distribute scientific research and data that facilitate fossil fuel exploration, production, and distribution.

Critical researchers seldom trace fossil fuels and, in turn, the climate crisis to the publishing corporations that they generally rely upon to distribute their own research. We argue that corporate publishers produce the invisibility of their connections to fossil fuels through changing practices of greenwashing both in the public sphere and within firms.

We detail marketing and management practices in the case of the largest science publisher in the world: Elsevier. On the one hand, we examine evolving forms of green marketing. On the other hand, building on recent calls for political ecologies of labor, we highlight the proliferation of ‘greenwashing rituals’ within the firm – i.e., performative, management-sponsored dialogues and actions regarding climate change.

We suggest that researchers continue to expand frameworks for critiquing the fossil fuel industry to include auxiliary industries such as corporate publishing.

URL : Greenwashing at Elsevier: A political ecology of corporate publishing

DOI : https://doi.org/10.2458/jpe.6276

Evaluation of Faculty Knowledge of Predatory Journals in the United States: A Cross-Institutional Survey

Authors : Nicole R. WebberStephanie WiegandJason A. CohenJohn M. ReynoldsLisa AnceletArlene V. Salazar

Predatory journals are a known hazard in modern academic research publishing, with research and anecdotal accounts indicating that they exploit inexperienced researchers. Most literature on the topic centres on specific disciplines and/or countries deemed ‘more vulnerable’ to publishing scams.

At the time of publication, no studies have examined a full range of disciplines at institutions across the United States. Our research collected responses from 1098 faculty at 17 US doctoral universities using a multi-disciplinary survey to assess self-reported knowledge and awareness of predatory publishing.

In this analysis, we investigated participants’ reported knowledge levels of predatory journals in relation to four aspects: academic discipline, years employed in academic research, number of articles published, and early career researcher status.

We conclude that the relationship between experience and knowledge of predatory publishing depends on the definition of experience employed, and that the number of recent articles published by a faculty member is a more reliable indicator of knowledge about predatory publishing than the other measures of experience investigated.

URL : Evaluation of Faculty Knowledge of Predatory Journals in the United States: A Cross-Institutional Survey

DOI : https://doi.org/10.1002/leap.2020

 

Does ChatGPT Ignore Article Retractions and Other Reliability Concerns?

Authors : Mike ThelwallMarianna LehtisaariIrini KatsireaKim HolmbergEr-Te Zheng

Large language models (LLMs) like ChatGPT seem to be increasingly used for information seeking and analysis, including to support academic literature reviews. To test whether the results might sometimes include retracted research, we identified 217 retracted or otherwise concerning academic studies with high altmetric scores and asked ChatGPT 4o-mini to evaluate their quality 30 times each.

Surprisingly, none of its 6510 reports mentioned that the articles were retracted or had relevant errors, and it gave 190 relatively high scores (world leading, internationally excellent, or close). The 27 articles with the lowest scores were mostly accused of being weak, although the topic (but not the article) was described as controversial in five cases (e.g., about hydroxychloroquine for COVID-19).

In a follow-up investigation, 61 claims were extracted from retracted articles from the set, and ChatGPT 4o-mini was asked 10 times whether each was true. It gave a definitive yes or a positive response two-thirds of the time, including for at least one statement that had been shown to be false over a decade ago.

The results therefore emphasise, from an academic knowledge perspective, the importance of verifying information from LLMs when using them for information seeking or analysis.

URL : Does ChatGPT Ignore Article Retractions and Other Reliability Concerns?

DOI : https://doi.org/10.1002/leap.2018

Research data management services in academic libraries to support the research data life cycle: A systematic review

Authors : Richard Cheng Yong HoSuei Nee WongPatsy ChiaChris TangMagdeline Tao Tao Ng

Academic libraries play an increasingly crucial role in providing services, information, education, and infrastructure support related to research data management (RDM). This systematic review aims to provide a comprehensive and critical analysis of the state of RDM services offered by academic libraries worldwide.

Utilizing the systematic review methodology, the paper examines 89 empirical studies to answer four research questions: (1) the types of RDM services implemented by academic libraries; (2) what are the infrastructure, workflow, and resources used to support these services; (3) what are the reasons for implementing these RDM services; and (4) the effectiveness of these RDM services in supporting the research data life cycle, if any.

This review highlights the critical reasons academic libraries provide RDM services and how they implemented these services through partnerships, infrastructure, and systems, and adapting to new workflows within the library.

These findings also examine the balance between institutional contexts, researchers’ needs, and library resources required to provide these RDM services. By investigating these questions, the results will provide recommendations and guidance for academic libraries interested in implementing RDM services in their own library and institutional contexts.

URL : Research data management services in academic libraries to support the research data life cycle: A systematic review

DOI : https://doi.org/10.1002/asi.70008