Publication rate and citation counts for preprints released during the COVID-19 pandemic: the good, the bad and the ugly

Authors : Diego Añazco, Bryan Nicolalde, Isabel Espinosa, Jose Camacho , Mariam Mushtaq, Jimena Gimenez, Enrique Teran

Background

Preprints are preliminary reports that have not been peer-reviewed. In December 2019, a novel coronavirus appeared in China, and since then, scientific production, including preprints, has drastically increased. In this study, we intend to evaluate how often preprints about COVID-19 were published in scholarly journals and cited.

Methods

We searched the iSearch COVID-19 portfolio to identify all preprints related to COVID-19 posted on bioRxiv, medRxiv, and Research Square from January 1, 2020, to May 31, 2020. We used a custom-designed program to obtain metadata using the Crossref public API.

After that, we determined the publication rate and made comparisons based on citation counts using non-parametric methods. Also, we compared the publication rate, citation counts, and time interval from posting on a preprint server to publication in a scholarly journal among the three different preprint servers.

Results

Our sample included 5,061 preprints, out of which 288 were published in scholarly journals and 4,773 remained unpublished (publication rate of 5.7%). We found that articles published in scholarly journals had a significantly higher total citation count than unpublished preprints within our sample (p < 0.001), and that preprints that were eventually published had a higher citation count as preprints when compared to unpublished preprints (p < 0.001).

As well, we found that published preprints had a significantly higher citation count after publication in a scholarly journal compared to as a preprint (p < 0.001). Our results also show that medRxiv had the highest publication rate, while bioRxiv had the highest citation count and shortest time interval from posting on a preprint server to publication in a scholarly journal.

Conclusions

We found a remarkably low publication rate for preprints within our sample, despite accelerated time to publication by multiple scholarly journals. These findings could be partially attributed to the unprecedented surge in scientific production observed during the COVID-19 pandemic, which might saturate reviewing and editing processes in scholarly journals.

However, our findings show that preprints had a significantly lower scientific impact, which might suggest that some preprints have lower quality and will not be able to endure peer-reviewing processes to be published in a peer-reviewed journal.

URL : Publication rate and citation counts for preprints released during the COVID-19 pandemic: the good, the bad and the ugly

DOI : https://doi.org/10.7717/peerj.10927

What Constitutes Authorship in the Social Sciences?

Author : Gernot Pruschak

Authorship represents a highly discussed topic in nowadays academia. The share of co-authored papers has increased substantially in recent years allowing scientists to specialize and focus on specific tasks.

Arising from this, social scientific literature has especially discussed author orders and the distribution of publication and citation credits among co-authors in depth. Yet only a small fraction of the authorship literature has also addressed the actual underlying question of what actually constitutes authorship.

To identify social scientists’ motives for assigning authorship, we conduct an empirical study surveying researchers around the globe. We find that social scientists tend to distribute research tasks among (individual) research team members. Nevertheless, they generally adhere to the universally applicable Vancouver criteria when distributing authorship.

More specifically, participation in every research task with the exceptions of data work as well as reviewing and remarking increases scholars’ chances to receive authorship. Based on our results, we advise journal editors to introduce authorship guidelines that incorporate the Vancouver criteria as they seem applicable to the social sciences.

We further call upon research institutions to emphasize data skills in hiring and promotion processes as publication counts might not always depict these characteristics.

URL : What Constitutes Authorship in the Social Sciences?

DOI : https://doi.org/10.3389/frma.2021.655350

A survey of researchers’ needs and priorities for data sharing

Authors : Iain Hrynaszkiewicz, James Harney, Lauren Cadwallader

PLOS has long supported Open Science. One of the ways in which we do so is via our stringent data availability policy established in 2014. Despite this policy, and more data sharing policies being introduced by other organizations, best practices for data sharing are adopted by a minority of researchers in their publications. Problems with effective research data sharing persist and these problems have been quantified by previous research as a lack of time, resources, incentives, and/or skills to share data.

In this study we built on this research by investigating the importance of tasks associated with data sharing, and researchers’ satisfaction with their ability to complete these tasks. By investigating these factors we aimed to better understand opportunities for new or improved solutions for sharing data.

In May-June 2020 we surveyed researchers from Europe and North America to rate tasks associated with data sharing on (i) their importance and (ii) their satisfaction with their ability to complete them. We received 728 completed and 667 partial responses. We calculated mean importance and satisfaction scores to highlight potential opportunities for new solutions to and compare different cohorts.

Tasks relating to research impact, funder compliance, and credit had the highest importance scores. 52% of respondents reuse research data but the average satisfaction score for obtaining data for reuse was relatively low. Tasks associated with sharing data were rated somewhat important and respondents were reasonably well satisfied in their ability to accomplish them. Notably, this included tasks associated with best data sharing practice, such as use of data repositories. However, the most common method for sharing data was in fact via supplemental files with articles, which is not considered to be best practice.

We presume that researchers are unlikely to seek new solutions to a problem or task that they are satisfied in their ability to accomplish, even if many do not attempt this task. This implies there are few opportunities for new solutions or tools to meet these researcher needs. Publishers can likely meet these needs for data sharing by working to seamlessly integrate existing solutions that reduce the effort or behaviour change involved in some tasks, and focusing on advocacy and education around the benefits of sharing data.

There may however be opportunities – unmet researcher needs – in relation to better supporting data reuse, which could be met in part by strengthening data sharing policies of journals and publishers, and improving the discoverability of data associated with published articles.

DOI : https://doi.org/10.31219/osf.io/njr5u

Is preprint the future of science? A thirty year journey of online preprint services

Authors : Boya Xie, Zhihong Shen, Kuansan Wang

Preprint is a version of a scientific paper that is publicly distributed preceding formal peer review. Since the launch of arXiv in 1991, preprints have been increasingly distributed over the Internet as opposed to paper copies.

It allows open online access to disseminate the original research within a few days, often at a very low operating cost. This work overviews how preprint has been evolving and impacting the research community over the past thirty years alongside the growth of the Web.

In this work, we first report that the number of preprints has exponentially increased 63 times in 30 years, although it only accounts for 4% of research articles. Second, we quantify the benefits that preprints bring to authors: preprints reach an audience 14 months earlier on average and associate with five times more citations compared with a non-preprint counterpart. Last, to address the quality concern of preprints, we discover that 41% of preprints are ultimately published at a peer-reviewed destination, and the published venues are as influential as papers without a preprint version.

Additionally, we discuss the unprecedented role of preprints in communicating the latest research data during recent public health emergencies. In conclusion, we provide quantitative evidence to unveil the positive impact of preprints on individual researchers and the community.

Preprints make scholarly communication more efficient by disseminating scientific discoveries more rapidly and widely with the aid of Web technologies. The measurements we present in this study can help researchers and policymakers make informed decisions about how to effectively use and responsibly embrace a preprint culture.

URL : https://arxiv.org/abs/2102.09066

Linguistic Analysis of the bioRxiv Preprint Landscape

Authors : David N. Nicholson, Vincent Rubinetti, Dongbo Hu, Marvin Thielk, Lawrence E. Hunter, Casey S. Greene

Preprints allow researchers to make their findings available to the scientific community before they have undergone peer review. Studies on preprints within bioRxiv have been largely focused on article metadata and how often these preprints are downloaded, cited, published, and discussed online.

A missing element that has yet to be examined is the language contained within the bioRxiv preprint repository. We sought to compare and contrast linguistic features within bioRxiv preprints to published biomedical text as a whole as this is an excellent opportunity to examine how peer review changes these documents.

The most prevalent features that changed appear to be associated with typesetting and mentions of supplementary sections or additional files. In addition to text comparison, we created document embeddings derived from a preprint-trained word2vec model.

We found that these embeddings are able to parse out different scientific approaches and concepts, link unannotated preprint-peer reviewed article pairs, and identify journals that publish linguistically similar papers to a given preprint.

We also used these embeddings to examine factors associated with the time elapsed between the posting of a first preprint and the appearance of a peer reviewed publication. We found that preprints with more versions posted and more textual changes took longer to publish.

Lastly, we constructed a web application (https://greenelab.github.io/preprint-similarity-search/) that allows users to identify which journals and articles that are most linguistically similar to a bioRxiv or medRxiv preprint as well as observe where the preprint would be positioned within a published article landscape.

DOI : https://doi.org/10.1101/2021.03.04.433874

Publication practices during the COVID-19 pandemic: Biomedical preprints and peer-reviewed literature

Authors : Yulia V. Sevryugina, Andrew J. Dicks

The coronavirus pandemic introduced many changes to our society, and deeply affected the established in biomedical sciences publication practices. In this article, we present a comprehensive study of the changes in scholarly publication landscape for biomedical sciences during the COVID-19 pandemic, with special emphasis on preprints posted on bioRxiv and medRxiv servers.

We observe the emergence of a new category of preprint authors working in the fields of immunology, microbiology, infectious diseases, and epidemiology, who extensively used preprint platforms during the pandemic for sharing their immediate findings. The majority of these findings were works-in-progress unfitting for a prompt acceptance by refereed journals.

The COVID-19 preprints that became peer-reviewed journal articles were often submitted to journals concurrently with the posting on a preprint server, and the entire publication cycle, from preprint to the online journal article, took on average 63 days. This included an expedited peer-review process of 43 days and journal’s production stage of 15 days, however there was a wide variation in publication delays between journals. Only one third of COVID-19 preprints posted during the first nine months of the pandemic appeared as peer-reviewed journal articles.

These journal articles display high Altmetric Attention Scores further emphasizing a significance of COVID-19 research during 2020. This article will be relevant to editors, publishers, open science enthusiasts, and anyone interested in changes that the 2020 crisis transpired to publication practices and a culture of preprints in life sciences.

DOI : https://doi.org/10.1101/2021.01.21.427563

Communicating Scientific Uncertainty in an Age of COVID-19: An Investigation into the Use of Preprints by Digital Media Outlets

Authors : Alice Fleerackers, Michelle Riedlinger, Laura Moorhead, Rukhsana Ahmed, Juan Pablo Alperin

In this article, we investigate the surge in use of COVID-19-related preprints by media outlets. Journalists are a main source of reliable public health information during crises and, until recently, journalists have been reluctant to cover preprints because of the associated scientific uncertainty.

Yet, uploads of COVID-19 preprints and their uptake by online media have outstripped that of preprints about any other topic. Using an innovative approach combining altmetrics methods with content analysis, we identified a diversity of outlets covering COVID-19-related preprints during the early months of the pandemic, including specialist medical news outlets, traditional news media outlets, and aggregators.

We found a ubiquity of hyperlinks as citations and a multiplicity of framing devices for highlighting the scientific uncertainty associated with COVID-19 preprints. These devices were rarely used consistently (e.g., mentioning that the study was a preprint, unreviewed, preliminary, and/or in need of verification).

About half of the stories we analyzed contained framing devices emphasizing uncertainty. Outlets in our sample were much less likely to identify the research they mentioned as preprint research, compared to identifying it as simply “research.” This work has significant implications for public health communication within the changing media landscape.

While current best practices in public health risk communication promote identifying and promoting trustworthy sources of information, the uptake of preprint research by online media presents new challenges.

At the same time, it provides new opportunities for fostering greater awareness of the scientific uncertainty associated with health research findings.

DOI : https://doi.org/10.1080/10410236.2020.1864892