Assessment of transparency indicators across the biomedical literature: How open is open?

Authors : Stylianos Serghiou, Despina G. Contopoulos-Ioannidis, Kevin W. Boyack, Nico Riedel, Joshua D. Wallach, John P. A. Ioannidis

Recent concerns about the reproducibility of science have led to several calls for more open and transparent research practices and for the monitoring of potential improvements over time. However, with tens of thousands of new biomedical articles published per week, manually mapping and monitoring changes in transparency is unrealistic.

We present an open-source, automated approach to identify 5 indicators of transparency (data sharing, code sharing, conflicts of interest disclosures, funding disclosures, and protocol registration) and apply it across the entire open access biomedical literature of 2.75 million articles on PubMed Central (PMC).

Our results indicate remarkable improvements in some (e.g., conflict of interest [COI] disclosures and funding disclosures), but not other (e.g., protocol registration and code sharing) areas of transparency over time, and map transparency across fields of science, countries, journals, and publishers.

This work has enabled the creation of a large, integrated, and openly available database to expedite further efforts to monitor, understand, and promote transparency and reproducibility in science.

URL : Assessment of transparency indicators across the biomedical literature: How open is open?

DOI : https://doi.org/10.1371/journal.pbio.3001107

Attracting new users or business as usual? A case study of converting academic subscription-based journals to open access

Author : Lars Wenaas

This paper studies a selection of 11 Norwegian journals in the humanities and social sciences and their conversion from subscription to open access, a move heavily incentivized by governmental mandates and open access policies.

By investigating the journals’ visiting logs in the period 2014–2019, the study finds that a conversion to open access induces higher visiting numbers; all journals in the study had a significant increase, which can be attributed to the conversion.

Converting a journal had no spillover in terms of increased visits to previously published articles still behind the paywall in the same journals. Visits from previously subscribing Norwegian higher education institutions did not account for the increase in visits, indicating that the increase must be accounted for by visitors from other sectors.

The results could be relevant for policymakers concerning the effects of strict policies targeting economically vulnerable national journals, and could further inform journal owners and editors on the effects of converting to open access.

DOI : https://doi.org/10.1162/qss_a_00126

Publication rate and citation counts for preprints released during the COVID-19 pandemic: the good, the bad and the ugly

Authors : Diego Añazco, Bryan Nicolalde, Isabel Espinosa, Jose Camacho , Mariam Mushtaq, Jimena Gimenez, Enrique Teran

Background

Preprints are preliminary reports that have not been peer-reviewed. In December 2019, a novel coronavirus appeared in China, and since then, scientific production, including preprints, has drastically increased. In this study, we intend to evaluate how often preprints about COVID-19 were published in scholarly journals and cited.

Methods

We searched the iSearch COVID-19 portfolio to identify all preprints related to COVID-19 posted on bioRxiv, medRxiv, and Research Square from January 1, 2020, to May 31, 2020. We used a custom-designed program to obtain metadata using the Crossref public API.

After that, we determined the publication rate and made comparisons based on citation counts using non-parametric methods. Also, we compared the publication rate, citation counts, and time interval from posting on a preprint server to publication in a scholarly journal among the three different preprint servers.

Results

Our sample included 5,061 preprints, out of which 288 were published in scholarly journals and 4,773 remained unpublished (publication rate of 5.7%). We found that articles published in scholarly journals had a significantly higher total citation count than unpublished preprints within our sample (p < 0.001), and that preprints that were eventually published had a higher citation count as preprints when compared to unpublished preprints (p < 0.001).

As well, we found that published preprints had a significantly higher citation count after publication in a scholarly journal compared to as a preprint (p < 0.001). Our results also show that medRxiv had the highest publication rate, while bioRxiv had the highest citation count and shortest time interval from posting on a preprint server to publication in a scholarly journal.

Conclusions

We found a remarkably low publication rate for preprints within our sample, despite accelerated time to publication by multiple scholarly journals. These findings could be partially attributed to the unprecedented surge in scientific production observed during the COVID-19 pandemic, which might saturate reviewing and editing processes in scholarly journals.

However, our findings show that preprints had a significantly lower scientific impact, which might suggest that some preprints have lower quality and will not be able to endure peer-reviewing processes to be published in a peer-reviewed journal.

URL : Publication rate and citation counts for preprints released during the COVID-19 pandemic: the good, the bad and the ugly

DOI : https://doi.org/10.7717/peerj.10927

What Constitutes Authorship in the Social Sciences?

Author : Gernot Pruschak

Authorship represents a highly discussed topic in nowadays academia. The share of co-authored papers has increased substantially in recent years allowing scientists to specialize and focus on specific tasks.

Arising from this, social scientific literature has especially discussed author orders and the distribution of publication and citation credits among co-authors in depth. Yet only a small fraction of the authorship literature has also addressed the actual underlying question of what actually constitutes authorship.

To identify social scientists’ motives for assigning authorship, we conduct an empirical study surveying researchers around the globe. We find that social scientists tend to distribute research tasks among (individual) research team members. Nevertheless, they generally adhere to the universally applicable Vancouver criteria when distributing authorship.

More specifically, participation in every research task with the exceptions of data work as well as reviewing and remarking increases scholars’ chances to receive authorship. Based on our results, we advise journal editors to introduce authorship guidelines that incorporate the Vancouver criteria as they seem applicable to the social sciences.

We further call upon research institutions to emphasize data skills in hiring and promotion processes as publication counts might not always depict these characteristics.

URL : What Constitutes Authorship in the Social Sciences?

DOI : https://doi.org/10.3389/frma.2021.655350

A survey of researchers’ needs and priorities for data sharing

Authors : Iain Hrynaszkiewicz, James Harney, Lauren Cadwallader

PLOS has long supported Open Science. One of the ways in which we do so is via our stringent data availability policy established in 2014. Despite this policy, and more data sharing policies being introduced by other organizations, best practices for data sharing are adopted by a minority of researchers in their publications. Problems with effective research data sharing persist and these problems have been quantified by previous research as a lack of time, resources, incentives, and/or skills to share data.

In this study we built on this research by investigating the importance of tasks associated with data sharing, and researchers’ satisfaction with their ability to complete these tasks. By investigating these factors we aimed to better understand opportunities for new or improved solutions for sharing data.

In May-June 2020 we surveyed researchers from Europe and North America to rate tasks associated with data sharing on (i) their importance and (ii) their satisfaction with their ability to complete them. We received 728 completed and 667 partial responses. We calculated mean importance and satisfaction scores to highlight potential opportunities for new solutions to and compare different cohorts.

Tasks relating to research impact, funder compliance, and credit had the highest importance scores. 52% of respondents reuse research data but the average satisfaction score for obtaining data for reuse was relatively low. Tasks associated with sharing data were rated somewhat important and respondents were reasonably well satisfied in their ability to accomplish them. Notably, this included tasks associated with best data sharing practice, such as use of data repositories. However, the most common method for sharing data was in fact via supplemental files with articles, which is not considered to be best practice.

We presume that researchers are unlikely to seek new solutions to a problem or task that they are satisfied in their ability to accomplish, even if many do not attempt this task. This implies there are few opportunities for new solutions or tools to meet these researcher needs. Publishers can likely meet these needs for data sharing by working to seamlessly integrate existing solutions that reduce the effort or behaviour change involved in some tasks, and focusing on advocacy and education around the benefits of sharing data.

There may however be opportunities – unmet researcher needs – in relation to better supporting data reuse, which could be met in part by strengthening data sharing policies of journals and publishers, and improving the discoverability of data associated with published articles.

DOI : https://doi.org/10.31219/osf.io/njr5u

Is preprint the future of science? A thirty year journey of online preprint services

Authors : Boya Xie, Zhihong Shen, Kuansan Wang

Preprint is a version of a scientific paper that is publicly distributed preceding formal peer review. Since the launch of arXiv in 1991, preprints have been increasingly distributed over the Internet as opposed to paper copies.

It allows open online access to disseminate the original research within a few days, often at a very low operating cost. This work overviews how preprint has been evolving and impacting the research community over the past thirty years alongside the growth of the Web.

In this work, we first report that the number of preprints has exponentially increased 63 times in 30 years, although it only accounts for 4% of research articles. Second, we quantify the benefits that preprints bring to authors: preprints reach an audience 14 months earlier on average and associate with five times more citations compared with a non-preprint counterpart. Last, to address the quality concern of preprints, we discover that 41% of preprints are ultimately published at a peer-reviewed destination, and the published venues are as influential as papers without a preprint version.

Additionally, we discuss the unprecedented role of preprints in communicating the latest research data during recent public health emergencies. In conclusion, we provide quantitative evidence to unveil the positive impact of preprints on individual researchers and the community.

Preprints make scholarly communication more efficient by disseminating scientific discoveries more rapidly and widely with the aid of Web technologies. The measurements we present in this study can help researchers and policymakers make informed decisions about how to effectively use and responsibly embrace a preprint culture.

URL : https://arxiv.org/abs/2102.09066

Linguistic Analysis of the bioRxiv Preprint Landscape

Authors : David N. Nicholson, Vincent Rubinetti, Dongbo Hu, Marvin Thielk, Lawrence E. Hunter, Casey S. Greene

Preprints allow researchers to make their findings available to the scientific community before they have undergone peer review. Studies on preprints within bioRxiv have been largely focused on article metadata and how often these preprints are downloaded, cited, published, and discussed online.

A missing element that has yet to be examined is the language contained within the bioRxiv preprint repository. We sought to compare and contrast linguistic features within bioRxiv preprints to published biomedical text as a whole as this is an excellent opportunity to examine how peer review changes these documents.

The most prevalent features that changed appear to be associated with typesetting and mentions of supplementary sections or additional files. In addition to text comparison, we created document embeddings derived from a preprint-trained word2vec model.

We found that these embeddings are able to parse out different scientific approaches and concepts, link unannotated preprint-peer reviewed article pairs, and identify journals that publish linguistically similar papers to a given preprint.

We also used these embeddings to examine factors associated with the time elapsed between the posting of a first preprint and the appearance of a peer reviewed publication. We found that preprints with more versions posted and more textual changes took longer to publish.

Lastly, we constructed a web application (https://greenelab.github.io/preprint-similarity-search/) that allows users to identify which journals and articles that are most linguistically similar to a bioRxiv or medRxiv preprint as well as observe where the preprint would be positioned within a published article landscape.

DOI : https://doi.org/10.1101/2021.03.04.433874