Preprint citation practice in PLOS

Authors : Marc Bertin, Iana Atanassova

The role of preprints in the scientific production and their part in citations have been growing over the past 10 years. In this paper we study preprint citations in several different aspects: the progression of preprint citations over time, their relative frequencies in relation to the IMRaD structure of articles, their distributions over time, per preprint database and per PLOS journal.

We have processed the PLOS corpus that covers 7 journals and a total of about 240,000 articles up to January 2021, and produced a dataset of 8460 preprint citation contexts that cite 12 different preprint databases.

Our results show that preprint citations are found with the highest frequency in the Method section of articles, though small variations exist with respect to journals. The PLOS Computational Biology journal stands out as it contains more than three times more preprint citations than any other PLOS journal.

The relative parts of the different preprint databases are also examined. While ArXiv and bioRxiv are the most frequent citation sources, bioRxiv’s disciplinary nature can be observed as it is the source of more than 70% of preprint citations in PLOS Biology, PLOS Genetics and PLOS Pathogens.

We have also compared the lexical content of preprint citation contexts to the citation content to peer-reviewed publications. Finally, by performing a lexicometric analysis, we have shown that preprint citation contexts differ significantly from citation contexts of peer-reviewed publications.

This confirms that authors make use of different lexical content when citing preprints compared to the rest of citations.

URL : Preprint citation practice in PLOS

DOI : https://doi.org/10.1007/s11192-022-04388-5

The influence of funding on the Open Access citation advantage

Authors : Pablo Dorta-González, María Isabel Dorta-González

Some of the citation advantage in open access is likely due to more access allows more people to read and hence cite articles they otherwise would not. However, causation is difficult to establish and there are many possible bias. Several factors can affect the observed differences in citation rates.

Funder mandates can be one of them. Funders are likely to have OA requirement, and well-funded studies are more likely to receive more citations than poorly funded studies. In this paper this hypothesis is tested. Thus, we studied the effect of funding on the publication modality and the citations received in more than 128 thousand research articles, of which 31% were funded.

These research articles come from 40 randomly selected subject categories in the year 2016, and the citations received from the period 2016-2020 in the Scopus database. We found open articles published in hybrid journals were considerably more cited than those in open access journals.

Thus, articles under the hybrid gold modality are cite on average twice as those in the gold modality. This is the case regardless of funding, so this evidence is strong. Moreover, within the same publication modality, we found that funded articles generally obtain 50% more citations than unfunded ones.

The most cited modality is the hybrid gold and the least cited is the gold, well below even the paywalled. Furthermore, the use of open access repositories considerably increases the citations received, especially for those articles without funding. Thus, the articles in open access repositories (green) are 50% more cited than the paywalled ones.

This evidence is remarkable and does not depend on funding. Excluding the gold modality, there is a citation advantage in more than 75% of the cases and it is considerably greater among unfunded articles. This result is strong both across fields and over time

URL : https://arxiv.org/abs/2202.02082v1

Are Conference Posters Being Cited?

Authors : Nick Haupka, Cäcilia Schröer, Christian Hauschke

We present a small case study on citations of conference posters using poster collections from both Figshare and Zenodo. The study takes into account the years 2016–2020 according to the dates of publication on the platforms.

Citation data was taken from DataCite, Crossref and Dimensions. Primarily, we want to know to what extent scientific posters are being cited and thereby which impact posters potentially have on the scholarly landscape and especially on academic publications.

Our data-driven analysis reveals that posters are rarely cited. Citations could only be found for 1% of the posters in our dataset. A limitation in this study however is that the impact of academic posters was not measured empirical but rather descriptive.

URL : Are Conference Posters Being Cited?

DOI : https://doi.org/10.3389/frma.2021.766552

 

Association between the Rankings of Top Bioinformatics and Medical Informatics Journals and the Scholarly Reputations of Chief Editors

Author : Salim Sazzed

The scientometric indices, such as the journal Impact Factor (IF) or SCImago Journal Rank (SJR), often play a determining role while choosing a journal for possible publication. The Editor-in-Chief (EiC), also known as a lead editor or chief editor, usually decides the outcomes (e.g., accept, reject) of the submitted manuscripts taking the reviewer’s feedback into account.

This study investigates the associations between the EiC’s scholarly reputation (i.e., citation-level metrics) and the rankings of top Bioinformatics and Computational Biology (BCB) and Medical Informatics (MI) journals. I consider three scholarly indices (i.e., citation, h-index, and i-10 index) of the EiC and four scientometric indices (i.e., h5-index, h5-median, impact factor, and SJR) of various journals.

To study the correlation between scientometric indices of the EiC and journal, I apply Spearman (ρ) and Kendall (τ) correlation coefficients. Moreover, I employ machine learning (ML) models for the journal’s SJR and IF predictions leveraging the EiC’s scholarly reputation indices.

The analysis reveals no correlation between the EiC’s scholarly achievement and the journal’s quantitative metrics. ML models yield high prediction errors for SJR and IF estimations, which suggests that the EiC’s scholarly indices are not good representations of the journal rankings.

URL : Association between the Rankings of Top Bioinformatics and Medical Informatics Journals and the Scholarly Reputations of Chief Editors

DOI : https://doi.org/10.3390/publications9030042

A qualitative and quantitative analysis of open citations to retracted articles: the Wakefield 1998 et al.’s case

Authors : Ivan Heibi, Silvio Peroni

In this article, we show the results of a quantitative and qualitative analysis of open citations on a popular and highly cited retracted paper: “Ileal-lymphoid-nodular hyperplasia, non-specific colitis and pervasive developmental disorder in children” by Wakefield et al., published in 1998.

The main purpose of our study is to understand the behavior of the publications citing one retracted article and the characteristics of the citations the retracted article accumulated over time. Our analysis is based on a methodology which illustrates how we gathered the data, extracted the topics of the citing articles and visualized the results.

The data and services used are all open and free to foster the reproducibility of the analysis. The outcomes concerned the analysis of the entities citing Wakefield et al.’s article and their related in-text citations. We observed a constant increasing number of citations in the last 20 years, accompanied with a constant increment in the percentage of those acknowledging its retraction.

Citing articles have started either discussing or dealing with the retraction of Wakefield et al.’s article even before its full retraction happened in 2010. Articles in the social sciences domain citing the Wakefield et al.’s one were among those that have mostly discussed its retraction.

In addition, when observing the in-text citations, we noticed that a large number of the citations received by Wakefield et al.’s article has focused on general discussions without recalling strictly medical details, especially after the full retraction.

Medical studies did not hesitate in acknowledging the retraction of the Wakefield et al.’s article and often provided strong negative statements on it.

URL : A qualitative and quantitative analysis of open citations to retracted articles: the Wakefield 1998 et al.’s case

DOI : https://doi.org/10.1007/s11192-021-04097-5

Article Processing Charges based publications: to which extent the price explains scientific impact?

Authors : Abdelghani Maddi, David Sapinho

The present study aims to analyze relationship between Citations Normalized Score (NCS) of scientific publications and Article Processing Charges (APCs) amounts of Gold Open access publications.

To do so, we use APCs information provided by OpenAPC database and citations scores of publications in the Web of Science database (WoS). Database covers the period from 2006 to 2019 with 83,752 articles published in 4751 journals belonging to 267 distinct publishers.

Results show that contrary to this belief, paying dearly does not necessarily increase the impact of publications. First, large publishers with high impact are not the most expensive.

Second, publishers with the highest APCs are not necessarily the best in terms of impact. Correlation between APCs and impact is moderate. Otherwise, in the econometric analysis we have shown that publication quality is strongly determined by journal quality in which it is published. International collaboration also plays an important role in citations score.

URL : https://arxiv.org/abs/2107.07348

Day-to-day discovery of preprint–publication links

Authors : Guillaume Cabanac, Theodora Oikonomidi, Isabelle Boutron

Preprints promote the open and fast communication of non-peer reviewed work. Once a preprint is published in a peer-reviewed venue, the preprint server updates its web page: a prominent hyperlink leading to the newly published work is added.

Linking preprints to publications is of utmost importance as it provides readers with the latest version of a now certified work. Yet leading preprint servers fail to identify all existing preprint–publication links.

This limitation calls for a more thorough approach to this critical information retrieval task: overlooking published evidence translates into partial and even inaccurate systematic reviews on health-related issues, for instance.

We designed an algorithm leveraging the Crossref public and free source of bibliographic metadata to comb the literature for preprint–publication links. We tested it on a reference preprint set identified and curated for a living systematic review on interventions for preventing and treating COVID-19 performed by international collaboration: the COVID-NMA initiative (covid-nma.com).

The reference set comprised 343 preprints, 121 of which appeared as a publication in a peer-reviewed journal. While the preprint servers identified 39.7% of the preprint–publication links, our linker identified 90.9% of the expected links with no clues taken from the preprint servers.

The accuracy of the proposed linker is 91.5% on this reference set, with 90.9% sensitivity and 91.9% specificity. This is a 16.26% increase in accuracy compared to that of preprint servers. We release this software as supplementary material to foster its integration into preprint servers’ workflows and enhance a daily preprint–publication chase that is useful to all readers, including systematic reviewers.

This preprint–publication linker currently provides day-to-day updates to the biomedical experts of the COVID-NMA initiative.

URL : Day-to-day discovery of preprint–publication links

DOI : https://doi.org/10.1007/s11192-021-03900-7