Why are these publications missing? Uncovering the reasons behind the exclusion of documents in free-access scholarly databases

Authors : Lorena Delgado-QuirósIsidro F. AguilloAlberto Martín-MartínEmilio Delgado López-CózarEnrique Orduña-MaleaJosé Luis Ortega

This study analyses the coverage of seven free-access bibliographic databases (Crossref, Dimensions—non-subscription version, Google Scholar, Lens, Microsoft Academic, Scilit, and Semantic Scholar) to identify the potential reasons that might cause the exclusion of scholarly documents and how they could influence coverage.

To do this, 116 k randomly selected bibliographic records from Crossref were used as a baseline. API endpoints and web scraping were used to query each database. The results show that coverage differences are mainly caused by the way each service builds their databases.

While classic bibliographic databases ingest almost the exact same content from Crossref (Lens and Scilit miss 0.1% and 0.2% of the records, respectively), academic search engines present lower coverage (Google Scholar does not find: 9.8%, Semantic Scholar: 10%, and Microsoft Academic: 12%). Coverage differences are mainly attributed to external factors, such as web accessibility and robot exclusion policies (39.2%–46%), and internal requirements that exclude secondary content (6.5%–11.6%).

In the case of Dimensions, the only classic bibliographic database with the lowest coverage (7.6%), internal selection criteria such as the indexation of full books instead of book chapters (65%) and the exclusion of secondary content (15%) are the main motives of missing publications.

URL : Why are these publications missing? Uncovering the reasons behind the exclusion of documents in free-access scholarly databases

DOI : https://doi.org/10.1002/asi.24839

How do journals deal with problematic articles. Editorial response of journals to articles commented in PubPeer

Authors : José-Luis Ortega, Lorena Delgado-Quirós

The aim of this article is to explore the editorial response of journals to research articles that may contain methodological errors or misconduct. A total of 17,244 articles commented on in PubPeer, a post-publication peer review site, were processed and classified according to several error and fraud categories.

Then, the editorial response (i.e., editorial notices) to these papers were retrieved from PubPeer, Retraction Watch, and PubMed to obtain the most comprehensive picture. The results show that only 21.5% of the articles that deserve an editorial notice (i.e., honest errors, methodological flaws, publishing fraud, manipulation) were corrected by the journal. This percentage would climb to 34% for 2019 publications.

This response is different between journals, but cross-sectional across all disciplines. Another interesting result is that high-impact journals suffer more from image manipulations, while plagiarism is more frequent in low-impact journals.

The study concludes with the observation that the journals have to improve their response to problematic articles.

URL : How do journals deal with problematic articles. Editorial response of journals to articles commented in PubPeer

DOI : https://doi.org/10.3145/epi.2023.ene.18

Classification and analysis of PubPeer comments: How a web journal club is used

Author : José Luis Ortega

This study explores the use of PubPeer by the scholarly community, to understand the issues discussed in an online journal club, the disciplines most commented on, and the characteristics of the most prolific users.

A sample of 39,985 posts about 24,779 publications were extracted from PubPeer in 2019 and 2020. These comments were divided into seven categories according to their degree of seriousness (Positive review, Critical review, Lack of information, Honest errors, Methodological flaws, Publishing fraud, and Manipulation).

The results show that more than two-thirds of comments are posted to report some type of misconduct, mainly about image manipulation. These comments generate most discussion and take longer to be posted. By discipline, Health Sciences and Life Sciences are the most discussed research areas.

The results also reveal “super commenters,” users who access the platform to systematically review publications. The study ends by discussing how various disciplines use the site for different purposes.

URL : Classification and analysis of PubPeer comments: How a web journal club is used

DOI : https://doi.org/10.1002/asi.24568

Altmetrics data providers: A meta-analysis review of the coverage of metrics and publication

Author : José-Luis Ortega

The aim of this paper is to review the current and most relevant literature on the use of altmetric providers since 2012. This review is supported by a meta-analysis of the coverage and metric counts obtained by more than 100 publications that have used these bibliographic platforms for altmetric studies.

The article is the most comprehensive analysis of altmetric data providers (Lagotto, Altmetric.com, ImpactStory, Mendeley, PlumX, Crossref Event Data) and explores the coverage of publications, social media and events from a longitudinal view. Disciplinary differences were also analysed.

The results show that most of the studies are based on Altmetric.com data. This provider is the service that captures most mentions from social media sites, blogs and news outlets. PlumX has better coverage, counting more Mendeley readers, but capturing fewer events.

CED has a special coverage of mentions from Wikipedia, while Lagotto and ImpactStory are becoming disused products because of their limited reach.

URL : Altmetrics data providers: A meta-analysis review of the coverage of metrics and publication

Original location : https://recyt.fecyt.es/index.php/EPI/article/view/epi.2020.ene.07