A quantitative and qualitative open citation analysis of retracted articles in the humanities

Authors : Ivan Heibi, Silvio Peroni

In this article, we show and discuss the results of a quantitative and qualitative analysis of open citations to retracted publications in the humanities domain. Our study was conducted by selecting retracted papers in the humanities domain and marking their main characteristics (e.g., retraction reason).

Then, we gathered the citing entities and annotated their basic metadata (e.g., title, venue, etc.) and the characteristics of their in-text citations (e.g., intent, sentiment, etc.). Using these data, we performed a quantitative and qualitative study of retractions in the humanities, presenting descriptive statistics and a topic modeling analysis of the citing entities’ abstracts and the in-text citation contexts.

As part of our main findings, we noticed that there was no drop in the overall number of citations after the year of retraction, with few entities which have either mentioned the retraction or expressed a negative sentiment toward the cited publication.

In addition, on several occasions, we noticed a higher concern/awareness when it was about citing a retracted publication, by the citing entities belonging to the health sciences domain, if compared to the humanities and the social science domains. Philosophy, arts, and history are the humanities areas that showed the higher concerns toward the retraction.

URL : A quantitative and qualitative open citation analysis of retracted articles in the humanities

DOI : https://doi.org/10.1162/qss_a_00222

A qualitative and quantitative analysis of open citations to retracted articles: the Wakefield 1998 et al.’s case

Authors : Ivan Heibi, Silvio Peroni

In this article, we show the results of a quantitative and qualitative analysis of open citations on a popular and highly cited retracted paper: “Ileal-lymphoid-nodular hyperplasia, non-specific colitis and pervasive developmental disorder in children” by Wakefield et al., published in 1998.

The main purpose of our study is to understand the behavior of the publications citing one retracted article and the characteristics of the citations the retracted article accumulated over time. Our analysis is based on a methodology which illustrates how we gathered the data, extracted the topics of the citing articles and visualized the results.

The data and services used are all open and free to foster the reproducibility of the analysis. The outcomes concerned the analysis of the entities citing Wakefield et al.’s article and their related in-text citations. We observed a constant increasing number of citations in the last 20 years, accompanied with a constant increment in the percentage of those acknowledging its retraction.

Citing articles have started either discussing or dealing with the retraction of Wakefield et al.’s article even before its full retraction happened in 2010. Articles in the social sciences domain citing the Wakefield et al.’s one were among those that have mostly discussed its retraction.

In addition, when observing the in-text citations, we noticed that a large number of the citations received by Wakefield et al.’s article has focused on general discussions without recalling strictly medical details, especially after the full retraction.

Medical studies did not hesitate in acknowledging the retraction of the Wakefield et al.’s article and often provided strong negative statements on it.

URL : A qualitative and quantitative analysis of open citations to retracted articles: the Wakefield 1998 et al.’s case

DOI : https://doi.org/10.1007/s11192-021-04097-5

OpenCitations, an infrastructure organization for open scholarship

Authors : Silvio Peroni, David Shotton

OpenCitations is an infrastructure organization for open scholarship dedicated to the publication of open citation data as Linked Open Data using Semantic Web technologies, thereby providing a disruptive alternative to traditional proprietary citation indexes.

Open citation data are valuable for bibliometric analysis, increasing the reproducibility of large-scale analyses by enabling publication of the source data. Following brief introductions to the development and benefits of open scholarship and to Semantic Web technologies, this paper describes OpenCitations and its data sets, tools, services, and activities.

These include the OpenCitations Data Model; the SPAR (Semantic Publishing and Referencing) Ontologies; OpenCitations’ open software of generic applicability for searching, browsing, and providing REST APIs over resource description framework (RDF) triplestores; Open Citation Identifiers (OCIs) and the OpenCitations OCI Resolution Service; the OpenCitations Corpus (OCC), a database of open downloadable bibliographic and citation data made available in RDF under a Creative Commons public domain dedication; and the OpenCitations Indexes of open citation data, of which the first and largest is COCI, the OpenCitations Index of Crossref Open DOI-to-DOI Citations, which currently contains over 624 million bibliographic citations and is receiving considerable usage by the scholarly community.

URL : OpenCitations, an infrastructure organization for open scholarship

DOI : https://doi.org/10.1162/qss_a_00023

COCI, the OpenCitations Index of Crossref open DOI-to-DOI citations

Authors : Ivan Heibi, Silvio Peroni, David Shotton

In this paper, we present COCI, the OpenCitations Index of Crossref open DOI-to-DOI citations (this http URL). COCI is the first open citation index created by OpenCitations, in which we have applied the concept of citations as first-class data entities, and it contains more than 445 million DOI-to-DOI citation links derived from the data available in Crossref.

These citations are described in RDF by means of the newly extended version of the OpenCitations Data Model (OCDM).

We introduce the workflow we have developed for creating these data, and also show the additional services that facilitate the access to and querying of these data via different access points: a SPARQL endpoint, a REST API, bulk downloads, Web interfaces, and direct access to the citations via HTTP content negotiation.

Finally, we present statistics regarding the use of COCI citation data, and we introduce several projects that have already started to use COCI data for different purposes.

URL : https://arxiv.org/abs/1904.06052

Open data to evaluate academic researchers: an experiment with the Italian Scientific Habilitation

Authors : Angelo Di Iorio, Silvio Peroni, Francesco Poggi

The need for scholarly open data is ever increasing. While there are large repositories of open access articles and free publication indexes, there are still a few examples of free citation networks and their coverage is partial.

One of the results is that most of the evaluation processes based on citation counts rely on commercial citation databases. Things are changing under the pressure of the Initiative for Open Citations (I4OC), whose goal is to campaign for scholarly publishers to make their citations as totally open.

This paper investigates the growth of open citations with an experiment on the Italian Scientific Habilitation, the National process for University Professor qualification which instead uses data from commercial indexes.

We simulated the procedure by only using open data and explored similarities and differences with the official results. The outcomes of the experiment show that the amount of open citation data currently available is not yet enough for obtaining similar results.

URL : https://arxiv.org/abs/1902.03287

Do altmetrics work for assessing research quality?

Authors : Andrea Giovanni Nuzzolese, Paolo Ciancarini, Aldo Gangemi, Silvio Peroni, Francesco Poggi, Valentina Presutti

Alternative metrics (aka altmetrics) are gaining increasing interest in the scientometrics community as they can capture both the volume and quality of attention that a research work receives online.

Nevertheless, there is limited knowledge about their effectiveness as a mean for measuring the impact of research if compared to traditional citation-based indicators.

This work aims at rigorously investigating if any correlation exists among indicators, either traditional (i.e. citation count and h-index) or alternative (i.e. altmetrics) and which of them may be effective for evaluating scholars.

The study is based on the analysis of real data coming from the National Scientific Qualification procedure held in Italy by committees of peers on behalf of the Italian Ministry of Education, Universities and Research.

URL : https://arxiv.org/abs/1812.11813

Automating semantic publishing

Author : Silvio Peroni

Semantic Publishing involves the use of Web and Semantic Web technologies and standards for the semantic enhancement of a scholarly work so as to improve its discoverability, interactivity, openness and (re-)usability for both humans and machines.

Recently, people have suggested that the semantic enhancements of a scholarly work should be undertaken by the authors of that scholarly work, and should be considered as integral parts of the contribution subjected to peer review. However, this requires that the authors should spend additional time and effort adding such semantic annotations, time that they usually do not have available.

Thus, the most pragmatic way to facilitate this additional task is to use automated services that create the semantic annotation of authors’ scholarly articles by parsing the content that they have already written, thus reducing the additional time required of the authors to that for checking and validating these semantic annotations.

In this article, I propose a generic approach called compositional and iterative semantic enhancement (CISE) that enables the automatic enhancement of scholarly papers with additional semantic annotations in a way that is independent of the markup used for storing scholarly articles and the natural language used for writing their content.

URL : Automating semantic publishing

Alternative location : https://content.iospress.com/articles/data-science/ds012