COCI, the OpenCitations Index of Crossref open DOI-to-DOI citations

Authors : Ivan Heibi, Silvio Peroni, David Shotton

In this paper, we present COCI, the OpenCitations Index of Crossref open DOI-to-DOI citations (this http URL). COCI is the first open citation index created by OpenCitations, in which we have applied the concept of citations as first-class data entities, and it contains more than 445 million DOI-to-DOI citation links derived from the data available in Crossref.

These citations are described in RDF by means of the newly extended version of the OpenCitations Data Model (OCDM).

We introduce the workflow we have developed for creating these data, and also show the additional services that facilitate the access to and querying of these data via different access points: a SPARQL endpoint, a REST API, bulk downloads, Web interfaces, and direct access to the citations via HTTP content negotiation.

Finally, we present statistics regarding the use of COCI citation data, and we introduce several projects that have already started to use COCI data for different purposes.

URL : https://arxiv.org/abs/1904.06052

Open data to evaluate academic researchers: an experiment with the Italian Scientific Habilitation

Authors : Angelo Di Iorio, Silvio Peroni, Francesco Poggi

The need for scholarly open data is ever increasing. While there are large repositories of open access articles and free publication indexes, there are still a few examples of free citation networks and their coverage is partial.

One of the results is that most of the evaluation processes based on citation counts rely on commercial citation databases. Things are changing under the pressure of the Initiative for Open Citations (I4OC), whose goal is to campaign for scholarly publishers to make their citations as totally open.

This paper investigates the growth of open citations with an experiment on the Italian Scientific Habilitation, the National process for University Professor qualification which instead uses data from commercial indexes.

We simulated the procedure by only using open data and explored similarities and differences with the official results. The outcomes of the experiment show that the amount of open citation data currently available is not yet enough for obtaining similar results.

URL : https://arxiv.org/abs/1902.03287

Do altmetrics work for assessing research quality?

Authors : Andrea Giovanni Nuzzolese, Paolo Ciancarini, Aldo Gangemi, Silvio Peroni, Francesco Poggi, Valentina Presutti

Alternative metrics (aka altmetrics) are gaining increasing interest in the scientometrics community as they can capture both the volume and quality of attention that a research work receives online.

Nevertheless, there is limited knowledge about their effectiveness as a mean for measuring the impact of research if compared to traditional citation-based indicators.

This work aims at rigorously investigating if any correlation exists among indicators, either traditional (i.e. citation count and h-index) or alternative (i.e. altmetrics) and which of them may be effective for evaluating scholars.

The study is based on the analysis of real data coming from the National Scientific Qualification procedure held in Italy by committees of peers on behalf of the Italian Ministry of Education, Universities and Research.

URL : https://arxiv.org/abs/1812.11813

Automating semantic publishing

Author : Silvio Peroni

Semantic Publishing involves the use of Web and Semantic Web technologies and standards for the semantic enhancement of a scholarly work so as to improve its discoverability, interactivity, openness and (re-)usability for both humans and machines.

Recently, people have suggested that the semantic enhancements of a scholarly work should be undertaken by the authors of that scholarly work, and should be considered as integral parts of the contribution subjected to peer review. However, this requires that the authors should spend additional time and effort adding such semantic annotations, time that they usually do not have available.

Thus, the most pragmatic way to facilitate this additional task is to use automated services that create the semantic annotation of authors’ scholarly articles by parsing the content that they have already written, thus reducing the additional time required of the authors to that for checking and validating these semantic annotations.

In this article, I propose a generic approach called compositional and iterative semantic enhancement (CISE) that enables the automatic enhancement of scholarly papers with additional semantic annotations in a way that is independent of the markup used for storing scholarly articles and the natural language used for writing their content.

URL : Automating semantic publishing

Alternative location : https://content.iospress.com/articles/data-science/ds012

Research Articles in Simplified HTML: a Web-first format for HTML-based scholarly articles

Authors : Silvio Peroni, Francesco Osborne, Angelo Di Iorio, Andrea Giovanni Nuzzolese, Francesco Poggi, Fabio Vitali, Enrico Motta

Purpose

This paper introduces the Research Articles in Simplified HTML (or RASH), which is a Web-first format for writing HTML-based scholarly papers; it is accompanied by the RASH Framework, i.e. a set tools for interacting with RASH-based articles. The paper also presents an evaluation that involved authors and reviewers of RASH articles, submitted to the SAVE-SD 2015 and SAVE-SD 2016 workshops.

Design

RASH has been developed in order to: be easy to learn and use; share scholarly documents (and embedded semantic annotations) through the Web; support its adoption within the existing publishing workflow.

Findings

The evaluation study confirmed that RASH can already be adopted in workshops, conferences and journals and can be quickly learnt by researchers who are familiar with HTML.

Research limitations

The evaluation study also highlighted some issues in the adoption of RASH, and in general of HTML formats, especially by less technical savvy users. Moreover, additional tools are needed, e.g. for enabling additional conversion from/to existing formats such as OpenXML.

Practical implications

RASH (and its Framework) is another step towards enabling the definition of formal representations of the meaning of the content of an article, facilitate its automatic discovery, enable its linking to semantically related articles, provide access to data within the article in actionable form, and allow integration of data between papers.

Social implications

RASH addresses the intrinsic needs related to the various users of a scholarly article: researchers (focussing on its content), readers (experiencing new ways for browsing it), citizen scientists (reusing available data formally defined within it through semantic annotations), publishers (using the advantages of new technologies as envisioned by the Semantic Publishing movement).

Value

RASH focuses strictly on writing the content of the paper (i.e., organisation of text + semantic annotations) and leaves all the issues about it validation, visualisation, conversion, and semantic data extraction to the various tools developed within its Framework.

URL : https://essepuntato.github.io/papers/rash-peerj2016.html