Deep Impact: A Study on the Impact of Data Papers and Datasets in the Humanities and Social Sciences

Authors : Barbara McGillivray, Paola Marongiu, Nilo Pedrazzini, Marton Ribary, Mandy Wigdorowitz, Eleonora Zordan

The humanities and social sciences (HSS) have recently witnessed an exponential growth in data-driven research. In response, attention has been afforded to datasets and accompanying data papers as outputs of the research and dissemination ecosystem.

In 2015, two data journals dedicated to HSS disciplines appeared in this landscape: Journal of Open Humanities Data (JOHD) and Research Data Journal for the Humanities and Social Sciences (RDJ).

In this paper, we analyse the state of the art in the landscape of data journals in HSS using JOHD and RDJ as exemplars by measuring performance and the deep impact of data-driven projects, including metrics (citation count; Altmetrics, views, downloads, tweets) of data papers in relation to associated research papers and the reuse of associated datasets.

Our findings indicate: that data papers are published following the deposit of datasets in a repository and usually following research articles; that data papers have a positive impact on both the metrics of research papers associated with them and on data reuse; and that Twitter hashtags targeted at specific research campaigns can lead to increases in data papers’ views and downloads.

HSS data papers improve the visibility of datasets they describe, support accompanying research articles, and add to transparency and the open research agenda.

URL : Deep Impact: A Study on the Impact of Data Papers and Datasets in the Humanities and Social Sciences

DOI : https://doi.org/10.3390/publications10040039

The citation advantage of linking publications to research data

Authors : Giovanni Colavizza, Iain Hrynaszkiewicz, Isla Staden, Kirstie Whitaker, Barbara McGillivray

Efforts to make research results open and reproducible are increasingly reflected by journal policies encouraging or mandating authors to provide data availability statements.

As a consequence of this, there has been a strong uptake of data availability statements in recent literature. Nevertheless, it is still unclear what proportion of these statements actually contain well-formed links to data, for example via a URL or permanent identifier, and if there is an added value in providing them.

We consider 531,889 journal articles published by PLOS and BMC which are part of the PubMed Open Access collection, categorize their data availability statements according to their content and analyze the citation advantage of different statement categories via regression.

We find that, following mandated publisher policies, data availability statements have become common by now, yet statements containing a link to a repository are still just a fraction of the total.

We also find that articles with these statements, in particular, can have up to 25.36% higher citation impact on average: an encouraging result for all publishers and authors who make the effort of sharing their data. All our data and code are made available in order to reproduce and extend our results.

URL : https://arxiv.org/abs/1907.02565

The relationship between usage and citations in an open access mega journal

Authors : Barbara McGillivray, Mathias Astell

How does usage of an article relate to the number of citations it accrues? Does the timeframe in which an article is used (and how much that article is used) have an effect on when and how much that article is cited?

What role does an article’s subject area play in the relationship between usage and citations? This paper aims to answer these questions through an observational study of usage and citation data collected about a multidisciplinary, open access mega journal, Scientific Reports.

We find that while the direct correlation between usage and citations is only moderate at best, the relationship between how early and how much an article is used and how early it is cited is much clearer. What is more, we find that when an article is cited earlier it is also cited more often, leading to the assertion that if an article is more highly accessed early on, it is more likely to be cited earlier and more often.

As Scientific Reports is a multidisciplinary journal covering all natural and clinical sciences, this study was also able to look at the differences across subject areas and found some interesting variations when comparing the major subject areas covered by the journal (i.e. biological, Earth, physical and health sciences).

URL : https://arxiv.org/abs/1902.01333