The Time Efficiency Gain in Sharing and Reuse of Research Data

Author: Tessa E. Pronk

Among the frequently stated benefits of sharing research data are time efficiency or increased productivity. The assumption is that reuse or secondary use of research data saves researchers time in not having to produce data for a publication themselves.

This can make science more efficient and productive. However, if there is no reuse, time costs in making data available for reuse will have been made with no return on this investment.

In this paper a mathematical model is used to calculate the break-even point for time spent sharing in a scientific community, versus time gain by reuse. This is done for several scenarios; from simple to complex datasets to share and reuse, and at different sharing rates.

The results indicate that sharing research data can indeed cause an efficiency revenue for the scientific community. However, this is not a given in all modeled scenarios.

The scientific community with the lowest reuse needed to reach a break-even point is one that has few sharing researchers and low time investments for sharing and reuse.

This suggests it would be beneficial to have a critical selection of datasets that are worth the effort to prepare for reuse in other scientific studies. In addition, stimulating reuse of datasets in itself would be beneficial to increase efficiency in scientific communities.

URL : The Time Efficiency Gain in Sharing and Reuse of Research Data


A cross sectional study of retraction notices of scholarly journals of science

Authors : Manorama Tripathi, Sharad Kumar Sonkar, Sunil Kumar

Retraction is the withdrawal of published article after it is found that the authors did not ensure integrity in conducting and reporting their research activities. The bibliometric information of 4716 document categorised as retractions in Science Citation Index, Web of Science was downloaded and analysed to understand trend, pattern and reasons of retraction.

The results showed that retractions had increased during the ten-year period, 2008-2017. The main reasons for retractions were plagiarism, falsified data, manipulation of images and figures. It was also found that just 40 out of 4716 retraction notices had explicitly stated reasons for retracting the published articles.

The open access journals had more number of retractions as compared to subscription based journals. The study will guide library professionals and research scholars towards a better comprehension of the reasons behind retractions in science discipline in the ten-year period.

They would be better equipped to steer clear of inauthentic publications in their citations and references.

URL : A cross sectional study of retraction notices of scholarly journals of science

Alternative location :

Ten myths around open scholarly publishing

Authors : Jonathan P Tennant​​​, Harry Crane​​, Tom Crick​​, Jacinto Davila​, Asura Enkhbayar​​, Johanna Havemann​​, Bianca Kramer​​, Ryan Martin​​, Paola Masuzzo​​, Andy Nobes​​, Curt Rice​​, Bárbara R López​​, Tony Ross-Hellauer​​, Susanne Sattler​​, Paul Thacker​​, MarcVanholsbeeck

The changing world of scholarly communication and the emergence of ‘Open Science’ or ‘Open Research’ has brought to light a number of controversial and hotly-debated topics.

Yet, evidence-based rational debate is regularly drowned out by misinformed or exaggerated rhetoric, which does not benefit the evolving system of scholarly communication.

The aim of this article is to provide a baseline evidence framework for ten of the most contested topics, in order to help frame and move forward discussions, practices and policies. We address preprints and scooping, the practice of copyright transfer, the function of peer review, and the legitimacy of ‘global’ databases.

The presented facts and data will be a powerful tool against misinformation across wider academic research, policy and practice, and may be used to inform changes within the rapidly evolving scholarly publishing system.

URL : Ten myths around open scholarly publishing


Searching Data: A Review of Observational Data Retrieval Practices in Selected Disciplines

Authors : Kathleen Gregory, Paul Groth, Helena Cousijn, Andrea Scharnhorst, Sally Wyatt

A cross‐disciplinary examination of the user behaviors involved in seeking and evaluating data is surprisingly absent from the research data discussion. This review explores the data retrieval literature to identify commonalities in how users search for and evaluate observational research data in selected disciplines.

Two analytical frameworks, rooted in information retrieval and science and technology studies, are used to identify key similarities in practices as a first step toward developing a model describing data retrieval.

URL : Searching Data: A Review of Observational Data Retrieval Practices in Selected Disciplines


The impact of the open-access status on journal indices: a review of medical journals

Authors : Saif Aldeen AlRyalat, Mohammad Saleh, Mohammad Alaqraa, Alaa Alfukaha, Yara Alkayed, Maryann Abaza, Hadeel Abu Saa, Mohamed Alshamiry


Over the past few decades, there has been an increase in the number of open access (OA) journals in almost all disciplines. This increase in OA journals was accompanied an increase in funding to support such movements.

Medical fields are among the highest funded fields, which further promoted its journals to move toward OA publishing. Here, we aim to compare OA and non-OA journals in terms of citation metrics and other indices.


We collected data on the included journals from Scopus Source List on 1st November 2018.  We filtered the list for medical journals only. For each journal, we extracted data regarding citation metrics, scholarly output, and wither the journal is OA or non-OA.


On the 2017 Scopus list of journals, there was 5835 medical journals. Upon analyzing the difference between medical OA and non-OA journals, we found that OA journals had a significantly higher CiteScore (p< 0.001), percent cited (p< 0.001), and source normalized impact per paper (SNIP) (p< 0.001), whereas non-OA journals had higher scholarly output (p< 0.001).

Among the five largest journal publishers, Springer Nature published the highest frequency of OA articles (31.5%), while Wiley-Blackwell had the lowest frequency among its medical journals (4.4%).


Among medical journals, although non-OA journals still have higher output in terms of articles per year, OA journals have higher citation metrics.

URL : The impact of the open-access status on journal indices: a review of medical journals

Opening Up Open Access Institutional Repositories to Demonstrate Value: Two Universities’ Pilots on Including Metadata-Only Records

Authors: Karen Bjork, Rebel Cummings-Sauls, Ryan Otto


Institutional repository managers are continuously looking for new ways to demonstrate the value of their repositories. One way to do this is to create a more inclusive repository that provides reliable information about the research output produced by faculty affiliated with the institution.


This article details two pilot projects that evaluated how their repositories could track faculty research output through the inclusion of metadata-only (no full-text) records.

The purpose of each pilot project was to determine the feasibility and provide an assessment of the long-term impact on the repository’s mission statement, staffing, and collection development policies.


This article shares the results of the pilot project and explores the impact for faculty and end users as well as the implications for repositories.

URL : Opening Up Open Access Institutional Repositories to Demonstrate Value: Two Universities’ Pilots on Including Metadata-Only Records


Blockchain and OECD data repositories: opportunities and policymaking implications

Authors : Miguel-Angel Sicilia, Anna Visvizi


The purpose of this paper is to employ the case of Organization for Economic Cooperation and Development (OECD) data repositories to examine the potential of blockchain technology in the context of addressing basic contemporary societal concerns, such as transparency, accountability and trust in the policymaking process. Current approaches to sharing data employ standardized metadata, in which the provider of the service is assumed to be a trusted party.

However, derived data, analytic processes or links from policies, are in many cases not shared in the same form, thus breaking the provenance trace and making the repetition of analysis conducted in the past difficult. Similarly, it becomes tricky to test whether certain conditions justifying policies implemented still apply.

A higher level of reuse would require a decentralized approach to sharing both data and analytic scripts and software. This could be supported by a combination of blockchain and decentralized file system technology.


The findings presented in this paper have been derived from an analysis of a case study, i.e., analytics using data made available by the OECD. The set of data the OECD provides is vast and is used broadly.

The argument is structured as follows. First, current issues and topics shaping the debate on blockchain are outlined. Then, a redefinition of the main artifacts on which some simple or convoluted analytic results are based is revised for some concrete purposes.

The requirements on provenance, trust and repeatability are discussed with regards to the architecture proposed, and a proof of concept using smart contracts is used for reasoning on relevant scenarios.


A combination of decentralized file systems and an open blockchain such as Ethereum supporting smart contracts can ascertain that the set of artifacts used for the analytics is shared. This enables the sequence underlying the successive stages of research and/or policymaking to be preserved.

This suggests that, in turn, and ex post, it becomes possible to test whether evidence supporting certain findings and/or policy decisions still hold. Moreover, unlike traditional databases, blockchain technology makes it possible that immutable records can be stored.

This means that the artifacts can be used for further exploitation or repetition of results. In practical terms, the use of blockchain technology creates the opportunity to enhance the evidence-based approach to policy design and policy recommendations that the OECD fosters.

That is, it might enable the stakeholders not only to use the data available in the OECD repositories but also to assess corrections to a given policy strategy or modify its scope.

Research limitations/implications

Blockchains and related technologies are still maturing, and several questions related to their use and potential remain underexplored. Several issues require particular consideration in future research, including anonymity, scalability and stability of the data repository.

This research took as example OECD data repositories, precisely to make the point that more research and more dialogue between the research and policymaking community is needed to embrace the challenges and opportunities blockchain technology generates.

Several questions that this research prompts have not been addressed. For instance, the question of how the sharing economy concept for the specifics of the case could be employed in the context of blockchain has not been dealt with.

Practical implications

The practical implications of the research presented here can be summarized in two ways. On the one hand, by suggesting how a combination of decentralized file systems and an open blockchain, such as Ethereum supporting smart contracts, can ascertain that artifacts are shared, this paper paves the way toward a discussion on how to make this approach and solution reality.

The approach and architecture proposed in this paper would provide a way to increase the scope of the reuse of statistical data and results and thus would improve the effectiveness of decision making as well as the transparency of the evidence supporting policy.

Social implications

Decentralizing analytic artifacts will add to existing open data practices an additional layer of benefits for different actors, including but not limited to policymakers, journalists, analysts and/or researchers without the need to establish centrally managed institutions.

Moreover, due to the degree of decentralization and absence of a single-entry point, the vulnerability of data repositories to cyberthreats might be reduced. Simultaneously, by ensuring that artifacts derived from data based in those distributed depositories are made immutable therein, full reproducibility of conclusions concerning the data is possible.

In the field of data-driven policymaking processes, it might allow policymakers to devise more accurate ways of addressing pressing issues and challenges.


This paper offers the first blueprint of a form of sharing that complements open data practices with the decentralized approach of blockchain and decentralized file systems.

The case of OECD data repositories is used to highlight that while data storing is important, the real added value of blockchain technology rests in the possible change on how we use the data and data sets in the repositories. It would eventually enable a more transparent and actionable approach to linking policy up with the supporting evidence.

From a different angle, throughout the paper the case is made that rather than simply data, artifacts from conducted analyses should be made persistent in a blockchain.

What is at stake is the full reproducibility of conclusions based on a given set of data, coupled with the possibility of ex post testing the validity of the assumptions and evidence underlying those conclusions.