Reproducible research practices, transparency, and open access data in the biomedical literature, 2015–2017

Authors : Joshua D. Wallach, Kevin W. Boyack, John P. A. Ioannidis

Currently, there is a growing interest in ensuring the transparency and reproducibility of the published scientific literature. According to a previous evaluation of 441 biomedical journals articles published in 2000–2014, the biomedical literature largely lacked transparency in important dimensions.

Here, we surveyed a random sample of 149 biomedical articles published between 2015 and 2017 and determined the proportion reporting sources of public and/or private funding and conflicts of interests, sharing protocols and raw data, and undergoing rigorous independent replication and reproducibility checks.

We also investigated what can be learned about reproducibility and transparency indicators from open access data provided on PubMed. The majority of the 149 studies disclosed some information regarding funding (103, 69.1% [95% confidence interval, 61.0% to 76.3%]) or conflicts of interest (97, 65.1% [56.8% to 72.6%]).

Among the 104 articles with empirical data in which protocols or data sharing would be pertinent, 19 (18.3% [11.6% to 27.3%]) discussed publicly available data; only one (1.0% [0.1% to 6.0%]) included a link to a full study protocol. Among the 97 articles in which replication in studies with different data would be pertinent, there were five replication efforts (5.2% [1.9% to 12.2%]).

Although clinical trial identification numbers and funding details were often provided on PubMed, only two of the articles without a full text article in PubMed Central that discussed publicly available data at the full text level also contained information related to data sharing on PubMed; none had a conflicts of interest statement on PubMed.

Our evaluation suggests that although there have been improvements over the last few years in certain key indicators of reproducibility and transparency, opportunities exist to improve reproducible research practices across the biomedical literature and to make features related to reproducibility more readily visible in PubMed.

URL : Reproducible research practices, transparency, and open access data in the biomedical literature, 2015–2017


Open Science by Design

Contributors : National Academies of Sciences, Engineering, and Medicine; Policy and Global Affairs; Board on Research Data and Information; Committee on Toward an Open Science Enterprise

Openness and sharing of information are fundamental to the progress of science and to the effective functioning of the research enterprise. The advent of scientific journals in the 17th century helped power the Scientific Revolution by allowing researchers to communicate across time and space, using the technologies of that era to generate reliable knowledge more quickly and efficiently.

Harnessing today’s stunning, ongoing advances in information technologies, the global research enterprise and its stakeholders are moving toward a new open science ecosystem.

Open science aims to ensure the free availability and usability of scholarly publications, the data that result from scholarly research, and the methodologies, including code or algorithms, that were used to generate those data.

Open Science by Design is aimed at overcoming barriers and moving toward open science as the default approach across the research enterprise.

This report explores specific examples of open science and discusses a range of challenges, focusing on stakeholder perspectives. It is meant to provide guidance to the research enterprise and its stakeholders as they build strategies for achieving open science and take the next steps.


The History, Advocacy and Efficacy of Data Management Plans

Authors : Nicholas Smale, Kathryn Unsworth, Gareth Denyer, Daniel Barr

Data management plans (DMPs) have increasingly been encouraged as a key component of institutional and funding body policy. Although DMPs necessarily place administrative burden on researchers, proponents claim that DMPs have myriad benefits, including enhanced research data quality, increased rates of data sharing, and institutional planning and compliance benefits.

In this manuscript, we explore the international history of DMPs and describe institutional and funding body DMP policy. We find that economic and societal benefits from presumed increased rates of data sharing was the original driver of mandating DMPs by funding bodies.

Today, 86% of UK Research Councils and 63% of US funding bodies require submission of a DMP with funding applications. Given that no major Australian funding bodies require DMP submission, it is of note that 37% of Australian universities have taken the initiative to internally mandate DMPs.

Institutions both within Australia and internationally frequently promote the professional benefits of DMP use, and endorse DMPs as ‘best practice’. We analyse one such typical DMP implementation at a major Australian institution, finding that DMPs have low levels of apparent translational value.

Indeed, an extensive literature review suggests there is very limited published systematic evidence that DMP use has any tangible benefit for researchers, institutions or funding bodies.

We are therefore led to question why DMPs have become the go-to tool for research data professionals and advocates of good data practice. By delineating multiple use-cases and highlighting the need for DMPs to be fit for intended purpose, we question the view that a good DMP is necessarily that which encompasses the entire data lifecycle of a project.

Finally, we summarise recent developments in the DMP landscape, and note a positive shift towards evidence-based research management through more researcher-centric, educative, and integrated DMP services.

URL : The History, Advocacy and Efficacy of Data Management Plans


Sharing health research data – the role of funders in improving the impact

Authors : Robert F. Terry, Katherine Littler, Piero L. Olliaro

Recent public health emergencies with outbreaks of influenza, Ebola and Zika revealed that the mechanisms for sharing research data are neither being used, or adequate for the purpose, particularly where data needs to be shared rapidly.

A review of research papers, including completed clinical trials related to priority pathogens, found only 31% (98 out of 319 published papers, excluding case studies) provided access to all the data underlying the paper – 65% of these papers give no information on how to find or access the data.

Only two clinical trials out of 58 on interventions for WHO priority pathogens provided any link in their registry entry to the background data.

Interviews with researchers revealed a reluctance to share data included a lack of confidence in the utility of the data; an absence of academic-incentives for rapid dissemination that prevents subsequent publication and a disconnect between those who are collecting the data and those who wish to use it quickly.

The role of the funders of research needs to change to address this. Funders need to engage early with the researchers and related stakeholders to understand their concerns and work harder to define the more explicitly the benefits to all stakeholders.

Secondly, there needs to be a direct benefit to sharing data that is directly relevant to those people that collect and curate the data.

Thirdly more work needs to be done to realise the intent of making data sharing resources more equitable, ethical and efficient.

Finally, a checklist of the issues that need to be addressed when designing new or revising existing data sharing resources should be created. This checklist would highlight the technical, cultural and ethical issues that need to be considered and point to examples of emerging good practice that can be used to address them.

URL : Sharing health research data – the role of funders in improving the impact


Evaluation of a novel cloud-based software platform for structured experiment design and linked data analytics

Authors : Hannes Juergens, Matthijs Niemeijer, Laura D. Jennings-Antipov, Robert Mans, Jack More, Antonius J. A. van Maris, Jack T. Pronk, Timothy S. Gardner

Open data in science requires precise definition of experimental procedures used in data generation, but traditional practices for sharing protocols and data cannot provide the required data contextualization.

Here, we explore implementation, in an academic research setting, of a novel cloud-based software system designed to address this challenge. The software supports systematic definition of experimental procedures as visual processes, acquisition and analysis of primary data, and linking of data and procedures in machine-computable form.

The software was tested on a set of quantitative microbial-physiology experiments. Though time-intensive, definition of experimental procedures in the software enabled much more precise, unambiguous definitions of experiments than conventional protocols.

Once defined, processes were easily reusable and composable into more complex experimental flows. Automatic coupling of process definitions to experimental data enables immediate identification of correlations between procedural details, intended and unintended experimental perturbations, and experimental outcomes.

Software-based experiment descriptions could ultimately replace terse and ambiguous ‘Materials and Methods’ sections in scientific journals, thus promoting reproducibility and reusability of published studies.

URL : Evaluation of a novel cloud-based software platform for structured experiment design and linked data analytics


Facilitating and Improving Environmental Research Data Repository Interoperability

Authors : Corinna Gries, Amber Budden, Christine Laney, Margaret O’Brien, Mark Servilla, Wade Sheldon, Kristin Vanderbilt, David Vieglais

Environmental research data repositories provide much needed services for data preservation and data dissemination to diverse communities with domain specific or programmatic data needs and standards.

Due to independent development these repositories serve their communities well, but were developed with different technologies, data models and using different ontologies. Hence, the effectiveness and efficiency of these services can be vastly improved if repositories work together adhering to a shared community platform that focuses on the implementation of agreed upon standards and best practices for curation and dissemination of data.

Such a community platform drives forward the convergence of technologies and practices that will advance cross-domain interoperability. It will also facilitate contributions from investigators through standardized and streamlined workflows and provide increased visibility for the role of data managers and the curation services provided by data repositories, beyond preservation infrastructure.

Ten specific suggestions for such standardizations are outlined without any suggestions for priority or technical implementation. Although the recommendations are for repositories to implement, they have been chosen specifically with the data provider/data curator and synthesis scientist in mind.

URL : Facilitating and Improving Environmental Research Data Repository Interoperability


How are we Measuring Up? Evaluating Research Data Services in Academic Libraries

Authors : Heather L. Coates, Jake Carlson, Ryan Clement, Margaret Henderson, Lisa R Johnston, Yasmeen Shorish


In the years since the emergence of federal funding agency data management and sharing requirements (, research data services (RDS) have expanded to dozens of academic libraries in the United States.

As these services have matured, service providers have begun to assess them. Given a lack of practical guidance in the literature, we seek to begin the discussion with several case studies and an exploration of four approaches suitable to assessing these emerging services.


This article examines five case studies that vary by staffing, drivers, and institutional context in order to begin a practice-oriented conversation about how to evaluate and assess research data services in academic libraries.

The case studies highlight some commonly discussed challenges, including insufficient training and resources, competing demands for evaluation efforts, and the tension between evidence that can be easily gathered and that which addresses our most important questions.

We explore reflective practice, formative evaluation, developmental evaluation, and evidence-based library and information practice for ideas to advance practice.


Data specialists engaged in providing research data services need strategies and tools with which to make decisions about their services. These range from identifying stakeholder needs to refining existing services to determining when to extend and discontinue declining services.

While the landscape of research data services is broad and diverse, there are common needs that we can address as a community. To that end, we have created a community-owned space to facilitate the exchange of knowledge and existing resources.

URL : How are we Measuring Up? Evaluating Research Data Services in Academic Libraries