Open Science by Design

Contributors : National Academies of Sciences, Engineering, and Medicine; Policy and Global Affairs; Board on Research Data and Information; Committee on Toward an Open Science Enterprise

Openness and sharing of information are fundamental to the progress of science and to the effective functioning of the research enterprise. The advent of scientific journals in the 17th century helped power the Scientific Revolution by allowing researchers to communicate across time and space, using the technologies of that era to generate reliable knowledge more quickly and efficiently.

Harnessing today’s stunning, ongoing advances in information technologies, the global research enterprise and its stakeholders are moving toward a new open science ecosystem.

Open science aims to ensure the free availability and usability of scholarly publications, the data that result from scholarly research, and the methodologies, including code or algorithms, that were used to generate those data.

Open Science by Design is aimed at overcoming barriers and moving toward open science as the default approach across the research enterprise.

This report explores specific examples of open science and discusses a range of challenges, focusing on stakeholder perspectives. It is meant to provide guidance to the research enterprise and its stakeholders as they build strategies for achieving open science and take the next steps.


The History, Advocacy and Efficacy of Data Management Plans

Authors : Nicholas Smale, Kathryn Unsworth, Gareth Denyer, Daniel Barr

Data management plans (DMPs) have increasingly been encouraged as a key component of institutional and funding body policy. Although DMPs necessarily place administrative burden on researchers, proponents claim that DMPs have myriad benefits, including enhanced research data quality, increased rates of data sharing, and institutional planning and compliance benefits.

In this manuscript, we explore the international history of DMPs and describe institutional and funding body DMP policy. We find that economic and societal benefits from presumed increased rates of data sharing was the original driver of mandating DMPs by funding bodies.

Today, 86% of UK Research Councils and 63% of US funding bodies require submission of a DMP with funding applications. Given that no major Australian funding bodies require DMP submission, it is of note that 37% of Australian universities have taken the initiative to internally mandate DMPs.

Institutions both within Australia and internationally frequently promote the professional benefits of DMP use, and endorse DMPs as ‘best practice’. We analyse one such typical DMP implementation at a major Australian institution, finding that DMPs have low levels of apparent translational value.

Indeed, an extensive literature review suggests there is very limited published systematic evidence that DMP use has any tangible benefit for researchers, institutions or funding bodies.

We are therefore led to question why DMPs have become the go-to tool for research data professionals and advocates of good data practice. By delineating multiple use-cases and highlighting the need for DMPs to be fit for intended purpose, we question the view that a good DMP is necessarily that which encompasses the entire data lifecycle of a project.

Finally, we summarise recent developments in the DMP landscape, and note a positive shift towards evidence-based research management through more researcher-centric, educative, and integrated DMP services.

URL : The History, Advocacy and Efficacy of Data Management Plans


Sharing health research data – the role of funders in improving the impact

Authors : Robert F. Terry, Katherine Littler, Piero L. Olliaro

Recent public health emergencies with outbreaks of influenza, Ebola and Zika revealed that the mechanisms for sharing research data are neither being used, or adequate for the purpose, particularly where data needs to be shared rapidly.

A review of research papers, including completed clinical trials related to priority pathogens, found only 31% (98 out of 319 published papers, excluding case studies) provided access to all the data underlying the paper – 65% of these papers give no information on how to find or access the data.

Only two clinical trials out of 58 on interventions for WHO priority pathogens provided any link in their registry entry to the background data.

Interviews with researchers revealed a reluctance to share data included a lack of confidence in the utility of the data; an absence of academic-incentives for rapid dissemination that prevents subsequent publication and a disconnect between those who are collecting the data and those who wish to use it quickly.

The role of the funders of research needs to change to address this. Funders need to engage early with the researchers and related stakeholders to understand their concerns and work harder to define the more explicitly the benefits to all stakeholders.

Secondly, there needs to be a direct benefit to sharing data that is directly relevant to those people that collect and curate the data.

Thirdly more work needs to be done to realise the intent of making data sharing resources more equitable, ethical and efficient.

Finally, a checklist of the issues that need to be addressed when designing new or revising existing data sharing resources should be created. This checklist would highlight the technical, cultural and ethical issues that need to be considered and point to examples of emerging good practice that can be used to address them.

URL : Sharing health research data – the role of funders in improving the impact


Evaluation of a novel cloud-based software platform for structured experiment design and linked data analytics

Authors : Hannes Juergens, Matthijs Niemeijer, Laura D. Jennings-Antipov, Robert Mans, Jack More, Antonius J. A. van Maris, Jack T. Pronk, Timothy S. Gardner

Open data in science requires precise definition of experimental procedures used in data generation, but traditional practices for sharing protocols and data cannot provide the required data contextualization.

Here, we explore implementation, in an academic research setting, of a novel cloud-based software system designed to address this challenge. The software supports systematic definition of experimental procedures as visual processes, acquisition and analysis of primary data, and linking of data and procedures in machine-computable form.

The software was tested on a set of quantitative microbial-physiology experiments. Though time-intensive, definition of experimental procedures in the software enabled much more precise, unambiguous definitions of experiments than conventional protocols.

Once defined, processes were easily reusable and composable into more complex experimental flows. Automatic coupling of process definitions to experimental data enables immediate identification of correlations between procedural details, intended and unintended experimental perturbations, and experimental outcomes.

Software-based experiment descriptions could ultimately replace terse and ambiguous ‘Materials and Methods’ sections in scientific journals, thus promoting reproducibility and reusability of published studies.

URL : Evaluation of a novel cloud-based software platform for structured experiment design and linked data analytics


Facilitating and Improving Environmental Research Data Repository Interoperability

Authors : Corinna Gries, Amber Budden, Christine Laney, Margaret O’Brien, Mark Servilla, Wade Sheldon, Kristin Vanderbilt, David Vieglais

Environmental research data repositories provide much needed services for data preservation and data dissemination to diverse communities with domain specific or programmatic data needs and standards.

Due to independent development these repositories serve their communities well, but were developed with different technologies, data models and using different ontologies. Hence, the effectiveness and efficiency of these services can be vastly improved if repositories work together adhering to a shared community platform that focuses on the implementation of agreed upon standards and best practices for curation and dissemination of data.

Such a community platform drives forward the convergence of technologies and practices that will advance cross-domain interoperability. It will also facilitate contributions from investigators through standardized and streamlined workflows and provide increased visibility for the role of data managers and the curation services provided by data repositories, beyond preservation infrastructure.

Ten specific suggestions for such standardizations are outlined without any suggestions for priority or technical implementation. Although the recommendations are for repositories to implement, they have been chosen specifically with the data provider/data curator and synthesis scientist in mind.

URL : Facilitating and Improving Environmental Research Data Repository Interoperability


How are we Measuring Up? Evaluating Research Data Services in Academic Libraries

Authors : Heather L. Coates, Jake Carlson, Ryan Clement, Margaret Henderson, Lisa R Johnston, Yasmeen Shorish


In the years since the emergence of federal funding agency data management and sharing requirements (, research data services (RDS) have expanded to dozens of academic libraries in the United States.

As these services have matured, service providers have begun to assess them. Given a lack of practical guidance in the literature, we seek to begin the discussion with several case studies and an exploration of four approaches suitable to assessing these emerging services.


This article examines five case studies that vary by staffing, drivers, and institutional context in order to begin a practice-oriented conversation about how to evaluate and assess research data services in academic libraries.

The case studies highlight some commonly discussed challenges, including insufficient training and resources, competing demands for evaluation efforts, and the tension between evidence that can be easily gathered and that which addresses our most important questions.

We explore reflective practice, formative evaluation, developmental evaluation, and evidence-based library and information practice for ideas to advance practice.


Data specialists engaged in providing research data services need strategies and tools with which to make decisions about their services. These range from identifying stakeholder needs to refining existing services to determining when to extend and discontinue declining services.

While the landscape of research data services is broad and diverse, there are common needs that we can address as a community. To that end, we have created a community-owned space to facilitate the exchange of knowledge and existing resources.

URL : How are we Measuring Up? Evaluating Research Data Services in Academic Libraries


Vers une culture de la donnée en SHS : Une étude à l’Université de Lille

Auteur/Author : Joachim Schöpfel

La science ouverte figure parmi les priorités de l’Etat français. Dans la continuité des chantiers engagés par le gouvernement français sur la transformation numérique de l’Etat et sa modernisation, le deuxième plan d’action national 2018-2020 “Pour une action publique transparente et collaborative” précise que la France « soutient la mise en œuvre des principes du gouvernement ouvert pour renforcer (…) l’accès aux matériaux et résultats de la recherche ».

Le plan national pour la science ouverte, présenté début juillet 2018, a confirmé cette ambition. L’objectif est que les données produites par la recherche publique soient progressivement structurées en conformité avec les principes FAIR, préservées et, quand cela est possible, ouvertes.

Notre étude “Vers une culture de la donnée en SHS” souhaite contribuer à la mise en œuvre de l’écosystème de la science ouverte sur le terrain d’un campus universitaire.

L’étude a été réalisée dans le cadre du projet structurant D4Humanities, avec un financement de la MESHS et du Conseil Régional Hauts-de-France, et elle fait suite à des travaux de recherche menés depuis 2013 par le laboratoire GERiiCO.

Conduite sous forme d’entretiens avec 51 chercheurs, doctorants, responsables de laboratoires, chefs de projets et ingénieurs en charge de données, l’étude poursuit trois objectifs :

  1. (Re)Mettre les enseignants-chercheurs au cœur de la mise en œuvre de l’écosystème de la science ouverte sur le campus, avec leurs besoins, priorités et interrogations.
  2. Identifier des opportunités et verrous pour une politique de données.
  3. Recommander dix actions à mettre en place pour développer la culture de données sur le campus.

Menée comme un audit sur un terrain particulier et dans le domaine des sciences humaines et sociales, l’étude a une portée pragmatique: dégager les éléments indispensables pour une politique cohérente de la production, gestion et réutilisation des données de la recherche sur un campus en sciences humaines et sociales, et contribuer ainsi à l’appropriation du concept de la science ouverte par une « mise en culture de la donnée, qui effectue une mise en sens d’usages disséminés et spécialisés de données ouvertes ».

Une première partie (« Constats préalables ») s’appuie sur deux études (Rennes 2, Lille 3) pour mieux cerner le concept de la donnée de recherche et son caractère de « longue traîne » ; cette partie synthétise les pratiques, motivations et attentes des enseignants-chercheurs dans ce domaine, en SHS.

Elle aborde également d’une manière générale la question des services et dispositifs de données. Une deuxième partie (« Observations ») décrit un paysage contrasté à partir des entretiens menés en 2017 et 2018 sur le campus SHS de l’Université de Lille.

Les besoins prioritaires des chercheurs sont la sécurité des données et systèmes, et la communication au sein des projets. L’image qui se dégage est un continuum de pratiques plus ou moins efficaces, formalisées et adéquates, avec une gouvernance parfois incertaine, au niveau des projets aussi bien qu’au niveau des structures.

Ces pratiques sont liées aux communautés disciplinaires mais plus encore, aux méthodes, équipements et thématiques scientifiques. La troisième partie (« Vers une culture de la donnée ») liste d’une manière succincte dix recommandations qui, ensemble, définissent un cadre de référence pour la mise en œuvre d’une politique de données sur un campus SHS :

  1. Mettre en place un pilotage scientifique
  2. Investir d’une manière ciblée
  3. Viser les projets, pas les laboratoires
  4. Utiliser les plans de gestion comme levier
  5. Apporter des réponses aux contraintes de sécurité
  6. Apporter des réponses aux besoins de communication
  7. Apporter des réponses aux besoins de curation
  8. Proposer plusieurs solutions pour la conservation des données
  9. Institutionnaliser le lien avec la TGIR Huma-Num
  10. Soutenir les bonnes pratiques

URL : Vers une culture de la donnée en SHS : Une étude à l’Université de Lille

Alternative location :