FAIR Forever? Accountabilities and Responsibilities in the Preservation of Research Data

Author : Amy Currie, William Kilbride

Digital preservation is a fast-moving and growing community of practice of ubiquitous relevance, but in which capability is unevenly distributed. Within the open science and research data communities, digital preservation has a close alignment to the FAIR principles and is delivered through a complex specialist infrastructure comprising technology, staff and policy.

However, capacity erodes quickly, establishing a need for ongoing examination and review to ensure that skills, technology, and policy remain fit for changing purpose. To address this challenge, the Digital Preservation Coalition (DPC) conducted the FAIR Forever study, commissioned by the European Open Science Cloud (EOSC) Sustainability Working Group and funded by the EOSC Secretariat Project in 2020, to assess the current strengths, weaknesses, opportunities and threats to the preservation of research data across EOSC, and the feasibility of establishing shared approaches, workflows and services that would benefit EOSC stakeholders.

This paper draws from the FAIR Forever study to document and explore its key findings on the identified strengths, weaknesses, opportunities, and threats to the preservation of FAIR data in EOSC, and to the preservation of research data more broadly.

It begins with background of the study and an overview of the methodology employed, which involved a desk-based assessment of the emerging EOSC vision, interviews with representatives of EOSC stakeholders, and focus groups with digital preservation specialists and data managers in research organizations.

It summarizes key findings on the need for clarity on digital preservation in the EOSC vision and for elucidation of roles, responsibilities, and accountabilities to mitigate risks of data loss, reputation, and sustainability. It then outlines the recommendations provided in the final report presented to the EOSC Sustainability Working Group.

To better ensure that research data can be FAIRer for longer, the recommendations of the study are presented with discussion on how they can be extended and applied to various research data stakeholders in and outside of EOSC, and suggest ways to bring together research data curation, management, and preservation communities to better ensure FAIRness now and in the long term.

URL : FAIR Forever? Accountabilities and Responsibilities in the Preservation of Research Data

DOI : https://doi.org/10.2218/ijdc.v16i1.768

Do I-PASS for FAIR? Measuring the FAIR-ness of Research Organizations

Authors : Jacquelijn Ringersma, Margriet Miedema

Given the increased use of the FAIR acronym as adjective for other contexts than data or data sets, the Dutch National Coordination Point for Research Data Management initiated a Task Group to work out the concept of a FAIR research organization.

The results of this Task Groups are a definition of a FAIR enabling organization and a method to measure the FAIR-ness of a research organization (The Do-I-PASS for FAIR method). The method can also aid in developing FAIR-enabling Road Maps for individual research institutions and at a national level.

This practice paper describes the development of the method and provides a couple of use cases for the application of the method in daily research data management practices in research organizations.

URL : Do I-PASS for FAIR? Measuring the FAIR-ness of Research Organizations

DOI : http://doi.org/10.5334/dsj-2021-030

From Conceptualization to Implementation: FAIR Assessment of Research Data Objects

Authors: Anusuriya Devaraju, Mustapha Mokrane, Linas Cepinskas, Robert Huber, Patricia Herterich, Jerry de Vries, Vesa Akerman, Hervé L’Hours, Joy Davidson, Michael Diepenbroek

Funders and policy makers have strongly recommended the uptake of the FAIR principles in scientific data management. Several initiatives are working on the implementation of the principles and standardized applications to systematically evaluate data FAIRness.

This paper presents practical solutions, namely metrics and tools, developed by the FAIRsFAIR project to pilot the FAIR assessment of research data objects in trustworthy data repositories. The metrics are mainly built on the indicators developed by the RDA FAIR Data Maturity Model Working Group.

The tools’ design and evaluation followed an iterative process. We present two applications of the metrics: an awareness-raising self-assessment tool and an automated FAIR data assessment tool.

Initial results of testing the tools with researchers and data repositories are discussed, and future improvements suggested including the next steps to enable FAIR data assessment in the broader research data ecosystem.

URL : From Conceptualization to Implementation: FAIR Assessment of Research Data Objects

DOI : http://doi.org/10.5334/dsj-2021-004

Open Science and the Hype Cycle

Author : George Strawn

The introduction of a new technology or innovation is often accompanied by “ups and downs” in its fortunes. Gartner Inc. defined a so-called hype cycle to describe a general pattern that many innovations experience: technology trigger, peak of inflated expectations, trough of disillusionment, slope of enlightenment, and plateau of productivity.

This article will compare the ongoing introduction of Open Science (OS) with the hype cycle model and speculate on the relevance of that model to OS. Lest the title of this article mislead the reader, be assured that the author believes that OS should happen and that it will happen.

However, I also believe that the path to OS will be longer than many of us had hoped. I will give a brief history of the today’s “semi-open” science, define what I mean by OS, define the hype cycle and where OS is now on that cycle, and finally speculate what it will take to traverse the cycle and rise to its plateau of productivity (as described by Gartner).

URL : Open Science and the Hype Cycle

DOI : https://doi.org/10.1162/dint_a_00081

Open Data Challenges in Climate Science

Authors : Francesca Eggleton, Kate Winfiel

The purpose of this paper is to explore challenges in open climate data experienced by data scientists at the Centre for Environmental Data Analysis (CEDA). This paper explores two of the five V’s of Big Data, Volume and Variety.

These challenges are explored using the Sentinel satellite data and Climate Modelling Intercomparison Project phase six (CMIP6) data held in the CEDA Archive. To address the Big Data Volume challenge, this paper describes the approach developed by CEDA to manage large volumes of data through the allocation of storage as filesets.

These filesets allow CEDA to plan and track dataset storage volumes, a flexible approach which could be adopted by any data centre. CEDA utilise the implementation of the Climate and Forecast (CF) conventions and standard names within archived data wherever possible to overcome the challenge of Variety.

Collaboration from the international science community through contributions to the moderation of CF standard names ensures these data then adhere to the FAIR (Findable, Accessible, Interoperable and Reusable) data principles.

Utilising data standards such as the CF standard names is recommended because it promotes data exchange and allows data from different sources to be compared. Addressing these Open Data challenges is crucial to ensure valuable climate data are made available to the scientific community to facilitate research that addresses one of society’s most pressing issues – climate change.

URL : Open Data Challenges in Climate Science

DOI : http://doi.org/10.5334/dsj-2020-052

Towards FAIR protocols and workflows: the OpenPREDICT use case

Authors : Remzi Celebi, Joao Rebelo Moreira, Ahmed A. Hassan, Sandeep Ayyar, Lars Ridder, Tobias Kuhn, Michel Dumontier

It is essential for the advancement of science that researchers share, reuse and reproduce each other’s workflows and protocols. The FAIR principles are a set of guidelines that aim to maximize the value and usefulness of research data, and emphasize the importance of making digital objects findable and reusable by others.

The question of how to apply these principles not just to data but also to the workflows and protocols that consume and produce them is still under debate and poses a number of challenges. In this paper we describe a two-fold approach of simultaneously applying the FAIR principles to scientific workflows as well as the involved data.

We apply and evaluate our approach on the case of the PREDICT workflow, a highly cited drug repurposing workflow. This includes FAIRification of the involved datasets, as well as applying semantic technologies to represent and store data about the detailed versions of the general protocol, of the concrete workflow instructions, and of their execution traces.

We propose a semantic model to address these specific requirements and was evaluated by answering competency questions. This semantic model consists of classes and relations from a number of existing ontologies, including Workflow4ever, PROV, EDAM, and BPMN.

This allowed us then to formulate and answer new kinds of competency questions. Our evaluation shows the high degree to which our FAIRified OpenPREDICT workflow now adheres to the FAIR principles and the practicality and usefulness of being able to answer our new competency questions.

URL : Towards FAIR protocols and workflows: the OpenPREDICT use case

DOI : https://doi.org/10.7717/peerj-cs.281

FAIRness Literacy: The Achilles’ Heel of Applying FAIR Principles

Authors : Romain David, Laurence Mabile, Alison Specht, Sarah Stryeck, Mogens Thomsen, Mohamed Yahia, Clement Jonquet, Laurent Dollé, Daniel Jacob, Daniele Bailo, Elena Bravo, Sophie Gachet, Hannah Gunderman, Jean-Eudes Hollebecq, Vassilios Ioannidis, Yvan Le Bras, Emilie Lerigoleur, Anne Cambon-Thomsen, The Research Data Alliance – SHAring Reward and Credit (SHARC) Interest Group

The SHARC Interest Group of the Research Data Alliance was established to improve research crediting and rewarding mechanisms for scientists who wish to organise their data (and material resources) for community sharing.

This requires that data are findable and accessible on the Web, and comply with shared standards making them interoperable and reusable in alignment with the FAIR principles. It takes considerable time, energy, expertise and motivation.

It is imperative to facilitate the processes to encourage scientists to share their data. To that aim, supporting FAIR principles compliance processes and increasing the human understanding of FAIRness criteria – i.e., promoting FAIRness literacy – and not only the machine-readability of the criteria, are critical steps in the data sharing process.

Appropriate human-understandable criteria must be the first identified in the FAIRness assessment processes and roadmap. This paper reports on the lessons learned from the RDA SHARC Interest Group on identifying the processes required to prepare FAIR implementation in various communities not specifically data skilled, and on the procedures and training that must be deployed and adapted to each practice and level of understanding.

These are essential milestones in developing adapted support and credit back mechanisms not yet in place.

URL : FAIRness Literacy: The Achilles’ Heel of Applying FAIR Principles

DOI : http://doi.org/10.5334/dsj-2020-032