FAIREST: A Framework for Assessing Research Repositories

Authors : Mathieu d’Aquin, Fabian Kirstein, Daniela Oliveira, Sonja Schimmler, Sebastian Urbanek

The open science movement has gained significant momentum within the last few years. This comes along with the need to store and share research artefacts, such as publications and research data. For this purpose, research repositories need to be established.

A variety of solutions exist for implementing such repositories, covering diverse features, ranging from custom depositing workflows to social media-like functions.

In this article, we introduce the FAIREST principles, a framework inspired by the well- known FAIR principles, but designed to provide a set of metrics for assessing and selecting solutions for creating digital repositories for research artefacts. The goal is to support decision makers in choosing such a solution when planning for a repository, especially at an institutional level.

The metrics included are therefore based on two pillars: (1) an analysis of established features and functionalities, drawn from existing dedicated, general purpose and commonly used solutions, and (2) a literature review on general requirements for digital repositories for research artefacts and related systems.

We further describe an assessment of 11 widespread solutions, with the goal to provide an overview of the current landscape of research data repository solutions, identifying gaps and research challenges to be addressed.

DOI : https://doi.org/10.1162/dint_a_00159

Reusable, FAIR Humanities Data : Creating Practical Guidance for Authors at Routledge Open Research

Author : Rebecca Grant

While stakeholders including funding agencies and academic publishers implement more stringent data sharing policies, challenges remain for researchers in the humanities who are increasingly prompted to share their research data.

This paper outlines some key challenges of research data sharing in the humanities, and identifies existing work which has been undertaken to explore these challenges. It describes the current landscape regarding publishers’ research data sharing policies, and the impact which strong data policies can have, regardless of discipline.

Using Routledge Open Research as a case study, the development of a set of humanities-inclusive Open Data publisher data guidelines is then described. These include practical guidance in relation to data sharing for humanities authors, and a close alignment with the FAIR Data Principles.

URL : Reusable, FAIR Humanities Data : Creating Practical Guidance for Authors at Routledge Open Research

DOI : https://doi.org/10.2218/ijdc.v17i1.820

Increasing the Reuse of Data through FAIR-enabling the Certification of Trustworthy Digital Repositories

Authors : Benjamin Jacob Mathers, Hervé L’Hours

The long-term preservation of digital objects, and the means by which they can be reused, are addressed by both the FAIR Data Principles (Findable, Accessible, Interoperable, Reusable) and a number of standards bodies providing Trustworthy Digital Repository (TDR) certification, such as the CoreTrustSeal.

Though many of the requirements listed in the Core Trustworthy Data Repositories Requirements 2020–2022 Extended Guidance address the FAIR Data Principles indirectly, there is currently no formal ‘FAIR Certification’ offered by the CoreTrustSeal or other TDR standards bodies. To address this gap the FAIRsFAIR project developed a number of tools and resources that facilitate the assessment of FAIR-enabling practices at the repository level as well as the FAIRness of datasets within them.

These include the CoreTrustSeal+FAIRenabling Capability Maturity model (CTS+FAIR CapMat), a FAIR-Enabling Trustworthy Digital Repositories-Capability Maturity Self-Assessment template, and F-UJI ,  a web-based tool designed to assess the FAIRness of research data objects.

The success of such tools and resources ultimately depends upon community uptake. This requires a community-wide commitment to develop best practices to increase the reuse of data and to reach consensus on what these practices are.

One possible way of achieving community consensus would be through the creation of a network of FAIR-enabling TDRs, as proposed by FAIRsFAIR.

URL : Increasing the Reuse of Data through FAIR-enabling the Certification of Trustworthy Digital Repositories

DOI : https://doi.org/10.2218/ijdc.v17i1.852

“Who Is the FAIRest of Them All?” Authors, Entities, and Journals Regarding FAIR Data Principles

Author : Luis Corujo

The perceived need to improve the infrastructure supporting the re-use of scholarly data since the second decade of the 21st century led to the design of a concise number of principles and metrics, named FAIR Data Principles. This paper, part of an extended study, intends to identify the main authors, entities, and scientific journals linked to research conducted within the FAIR Data Principles.

The research was developed by means of a qualitative approach, using documentary research and a constant comparison method for codification and categorization of the sampled data.

The sample studied showed that most authors were located in the Netherlands, with Europe accounting for more than 70% of the number of authors considered. Most of these are researchers and work in higher education institutions. T

hese entities can be found in most of the territorial-administrative areas under consideration, with the USA being the country with more entities and Europe being the world region where they are more numerous.

The journal with more texts in the used sample was Insights, with 2020 being the year when more texts were published. Two of the most prominent authors present in the sample texts were located in the Netherlands, while the other two were in France and Australia.

URL : “Who Is the FAIRest of Them All?” Authors, Entities, and Journals Regarding FAIR Data Principles

DOI : https://doi.org/10.3390/publications10030031

Globally Accessible Distributed Data Sharing (GADDS): a decentralized FAIR platform to facilitate data sharing in the life sciences

Authors : Pavel Vazquez, Kayoko Hirayama-Shoji, Steffen Novik, Stefan Krauss, Simon Rayner

Motivation

Technical advances have revolutionized the life sciences and researchers commonly face challenges associated with handling large amounts of heterogeneous digital data. The Findable, Accessible, Interoperable and Reusable (FAIR) principles provide a framework to support effective data management.

However, implementing this framework is beyond the means of most researchers in terms of resources and expertise, requiring awareness of metadata, policies, community agreements, and other factors such as vocabularies and ontologies.

Results

We have developed the Globally Accessible Distributed Data Sharing (GADDS) platform to facilitate FAIR-like data-sharing in cross-disciplinary research collaborations. The platform consists of (i) a blockchain based metadata quality control system, (ii) a private cloud-like storage system and (iii) a version control system. GADDS is built with containerized technologies, providing minimal hardware standards and easing scalability, and offers decentralized trust via transparency of metadata, facilitating data exchange and collaboration.

As a use case, we provide an example implementation in engineered living material technology within the Hybrid Technology Hub at the University of Oslo.

URL : Globally Accessible Distributed Data Sharing (GADDS): a decentralized FAIR platform to facilitate data sharing in the life sciences

DOI : https://doi.org/10.1093/bioinformatics/btac362

Caching and Reproducibility: Making Data Science Experiments Faster and FAIRer

Authors : Moritz Schubotz, Ankit Satpute, André Greiner-Petter, Akiko Aizawa, Bela Gipp

Small to medium-scale data science experiments often rely on research software developed ad-hoc by individual scientists or small teams. Often there is no time to make the research software fast, reusable, and open access.

The consequence is twofold. First, subsequent researchers must spend significant work hours building upon the proposed hypotheses or experimental framework. In the worst case, others cannot reproduce the experiment and reuse the findings for subsequent research. Second, suppose the ad-hoc research software fails during often long-running computational expensive experiments.

In that case, the overall effort to iteratively improve the software and rerun the experiments creates significant time pressure on the researchers. We suggest making caching an integral part of the research software development process, even before the first line of code is written.

This article outlines caching recommendations for developing research software in data science projects. Our recommendations provide a perspective to circumvent common problems such as propriety dependence, speed, etc. At the same time, caching contributes to the reproducibility of experiments in the open science workflow.

Concerning the four guiding principles, i.e., Findability, Accessibility, Interoperability, and Reusability (FAIR), we foresee that including the proposed recommendation in a research software development will make the data related to that software FAIRer for both machines and humans.

We exhibit the usefulness of some of the proposed recommendations on our recently completed research software project in mathematical information retrieval.

URL : Caching and Reproducibility: Making Data Science Experiments Faster and FAIRer

DOI : https://doi.org/10.3389/frma.2022.861944

The Data Life Aquatic: Oceanographers’ Experience with Interoperability and Re-usability: Oceanographers’ Experience with Interoperability and Re-usability

Authors : Bradley Wade Bishop, Carolyn F Hank, Joel T Webster

This paper assesses data consumers’ perspectives on the interoperable and re-usable aspects of the FAIR Data Principles. Taking a domain-specific informatics approach, ten oceanographers were asked to think of a recent search for data and describe their process of discovery, evaluation, and use.

The interview schedule, derived from the FAIR Data Principles, included questions about the interoperability and re-usability of data. Through this critical incident technique, findings on data interoperability and re-usability give data curators valuable insights into how real-world users access, evaluate, and use data.

Results from this study show that oceanographers utilize tools that make re-use simple, with interoperability seamless within the systems used. The processes employed by oceanographers present a good baseline for other domains adopting the FAIR Data Principles.

URL : The Data Life Aquatic: Oceanographers’ Experience with Interoperability and Re-usability: Oceanographers’ Experience with Interoperability and Re-usability

DOI : https://doi.org/10.2218/ijdc.v16i1.635