Public Microbial Resource Centers: Key Hubs for Findable,Accessible, Interoperable, and Reusable (FAIR) Microorganismsand Genetic Materials

Authors : P. Becker, M. Bosschaerts, P. Chaerle, H.-M. Daniel, A. Hellemans, A. Olbrechts, L. Rigouts, A. Wilmotte, M. Hendrickx

In the context of open science, the availability of research materials is essential for knowledge accumulation and to maximize the impact of scientific research. In microbiology, microbial domain biological resource centers (mBRCs) have long-standing experience in preserving and distributing authenticated microbial strains and genetic materials (e.g., recombinant plasmids and DNA libraries) to support new discoveries and follow-on studies.

These culture collections play a central role in the conservation of microbial biodiversity and have expertise in cultivation, characterization, and taxonomy of microorganisms. Information associated with preserved biological resources is recorded in databases and is accessible through online catalogues.

Legal expertise developed by mBRCs guarantees end users the traceability and legality of the acquired material, notably with respect to the Nagoya Protocol. However, awareness of the advantages of depositing biological materials in professional repositories remains low, and the necessity of securing strains and genetic resources for future research must be emphasized.

This review describes the unique position of mBRCs in microbiology and molecular biology through their history, evolving roles, expertise, services, challenges, and international collaborations. It also calls for an increased deposit of strains and genetic resources, a responsibility shared by scientists, funding agencies, and publishers.

Journal policies requesting a deposit during submission of a manuscript represent one of the measures to make more biological materials available to the broader community, hence fully releasing their potential and improving openness and reproducibility in scientific research.

URL : https://orbi.uliege.be/bitstream/2268/240381/1/Applied%20and%20Environmental%20Microbiology-2019-Becker-e01444-19.full-1.pdf

Produire, analyser et partager des données ouvertes en Humanités Numériques : quelques bonnes pratiques

Auteur/Author : Gérald Kembellec

La réponse à des problématiques scientifiques liées aux humanités passe par le traitement numérique de corpus. Les humanités numériques deviennent un sujet d’importance qui regroupe des savoirs et des méthodes issus de diverses disciplines comme l’informatique, les statistiques, la sociologie, la cartographie ou encore la linguistique.

Cet article, s’il est ancré dans les sciences de l’information et de la communication, convoque des méthodes périphériques et se propose comme un vade-mecum de la gestion des données des humanités : la qualification, la collecte, le traitement, l’enrichissement, la documentation et le partage des données des humanités.

Nous mettons ici en avant le concept de « courtoisie du FAIR data » en contexte scientifique : la valorisation des corpus, en particulier par le partage de jeux de données de qualité, documentés et accessibles physiquement et légalement exploitables. Nous insistons également sur l’éthique lors des étapes de traitement et d’exploitation des données de la recherche.

URL : https://halshs.archives-ouvertes.fr/ISKOFRANCE2019/hal-02306958

Evaluating FAIR maturity through a scalable, automated, community-governed framework

Authors : Mark D. Wilkinson, Michel Dumontier, Susanna-Assunta Sansone, Luiz Olavo Bonino da Silva Santos, Mario Prieto, Dominique Batista, Peter McQuilton, Tobias Kuhn, Philippe Rocca-Serra, Mercѐ Crosas, Erik Schultes

Transparent evaluations of FAIRness are increasingly required by a wide range of stakeholders, from scientists to publishers, funding agencies and policy makers. We propose a scalable, automatable framework to evaluate digital resources that encompasses measurable indicators, open source tools, and participation guidelines, which come together to accommodate domain relevant community-defined FAIR assessments.

The components of the framework are: (1) Maturity Indicators – community-authored specifications that delimit a specific automatically-measurable FAIR behavior; (2) Compliance Tests – small Web apps that test digital resources against individual Maturity Indicators; and (3) the Evaluator, a Web application that registers, assembles, and applies community-relevant sets of Compliance Tests against a digital resource, and provides a detailed report about what a machine “sees” when it visits that resource.

We discuss the technical and social considerations of FAIR assessments, and how this translates to our community-driven infrastructure. We then illustrate how the output of the Evaluator tool can serve as a roadmap to assist data stewards to incrementally and realistically improve the FAIRness of their resources.

URL : Evaluating FAIR maturity through a scalable, automated, community-governed framework

DOI : https://doi.org/10.1038/s41597-019-0184-5

Toward collaborative open data science in metabolomics using Jupyter Notebooks and cloud computing

Authors : Kevin M. Mendez, Leighton Pritchard, Stacey N. Reinke, David I. Broadhurst

Background

A lack of transparency and reporting standards in the scientific community has led to increasing and widespread concerns relating to reproduction and integrity of results. As an omics science, which generates vast amounts of data and relies heavily on data science for deriving biological meaning, metabolomics is highly vulnerable to irreproducibility.

The metabolomics community has made substantial efforts to align with FAIR data standards by promoting open data formats, data repositories, online spectral libraries, and metabolite databases.

Open data analysis platforms also exist; however, they tend to be inflexible and rely on the user to adequately report their methods and results. To enable FAIR data science in metabolomics, methods and results need to be transparently disseminated in a manner that is rapid, reusable, and fully integrated with the published work.

To ensure broad use within the community such a framework also needs to be inclusive and intuitive for both computational novices and experts alike.

Aim of Review

To encourage metabolomics researchers from all backgrounds to take control of their own data science, mould it to their personal requirements, and enthusiastically share resources through open science.

Key Scientific Concepts of Review

This tutorial introduces the concept of interactive web-based computational laboratory notebooks. The reader is guided through a set of experiential tutorials specifically targeted at metabolomics researchers, based around the Jupyter Notebook web application, GitHub data repository, and Binder cloud computing platform.

URL : Toward collaborative open data science in metabolomics using Jupyter Notebooks and cloud computing

DOI : https://doi.org/10.1007/s11306-019-1588-0

Workflows Allowing Creation of Journal Article Supporting Information and Findable, Accessible, Interoperable, and Reusable (FAIR)-Enabled Publication of Spectroscopic Data

Authors : Agustin Barba, Santiago Dominguez, Carlos Cobas, David P. Martinsen, Charles Romain, Henry S. Rzepa; Felipe Seoane

There is an increasing focus on the part of academic institutions, funding agencies, and publishers, if not researchers themselves, on preservation and sharing of research data. Motivations for sharing include research integrity, replicability, and reuse.

One of the barriers to publishing data is the extra work involved in preparing data for publication once a journal article and its supporting information have been completed.

In this work, a method is described to generate both human and machine-readable supporting information directly from the primary instrumental data files and to generate the metadata to ensure it is published in accordance with findable, accessible, interoperable, and reusable (FAIR) guidelines.

Using this approach, both the human readable supporting information and the primary (raw) data can be submitted simultaneously with little extra effort.

Although traditionally the data package would be sent to a journal publisher for publication alongside the article, the data package could also be published independently in an institutional FAIR data repository.

Workflows are described that store the data packages and generate metadata appropriate for such a repository. The methods both to generate and to publish the data packages have been implemented for NMR data, but the concept is extensible to other types of spectroscopic data as well.

URL : Workflows Allowing Creation of Journal Article Supporting Information and Findable, Accessible, Interoperable, and Reusable (FAIR)-Enabled Publication of Spectroscopic Data

DOI : https://doi.org/10.1021/acsomega.8b03005

On a Quest for Cultural Change – Surveying Research Data Management Practices at Delft University of Technology

Authors : Heather Andrews Mancilla, Marta Teperek, Jasper van Dijck, Kees den Heijer, Robbert Eggermont, Esther Plomp, Yasemin Turkyilmaz-van der Velden, Shalini Kurapati

The Data Stewardship project is a new initiative from the Delft University of Technology (TU Delft) in the Netherlands. Its aim is to create mature working practices and policies regarding research data management across all TU Delft faculties.

The novelty of this project relies on having a dedicated person, the so-called ‘Data Steward’, embedded in each faculty to approach research data management from a more discipline-specific perspective. It is within this framework that a research data management survey was carried out at the faculties that had a Data Steward in place by July 2018.

The goal was to get an overview of the general data management practices, and use its results as a benchmark for the project. The total response rate was 11 to 37% depending on the faculty.

Overall, the results show similar trends in all faculties, and indicate lack of awareness regarding different data management topics such as automatic data backups, data ownership, relevance of data management plans, awareness of FAIR data principles and usage of research data repositories.

The results also show great interest towards data management, as more than ~80% of the respondents in each faculty claimed to be interested in data management training and wished to see the summary of survey results.

Thus, the survey helped identified the topics the Data Stewardship project is currently focusing on, by carrying out awareness campaigns and providing training at both university and faculty levels.

URL : On a Quest for Cultural Change – Surveying Research Data Management Practices at Delft University of Technology

Implementing the FAIR Data Principles in precision oncology: review of supporting initiatives

Authors : Charles Vesteghem, Rasmus Froberg Brøndum, Mads Sønderkær, Mia Sommer, Alexander Schmitz, Julie Støve Bødker, Karen Dybkær, Tarec Christoffer El-Galaly, Martin Bøgsted

Compelling research has recently shown that cancer is so heterogeneous that single research centres cannot produce enough data to fit prognostic and predictive models of sufficient accuracy. Data sharing in precision oncology is therefore of utmost importance.

The Findable, Accessible, Interoperable and Reusable (FAIR) Data Principles have been developed to define good practices in data sharing. Motivated by the ambition of applying the FAIR Data Principles to our own clinical precision oncology implementations and research, we have performed a systematic literature review of potentially relevant initiatives.

For clinical data, we suggest using the Genomic Data Commons model as a reference as it provides a field-tested and well-documented solution. Regarding classification of diagnosis, morphology and topography and drugs, we chose to follow the World Health Organization standards, i.e. ICD10, ICD-O-3 and Anatomical Therapeutic Chemical classifications, respectively.

For the bioinformatics pipeline, the Genome Analysis ToolKit Best Practices using Docker containers offer a coherent solution and have therefore been selected. Regarding the naming of variants, we follow the Human Genome Variation Society’s standard.

For the IT infrastructure, we have built a centralized solution to participate in data sharing through federated solutions such as the Beacon Networks.

URL : Implementing the FAIR Data Principles in precision oncology: review of supporting initiatives

DOI : https://doi.org/10.1093/bib/bbz044