Workflows Allowing Creation of Journal Article Supporting Information and Findable, Accessible, Interoperable, and Reusable (FAIR)-Enabled Publication of Spectroscopic Data

Authors : Agustin Barba, Santiago Dominguez, Carlos Cobas, David P. Martinsen, Charles Romain, Henry S. Rzepa; Felipe Seoane

There is an increasing focus on the part of academic institutions, funding agencies, and publishers, if not researchers themselves, on preservation and sharing of research data. Motivations for sharing include research integrity, replicability, and reuse.

One of the barriers to publishing data is the extra work involved in preparing data for publication once a journal article and its supporting information have been completed.

In this work, a method is described to generate both human and machine-readable supporting information directly from the primary instrumental data files and to generate the metadata to ensure it is published in accordance with findable, accessible, interoperable, and reusable (FAIR) guidelines.

Using this approach, both the human readable supporting information and the primary (raw) data can be submitted simultaneously with little extra effort.

Although traditionally the data package would be sent to a journal publisher for publication alongside the article, the data package could also be published independently in an institutional FAIR data repository.

Workflows are described that store the data packages and generate metadata appropriate for such a repository. The methods both to generate and to publish the data packages have been implemented for NMR data, but the concept is extensible to other types of spectroscopic data as well.

URL : Workflows Allowing Creation of Journal Article Supporting Information and Findable, Accessible, Interoperable, and Reusable (FAIR)-Enabled Publication of Spectroscopic Data

DOI : https://doi.org/10.1021/acsomega.8b03005

On a Quest for Cultural Change – Surveying Research Data Management Practices at Delft University of Technology

Authors : Heather Andrews Mancilla, Marta Teperek, Jasper van Dijck, Kees den Heijer, Robbert Eggermont, Esther Plomp, Yasemin Turkyilmaz-van der Velden, Shalini Kurapati

The Data Stewardship project is a new initiative from the Delft University of Technology (TU Delft) in the Netherlands. Its aim is to create mature working practices and policies regarding research data management across all TU Delft faculties.

The novelty of this project relies on having a dedicated person, the so-called ‘Data Steward’, embedded in each faculty to approach research data management from a more discipline-specific perspective. It is within this framework that a research data management survey was carried out at the faculties that had a Data Steward in place by July 2018.

The goal was to get an overview of the general data management practices, and use its results as a benchmark for the project. The total response rate was 11 to 37% depending on the faculty.

Overall, the results show similar trends in all faculties, and indicate lack of awareness regarding different data management topics such as automatic data backups, data ownership, relevance of data management plans, awareness of FAIR data principles and usage of research data repositories.

The results also show great interest towards data management, as more than ~80% of the respondents in each faculty claimed to be interested in data management training and wished to see the summary of survey results.

Thus, the survey helped identified the topics the Data Stewardship project is currently focusing on, by carrying out awareness campaigns and providing training at both university and faculty levels.

URL : On a Quest for Cultural Change – Surveying Research Data Management Practices at Delft University of Technology

Implementing the FAIR Data Principles in precision oncology: review of supporting initiatives

Authors : Charles Vesteghem, Rasmus Froberg Brøndum, Mads Sønderkær, Mia Sommer, Alexander Schmitz, Julie Støve Bødker, Karen Dybkær, Tarec Christoffer El-Galaly, Martin Bøgsted

Compelling research has recently shown that cancer is so heterogeneous that single research centres cannot produce enough data to fit prognostic and predictive models of sufficient accuracy. Data sharing in precision oncology is therefore of utmost importance.

The Findable, Accessible, Interoperable and Reusable (FAIR) Data Principles have been developed to define good practices in data sharing. Motivated by the ambition of applying the FAIR Data Principles to our own clinical precision oncology implementations and research, we have performed a systematic literature review of potentially relevant initiatives.

For clinical data, we suggest using the Genomic Data Commons model as a reference as it provides a field-tested and well-documented solution. Regarding classification of diagnosis, morphology and topography and drugs, we chose to follow the World Health Organization standards, i.e. ICD10, ICD-O-3 and Anatomical Therapeutic Chemical classifications, respectively.

For the bioinformatics pipeline, the Genome Analysis ToolKit Best Practices using Docker containers offer a coherent solution and have therefore been selected. Regarding the naming of variants, we follow the Human Genome Variation Society’s standard.

For the IT infrastructure, we have built a centralized solution to participate in data sharing through federated solutions such as the Beacon Networks.

URL : Implementing the FAIR Data Principles in precision oncology: review of supporting initiatives

DOI : https://doi.org/10.1093/bib/bbz044

Are Research Datasets FAIR in the Long Run?

Authors : Dennis Wehrle, Klaus Rechert

Currently, initiatives in Germany are developing infrastructure to accept and preserve dissertation data together with the dissertation texts (on state level – bwDATA Diss, on federal level – eDissPlus).

In contrast to specialized data repositories, these services will accept data from all kind of research disciplines. To ensure FAIR data principles (Wilkinson et al., 2016), preservation plans are required, because ensuring accessibility, interoperability and re-usability even for a minimum ten year data redemption period can become a major challenge.

Both for longevity and re-usability, file formats matter. In order to ensure access to data, the data’s encoding, i.e. their technical and structural representation in form of file formats, needs to be understood. Hence, due to a fast technical lifecycle, interoperability, re-use and in some cases even accessibility depends on the data’s format and our future ability to parse or render these.

This leads to several practical questions regarding quality assurance, potential access options and necessary future preservation steps. In this paper, we analyze datasets from public repositories and apply a file format based long-term preservation risk model to support workflows and services for non-domain specific data repositories.

URL : Are Research Datasets FAIR in the Long Run?

DOI : https://doi.org/10.2218/ijdc.v13i1.659

Hors norme ? Une approche normative des données de la recherche

Auteur : Joachim Schöpfel

Nous proposons une réflexion sur le rôle des normes et standards dans la gestion des données de la recherche, dans l’environnement de la politique de la science ouverte.

A partir d’une définition générale des données de la recherche, nous analysons la place et la fonction des normes et standards dans les différentes dimensions du concept des données. En particulier, nous nous intéressons à trois aspects faisant le lien entre le processus scientifique, l’environnement réglementaire et les données de la recherche : les protocoles éthiques, les systèmes d’information recherche et les plans de gestion des données.

A l’échelle internationale, nous décrivons l’effet normatif des principes FAIR qui, par la mobilisation d’autres normes et standards, créent une sorte de « cascade de standards » autour des plateformes et entrepôts, avec un impact direct sur les pratiques scientifiques.

URL : https://revue-cossi.info/numeros/n-5-2018-processus-normalisation-durabilite-information/730-5-2018-schopfel

Research data management in the French National Research Center (CNRS)

Authors : Joachim Schöpfel, Coline Ferrant, Francis Andre, Renaud Fabre

Purpose

The purpose of this paper is to present empirical evidence on the opinion and behaviour of French scientists (senior management level) regarding research data management (RDM).

Design/methodology/approach

The results are part of a nationwide survey on scientific information and documentation with 432 directors of French public research laboratories conducted by the French Research Center CNRS in 2014.

Findings

The paper presents empirical results about data production (types), management (human resources, IT, funding, and standards), data sharing and related needs, and highlights significant disciplinary differences.

Also, it appears that RDM and data sharing is not directly correlated with the commitment to open access. Regarding the FAIR data principles, the paper reveals that 68 per cent of all laboratory directors affirm that their data production and management is compliant with at least one of the FAIR principles.

But only 26 per cent are compliant with at least three principles, and less than 7 per cent are compliant with all four FAIR criteria, with laboratories in nuclear physics, SSH and earth sciences and astronomy being in advance of other disciplines, especially concerning the findability and the availability of their data output.

The paper concludes with comments about research data service development and recommendations for an institutional RDM policy.

Originality/value

For the first time, a nationwide survey was conducted with the senior research management level from all scientific disciplines. Surveys on RDM usually assess individual data behaviours, skills and needs. This survey is different insofar as it addresses institutional and collective data practice.

The respondents did not report on their own data behaviours and attitudes but were asked to provide information about their laboratory. The response rate was high (>30 per cent), and the results provide good insight into the real support and uptake of RDM by senior research managers who provide both models (examples for good practice) and opinion leadership.

URL : https://hal.univ-lille3.fr/hal-01728541/

From Open Access to Open Data: collaborative work in the university libraries of Catalonia

Authors: Mireia Alcalá Ponce de León, Lluís Anglada i de Ferrer

In the last years, the scientific community and funding bodies have paid attention to collected, generated or used data throughout different research activities. The dissemination of these data becomes one of the constituent elements of Open Science.

For this reason, many funders are requiring or promoting the development of Data Management Plans, and depositing open data following the FAIR principles (Findable, Accessible, Interoperable and Reusable).

Libraries and research offices of Catalan universities –which coordinately work within the Open Science Area of CSUC– offer support services to research data management. The different works carried out at the Consortium level will be presented, as well the implementation of the service in each university.

URL : From Open Access to Open Data: collaborative work in the university libraries of Catalonia

DOI : http://doi.org/10.18352/lq.10253