Building a Disciplinary, World‐Wide Data Infrastructure

Authors: Françoise Genova, Christophe Arviset, Bridget M. Almas, Laura Bartolo, Daan Broeder, Emily Law, Brian McMahon

Sharing scientific data with the objective of making it discoverable, accessible, reusable, and interoperable requires work and presents challenges being faced at the disciplinary level to define in particular how the data should be formatted and described.

This paper represents the Proceedings of a session held at SciDataCon 2016 (Denver, 12–13 September 2016). It explores the way a range of disciplines, namely materials science, crystallography, astronomy, earth sciences, humanities and linguistics, get organized at the international level to address those challenges. T

he disciplinary culture with respect to data sharing, science drivers, organization, lessons learnt and the elements of the data infrastructure which are or could be shared with others are briefly described. Commonalities and differences are assessed.

Common key elements for success are identified: data sharing should be science driven; defining the disciplinary part of the interdisciplinary standards is mandatory but challenging; sharing of applications should accompany data sharing. Incentives such as journal and funding agency requirements are also similar.

For all, social aspects are more challenging than technological ones. Governance is more diverse, often specific to the discipline organization. Being problem‐driven is also a key factor of success for building bridges to enable interdisciplinary research.

Several international data organizations such as CODATA, RDA and WDS can facilitate the establishment of disciplinary interoperability frameworks. As a spin‐off of the session, a RDA Disciplinary Interoperability Interest Group is proposed to bring together representatives across disciplines to better organize and drive the discussion for prioritizing, harmonizing and efficiently articulating disciplinary needs.

URL : Building a Disciplinary, World‐Wide Data Infrastructure

DOI : http://doi.org/10.5334/dsj-2017-016

 

Pursuing Best Performance in Research Data Management by Using the Capability Maturity Model and Rubrics

Authors : Jian Qin, Kevin Crowston, Arden Kirkland

Objective

To support the assessment and improvement of research data management (RDM) practices to increase its reliability, this paper describes the development of a capability maturity model (CMM) for RDM. Improved RDM is now a critical need, but low awareness of – or lack of – data management is still common among research projects.

Methods

A CMM includes four key elements: key practices, key process areas, maturity levels, and generic processes. These elements were determined for RDM by a review and synthesis of the published literature on and best practices for RDM.

Results

The RDM CMM includes five chapters describing five key process areas for research data management: 1) data management in general; 2) data acquisition, processing, and quality assurance; 3) data description and representation; 4) data dissemination; and 5) repository services and preservation.

In each chapter, key data management practices are organized into four groups according to the CMM’s generic processes: commitment to perform, ability to perform, tasks performed, and process assessment (combining the original measurement and verification).

For each area of practice, the document provides a rubric to help projects or organizations assess their level of maturity in RDM.

Conclusions

By helping organizations identify areas of strength and weakness, the RDM CMM provides guidance on where effort is needed to improve the practice of RDM.

URL : Pursuing Best Performance in Research Data Management by Using the Capability Maturity Model and Rubrics

DOI : https://doi.org/10.7191/jeslib.2017.1113

How to responsibly acknowledge research work in the era of big data and biobanks: ethical aspects of the Bioresource Research Impact Factor (BRIF)

Authors : Heidi Carmen Howard, Deborah Mascalzoni, Laurence Mabile, Gry Houeland, Emmanuelle Rial-Sebbag, Anne Cambon-Thomsen

Currently, a great deal of biomedical research in fields such as epidemiology, clinical trials and genetics is reliant on vast amounts of biological and phenotypic information collected and assembled in biobanks.

While many resources are being invested to ensure that comprehensive and well-organised biobanks are able to provide increased access to, and sharing of biomedical samples and information, many barriers and challenges remain to such responsible and extensive sharing.

Germane to the discussion herein is the barrier to collecting and sharing bioresources related to the lack of proper recognition of researchers and clinicians who developed the bioresource. Indeed, the efforts and resources invested to set up and sustain a bioresource can be enormous and such work should be easily traced and properly recognised.

However, there is currently no such system that systematically and accurately traces and attributes recognition to those doing this work or the bioresource institution itself. As a beginning of a solution to the “recognition problem”, the Bioresource Research Impact Factor/Framework (BRIF) initiative was proposed almost a decade and a half ago and is currently under further development.

With the ultimate aim of increasing awareness and understanding of the BRIF, in this article, we contribute the following: (1) a review of the objectives and functions of the BRIF including the description of two tools that will help in the deployment of the BRIF, the CoBRA (Citation of BioResources in journal Articles) guideline, and the Open Journal of Bioresources (OJB); (2) the results of a small empirical study on stakeholder awareness of the BRIF and (3) a brief analysis of the ethical dimensions of the BRIF which allow it to be a positive contribution to responsible biobanking.

URL : How to responsibly acknowledge research work in the era of big data and biobanks: ethical aspects of the Bioresource Research Impact Factor (BRIF)

Alternative locaton : https://link.springer.com/article/10.1007/s12687-017-0332-6

De l’open data à l’open science : retour réflexif sur les méthodes et pratiques d’une recherche sur les données géographiques

Auteurs/Authors : Nathalie Pinède, Matthieu Noucher, Françoise Gourmelon, Karel Soumagnac-Colin

Nous mobilisons ici l’expérience d’un projet de recherche en cours pour analyser la façon dont les nouveaux terrains d’expérimentations sur le web, modifient les conditions de la pratique scientifique, des objets aux méthodes, de l’open data à l’open science.

La massification des données géographiques disponibles sur le web reconfigure les dynamiques de recherche selon trois axes de transformation : les objets, les méthodes et les pratiques de recherche. Tout d’abord, nous soulignerons comment les enjeux de pouvoir autour de la cartographie se sont déplacés avec l’avènement du web et de l’open data.

Nous développerons ensuite les impacts en matière de méthodologie de recherche dans un contexte d’approche interdisciplinaire. Enfin, nous montrerons comment ce projet de recherche s’inscrit dans une démarche de type open science.

URL : https://rfsic.revues.org/3200

The development of a research data policy at Wageningen University & Research: best practices as a framework

Authors: Hilde van Zeeland, Jacquelijn Ringersma

The current case study describes the development of a Research Data Management policy at Wageningen University & Research, the Netherlands. To develop this policy, an analysis was carried out of existing frameworks and principles on data management (such as the FAIR principles), as well as of the data management practices in the organisation.

These practices were defined through interviews with research groups. Using criteria drawn from the existing frameworks and principles, certain research groups were identified as ‘best-practices’: cases where data management was meeting the most important data management criteria.

These best-practices were then used to inform the RDM policy. This approach shows how engagement with researchers can not only provide insight into their data management practices and needs, but directly inform new policy guidelines.

URL : The development of a research data policy at Wageningen University & Research: best practices as a framework

DOI : http://doi.org/10.18352/lq.10215

Ethics approval in applications for open-access clinical trial data: An analysis of researcher statements to clinicalstudydatarequest.com

Authors : Derek So, Bartha M. Knoppers

Although there are a number of online platforms for patient-level clinical trial data sharing from industry sponsors, they are not very harmonized regarding the role of local ethics approval in the research proposal review process.

The first and largest of these platforms is ClinicalStudyDataRequest.com (CSDR), which includes over three thousand trials from thirteen sponsors including GlaxoSmithKline, Novartis, Roche, Sanofi, and Bayer. CSDR asks applicants to state whether they have received ethics approval for their research proposal, but in most cases does not require that they submit evidence of approval.

However, the website does require that applicants without ethical approval state the reason it was not required. In order to examine the perspectives of researchers on this topic, we coded every response to that question received by CSDR between June 2014 and February 2017.

Of 111 applicants who stated they were exempt from ethics approval, 63% mentioned de-identification, 57% mentioned the use of existing data, 33% referred to local or jurisdictional regulations, and 20% referred to the approvals obtained by the original study.

We conclude by examining the experience of CSDR within the broader context of the access mechanisms and policies currently being used by other data sharing platforms, and discuss how our findings might be used to help clinical trial data providers design clear and informative access documents.

URL : Ethics approval in applications for open-access clinical trial data: An analysis of researcher statements to clinicalstudydatarequest.com

DOI : https://doi.org/10.1371/journal.pone.0184491

Using Peer Review to Support Development of Community Resources for Research Data Management

Authors : Heather Soyka, Amber Budden, Viv Hutchison, David Bloom, Jonah Duckles, Amy Hodge, Matthew S. Mayernik, Timothée Poisot, Shannon Rauch, Gail Steinhart, Leah Wasser, Amanda L. Whitmire, Stephanie Wright

Objective

To ensure that resources designed to teach skills and best practices for scientific research data sharing and management are useful, the maintainers of those materials need to evaluate and update them to ensure their accuracy, currency, and quality.

This paper advances the use and process of outside peer review for community resources in addressing ongoing accuracy, quality, and currency issues. It further describes the next step of moving the updated materials to an online collaborative community platform for future iterative review in order to build upon mechanisms for open science, ongoing iteration, participation, and transparent community engagement.

Setting

Research data management resources were developed in support of the DataONE (Data Observation Network for Earth) project, which has deployed a sustainable, long-term network to ensure the preservation and access to multi-scale, multi-discipline, and multi-national environmental and biological science data (Michener et al. 2012).

Created by members of the Community Engagement and Education (CEE) Working Group in 2011-2012, the freely available Educational Modules included three complementary components (slides, handouts, and exercises) that were designed to be adaptable for use in classrooms as well as for research data management training.

Methods

Because the modules were initially created and launched in 2011-2012, the current members of the (renamed) Community Engagement and Outreach (CEO) Working Group were concerned that the materials could be and / or quickly become outdated and should be reviewed for accuracy, currency, and quality.

In November 2015, the Working Group developed an evaluation rubric for use by outside reviewers. Review criteria were developed based on surveys and usage scenarios from previous DataONE projects.

Peer reviewers were selected from the DataONE community network for their expertise in the areas covered by one of the 11 educational modules. Reviewers were contacted in March 2016, and were asked to volunteer to complete their evaluations online within one month of the request, by using a customized Google form.

Results

For the 11 modules, 22 completed reviews were received by April 2016 from outside experts. Comments on all three components of each module (slides, handouts, and exercises) were compiled and evaluated by the postdoctoral fellow attached to the CEO Working Group.

These reviews contributed to the full evaluation and revision by members of the Working Group of all educational modules in September 2016. This review process, as well as the potential lack of funding for ongoing maintenance by Working Group members or paid staff, provoked the group to transform the modules to a more stable, non-proprietary format, and move them to an online open repository hosting platform, GitHub.

These decisions were made to foster sustainability, community engagement, version control, and transparency.

Conclusion

Outside peer review of the modules by experts in the field was beneficial for highlighting areas of weakness or overlap in the education modules. The modules were initially created in 2011-2012 by an earlier iteration of the Working Group, and updates were needed due to the constant evolving practices in the field.

Because the review process was lengthy (approximately one year) comparative to the rate of innovations in data management practices, the Working Group discussed other options that would allow community members to make updates available more quickly.

The intent of migrating the modules to an online collaborative platform (GitHub) is to allow for iterative updates and ongoing outside review, and to provide further transparency about accuracy, currency, and quality in the spirit of open science and collaboration.

Documentation about this project may be useful for others trying to develop and maintain educational resources for engagement and outreach, particularly in communities and spaces where information changes quickly, and open platforms are already in common use.

URL : Using Peer Review to Support Development of Community Resources for Research Data Management

DOI : https://doi.org/10.7191/jeslib.2017.1114