Pursuing Best Performance in Research Data Management by Using the Capability Maturity Model and Rubrics

Authors : Jian Qin, Kevin Crowston, Arden Kirkland

Objective

To support the assessment and improvement of research data management (RDM) practices to increase its reliability, this paper describes the development of a capability maturity model (CMM) for RDM. Improved RDM is now a critical need, but low awareness of – or lack of – data management is still common among research projects.

Methods

A CMM includes four key elements: key practices, key process areas, maturity levels, and generic processes. These elements were determined for RDM by a review and synthesis of the published literature on and best practices for RDM.

Results

The RDM CMM includes five chapters describing five key process areas for research data management: 1) data management in general; 2) data acquisition, processing, and quality assurance; 3) data description and representation; 4) data dissemination; and 5) repository services and preservation.

In each chapter, key data management practices are organized into four groups according to the CMM’s generic processes: commitment to perform, ability to perform, tasks performed, and process assessment (combining the original measurement and verification).

For each area of practice, the document provides a rubric to help projects or organizations assess their level of maturity in RDM.

Conclusions

By helping organizations identify areas of strength and weakness, the RDM CMM provides guidance on where effort is needed to improve the practice of RDM.

URL : Pursuing Best Performance in Research Data Management by Using the Capability Maturity Model and Rubrics

DOI : https://doi.org/10.7191/jeslib.2017.1113

The development of a research data policy at Wageningen University & Research: best practices as a framework

Authors: Hilde van Zeeland, Jacquelijn Ringersma

The current case study describes the development of a Research Data Management policy at Wageningen University & Research, the Netherlands. To develop this policy, an analysis was carried out of existing frameworks and principles on data management (such as the FAIR principles), as well as of the data management practices in the organisation.

These practices were defined through interviews with research groups. Using criteria drawn from the existing frameworks and principles, certain research groups were identified as ‘best-practices’: cases where data management was meeting the most important data management criteria.

These best-practices were then used to inform the RDM policy. This approach shows how engagement with researchers can not only provide insight into their data management practices and needs, but directly inform new policy guidelines.

URL : The development of a research data policy at Wageningen University & Research: best practices as a framework

DOI : http://doi.org/10.18352/lq.10215

Using Peer Review to Support Development of Community Resources for Research Data Management

Authors : Heather Soyka, Amber Budden, Viv Hutchison, David Bloom, Jonah Duckles, Amy Hodge, Matthew S. Mayernik, Timothée Poisot, Shannon Rauch, Gail Steinhart, Leah Wasser, Amanda L. Whitmire, Stephanie Wright

Objective

To ensure that resources designed to teach skills and best practices for scientific research data sharing and management are useful, the maintainers of those materials need to evaluate and update them to ensure their accuracy, currency, and quality.

This paper advances the use and process of outside peer review for community resources in addressing ongoing accuracy, quality, and currency issues. It further describes the next step of moving the updated materials to an online collaborative community platform for future iterative review in order to build upon mechanisms for open science, ongoing iteration, participation, and transparent community engagement.

Setting

Research data management resources were developed in support of the DataONE (Data Observation Network for Earth) project, which has deployed a sustainable, long-term network to ensure the preservation and access to multi-scale, multi-discipline, and multi-national environmental and biological science data (Michener et al. 2012).

Created by members of the Community Engagement and Education (CEE) Working Group in 2011-2012, the freely available Educational Modules included three complementary components (slides, handouts, and exercises) that were designed to be adaptable for use in classrooms as well as for research data management training.

Methods

Because the modules were initially created and launched in 2011-2012, the current members of the (renamed) Community Engagement and Outreach (CEO) Working Group were concerned that the materials could be and / or quickly become outdated and should be reviewed for accuracy, currency, and quality.

In November 2015, the Working Group developed an evaluation rubric for use by outside reviewers. Review criteria were developed based on surveys and usage scenarios from previous DataONE projects.

Peer reviewers were selected from the DataONE community network for their expertise in the areas covered by one of the 11 educational modules. Reviewers were contacted in March 2016, and were asked to volunteer to complete their evaluations online within one month of the request, by using a customized Google form.

Results

For the 11 modules, 22 completed reviews were received by April 2016 from outside experts. Comments on all three components of each module (slides, handouts, and exercises) were compiled and evaluated by the postdoctoral fellow attached to the CEO Working Group.

These reviews contributed to the full evaluation and revision by members of the Working Group of all educational modules in September 2016. This review process, as well as the potential lack of funding for ongoing maintenance by Working Group members or paid staff, provoked the group to transform the modules to a more stable, non-proprietary format, and move them to an online open repository hosting platform, GitHub.

These decisions were made to foster sustainability, community engagement, version control, and transparency.

Conclusion

Outside peer review of the modules by experts in the field was beneficial for highlighting areas of weakness or overlap in the education modules. The modules were initially created in 2011-2012 by an earlier iteration of the Working Group, and updates were needed due to the constant evolving practices in the field.

Because the review process was lengthy (approximately one year) comparative to the rate of innovations in data management practices, the Working Group discussed other options that would allow community members to make updates available more quickly.

The intent of migrating the modules to an online collaborative platform (GitHub) is to allow for iterative updates and ongoing outside review, and to provide further transparency about accuracy, currency, and quality in the spirit of open science and collaboration.

Documentation about this project may be useful for others trying to develop and maintain educational resources for engagement and outreach, particularly in communities and spaces where information changes quickly, and open platforms are already in common use.

URL : Using Peer Review to Support Development of Community Resources for Research Data Management

DOI : https://doi.org/10.7191/jeslib.2017.1114

Recommended versus Certified Repositories: Mind the Gap

Authors : Sean Edward Husen, Zoë G. de Wilde, Anita de Waard, Helena Cousijn

Researchers are increasingly required to make research data publicly available in data repositories. Although several organisations propose criteria to recommend and evaluate the quality of data repositories, there is no consensus of what constitutes a good data repository.

In this paper, we investigate, first, which data repositories are recommended by various stakeholders (publishers, funders, and community organizations) and second, which repositories are certified by a number of organisations.

We then compare these two lists of repositories, and the criteria for recommendation and certification. We find that criteria used by organisations recommending and certifying repositories are similar, although the certification criteria are generally more detailed.

We distil the lists of criteria into seven main categories: “Mission”, “Community/Recognition”, “Legal and Contractual Compliance”, “Access/Accessibility”, “Technical Structure/Interface”, “Retrievability” and “Preservation”.

Although the criteria are similar, the lists of repositories that are recommended by the various agencies are very different. Out of all of the recommended repositories, less than 6% obtained certification.

As certification is becoming more important, steps should be taken to decrease this gap between recommended and certified repositories, and ensure that certification standards become applicable, and applied, to the repositories which researchers are currently using.

URL : Recommended versus Certified Repositories: Mind the Gap

DOI: https://doi.org/10.5334/dsj-2017-042

What do data curators care about? Data quality, user trust, and the data reuse plan

Author : Frank Andreas Sposito

Data curation is often defined as the practice of maintaining, preserving, and enhancing research data for long-term value and reusability. The role of data reuse in the data curation lifecycle is critical: increased reuse is the core justification for the often sizable expenditures necessary to build data management infrastructures and user services.

Yet recent studies have shown that data are being shared and reused through open data repositories at much lower levels than expected. These studies underscore a fundamental and often overlooked challenge in research data management that invites deeper examination of the roles and responsibilities of data curators.

This presentation will identify key barriers to data reuse, data quality and user trust, and propose a framework for implementing reuser-centric strategies to increase data reuse.

Using the concept of a “data reuse plan” it will highlight repository-based approaches to improve data quality and user trust, and address critical areas for innovation for data curators working in the absence of repository support.

URL : What do data curators care about? Data quality, user trust, and the data reuse plan

Alternative location : http://library.ifla.org/id/eprint/1797

 

Scientific data from and for the citizen

Authors : Sven Schade, Chrisa Tsinaraki, Elena Roglia

Powered by advances of technology, today’s Citizen Science projects cover a wide range of thematic areas and are carried out from local to global levels. This wealth of activities creates an abundance of data, for example, in the forms of observations submitted by mobile phones; readings of low-cost sensors; or more general information about peoples’ activities.

The management and possible sharing of this data has become a research topic in its own right. We conducted a survey in the summer of 2015 in order to collectively analyze the state of play in Citizen Science.

This paper summarizes our main findings related to data access, standardization and data preservation. We provide examples of good practices in each of these areas and outline actions to address identified challenges.

URL : http://firstmonday.org/ojs/index.php/fm/article/view/7842

Understanding Perspectives on Sharing Neutron Data at Oak Ridge National Laboratory

Authors : Devan Ray Donaldson, Shawn Martin, Thomas Proffen

Even though the importance of sharing data is frequently discussed, data sharing appears to be limited to a few fields, and practices within those fields are not well understood. This study examines perspectives on sharing neutron data collected at Oak Ridge National Laboratory’s neutron sources.

Operation at user facilities has traditionally focused on making data accessible to those who create them. The recent emphasis on open data is shifting the focus to ensure that the data produced are reusable by others.

This mixed methods research study included a series of surveys and focus group interviews in which 13 data consumers, data managers, and data producers answered questions about their perspectives on sharing neutron data.

Data consumers reported interest in reusing neutron data for comparison/verification of results against their own measurements and testing new theories using existing data. They also stressed the importance of establishing context for data, including how data are produced, how samples are prepared, units of measurement, and how temperatures are determined.

Data managers expressed reservations about reusing others’ data because they were not always sure if they could trust whether the people responsible for interpreting data did so correctly.

Data producers described concerns about their data being misused, competing with other users, and over-reliance on data producers to understand data. We present the Consumers Managers Producers (CMP) Model for understanding the interplay of each group regarding data sharing.

We conclude with policy and system recommendations and discuss directions for future research.

URL : Understanding Perspectives on Sharing Neutron Data at Oak Ridge National Laboratory

DOI : http://doi.org/10.5334/dsj-2017-035