A Conceptual Enterprise Framework for Managing Scientific Data Stewardship

Authors : Ge Peng, Jeffrey L. Privette, Curt Tilmes, Sky Bristol, Tom Maycock, John J. Bates, Scott Hausman, Otis Brown, Edward J. Kearns

Scientific data stewardship is an important part of long-term preservation and the use/reuse of digital research data. It is critical for ensuring trustworthiness of data, products, and services, which is important for decision-making.

Recent U.S. federal government directives and scientific organization guidelines have levied specific requirements, increasing the need for a more formal approach to ensuring that stewardship activities support compliance verification and reporting.

However, many science data centers lack an integrated, systematic, and holistic framework to support such efforts. The current business- and process-oriented stewardship frameworks are too costly and lengthy for most data centers to implement.

They often do not explicitly address the federal stewardship requirements and/or the uniqueness of geospatial data. This work proposes a data-centric conceptual enterprise framework for managing stewardship activities, based on the philosophy behind the Plan-Do-Check-Act (PDCA) cycle, a proven industrial concept.

This framework, which includes the application of maturity assessment models, allows for quantitative evaluation of how organizations manage their stewardship activities and supports informed decision-making for continual improvement towards full compliance with federal, agency, and user requirements.

URL : A Conceptual Enterprise Framework for Managing Scientific Data Stewardship

DOI : http://doi.org/10.5334/dsj-2018-015

Communicating data: interactive infographics, scientific data and credibility

Authors : Nan Li, Dominique Brossard, Dietram A. Scheufele, Paul H. Wilson, Kathleen M. Rose

Information visualization could be used to leverage the credibility of displayed scientific data. However, little was known about how display characteristics interact with individuals’ predispositions to affect perception of data credibility.

Using an experiment with 517 participants, we tested perceptions of data credibility by manipulating data visualizations related to the issue of nuclear fuel cycle based on three characteristics: graph format, graph interactivity, and source attribution.

Results showed that viewers tend to rely on preexisting levels of trust and peripheral cues, such as source attribution, to judge the credibility of shown data, whereas their comprehension level did not relate to perception of data credibility. We discussed the implications for science communicators and design professionals.

URL : Communicating data: interactive infographics, scientific data and credibility

DOI : https://doi.org/10.22323/2.17020206

Conceptualizing Data Curation Activities Within Two Academic Libraries

Authors : Sophia Lafferty-Hess, Julie Rudder, Moira Downey, Susan Ivey, Jennifer Darragh

A growing focus on sharing research data that meet certain standards, such as the FAIR guiding principles, has resulted in libraries increasingly developing and scaling up support for research data.

As libraries consider what new data curation services they would like to provide as part of their repository programs, there are various questions that arise surrounding scalability, resource allocation, requisite expertise, and how to communicate these services to the research community.

Data curation can involve a variety of tasks and activities. Some of these activities can be managed by systems, some require human intervention, and some require highly specialized domain or data type expertise.

At the 2017 Triangle Research Libraries Network Institute, staff from the University of North Carolina at Chapel Hill and Duke University used the 47 data curation activities identified by the Data Curation Network project to create conceptual groupings of data curation activities.

The results of this “thought-exercise” are discussed in this white paper. The purpose of this exercise was to provide more specificity around data curation within our individual contexts as a method to consistently discuss our current service models, identify gaps we would like to fill, and determine what is currently out of scope.

We hope to foster an open and productive discussion throughout the larger academic library community about how we prioritize data curation activities as we face growing demand and limited resources.

URL : Conceptualizing Data Curation Activities Within Two Academic Libraries

DOI : https://dx.doi.org/10.17605/OSF.IO/ZJ5PQ

Developing research data management services and support for researchers

Authors : Laure Perrier, Leslie Barnes

This mixed method study determined the essential tools and services required for research data management to aid academic researchers in fulfilling emerging funding agency and journal requirements. Focus groups were conducted and a rating exercise was designed to rank potential services.

Faculty conducting research at the University of Toronto were recruited; 28 researchers participated in four focus groups from June– August 2016. Two investigators independently coded the transcripts from the focus groups and identified four themes: 1) seamless infrastructure, 2) data security, 3) developing skills and knowledge, and 4) anxiety about releasing data.

Researchers require assistance with the secure storage of data and favour tools that are easy to use. Increasing knowledge of best practices in research data management is necessary and can be supported by the library using multiple strategies.

These findings help our library identify and prioritize tools and services in order to allocate resources in support of research data management on campus.

URL : Developing research data management services and support for researchers

DOI : https://doi.org/10.21083/partnership.v13i1.4115

Modelling the Research Data Lifecycle

Author: Stacy T Kowalczyk

This paper develops and tests a lifecycle model for the preservation of research data by investigating the research practices of scientists. This research is based on a mixed-method approach.

An initial study was conducted using case study analytical techniques; insights from these case studies were combined with grounded theory in order to develop a novel model of the Digital Research Data Lifecycle.

A broad-based quantitative survey was then constructed to test and extend the components of the model. The major contribution of these research initiatives are the creation of the Digital Research Data Lifecycle, a data lifecycle that provides a generalized model of the research process to better describe and explain both the antecedents and barriers to preservation.

The antecedents and barriers to preservation are data management, contextual metadata, file formats, and preservation technologies. The availability of data management support and preservation technologies, the ability to create and manage contextual metadata, and the choices of file formats all significantly effect the preservability of research data.

URL : Modelling the Research Data Lifecycle

DOI : https://doi.org/10.2218/ijdc.v12i2.429

A Data-Driven Approach to Appraisal and Selection at a Domain Data Repository

Authors : Amy M Pienta, Dharma Akmon, Justin Noble, Lynette Hoelter, Susan Jekielek

Social scientists are producing an ever-expanding volume of data, leading to questions about appraisal and selection of content given finite resources to process data for reuse. We analyze users’ search activity in an established social science data repository to better understand demand for data and more effectively guide collection development.

By applying a data-driven approach, we aim to ensure curation resources are applied to make the most valuable data findable, understandable, accessible, and usable. We analyze data from a domain repository for the social sciences that includes over 500,000 annual searches in 2014 and 2015 to better understand trends in user search behavior.

Using a newly created search-to-study ratio technique, we identified gaps in the domain data repository’s holdings and leveraged this analysis to inform our collection and curation practices and policies.

The evaluative technique we propose in this paper will serve as a baseline for future studies looking at trends in user demand over time at the domain data repository being studied with broader implications for other data repositories.

URL : A Data-Driven Approach to Appraisal and Selection at a Domain Data Repository

DOI : https://doi.org/10.2218/ijdc.v12i2.500

The Changing Influence of Journal Data Sharing Policies on Local RDM Practices

Authors : Dylanne Dearborn, Steve Marks, Leanne Trimble

The purpose of this study was to examine changes in research data deposit policies of highly ranked journals in the physical and applied sciences between 2014 and 2016, as well as to develop an approach to examining the institutional impact of deposit requirements.

Policies from the top ten journals (ranked by impact factor from the Journal Citation Reports) were examined in 2014 and again in 2016 in order to determine if data deposits were required or recommended, and which methods of deposit were listed as options.

For all 2016 journals with a required data deposit policy, publication information (2009-2015) for the University of Toronto was pulled from Scopus and departmental affiliation was determined for each article.

The results showed that the number of high-impact journals in the physical and applied sciences requiring data deposit is growing. In 2014, 71.2% of journals had no policy, 14.7% had a recommended policy, and 13.9% had a required policy (n=836).

In contrast, in 2016, there were 58.5% with no policy, 19.4% with a recommended policy, and 22.0% with a required policy (n=880). It was also evident that U of T chemistry researchers are by far the most heavily affected by these journal data deposit requirements, having published 543 publications, representing 32.7% of all publications in the titles requiring data deposit in 2016.

The Python scripts used to retrieve institutional publications based on a list of ISSNs have been released on GitHub so that other institutions can conduct similar research.

URL : The Changing Influence of Journal Data Sharing Policies on Local RDM Practices

DOI : https://doi.org/10.2218/ijdc.v12i2.583