A study of the impact of data sharing on article citations using journal policies as a natural experiment

Authors : Garret Christensen, Allan Dafoe, Edward Miguel, Don A. Moore, Andrew K. Rose

This study estimates the effect of data sharing on the citations of academic articles, using journal policies as a natural experiment. We begin by examining 17 high-impact journals that have adopted the requirement that data from published articles be publicly posted.

We match these 17 journals to 13 journals without policy changes and find that empirical articles published just before their change in editorial policy have citation rates with no statistically significant difference from those published shortly after the shift.

We then ask whether this null result stems from poor compliance with data sharing policies, and use the data sharing policy changes as instrumental variables to examine more closely two leading journals in economics and political science with relatively strong enforcement of new data policies.

We find that articles that make their data available receive 97 additional citations (estimate standard error of 34).

We conclude that: a) authors who share data may be rewarded eventually with additional scholarly citations, and b) data-posting policies alone do not increase the impact of articles published in a journal unless those policies are enforced.

URL : A study of the impact of data sharing on article citations using journal policies as a natural experiment

DOI : https://doi.org/10.1371/journal.pone.0225883

Resurfacing Historical Scientific Data: A Case Study Involving Fruit Breeding Data

Authors : Shannon L. Farrell, Lois G. Hendrickson, Kristen L. Mastel, Katherine Adina Allen, Julia A. Kelly

Objective

The objective of this paper is to illustrate the importance and complexities of working with historical analog data that exists on university campuses. Using a case study of fruit breeding data, we highlight issues and opportunities for librarians to help preserve and increase access to potentially valuable data sets.

Methods

We worked in conjunction with researchers to inventory, describe, and increase access to a large, 100-year-old data set of analog fruit breeding data. This involved creating a spreadsheet to capture metadata about each data set, identifying data sets at risk for loss, and digitizing select items for deposit in our institutional repository.

Results/Discussion

We illustrate that large amounts of data exist within biological and agricultural sciences departments and labs, and how past practices of data collection, record keeping, storage, and management have hindered data reuse.

We demonstrate that librarians have a role in collaborating with researchers and providing direction in how to preserve analog data and make it available for reuse. This work may provide guidance for other science librarians pursing similar projects.

Conclusions

This case study demonstrates how science librarians can build or strengthen their role in managing and providing access to analog data by combining their data management skills with researchers’ needs to recover and reuse data.

URL : Resurfacing Historical Scientific Data: A Case Study Involving Fruit Breeding Data

DOI : https://doi.org/10.7191/jeslib.2019.1171

“Data Stewardship Wizard”: A Tool Bringing Together Researchers, Data Stewards, and Data Experts around Data Management Planning

Authors: Robert Pergl, Rob Hooft, Marek Suchánek, Vojtěch Knaisl, Jan Slifka

The Data Stewardship Wizard is a tool for data management planning that is focused on getting the most value out of data management planning for the project itself rather than on fulfilling obligations.

It is based on FAIR Data Stewardship, in which each data-related decision in a project acts to optimize the Findability, Accessibility, Interoperability and/or Reusability of the data.

The background to this philosophy is that the first reuser of the data is the researcher themselves. The tool encourages the consulting of expertise and experts, can help researchers avoid risks they did not know they would encounter by confronting them with practical experience from others, and can help them discover helpful technologies they did not know existed.

In this paper, we discuss the context and motivation for the tool, we explain its architecture and we present key functions, such as the knowledge model evolvability and migrations, assembling data management plans, metrics and evaluation of data management plans.

URL : “Data Stewardship Wizard”: A Tool Bringing Together Researchers, Data Stewards, and Data Experts around Data Management Planning

DOI : http://doi.org/10.5334/dsj-2019-059

Data Curation for Big Interdisciplinary Science: The Pulley Ridge Experience

Authors : Timothy B. Norris, Christopher C. Mader

The curation and preservation of scientific data has long been recognized as an essential activity for the reproducibility of science and the advancement of knowledge. While investment into data curation for specific disciplines and at individual research institutions has advanced the ability to preserve research data products, data curation for big interdisciplinary science remains relatively unexplored terrain.

To fill this lacunae, this article presents a case study of the data curation for the National Centers for Coastal Ocean Science (NCCOS) funded project “Understanding Coral Ecosystem Connectivity in the Gulf of Mexico-Pulley Ridge to the Florida Keys” undertaken from 2011 to 2018 by more than 30 researchers at several research institutions.

The data curation process is described and a discussion of strengths, weaknesses and lessons learned is presented. Major conclusions from this case study include: the reimplementation of data repository infrastructure builds valuable institutional data curation knowledge but may not meet data curation standards and best practices; data from big interdisciplinary science can be considered as a special collection with the implication that metadata takes the form of a finding aid or catalog of datasets within the larger project context; and there are opportunities for data curators and librarians to synthesize and integrate results across disciplines and to create exhibits as stories that emerge from interdisciplinary big science.

URL : Data Curation for Big Interdisciplinary Science: The Pulley Ridge Experience

Alternative location : https://escholarship.umassmed.edu/jeslib/vol8/iss2/8/

Peer Review of Research Data Submissions to ScholarsArchive@OSU: How can we improve the curation of research datasets to enhance reusability?

Authors : Clara Llebot, Steven Van Tuyl

Objective

Best practices such as the FAIR Principles (Findability, Accessibility, Interoperability, Reusability) were developed to ensure that published datasets are reusable. While we employ best practices in the curation of datasets, we want to learn how domain experts view the reusability of datasets in our institutional repository, ScholarsArchive@OSU.

Curation workflows are designed by data curators based on their own recommendations, but research data is extremely specialized, and such workflows are rarely evaluated by researchers.

In this project we used peer-review by domain experts to evaluate the reusability of the datasets in our institutional repository, with the goal of informing our curation methods and ensure that the limited resources of our library are maximizing the reusability of research data.

Methods

We asked all researchers who have datasets submitted in Oregon State University’s repository to refer us to domain experts who could review the reusability of their data sets. Two data curators who are non-experts also reviewed the same datasets.

We gave both groups review guidelines based on the guidelines of several journals. Eleven domain experts and two data curators reviewed eight datasets.

The review included the quality of the repository record, the quality of the documentation, and the quality of the data. We then compared the comments given by the two groups.

Results

Domain experts and non-expert data curators largely converged on similar scores for reviewed datasets, but the focus of critique by domain experts was somewhat divergent.

A few broad issues common across reviews were: insufficient documentation, the use of links to journal articles in the place of documentation, and concerns about duplication of effort in creating documentation and metadata. Reviews also reflected the background and skills of the reviewer.

Domain experts expressed a lack of expertise in data curation practices and data curators expressed their lack of expertise in the research domain.

Conclusions

The results of this investigation could help guide future research data curation activities and align domain expert and data curator expectations for reusability of datasets.

We recommend further exploration of these common issues and additional domain expert peer-review project to further refine and align expectations for research data reusability.

URL : Peer Review of Research Data Submissions to ScholarsArchive@OSU: How can we improve the curation of research datasets to enhance reusability?

DOI : https://doi.org/10.7191/jeslib.2019.1166

Digging into data management in public‐funded, international research in digital humanities

Authors : Alex H. Poole, Deborah A. Garwood

Path‐breaking in theory and practice alike, digital humanities (DH) not only secures a larger public audience for humanities and social sciences research, but also permits researchers to ask novel questions and to revisit familiar ones. Public‐funded, international, and collaborative research in DH furthers institutional research missions and enriches networked knowledge.

The Digging into Data 3 challenge (DID3) (2014–2016), an international and interdisciplinary grant initiative embracing big data, included 14 teams sponsored by 10 funders from four nations.

A qualitative case study that relies on purposive sampling and grounded analysis, this article centers on the information practices of DID3 participants. Semistructured interviews were conducted with 53 participants on 11 of the 14 DID3 projects.

The study explores how Data Management Plan requirements affect work practices in public‐funded DH, how scholars grapple with key data management challenges, and how they plan to reuse and share their data. It concludes with three recommendations and three directions for future research.

DOI : https://doi.org/10.1002/asi.24213

Data Management Planning: How Requirements and Solutions are Beginning to Converge

Authors : Sarah Jones, Robert Pergl, Rob Hooft, Tomasz Miksa, Robert Samors, Judit Ungvari, Rowena I. Davis, Tina Lee

Effective stewardship of data is a critical precursor to making data FAIR. The goal of this paper is to bring an overview of current state of the art of data management and data stewardship planning solutions (DMP).

We begin by arguing why data management is an important vehicle supporting adoption and implementation of the FAIR principles, we describe the background, context and historical development, as well as major driving forces, being research initiatives and funders. Then we provide an overview of the current leading DMP tools in the form of a table presenting the key characteristics.

Next, we elaborate on emerging common standards for DMPs, especially the topic of machine-actionable DMPs. As sound DMP is not only a precursor of FAIR data stewardship, but also an integral part of it, we discuss its positioning in the emerging FAIR tools ecosystem. Capacity building and training activities are an important ingredient in the whole effort.

Although not being the primary goal of this paper, we touch also the topic of research workforce support, as tools can be just as much effective as their users are competent to use them properly.

We conclude by discussing the relations of DMP to FAIR principles, as there are other important connections than just being a precursor.

URL : Data Management Planning: How Requirements and Solutions are Beginning to Converge