Data Sustainability and Reuse Pathways of Natural Resources and Environmental Scientists

Author : Yi Shen

This paper presents a multifarious examination of natural resources and environmental scientists’ adventures navigating the policy change towards open access and cultural shift in data management, sharing, and reuse.

Situated in the institutional context of Virginia Tech, a focus group and multiple individual interviews were conducted exploring the domain scientists’ all-around experiences, performances, and perspectives on their collection, adoption, integration, preservation, and management of data.

The results reveal the scientists’ struggles, concerns, and barriers encountered, as well as their shared values, beliefs, passions, and aspirations when working with data. Based on these findings, this study provides suggestions on data modeling and knowledge representation strategies to support the long-term viability, stewardship, accessibility, and sustainability of scientific data.

It also discusses the art of curation as creative scholarship and new opportunities for data librarians and information professionals to mobilize the data revolution.

URL : https://arxiv.org/abs/1803.01788

Data sharing and reanalysis of randomized controlled trials in leading biomedical journals with a full data sharing policy: survey of studies published in The BMJ and PLOS Medicine

Authors : Florian Naudet, Charlotte Sakarovitch, Perrine Janiaud, Ioana Cristea, Daniele Fanelli, David Moher, John P A Ioannidis

Objectives

To explore the effectiveness of data sharing by randomized controlled trials (RCTs) in journals with a full data sharing policy and to describe potential difficulties encountered in the process of performing reanalyses of the primary outcomes.

Design

Survey of published RCTs.

Setting

PubMed/Medline.

Eligibility criteria

RCTs that had been submitted and published by The BMJ and PLOS Medicine subsequent to the adoption of data sharing policies by these journals.

Main outcome measure

The primary outcome was data availability, defined as the eventual receipt of complete data with clear labelling. Primary outcomes were reanalyzed to assess to what extent studies were reproduced. Difficulties encountered were described.

Results

37 RCTs (21 from The BMJ and 16 from PLOS Medicine) published between 2013 and 2016 met the eligibility criteria. 17/37 (46%, 95% confidence interval 30% to 62%) satisfied the definition of data availability and 14 of the 17 (82%, 59% to 94%) were fully reproduced on all their primary outcomes. Of the remaining RCTs, errors were identified in two but reached similar conclusions and one paper did not provide enough information in the Methods section to reproduce the analyses. Difficulties identified included problems in contacting corresponding authors and lack of resources on their behalf in preparing the datasets. In addition, there was a range of different data sharing practices across study groups.

Conclusions

Data availability was not optimal in two journals with a strong policy for data sharing. When investigators shared data, most reanalyses largely reproduced the original results. Data sharing practices need to become more widespread and streamlined to allow meaningful reanalyses and reuse of data.

 

Scaling Research Data Management Services Along the Maturity Spectrum: Three Institutional Perspectives

Authors : Cinthya Ippoliti, Amy Koshoffer, Renaine Julian, Micah Vandegrift, Devin Soper, Sophie Meridien

Research data services promise to advance many academic libraries’ strategic goals of becoming partners in the research process and integrating library services with modern research workflows. Academic librarians are well positioned to make an impact in this space due to their expertise in managing, curating, and preserving digital information, and a history of engaging with scholarly communications writ large.

Some academic libraries have quickly developed infrastructure and support for every activity ranging from data storage and curation to project management and collaboration, while others are just beginning to think about addressing the data needs of their researchers.

Regardless of which end of the spectrum they identify with, libraries are still seeking to understand the research landscape and define their role in the process.

This article seeks to blend both a general perspective regarding these issues with actual case studies derived from three institutions, University of Cincinnati, Oklahoma State University, and Florida State University, all of which are at different levels of implementation, maturity, and campus involvement.

URL : Scaling Research Data Management Services Along the Maturity Spectrum: Three Institutional Perspectives

DOI : https://dx.doi.org/10.17605/OSF.IO/WZ8FN

 

Open Data Protection : Study on legal barriers to open data sharing – Data Protection and PSI

Authors : Andreas Wiebe, Nils Dietrich

This study analyses legal barriers to data sharing in the context of the Open Research Data Pilot, which the European Commission is running within its research framework programme Horizon2020.

In the first part of the study, data protection issues are analysed. After a brief overview of the international basis for data protection, the European legal framework is described in detail.

The main focus is thus on the Data Protection Directive (95/46/EC), which has been in force since 1995. Not only is the Data Protection Directive itself described, but also its implementation in selected EU Member States.

Additionally, the upcoming General Data Protection Regulation (2016/679/EU) and relevant changes are described. Special focus is placed on leading data protection principles. Next, the study describes the use of research data in the Open Research Data Pilot and how data protection principles influence such use.

The experiences of the European Commission in running the Open Research Data Pilot so far, as well as basic examples of repository use forms, are considered. The second part of the study analyses the extent to which legislation on public sector information (PSI) influences access to and re-use of research data.

The Public Sector Information Directive (2003/98/EC) and the impact of its revision in 2013 (2013/37/EU) are described. There is a special focus on the application of PSI legislation to public libraries, including university and research libraries, and its practical implications.

In the final part of the study the results are critically evaluated and core recommendations are made to improve the legal situation in relation to research data.

URL : Open Data Protection : Study on legal barriers to open data sharing – Data Protection and PSI

Recommendations to Improve Downloads of Large Earth Observation Data

Authors : Rahul Ramachandran, Christopher Lynnes, Kathleen Baynes, Kevin Murphy, Jamie Baker, Jamie Kinney, Ariel Gold, Jed Sundwall, Mark Korver, Allison Lieber, William Vambenepe, Matthew Hancher,  Rebecca Moore, Tyler Erickson, Josh Henretig,
Brant Zwiefel, Heather Patrick-Ahlstrom, Matthew J. Smith

With the volume of Earth observation data expanding rapidly, cloud computing is quickly changing the way these data are processed, analyzed, and visualized. Collocating freely available Earth observation data on a cloud computing infrastructure may create opportunities unforeseen by the original data provider for innovation and value-added data re-use, but existing systems at data centers are not designed for supporting requests for large data transfers.

A lack of common methodology necessitates that each data center handle such requests from different cloud vendors differently. Guidelines are needed to support enabling all cloud vendors to utilize a common methodology for bulk-downloading data from data centers, thus preventing the providers from building custom capabilities to meet the needs of individual vendors.

This paper presents recommendations distilled from use cases provided by three cloud vendors (Amazon, Google, and Microsoft) and are based on the vendors’ interactions with data systems at different Federal agencies and organizations.

These specific recommendations range from obvious steps for improving data usability (such as ensuring the use of standard data formats and commonly supported projections) to non-obvious undertakings important for enabling bulk data downloads at scale.

These recommendations can be used to evaluate and improve existing data systems for high-volume data transfers, and their adoption can lead to cloud vendors utilizing a common methodology.

URL : Recommendations to Improve Downloads of Large Earth Observation Data

DOI : http://doi.org/10.5334/dsj-2018-002

 

Understanding Data Retrieval Practices: A Social Informatics Perspective

Authors : Kathleen Gregory, Helena Cousijn, Paul Groth, Andrea Scharnhorst, Sally Wyatt

Open research data are heralded as having the potential to increase effectiveness, productivity, and reproducibility in science, but little is known about the actual practices involved in data search and retrieval.

The socio-technical problem of locating data for (re)use is often reduced to the technological dimension of designing data search systems. In this article, we explore how a social informatics perspective can help to better analyze the current academic discourse about data retrieval as well as to study user practices and behaviors.

We employ two methods in our analysis – bibliometrics and interviews with data seekers – and conclude with a discussion of the implications of our findings for designing data discovery systems.

URL : https://arxiv.org/abs/1801.04971