Authors : Corinna Gries, Amber Budden, Christine Laney, Margaret O’Brien, Mark Servilla, Wade Sheldon, Kristin Vanderbilt, David Vieglais
Environmental research data repositories provide much needed services for data preservation and data dissemination to diverse communities with domain specific or programmatic data needs and standards.
Due to independent development these repositories serve their communities well, but were developed with different technologies, data models and using different ontologies. Hence, the effectiveness and efficiency of these services can be vastly improved if repositories work together adhering to a shared community platform that focuses on the implementation of agreed upon standards and best practices for curation and dissemination of data.
Such a community platform drives forward the convergence of technologies and practices that will advance cross-domain interoperability. It will also facilitate contributions from investigators through standardized and streamlined workflows and provide increased visibility for the role of data managers and the curation services provided by data repositories, beyond preservation infrastructure.
Ten specific suggestions for such standardizations are outlined without any suggestions for priority or technical implementation. Although the recommendations are for repositories to implement, they have been chosen specifically with the data provider/data curator and synthesis scientist in mind.
URL : Facilitating and Improving Environmental Research Data Repository Interoperability
DOI : http://doi.org/10.5334/dsj-2018-022
Authors : Amy M Pienta, Dharma Akmon, Justin Noble, Lynette Hoelter, Susan Jekielek
Social scientists are producing an ever-expanding volume of data, leading to questions about appraisal and selection of content given finite resources to process data for reuse. We analyze users’ search activity in an established social science data repository to better understand demand for data and more effectively guide collection development.
By applying a data-driven approach, we aim to ensure curation resources are applied to make the most valuable data findable, understandable, accessible, and usable. We analyze data from a domain repository for the social sciences that includes over 500,000 annual searches in 2014 and 2015 to better understand trends in user search behavior.
Using a newly created search-to-study ratio technique, we identified gaps in the domain data repository’s holdings and leveraged this analysis to inform our collection and curation practices and policies.
The evaluative technique we propose in this paper will serve as a baseline for future studies looking at trends in user demand over time at the domain data repository being studied with broader implications for other data repositories.
URL : A Data-Driven Approach to Appraisal and Selection at a Domain Data Repository
DOI : https://doi.org/10.2218/ijdc.v12i2.500
Authors : Dylanne Dearborn, Steve Marks, Leanne Trimble
The purpose of this study was to examine changes in research data deposit policies of highly ranked journals in the physical and applied sciences between 2014 and 2016, as well as to develop an approach to examining the institutional impact of deposit requirements.
Policies from the top ten journals (ranked by impact factor from the Journal Citation Reports) were examined in 2014 and again in 2016 in order to determine if data deposits were required or recommended, and which methods of deposit were listed as options.
For all 2016 journals with a required data deposit policy, publication information (2009-2015) for the University of Toronto was pulled from Scopus and departmental affiliation was determined for each article.
The results showed that the number of high-impact journals in the physical and applied sciences requiring data deposit is growing. In 2014, 71.2% of journals had no policy, 14.7% had a recommended policy, and 13.9% had a required policy (n=836).
In contrast, in 2016, there were 58.5% with no policy, 19.4% with a recommended policy, and 22.0% with a required policy (n=880). It was also evident that U of T chemistry researchers are by far the most heavily affected by these journal data deposit requirements, having published 543 publications, representing 32.7% of all publications in the titles requiring data deposit in 2016.
The Python scripts used to retrieve institutional publications based on a list of ISSNs have been released on GitHub so that other institutions can conduct similar research.
URL : The Changing Influence of Journal Data Sharing Policies on Local RDM Practices
DOI : https://doi.org/10.2218/ijdc.v12i2.583
Authors : Amir Aryani, Marta Poblet, Kathryn Unsworth, Jingbo Wang, Ben Evans, Anusuriya Devaraju, Brigitte Hausstein, Claus-Peter Klas, Benjamin Zapilko, Samuele Kaplun
This paper describes the open access graph dataset that shows the connections between Dryad, CERN, ANDS and other international data repositories to publications and grants across multiple research data infrastructures.
The graph dataset was created using the Research Graph data model and the Research Data Switchboard (RD-Switchboard), a collaborative project by the Research Data Alliance DDRI Working Group (DDRI WG) with the aim to discover and connect the related research datasets based on publication co-authorship or jointly funded grants.
The graph dataset allows researchers to trace and follow the paths to understanding a body of work. By mapping the links between research datasets and related resources, the graph dataset improves both their discovery and visibility, while avoiding duplicate efforts in data creation.
Ultimately, the linked datasets may spur novel ideas, facilitate reproducibility and re-use in new applications, stimulate combinatorial creativity, and foster collaborations across institutions.
URL : A Research Graph dataset for connecting research data repositories using RD-Switchboard
Alternative location : https://www.nature.com/articles/sdata201899
Authors : Simone Ivan Conte, Federica Fina, Michalis Psalios, Shyam Ryal, Tomas Lebl, Anna Clements
Research funders have introduced requirements that expect researchers to properly manage and publicly share their research data, and expect institutions to put in place services to support researchers in meeting these requirements.
So far the general focus of these services and systems has been on addressing the final stages of the research data lifecycle (archive, share and re-use), rather than stages related to the active phase of the cycle (collect/create and analyse).
As a result, full integration of active data management systems with data repositories is not yet the norm, making the streamlined transition of data from an active to a published and archived status an important challenge.
In this paper we present the integration between an active data management system developed in-house (NOMAD) and Elsevier’s Pure data repository used at our institution, with the aim of offering a simple workflow to facilitate and promote the data deposit process.
The integration results in a new data management and publication workflow that helps researchers to save time, minimize human errors related to manually handling files, and further promote data deposit together with collaboration across the institution.
URL : Integration of an Active Research Data System with a Data Repository to Streamline the Research Data Lifecyle: Pure-NOMAD Case Study
Authors : Martin Fenner, Mercè Crosas, Jeffrey S. Grethe, David Kennedy, Henning Hermjakob, Phillippe Rocca-Serra, Gustavo Durand, Robin Berjon, Sebastian Karcher, Maryann Martone, Tim Clark
This article presents a practical roadmap for scholarly data repositories to implement data citation in accordance with the Joint Declaration of Data Citation Principles, a synopsis and harmonization of the recommendations of major science policy bodies.
The roadmap was developed by the Repositories Expert Group, as part of the Data Citation Implementation Pilot (DCIP) project, an initiative of FORCE11.org and the NIH BioCADDIE (https://biocaddie.org) program.
The roadmap makes 11 specific recommendations, grouped into three phases of implementation: a) required steps needed to support the Joint Declaration of Data Citation Principles, b) recommended steps that facilitate article/data publication workflows, and c) optional steps that further improve data citation support provided by data repositories.
URL : A Data Citation Roadmap for Scholarly Data Repositories
DOI : https://doi.org/10.1101/097196
Author : OECD
There is a large variety of repositories that are responsible for providing long term access to data that is used for research. As data volumes and the demands for more open access to this data increase, these repositories are coming under increasing financial pressures that can undermine their long-term sustainability.
This report explores the income streams, costs, value propositions, and business models for 48 research data repositories. It includes a set of recommendations designed to provide a framework for developing sustainable business models and to assist policy makers and funders in supporting repositories with a balance of policy regulation and incentives.
DOI : http://dx.doi.org/10.1787/302b12bb-en