Business models for sustainable research data repositories

Author : OECD

There is a large variety of repositories that are responsible for providing long term access to data that is used for research. As data volumes and the demands for more open access to this data increase, these repositories are coming under increasing financial pressures that can undermine their long-term sustainability.

This report explores the income streams, costs, value propositions, and business models for 48 research data repositories. It includes a set of recommendations designed to provide a framework for developing sustainable business models and to assist policy makers and funders in supporting repositories with a balance of policy regulation and incentives.

DOI : http://dx.doi.org/10.1787/302b12bb-en

Recommended versus Certified Repositories: Mind the Gap

Authors : Sean Edward Husen, Zoë G. de Wilde, Anita de Waard, Helena Cousijn

Researchers are increasingly required to make research data publicly available in data repositories. Although several organisations propose criteria to recommend and evaluate the quality of data repositories, there is no consensus of what constitutes a good data repository.

In this paper, we investigate, first, which data repositories are recommended by various stakeholders (publishers, funders, and community organizations) and second, which repositories are certified by a number of organisations.

We then compare these two lists of repositories, and the criteria for recommendation and certification. We find that criteria used by organisations recommending and certifying repositories are similar, although the certification criteria are generally more detailed.

We distil the lists of criteria into seven main categories: “Mission”, “Community/Recognition”, “Legal and Contractual Compliance”, “Access/Accessibility”, “Technical Structure/Interface”, “Retrievability” and “Preservation”.

Although the criteria are similar, the lists of repositories that are recommended by the various agencies are very different. Out of all of the recommended repositories, less than 6% obtained certification.

As certification is becoming more important, steps should be taken to decrease this gap between recommended and certified repositories, and ensure that certification standards become applicable, and applied, to the repositories which researchers are currently using.

URL : Recommended versus Certified Repositories: Mind the Gap

DOI: https://doi.org/10.5334/dsj-2017-042

Social Science Data Repositories in Data Deluge: A Case Study at ICPSR Workflow and Practices

Authors :  Wei Jeng, Daqing He, Yu Chi

Design/methodology/approach

We conducted two focus group sessions and one individual interview with eight employees at the world’s largest social science data repository, the Interuniversity Consortium for Political and Social Research (ICPSR).

By examining their current actions (activities regarding their work responsibilities) and IT practices, we studied the barriers and challenges of archiving and curating qualitative data at ICPSR.

Purpose

Due to the recent surge of interest in the age of the data deluge, the importance of researching data infrastructures is increasing. The Open Archival Information System (OAIS) model has been widely adopted as a framework for creating and maintaining digital repositories.

Considering that OAIS is a reference model that requires customization for actual practice, this study examines how the current practices in a data repository map to the OAIS environment and functional components.

Findings

We observed that the OAIS model is robust and reliable in actual service processes for data curation and data archives. In addition, a data repository’s workflow resembles digital archives or even digital libraries.

On the other hand, we find that: 1) the cost of preventing disclosure risk and 2) a lack of agreement on the standards of text data files are the most apparent obstacles for data curation professionals to handle qualitative data; 3) the maturation of data metrics seems to be a promising solution to several challenges in social science data sharing.

Original value

We evaluated the gap between a research data repository’s current practices and the adoption of the OAIS model. We also identified answers to questions such as how current technological infrastructure in a leading data repository such as ICPSR supports their daily operations, what the ideal technologies in those data repositories would be, and the associated challenges that accompany these ideal technologies.

Most importantly, we helped to prioritize challenges and barriers from the data curator’s perspective, and contribute implications of data sharing and reuse in social sciences.

URL : http://d-scholarship.pitt.edu/31876/

Strengthening institutional data management and promoting data sharing in the social and economic sciences

Authors : Monika Linne, Wolfgang Zenk-Möltgen

In the German social and economic sciences there is a growing awareness of flexible data distribution and research data reuse, especially as increasing numbers of research funders recommend publishing research data as the basis for scientific insight.

However, a data-sharing mentality has not yet been established in Germany attributable to researchers’ strong reservations about publishing their data.

This attitude is exacerbated by the fact that, at present, there is no trusted national data sharing repository that covers the particular requirements of institutions regarding research data.

This article discusses how this objective can be achieved with the project initiative SowiDataNet.

The development of a community-driven data repository is a logically consistent and important step towards an attitude shift concerning data sharing in the social and economic sciences.

DOI : http://doi.org/10.18352/lq.10195

The Landscape of Research Data Repositories in 2015: A re3data Analysis

Authors : Maxi Kindling, Heinz Pampel, Stephanie van de Sandt, Jessika Rücknagel, Paul Vierkant, Gabriele Kloska, Michael Witt, Peter Schirmbacher, Roland Bertelmann, Frank Scholze

This article provides a comprehensive descriptive and statistical analysis of metadata information on 1,381 research data repositories worldwide and across all research disciplines.

The analyzed metadata is derived from the re3data database, enabling search and browse functionalities for the global registry of research data repositories. The analysis focuses mainly on institutions that operate research data repositories, types and subjects of research data repositories (RDR), access conditions as well as services provided by the research data repositories.

RDR differ in terms of the service levels they offer, languages they support or standards they comply with. These statements are commonly acknowledged by saying the RDR landscape is heterogeneous.

As expected, we found a heterogeneous RDR landscape that is mostly influenced by the repositories’ disciplinary background for which they offer services.

URL : http://www.dlib.org/dlib/march17/kindling/03kindling.html

Discovery and Reuse of Open Datasets: An Exploratory Study

Authors : Sara Mannheimer, Leila Belle Sterman, Susan Borda

Objective

This article analyzes twenty cited or downloaded datasets and the repositories that house them, in order to produce insights that can be used by academic libraries to encourage discovery and reuse of research data in institutional repositories.

Methods

Using Thomson Reuters’ Data Citation Index and repository download statistics, we identified twenty cited/downloaded datasets. We documented the characteristics of the cited/downloaded datasets and their corresponding repositories in a self-designed rubric.

The rubric includes six major categories: basic information; funding agency and journal information; linking and sharing; factors to encourage reuse; repository characteristics; and data description.

Results

Our small-scale study suggests that cited/downloaded datasets generally comply with basic recommendations for facilitating reuse: data are documented well; formatted for use with a variety of software; and shared in established, open access repositories.

Three significant factors also appear to contribute to dataset discovery: publishing in discipline-specific repositories; indexing in more than one location on the web; and using persistent identifiers.

The cited/downloaded datasets in our analysis came from a few specific disciplines, and tended to be funded by agencies with data publication mandates.

Conclusions

The results of this exploratory research provide insights that can inform academic librarians as they work to encourage discovery and reuse of institutional datasets.

Our analysis also suggests areas in which academic librarians can target open data advocacy in their communities in order to begin to build open data success stories that will fuel future advocacy efforts.

URL : Discovery and Reuse of Open Datasets: An Exploratory Study

DOI : http://dx.doi.org/10.7191/jeslib.2016.1091