Searching Data: A Review of Observational Data Retrieval Practices in Selected Disciplines

Authors : Kathleen Gregory, Paul Groth, Helena Cousijn, Andrea Scharnhorst, Sally Wyatt

A cross‐disciplinary examination of the user behaviors involved in seeking and evaluating data is surprisingly absent from the research data discussion. This review explores the data retrieval literature to identify commonalities in how users search for and evaluate observational research data in selected disciplines.

Two analytical frameworks, rooted in information retrieval and science and technology studies, are used to identify key similarities in practices as a first step toward developing a model describing data retrieval.

URL : Searching Data: A Review of Observational Data Retrieval Practices in Selected Disciplines

DOI : https://doi.org/10.1002/asi.24165

Understanding Data Retrieval Practices: A Social Informatics Perspective

Authors : Kathleen Gregory, Helena Cousijn, Paul Groth, Andrea Scharnhorst, Sally Wyatt

Open research data are heralded as having the potential to increase effectiveness, productivity, and reproducibility in science, but little is known about the actual practices involved in data search and retrieval.

The socio-technical problem of locating data for (re)use is often reduced to the technological dimension of designing data search systems. In this article, we explore how a social informatics perspective can help to better analyze the current academic discourse about data retrieval as well as to study user practices and behaviors.

We employ two methods in our analysis – bibliometrics and interviews with data seekers – and conclude with a discussion of the implications of our findings for designing data discovery systems.

URL : https://arxiv.org/abs/1801.04971

Recommended versus Certified Repositories: Mind the Gap

Authors : Sean Edward Husen, Zoë G. de Wilde, Anita de Waard, Helena Cousijn

Researchers are increasingly required to make research data publicly available in data repositories. Although several organisations propose criteria to recommend and evaluate the quality of data repositories, there is no consensus of what constitutes a good data repository.

In this paper, we investigate, first, which data repositories are recommended by various stakeholders (publishers, funders, and community organizations) and second, which repositories are certified by a number of organisations.

We then compare these two lists of repositories, and the criteria for recommendation and certification. We find that criteria used by organisations recommending and certifying repositories are similar, although the certification criteria are generally more detailed.

We distil the lists of criteria into seven main categories: “Mission”, “Community/Recognition”, “Legal and Contractual Compliance”, “Access/Accessibility”, “Technical Structure/Interface”, “Retrievability” and “Preservation”.

Although the criteria are similar, the lists of repositories that are recommended by the various agencies are very different. Out of all of the recommended repositories, less than 6% obtained certification.

As certification is becoming more important, steps should be taken to decrease this gap between recommended and certified repositories, and ensure that certification standards become applicable, and applied, to the repositories which researchers are currently using.

URL : Recommended versus Certified Repositories: Mind the Gap

DOI: https://doi.org/10.5334/dsj-2017-042

Searching Data: A Review of Observational Data Retrieval Practices

Authors : Kathleen Gregory, Paul Groth, Helena Cousijn, Andrea Scharnhorst, Sally Wyatt

A cross-disciplinary examination of the user behaviours involved in seeking and evaluating data is surprisingly absent from the research data discussion. This review explores the data retrieval literature to identify commonalities in how users search for and evaluate observational research data.

Two analytical frameworks rooted in information retrieval and science technology studies are used to identify key similarities in practices as a first step toward developing a model describing data retrieval.

URL : https://arxiv.org/abs/1707.06937