Data Discovery Paradigms: User Requirements and Recommendations for Data Repositories

Authors: Mingfang Wu, Fotis Psomopoulos, Siri Jodha Khalsa, Anita de Waard

As data repositories make more data openly available it becomes challenging for researchers to find what they need either from a repository or through web search engines.

This study attempts to investigate data users’ requirements and the role that data repositories can play in supporting data discoverability by meeting those requirements.

We collected 79 data discovery use cases (or data search scenarios), from which we derived nine functional requirements for data repositories through qualitative analysis.

We then applied usability heuristic evaluation and expert review methods to identify best practices that data repositories can implement to meet each functional requirement.

We propose the following ten recommendations for data repository operators to consider for improving data discoverability and user’s data search experience:

1. Provide a range of query interfaces to accommodate various data search behaviours.

2. Provide multiple access points to find data.

3. Make it easier for researchers to judge relevance, accessibility and reusability of a data collection from a search summary.

4. Make individual metadata records readable and analysable.

5. Enable sharing and downloading of bibliographic references.

6. Expose data usage statistics.

7. Strive for consistency with other repositories.

8. Identify and aggregate metadata records that describe the same data object.

9. Make metadata records easily indexed and searchable by major web search engines.

10. Follow API search standards and community adopted vocabularies for interoperability.


Incorporating Software Curation into Research Data Management Services

Author : Fernando Rios

Many large research universities provide research data management (RDM) support services for researchers. These may include support for data management planning, best practices (e.g., organization, support, and storage), archiving, sharing, and publication.

However, these data-focused services may under-emphasize the importance of the software that is created to analyse said data. This is problematic for several reasons.

First, because software is an integral part of research across all disciplines, it undermines the ability of said research to be understood, verified, and reused by others (and perhaps even the researcher themselves).

Second, it may result in less visibility and credit for those involved in creating the software.

A third reason is related to stewardship: if there is no clear process for how, when, and where the software associated with research can be accessed and who will be responsible for maintaining such access, important details of the research may be lost over time.

This article presents the process by which the RDM services unit of a large research university addressed the lack of emphasis on software and source code in their existing service offerings.

The greatest challenges were related to the need to incorporate software into existing data-oriented service workflows while minimizing additional resources required, and the nascent state of software curation and archiving in a data management context.

The problem was addressed from four directions: building an understanding of software curation and preservation from various viewpoints (e.g., video games, software engineering), building a conceptual model of software preservation to guide service decisions, implementing software-related services, and documenting and evaluating the work to build expertise and establish a standard service level.

URL : Incorporating Software Curation into Research Data Management Services

Alternative location :

Research data management in the French National Research Center (CNRS)

Authors : Joachim Schöpfel, Coline Ferrant, Francis Andre, Renaud Fabre


The purpose of this paper is to present empirical evidence on the opinion and behaviour of French scientists (senior management level) regarding research data management (RDM).


The results are part of a nationwide survey on scientific information and documentation with 432 directors of French public research laboratories conducted by the French Research Center CNRS in 2014.


The paper presents empirical results about data production (types), management (human resources, IT, funding, and standards), data sharing and related needs, and highlights significant disciplinary differences.

Also, it appears that RDM and data sharing is not directly correlated with the commitment to open access. Regarding the FAIR data principles, the paper reveals that 68 per cent of all laboratory directors affirm that their data production and management is compliant with at least one of the FAIR principles.

But only 26 per cent are compliant with at least three principles, and less than 7 per cent are compliant with all four FAIR criteria, with laboratories in nuclear physics, SSH and earth sciences and astronomy being in advance of other disciplines, especially concerning the findability and the availability of their data output.

The paper concludes with comments about research data service development and recommendations for an institutional RDM policy.


For the first time, a nationwide survey was conducted with the senior research management level from all scientific disciplines. Surveys on RDM usually assess individual data behaviours, skills and needs. This survey is different insofar as it addresses institutional and collective data practice.

The respondents did not report on their own data behaviours and attitudes but were asked to provide information about their laboratory. The response rate was high (>30 per cent), and the results provide good insight into the real support and uptake of RDM by senior research managers who provide both models (examples for good practice) and opinion leadership.


Integrating Data Science Tools into a Graduate Level Data Management Course

Authors: Pete E. Pascuzzi, Megan R. Sapp Nelson


This paper describes a project to revise an existing research data management (RDM) course to include instruction in computer skills with robust data science tools.


A Carnegie R1 university.

Brief Description

Graduate student researchers need training in the basic concepts of RDM. However, they generally lack experience with robust data science tools to implement these concepts holistically. Two library instructors fundamentally redesigned an existing research RDM course to include instruction with such tools.

The course was divided into lecture and lab sections to facilitate the increased instructional burden. Learning objectives and assessments were designed at a higher order to allow students to demonstrate that they not only understood course concepts but could use their computer skills to implement these concepts.


Twelve students completed the first iteration of the course. Feedback from these students was very positive, and they appreciated the combination of theoretical concepts, computer skills and hands-on activities. Based on student feedback, future iterations of the course will include more “flipped” content including video lectures and interactive computer tutorials to maximize active learning time in both lecture and lab.

The substance of this article is based upon poster presentations at RDAP Summit 2018.

URL : Integrating Data Science Tools into a Graduate Level Data Management Course


Toward a Better Data Management Plan: The Impact of DMPs on Grant Funded Research Practices

Author : Sara Mannheimer

Data Management Plans (DMPs) are often required for grant applications. But do strong DMPs lead to better data management and sharing practices? Several recent research projects in the Library and Information Science field have investigated data management planning and practice through DMP content analysis and data-management-related interviews.

However, research hasn’t yet shown how DMPs ultimately affect data management and data sharing practices during grant-funded research. The research described in this article contributes to the existing literature by examining the impact of DMPs on grant awards and on Principal Investigators’ (PIs) data management and sharing practices.

The results of this research suggest the following key takeaways:

(1) Most PIs practice internal data management in order to prevent data loss, to facilitate sharing within the research team, and to seamlessly continue their research during personnel turnover;

(2) PIs still have room to grow in understanding specialized concepts such as metadata and policies for use and reuse;

(3) PIs may need guidance on practices that facilitate FAIR data, such as using metadata standards, assigning licenses to their data, and publishing in data repositories.

Ultimately, the results of this research can inform academic library services and support stronger, more actionable DMPs. The substance of this article is based upon a lightning talk presentation at RDAP Summit 2018.

URL : Toward a Better Data Management Plan: The Impact of DMPs on Grant Funded Research Practices


Les enjeux de l’interopérabilité dans la diffusion et la valorisation des données archéologiques

Auteur/Author : Pauline Vignaud

Discipline historique et scientifique, l’archéologie a vu ses pratiques évoluées depuis l’arrivée du numérique. Dès lors, plusieurs problématiques se sont imposées aux archéologues notamment dans leur manière de diffuser et de valoriser leurs données.

Dans ce contexte-là, des questions autour de l’interopérabilité ont émergé notamment les outils à développer (plateformes, applications, projets) et à mettre en place pour permettre le partage et la mise en valeur des données archéologiques.

Ce mémoire propose d’explorer toutes les thématiques (jeux de données, réutilisation…) où l’interopérabilité intervient dans cet environnement scientifique comme un facteur favorisant – ou problématique dans la diffusion et la valorisation.

URL : Les enjeux de l’interopérabilité dans la diffusion et la valorisation des données archéologiques

Alternative location :

From Open Access to Open Data: collaborative work in the university libraries of Catalonia

Authors: Mireia Alcalá Ponce de León, Lluís Anglada i de Ferrer

In the last years, the scientific community and funding bodies have paid attention to collected, generated or used data throughout different research activities. The dissemination of these data becomes one of the constituent elements of Open Science.

For this reason, many funders are requiring or promoting the development of Data Management Plans, and depositing open data following the FAIR principles (Findable, Accessible, Interoperable and Reusable).

Libraries and research offices of Catalan universities –which coordinately work within the Open Science Area of CSUC– offer support services to research data management. The different works carried out at the Consortium level will be presented, as well the implementation of the service in each university.

URL : From Open Access to Open Data: collaborative work in the university libraries of Catalonia