Entrepôts de données de recherche : mesurer l’impact de l’Open Science à l’aune de la consultation des jeux de données déposés

Auteur/Author  : Violaine Rebouillat

Les décennies 2000 et 2010 ont vu se développer un nombre croissant de e-infrastructures de recherche, rendant plus aisés le partage et l’accès aux données scientifiques. Cette tendance s’est vue renforcée par l’essor de politiques d’ouverture des données, lesquelles ont donné lieu à une multiplication de réservoirs de données – aussi appelés « entrepôts de données ». Quantifier et qualifier l’utilisation des données rendues publiques constitue un élément essentiel pour évaluer l’impact des politiques d’ouverture des données.

Dans cet article, nous questionnons l’utilisation des données déposées dans les entrepôts. Dans quelle mesure ces données sont-elles consultées et téléchargées ?

L’article présente les premiers résultats d’une enquête quantitative auprès de 20 entrepôts. Il esquisse deux tendances, qui restent à ce stade propres à l’échantillon étudié, à savoir : (1) l’augmentation globale du nombre de consultations, de téléchargements et de données disponibles dans les entrepôts sur la période étudiée (2015-2020), et (2) la concentration des téléchargements sur une proportion relativement faible des données de l’entrepôt (de l’ordre de 10% à 30%).

URL : https://hal.archives-ouvertes.fr/hal-02928817/

Data librarian et services aux chercheurs en bibliothèque universitaire : de nouvelles médiations en émergence

Auteur/Author : Florence Thiault

Les services à destination des chercheurs se développent dans les bibliothèques universitaires françaises. L’augmentation de la quantité de données de recherche produites et réutilisées par les chercheurs pose des défis importants aux bibliothèques universitaires.

De nouvelles compétences associées à un profil professionnel spécifique celui de datalibrarian sont nécessaires pour assurer ces missions d’accompagnement à la recherche. Ce spécialiste des données à vocation à accompagner les chercheurs dans le cycle de vie de la recherche en assurant une collaboration active avec une série d’acteurs internes et externes.

Cette communication présente trois cas d’études emblématiques dans le registre des médiations à destination des chercheurs : l’analyse de la production scientifique, l’accompagnement à la recherche et à la publication ainsi que la gestion des données de recherche.

URL : https://hal.archives-ouvertes.fr/hal-02972705

Do researchers use open research data? Exploring the relationships between usage trends and metadata quality across scientific disciplines from the Figshare case

Authors : Alfonso Quarati, Juliana E Raffaghelli

Open research data (ORD) have been considered a driver of scientific transparency. However, data friction, as the phenomenon of data underutilisation for several causes, has also been pointed out.

A factor often called into question for ORD low usage is the quality of the ORD and associated metadata. This work aims to illustrate the use of ORD, published by the Figshare scientific repository, concerning their scientific discipline, their type and compared with the quality of their metadata.

Considering all the Figshare resources and carrying out a programmatic quality assessment of their metadata, our analysis highlighted two aspects. First, irrespective of the scientific domain considered, most ORD are under-used, but with exceptional cases which concentrate most researchers’ attention.

Second, there was no evidence that the use of ORD is associated with good metadata publishing practices. These two findings opened to a reflection about the potential causes of such data friction.

URL : Do researchers use open research data? Exploring the relationships between usage trends and metadata quality across scientific disciplines from the Figshare case

DOI : https://doi.org/10.1177/0165551520961048

Who Does What? – Research Data Management at ETH Zurich

Authors: Matthias Töwe, Caterina Barillari

We present the approach to Research Data Management (RDM) support for researchers taken at ETH Zurich. Overall requirements are governed by institutional guidelines for Research Integrity, funders’ regulations, and legal obligations. The ETH approach is based on the distinction of three phases along the research data life-cycle: 1. Data Management Planning; 2. Active RDM; 3. Data Publication and Preservation. Two ETH units, namely the Scientific IT Services and the ETH Library, provide support for different aspects of these phases, building on their respective competencies. They jointly offer trainings, consulting, information, and materials for the first phase.

The second phase deals with data which is in current use in active research projects. Scientific IT Services provide their own platform, openBIS, for keeping track of raw, processed and analysed data, in addition to organising samples, materials, and scientific procedures.

ETH Library operates solutions for the third phase within the infrastructure of ETH Zurich’s central IT Services. The Research Collection is the institutional repository for research output including Research Data, Open Access publications, and ETH Zurich’s bibliography.

URL : Who Does What? – Research Data Management at ETH Zurich

DOI : http://doi.org/10.5334/dsj-2020-036

Towards FAIR protocols and workflows: the OpenPREDICT use case

Authors : Remzi Celebi, Joao Rebelo Moreira, Ahmed A. Hassan, Sandeep Ayyar, Lars Ridder, Tobias Kuhn, Michel Dumontier

It is essential for the advancement of science that researchers share, reuse and reproduce each other’s workflows and protocols. The FAIR principles are a set of guidelines that aim to maximize the value and usefulness of research data, and emphasize the importance of making digital objects findable and reusable by others.

The question of how to apply these principles not just to data but also to the workflows and protocols that consume and produce them is still under debate and poses a number of challenges. In this paper we describe a two-fold approach of simultaneously applying the FAIR principles to scientific workflows as well as the involved data.

We apply and evaluate our approach on the case of the PREDICT workflow, a highly cited drug repurposing workflow. This includes FAIRification of the involved datasets, as well as applying semantic technologies to represent and store data about the detailed versions of the general protocol, of the concrete workflow instructions, and of their execution traces.

We propose a semantic model to address these specific requirements and was evaluated by answering competency questions. This semantic model consists of classes and relations from a number of existing ontologies, including Workflow4ever, PROV, EDAM, and BPMN.

This allowed us then to formulate and answer new kinds of competency questions. Our evaluation shows the high degree to which our FAIRified OpenPREDICT workflow now adheres to the FAIR principles and the practicality and usefulness of being able to answer our new competency questions.

URL : Towards FAIR protocols and workflows: the OpenPREDICT use case

DOI : https://doi.org/10.7717/peerj-cs.281

What drives and inhibits researchers to share and use open research data? A systematic literature review to analyze factors influencing open research data adoption

Authors : Anneke Zuiderwijk, Rhythima Shinde, Wei Jeng

Both sharing and using open research data have the revolutionary potentials for forwarding scientific advancement. Although previous research gives insight into researchers’ drivers and inhibitors for sharing and using open research data, both these drivers and inhibitors have not yet been integrated via a thematic analysis and a theoretical argument is lacking.

This study’s purpose is to systematically review the literature on individual researchers’ drivers and inhibitors for sharing and using open research data. This study systematically analyzed 32 open data studies (published between 2004 and 2019 inclusively) and elicited drivers plus inhibitors for both open research data sharing and use in eleven categories total that are: ‘the researcher’s background’, ‘requirements and formal obligations’, ‘personal drivers and intrinsic motivations’, ‘facilitating conditions’, ‘trust’, ‘expected performance’, ‘social influence and affiliation’, ‘effort’, ‘the researcher’s experience and skills’, ‘legislation and regulation’, and ‘data characteristics.’

This study extensively discusses these categories, along with argues how such categories and factors are connected using a thematic analysis. Also, this study discusses several opportunities for altogether applying, extending, using, and testing theories in open research data studies.

With such discussions, an overview of identified categories and factors can be further applied to examine both researchers’ drivers and inhibitors in different research disciplines, such as those with low rates of data sharing and use versus disciplines with high rates of data sharing plus use. What’s more, this study serves as a first vital step towards developing effective incentives for both open data sharing and use behavior.

URL : What drives and inhibits researchers to share and use open research data? A systematic literature review to analyze factors influencing open research data adoption

DOI : https://doi.org/10.1371/journal.pone.0239283

Research data management policy and practice in Chinese university libraries

Authors : Yingshen Huang, Andrew M. Cox, Laura Sbaffi

On April 2, 2018, the State Council of China formally released a national Research Data Management (RDM) policy “Measures for Managing Scientific Data”. In this context and given that university libraries have played an important role in supporting RDM at an institutional level in North America, Europe, and Australasia, the aim of this article is to explore the current status of RDM in Chinese universities, in particular how university libraries have been involved in taking the agenda forward.

This article uses a mixed‐methods data collection approach and draws on a website analysis of university policies and services; a questionnaire for university librarians; and semi‐structured interviews. Findings indicate that Research Data Service at a local level in Chinese Universities are in their infancy.

There is more evidence of activity in developing data repositories than support services. There is little development of local policy. Among the explanations of this may be the existence of a national‐level infrastructure for some subject disciplines, the lack of professionalization of librarianship, and the relatively weak resonance of openness as an idea in the Chinese context.

URL : Research data management policy and practice in Chinese university libraries

DOI : https://doi.org/10.1002/asi.24413