Entrepôts de données de recherche : mesurer l’impact de l’Open Science à l’aune de la consultation des jeux de données déposés

Auteur/Author  : Violaine Rebouillat

Les décennies 2000 et 2010 ont vu se développer un nombre croissant de e-infrastructures de recherche, rendant plus aisés le partage et l’accès aux données scientifiques. Cette tendance s’est vue renforcée par l’essor de politiques d’ouverture des données, lesquelles ont donné lieu à une multiplication de réservoirs de données – aussi appelés « entrepôts de données ». Quantifier et qualifier l’utilisation des données rendues publiques constitue un élément essentiel pour évaluer l’impact des politiques d’ouverture des données.

Dans cet article, nous questionnons l’utilisation des données déposées dans les entrepôts. Dans quelle mesure ces données sont-elles consultées et téléchargées ?

L’article présente les premiers résultats d’une enquête quantitative auprès de 20 entrepôts. Il esquisse deux tendances, qui restent à ce stade propres à l’échantillon étudié, à savoir : (1) l’augmentation globale du nombre de consultations, de téléchargements et de données disponibles dans les entrepôts sur la période étudiée (2015-2020), et (2) la concentration des téléchargements sur une proportion relativement faible des données de l’entrepôt (de l’ordre de 10% à 30%).

URL : https://hal.archives-ouvertes.fr/hal-02928817/

Data librarian et services aux chercheurs en bibliothèque universitaire : de nouvelles médiations en émergence

Auteur/Author : Florence Thiault

Les services à destination des chercheurs se développent dans les bibliothèques universitaires françaises. L’augmentation de la quantité de données de recherche produites et réutilisées par les chercheurs pose des défis importants aux bibliothèques universitaires.

De nouvelles compétences associées à un profil professionnel spécifique celui de datalibrarian sont nécessaires pour assurer ces missions d’accompagnement à la recherche. Ce spécialiste des données à vocation à accompagner les chercheurs dans le cycle de vie de la recherche en assurant une collaboration active avec une série d’acteurs internes et externes.

Cette communication présente trois cas d’études emblématiques dans le registre des médiations à destination des chercheurs : l’analyse de la production scientifique, l’accompagnement à la recherche et à la publication ainsi que la gestion des données de recherche.

URL : https://hal.archives-ouvertes.fr/hal-02972705

‘I Updated the ‘: The Evolution of References in the English Wikipedia and the Implications for Altmetrics

Authors : Olga Zagovora, Roberto Ulloa, Katrin Weller, Fabian Flöck

With this work, we present a publicly available dataset of the history of all the references (more than 55 million) ever used in the English Wikipedia until June 2019. We have applied a new method for identifying and monitoring references in Wikipedia, so that for each reference we can provide data about associated actions: creation, modifications, deletions, and reinsertions.

The high accuracy of this method and the resulting dataset was confirmed via a comprehensive crowdworker labelling campaign. We use the dataset to study the temporal evolution of Wikipedia references as well as users’ editing behaviour.

We find evidence of a mostly productive and continuous effort to improve the quality of references: (1) there is a persistent increase of reference and document identifiers (DOI, PubMedID, PMC, ISBN, ISSN, ArXiv ID), and (2) most of the reference curation work is done by registered humans (not bots or anonymous editors).

We conclude that the evolution of Wikipedia references, including the dynamics of the community processes that tend to them should be leveraged in the design of relevance indexes for altmetrics, and our dataset can be pivotal for such effort.

URL : https://arxiv.org/abs/2010.03083

The measurement of “interdisciplinarity” and “synergy” in scientific and extra‐scientific collaborations

Authors : Loet Leydesdorff, Inga Ivanova

Problem solving often requires crossing boundaries, such as those between disciplines. When policy‐makers call for “interdisciplinarity,” however, they often mean “synergy.” Synergy is generated when the whole offers more possibilities than the sum of its parts. An increase in the number of options above the sum of the options in subsets can be measured as redundancy; that is, the number of not‐yet‐realized options.

The number of options available to an innovation system for realization can be as decisive for the system’s survival as the historically already‐realized innovations. Unlike “interdisciplinarity,” “synergy” can also be generated in sectorial or geographical collaborations. The measurement of “synergy,” however, requires a methodology different from the measurement of “interdisciplinarity.”

In this study, we discuss recent advances in the operationalization and measurement of “interdisciplinarity,” and propose a methodology for measuring “synergy” based on information theory.

The sharing of meanings attributed to information from different perspectives can increase redundancy. Increasing redundancy reduces the relative uncertainty, for example, in niches.

The operationalization of the two concepts—“interdisciplinarity” and “synergy”—as different and partly overlapping indicators allows for distinguishing between the effects and the effectiveness of science‐policy interventions in research priorities.

URL : The measurement of “interdisciplinarity” and “synergy” in scientific and extra‐scientific collaborations

DOI : https://doi.org/10.1002/asi.24416

Do researchers use open research data? Exploring the relationships between usage trends and metadata quality across scientific disciplines from the Figshare case

Authors : Alfonso Quarati, Juliana E Raffaghelli

Open research data (ORD) have been considered a driver of scientific transparency. However, data friction, as the phenomenon of data underutilisation for several causes, has also been pointed out.

A factor often called into question for ORD low usage is the quality of the ORD and associated metadata. This work aims to illustrate the use of ORD, published by the Figshare scientific repository, concerning their scientific discipline, their type and compared with the quality of their metadata.

Considering all the Figshare resources and carrying out a programmatic quality assessment of their metadata, our analysis highlighted two aspects. First, irrespective of the scientific domain considered, most ORD are under-used, but with exceptional cases which concentrate most researchers’ attention.

Second, there was no evidence that the use of ORD is associated with good metadata publishing practices. These two findings opened to a reflection about the potential causes of such data friction.

URL : Do researchers use open research data? Exploring the relationships between usage trends and metadata quality across scientific disciplines from the Figshare case

DOI : https://doi.org/10.1177/0165551520961048

Evaluating the impact of open access policies on research institutions

Authors : Chun-kai (karl) Huang, Cameron Neylon, Richard Hosking, Lucy Montgomery, Katie S Wilson, Alkim Ozaygen, Chloe Brookes-Kenworthy

The proportion of research outputs published in open access journals or made available on other freely-accessible platforms has increased over the past two decades, driven largely by funder mandates, institutional policies, grass-roots advocacy, and changing attitudes in the research community.

However, the relative effectiveness of these different interventions has remained largely unexplored. Here we present a robust, transparent and updateable method for analysing how these interventions affect the open access performance of individual institutes.

We studied 1,207 institutions from across the world, and found that, in 2017, the top-performing universities published around 80–90% of their research open access.

The analysis also showed that publisher-mediated (gold) open access was popular in Latin American and African universities, whereas the growth of open access in Europe and North America has mostly been driven by repositories.

URL : Evaluating the impact of open access policies on research institutions

DOI : https://doi.org/10.7554/eLife.57067

Global electronic thesis and dissertation repositories – collection diversity and management issues

Authors: Fayaz Ahmad Loan, Ufaira Yaseen Shah

This article discovers the collection diversity of electronic thesis and dissertation (ETD) repositories based on key parameters such as regional distribution, subject classification, language diversity, etc. and identifies the critical management issues of the ETD repositories related to collection management, software management, content management and metadata policies.

The ETD repositories were identified in the Directory of Open Access Repositories (OpenDOAR). The required data were manually collected from the OpenDOAR and websites of repositories to achieve the prescribed objectives of the study. The data were later tabulated, analysed and interpreted using simple arithmetic techniques.

The study was limited to the ETD repositories available in the OpenDOAR, and findings cannot be generalized across repositories and directories. It provides insights about ETD repositories worldwide, highlights their critical management issues and suggests mechanisms for their sustainable growth and development.

This article is purely based on research and its findings are valid for scholars, faculty members, institutions – as well as administrators and managers of the ETD repositories.

URL : Global electronic thesis and dissertation repositories – collection diversity and management issues

DOI : http://doi.org/10.1629/uksg.524