Le partage des données vu par les chercheurs : une approche par la valeur

Auteur/Author : Violaine Rebouillat

Le propos de cet article porte sur la compréhension des logiques qui interviennent dans la définition de la valeur des données de la recherche, celles-ci pouvant avoir une influence sur les critères déterminant leur motivation au partage.

L’approche méthodologique repose sur une enquête qualitative, menée dans le cadre d’une recherche doctorale, qui a déployé 57 entretiens semi-directifs. Alors que les travaux menés autour des données sont focalisés sur les freins et motivations du partage, l’originalité de cette recherche consiste à identifier les différents prismes par lesquels la question de la valeur des données impacte la motivation et la décision de leur partage.

L’analyse des résultats montre que, tous domaines confondus, la valeur des données reste encore cristallisée autour de la publication et de la reconnaissance symbolique du travail du chercheur.

Les résultats permettent de comprendre que la question du partage est confrontée à un impensé : celui du cadre actuel de l’évaluation de la recherche, qui met l’article scientifique au cœur de son dispositif.

Ce travail contribue donc à montrer que l’avenir du partage des données dépend des systèmes alternatifs futurs d’évaluation de la recherche, associés à la science ouverte.

URL : https://lesenjeux.univ-grenoble-alpes.fr/2021/varia/03-le-partage-des-donnees-vu-par-les-chercheurs-une-approche-par-la-valeur/

Modes d’évaluation ouverte par les pairs : de la revue à la plateforme

Auteurs/Authors : Evelyne Broudoux, Madjid Ihadjadene

Cet article a pour but de proposer un état de l’art des différentes formes de l’évaluation d’articles ou de communications par les pairs. De l’évaluation « aveugle» à l’évaluation « ouverte », de multiples possibilités existent et sont expérimentées.

C’est dans le champ des sciences que l’on trouve le plus d’innovations sociotechniques s’appuyant sur des plateformes de publication modélisant des workflows éditoriaux originaux.

L’ouverture de l’évaluation peut se produire entre pairs, en rendant publiques les identités et/ou les rapports des évaluateurs, à différents stades de l’article scientifique : préprint, en cours de rédaction, ou encore après publication.

Cet état de l’art est basé sur un ensemble de publications essentiellement produites par les acteurs de l’évaluation ouverte, issus principalement des disciplines STM.

URL : Modes d’évaluation ouverte par les pairs : de la revue à la plateforme

URL : https://revue-cossi.numerev.com/articles/revue-9/2496-modes-d-evaluation-ouverte-par-les-pairs-de-la-revue-a-la-plateforme

Prevalence of nonsensical algorithmically generated papers in the scientific literature

Authors : Guillaume Cabanac, Cyril Labbé

In 2014 leading publishers withdrew more than 120 nonsensical publications automatically generated with the SCIgen program. Casual observations suggested that similar problematic papers are still published and sold, without follow-up retractions.

No systematic screening has been performed and the prevalence of such nonsensical publications in the scientific literature is unknown. Our contribution is 2-fold.

First, we designed a detector that combs the scientific literature for grammar-based computer-generated papers. Applied to SCIgen, it has a 83.6% precision. Second, we performed a scientometric study of the 243 detected SCIgen-papers from 19 publishers.

We estimate the prevalence of SCIgen-papers to be 75 per million papers in Information and Computing Sciences. Only 19% of the 243 problematic papers were dealt with: formal retraction (12) or silent removal (34).

Publishers still serve and sometimes sell the remaining 197 papers without any caveat. We found evidence of citation manipulation via edited SCIgen bibliographies. This work reveals metric gaming up to the point of absurdity: fraudsters publish nonsensical algorithmically generated papers featuring genuine references.

It stresses the need to screen papers for nonsense before peer-review and chase citation manipulation in published papers. Overall, this is yet another illustration of the harmful effects of the pressure to publish or perish.

URL : Prevalence of nonsensical algorithmically generated papers in the scientific literature

DOI : https://doi.org/10.1002/asi.24495

Day-to-day discovery of preprint–publication links

Authors : Guillaume Cabanac, Theodora Oikonomidi, Isabelle Boutron

Preprints promote the open and fast communication of non-peer reviewed work. Once a preprint is published in a peer-reviewed venue, the preprint server updates its web page: a prominent hyperlink leading to the newly published work is added.

Linking preprints to publications is of utmost importance as it provides readers with the latest version of a now certified work. Yet leading preprint servers fail to identify all existing preprint–publication links.

This limitation calls for a more thorough approach to this critical information retrieval task: overlooking published evidence translates into partial and even inaccurate systematic reviews on health-related issues, for instance.

We designed an algorithm leveraging the Crossref public and free source of bibliographic metadata to comb the literature for preprint–publication links. We tested it on a reference preprint set identified and curated for a living systematic review on interventions for preventing and treating COVID-19 performed by international collaboration: the COVID-NMA initiative (covid-nma.com).

The reference set comprised 343 preprints, 121 of which appeared as a publication in a peer-reviewed journal. While the preprint servers identified 39.7% of the preprint–publication links, our linker identified 90.9% of the expected links with no clues taken from the preprint servers.

The accuracy of the proposed linker is 91.5% on this reference set, with 90.9% sensitivity and 91.9% specificity. This is a 16.26% increase in accuracy compared to that of preprint servers. We release this software as supplementary material to foster its integration into preprint servers’ workflows and enhance a daily preprint–publication chase that is useful to all readers, including systematic reviewers.

This preprint–publication linker currently provides day-to-day updates to the biomedical experts of the COVID-NMA initiative.

URL : Day-to-day discovery of preprint–publication links

DOI : https://doi.org/10.1007/s11192-021-03900-7

Publication by association: how the COVID-19 pandemic has shown relationships between authors and editorial board members in the field of infectious diseases

Authors : Clara Locher, David Moher, Ioana Alina Cristea, Florian Naudet

During the COVID-19 pandemic, the rush to scientific and political judgements on the merits of hydroxychloroquine was fuelled by dubious papers which may have been published because the authors were not independent from the practices of the journals in which they appeared.

This example leads us to consider a new type of illegitimate publishing entity, ‘self-promotion journals’ which could be deployed to serve the instrumentalisation of productivity-based metrics, with a ripple effect on decisions about promotion, tenure and grant funding, but also on the quality of manuscripts that are disseminated to the medical community and form the foundation of evidence-based medicine.

DOI : http://dx.doi.org/10.1136/bmjebm-2021-111670

Semantic micro-contributions with decentralized nanopublication services

Authors : Tobias Kuhn, Ruben Taelman, Vincent Emonet, Haris Antonatos, Stian Soiland-Reyes, Michel Dumontier

While the publication of Linked Data has become increasingly common, the process tends to be a relatively complicated and heavy-weight one. Linked Data is typically published by centralized entities in the form of larger dataset releases, which has the downside that there is a central bottleneck in the form of the organization or individual responsible for the releases.

Moreover, certain kinds of data entries, in particular those with subjective or original content, currently do not fit into any existing dataset and are therefore more difficult to publish.

To address these problems, we present here an approach to use nanopublications and a decentralized network of services to allow users to directly publish small Linked Data statements through a simple and user-friendly interface, called Nanobench, powered by semantic templates that are themselves published as nanopublications.

The published nanopublications are cryptographically verifiable and can be queried through a redundant and decentralized network of services, based on the grlc API generator and a new quad extension of Triple Pattern Fragments.

We show here that these two kinds of services are complementary and together allow us to query nanopublications in a reliable and efficient manner. We also show that Nanobench makes it indeed very easy for users to publish Linked Data statements, even for those who have no prior experience in Linked Data publishing.

URL : Semantic micro-contributions with decentralized nanopublication services

DOI : https://doi.org/10.7717/peerj-cs.387