Les données de la recherche et leurs entrepôts, de la documentation à la réutilisation : étude de cas pour l’archive HAL

Auteur/Author : Marilou Pain

L’archive ouverte nationale et pluridisciplinaire HAL héberge aujourd’hui des données de la recherche ainsi que des données supplémentaires sous la forme d’annexes.

Afin de tenter de définir des orientations pour cette infrastructure, ce mémoire présente un état de l’art des différents acteurs et enjeux qui gravitent autour de la thématique des données de la recherche. Ensuite, il s’attache à décrire les différents services mis en œuvre par les entrepôts de données de la recherche ainsi que les défis auxquels ils doivent répondre.

Enfin, est proposée une étude exploratoire des données supplémentaires hébergées par HAL, qui cherche à identifier quelles communautés scientifiques utilisent ce service et sous quelles formes.

URL : Les données de la recherche et leurs entrepôts, de la documentation à la réutilisation : étude de cas pour l’archive HAL

Alternative location : https://memsic.ccsd.cnrs.fr/mem_01374509v1

How Do Scientists Define Openness? Exploring the Relationship Between Open Science Policies and Research Practice

Authors : Nadine Levin, Sabina Leonelli, Dagmara Weckowska, David Castle, John Dupré

This article documents how biomedical researchers in the United Kingdom understand and enact the idea of “openness.”

This is of particular interest to researchers and science policy worldwide in view of the recent adoption of pioneering policies on Open Science and Open Access by the U.K. government—policies whose impact on and implications for research practice are in need of urgent evaluation, so as to decide on their eventual implementation elsewhere.

This study is based on 22 in-depth interviews with U.K. researchers in systems biology, synthetic biology, and bioinformatics, which were conducted between September 2013 and February 2014.

Through an analysis of the interview transcripts, we identify seven core themes that characterize researchers’ understanding of openness in science and nine factors that shape the practice of openness in research.

Our findings highlight the implications that Open Science policies can have for research processes and outcomes and provide recommendations for enhancing their content, effectiveness, and implementation.

URL : How Do Scientists Define Openness? Exploring the Relationship Between Open Science Policies and Research Practice

Alternative location : http://bst.sagepub.com/content/early/2016/09/30/0270467616668760.abstract

Data trajectories: tracking reuse of published data for transitive credit attribution

Author : Paolo Missier

The ability to measure the use and impact of published data sets is key to the success of the open data/open science paradigm. A direct measure of impact would require tracking data (re)use in the wild, which is difficult to achieve.

This is therefore commonly replaced by simpler metrics based on data download and citation counts. In this paper we describe a scenario where it is possible to track the trajectory of a dataset after its publication, and show how this enables the design of accurate models for ascribing credit to data originators.

A Data Trajectory (DT) is a graph that encodes knowledge of how, by whom, and in which context data has been re-used, possibly after several generations. We provide a theoretical model of DTs that is grounded in the W3C PROV data model for provenance, and we show how DTs can be used to automatically propagate a fraction of the credit associated with transitively derived datasets, back to original data contributors.

We also show this model of transitive credit in action by means of a Data Reuse Simulator. In the longer term, our ultimate hope is that credit models based on direct measures of data reuse will provide further incentives to data publication.

We conclude by outlining a research agenda to address the hard questions of creating, collecting, and using DTs systematically across a large number of data reuse instances in the wild.

URL : Data trajectories: tracking reuse of published data for transitive credit attribution

URL : http://dx.doi.org/10.2218/ijdc.v11i1.425

Recognizing the Diversity of Contributions: A Case Study for Framing Attribution and Acknowledgement for Scientific Data

Authors : Chung-Yi Hou, Matthew Mayernik

As scientific data volumes, format types, and sources increase rapidly with the invention and improvement of scientific capabilities, the resulting datasets are becoming more complex to manage as well.

One of the significant management challenges is pulling apart the individual contributions of specific people and organizations within large, complex projects.

This is important for two aspects:1) assigning responsibility and accountability for scientific work, and 2) giving professional credit to individuals (e.g. hiring, promotion, and tenure) who work within such large projects.

This paper aims to review the extant practice of data attribution and how it may be improved. Through a case study of creating a detailed attribution record for a climate model dataset, the paper evaluates the strengths and weaknesses of the current data attribution method and proposes an alternative attribution framework accordingly.

The paper concludes by demonstrating that, analogous to acknowledging the different roles and responsibilities shown in movie credits, the methodology developed in the study could be used in general to identify and map out the relationships among the organizations and individuals who had contributed to a dataset.

As a result, the framework could be applied to create data attribution for other dataset types beyond climate model datasets.

URL : Recognizing the Diversity of Contributions: A Case Study for Framing Attribution and Acknowledgement for Scientific Data

DOI : http://dx.doi.org/10.2218/ijdc.v11i1.357

Factors Influencing Research Data Reuse in the Social Sciences : An Exploratory Study

Author : Renata Gonçalves Curty

The development of e-Research infrastructure has enabled data to be shared and accessed more openly. Policy mandates for data sharing have contributed to the increasing availability of research data through data repositories, which create favourable conditions for the re-use of data for purposes not always anticipated by original collectors.

Despite the current efforts to promote transparency and reproducibility in science, datare-use cannot be assumed, nor merely considered a ‘thrifting’ activity where scientists shop around in datarepositories considering only the ease of access to data.

The lack of an integrated view of individual, socialand technological influential factors to intentional and actual data re-use behaviour was the key motivatorfor this study. Interviews with 13 social scientists produced 25 factors that were found to influence theirperceptions and experiences, including both their unsuccessful and successful attempts to re-use data.

These factors were grouped into six theoretical variables: perceived benefits, perceived risks, perceived effort,social influence, facilitating conditions, and perceived re-usability.

These research findings provide an in-depth understanding about the re-use of research data in the context of open science, which can be valuablein terms of theory and practice to help leverage data re-use and make publicly available data moreactionable.

URL : Factors Influencing Research Data Reuse in the Social Sciences : An Exploratory Study

DOI : http://dx.doi.org/10.2218/ijdc.v11i1.401

Où sont les données de la recherche ? : Essai de cartographie

Auteur/Author : Cécile Delay-Artous

La question émergente en France des données de la recherche se situe dans un cadre institutionnel foisonnant mais rigide, délicat à cerner. La recherche est aussi financée et évaluée au niveau européen.

Cette organisation nationale et européenne se double d’un aspect international inhérent à la recherche et aux échanges d’informations rapides et répétés, accélérés par le développement d’Internet.

Le labyrinthe institutionnel franco-européen se superpose ainsi avec le millefeuille international et disciplinaire du monde de la recherche. Enfin, la proximité de deux mouvements qui ne sont pourtant pas synonyme, l’Open Access et l’Open Data, vient encore troubler la compréhension de ce panorama.

Il n’est donc pas aisé de comprendre les rôles de chacun des acteurs quant aux données de la recherche. C’est à une clarification de ce paysage que nous nous proposons de participer, en initiant une cartographie des initiatives et acteurs visibles en France concernant les données des sciences humaines et sociales.

URL : https://halshs.archives-ouvertes.fr/halshs-01369745

Badges to Acknowledge Open Practices: A Simple, Low-Cost, Effective Method for Increasing Transparency

Authors : Mallory C. Kidwell, Ljiljana B. Lazarević, Erica Baranski, Tom E. Hardwicke, Sarah Piechowski, Lina-Sophia Falkenberg, Curtis Kennett, Agnieszka Slowik, Carina Sonnleitner, Chelsey Hess-Holden, Timothy M. Errington, Susann Fiedler, Brian A. Nosek

Beginning January 2014, Psychological Science gave authors the opportunity to signal open data and materials if they qualified for badges that accompanied published articles. Before badges, less than 3% of Psychological Science articles reported open data.

After badges, 23% reported open data, with an accelerating trend; 39% reported open data in the first half of 2015, an increase of more than an order of magnitude from baseline. There was no change over time in the low rates of data sharing among comparison journals.

Moreover, reporting openness does not guarantee openness. When badges were earned, reportedly available data were more likely to be actually available, correct, usable, and complete than when badges were not earned.

Open materials also increased to a weaker degree, and there was more variability among comparison journals. Badges are simple, effective signals to promote open practices and improve preservation of data and materials by using independent repositories.

URL : Badges to Acknowledge Open Practices: A Simple, Low-Cost, Effective Method for Increasing Transparency

DOI : http://dx.doi.org/10.1371/journal.pbio.1002456