Étiquette : research data

« Pour commencer, pourriez-vous définir ‘données de la recherche’ ? » Une tentative de réponse

Auteur de l’article Par Hans Dillaerts
Date de l’article 3 juin 2017

Auteurs/Authors : Joachim Schöpfel, Eric Kergosien, Hélène Prost

Le projet D4Humanities s’inscrit dans le champ des Humanités numériques – comment permettre l’exploration des données de la recherche en SHS (corpus textuels ou oraux, données brutes, images…) avec des techniques numériques (text and data mining, cartographie, visualisation…) afin de construire un sens nouveau ?

Il s’inscrit dans la continuité des travaux du laboratoire GERiiCO et de ses partenaires à l’Université de Lille Sciences Humaines et Sociales (SCD, ED SHS, ANRT…) avec comme objectif d’accélérer la démarche des données de la recherche notamment par rapport aux doctorants et jeunes chercheurs, et de faciliter le montage d’un projet de recherche international.

En particulier, le projet contient trois volets : (1) Pratiques et besoins dans le domaine des données de la recherche (enquête qualitative des comportements, attitudes, motivations et besoins par rapport à la gestion et au partage des données de la recherche) ; (2) workflow pour le dépôt des données des doctorants en SHS (dépôt, préservation et diffusion des données via le service NAKALA de la TGIR Huma-Num) ; (3) recherche sur les données et les thèses (concept et typologie des données en SHS ; évolution des contenus, formats, structures et prescriptions des thèses dans l’environnement de l’Open Science).

Le projet sera mené avec l’ISN Oldenburg et d’autres partenaires étrangers ; il facilitera la création d’un consortium et le montage d’un projet de recherche dans les Humanités numériques sur les thèses de doctorat de l’avenir, avec un financement européen (H2020) ou franco-allemand (ANR/DFG).

Cette communication présente les grandes lignes de l’étude sur les données de l’axe 3, c’est-à-dire l’analyse du concept de données de la recherche, pour mieux cerner l’identification (granularité), pour mieux comprendre la distinction et les relations entre données primaires et secondaires et pour affiner la catégorisation des données en SHS. L’accent est mis sur une triple approche, conceptuelle, typologique et fonctionnelle.

URL : http://hal.univ-lille3.fr/hal-01530937

Étiquettes Eric Kergosien, Hélène Prost, Joachim Schöpfel, research data

A Trust Framework for Online Research Data Services

Auteur de l’article Par Hans Dillaerts
Date de l’article 3 juin 2017

Authors : Malcolm Wolski, Louise Howard, Joanna Richardson

There is worldwide interest in the potential of open science to increase the quality, impact, and benefits of science and research. More recently, attention has been focused on aspects such as transparency, quality, and provenance, particularly in regard to data.

For industry, citizens, and other researchers to participate in the open science agenda, further work needs to be undertaken to establish trust in research environments.

Based on a critical review of the literature, this paper examines the issue of trust in an open science environment, using virtual laboratories as the focus for discussion. A trust framework, which has been developed from an end-user perspective, is proposed as a model for addressing relevant issues within online research data services and tools.

URL : A Trust Framework for Online Research Data Services

DOI : http://dx.doi.org/10.3390/publications5020014

Étiquettes Joanna Richardson, Louise Howard, Malcolm Wolski, open science, research data

Social Science Data Repositories in Data Deluge: A Case Study at ICPSR Workflow and Practices

Auteur de l’article Par Hans Dillaerts
Date de l’article 19 mai 2017

Authors : Wei Jeng, Daqing He, Yu Chi

Design/methodology/approach

We conducted two focus group sessions and one individual interview with eight employees at the world’s largest social science data repository, the Interuniversity Consortium for Political and Social Research (ICPSR).

By examining their current actions (activities regarding their work responsibilities) and IT practices, we studied the barriers and challenges of archiving and curating qualitative data at ICPSR.

Purpose

Due to the recent surge of interest in the age of the data deluge, the importance of researching data infrastructures is increasing. The Open Archival Information System (OAIS) model has been widely adopted as a framework for creating and maintaining digital repositories.

Considering that OAIS is a reference model that requires customization for actual practice, this study examines how the current practices in a data repository map to the OAIS environment and functional components.

Findings

We observed that the OAIS model is robust and reliable in actual service processes for data curation and data archives. In addition, a data repository’s workflow resembles digital archives or even digital libraries.

On the other hand, we find that: 1) the cost of preventing disclosure risk and 2) a lack of agreement on the standards of text data files are the most apparent obstacles for data curation professionals to handle qualitative data; 3) the maturation of data metrics seems to be a promising solution to several challenges in social science data sharing.

Original value

We evaluated the gap between a research data repository’s current practices and the adoption of the OAIS model. We also identified answers to questions such as how current technological infrastructure in a leading data repository such as ICPSR supports their daily operations, what the ideal technologies in those data repositories would be, and the associated challenges that accompany these ideal technologies.

Most importantly, we helped to prioritize challenges and barriers from the data curator’s perspective, and contribute implications of data sharing and reuse in social sciences.

URL : http://d-scholarship.pitt.edu/31876/

Étiquettes case study, Daqing He, data repositories, OAIS model, research data, Wei Jeng, Yu Chi

Semantic representation and enrichment of information retrieval experimental data

Auteur de l’article Par Hans Dillaerts
Date de l’article 15 mai 2017

Authors : Gianmaria Silvello, Georgeta Bordea, Nicola Ferro, Paul Buitelaar, Toine Bogers

Experimental evaluation carried out in international large-scale campaigns is a fundamental pillar of the scientific and technological advancement of information retrieval (IR) systems.

Such evaluation activities produce a large quantity of scientific and experimental data, which are the foundation for all the subsequent scientific production and development of new systems.

In this work, we discuss how to semantically annotate and interlink this data, with the goal of enhancing their interpretation, sharing, and reuse. We discuss the underlying evaluation workflow and propose a resource description framework model for those workflow parts.

We use expertise retrieval as a case study to demonstrate the benefits of our semantic representation approach. We employ this model as a means for exposing experimental data as linked open data (LOD) on the Web and as a basis for enriching and automatically connecting this data with expertise topics and expert profiles.

In this context, a topic-centric approach for expert search is proposed, addressing the extraction of expertise topics, their semantic grounding with the LOD cloud, and their connection to IR experimental data.

Several methods for expert profiling and expert finding are analysed and evaluated. Our results show that it is possible to construct expert profiles starting from automatically extracted expertise topics and that topic-centric approaches outperform state-of-the-art language modelling approaches for expert finding.

URL : https://aran.library.nuigalway.ie/handle/10379/5862

Étiquettes experimental data, Georgeta Bordea, Gianmaria Silvello, Linked Open Data, Nicola Ferro, Paul Buitelaar, research data, semantic enrichment, Toine Bogers

Advancing research data publishing practices for the social sciences: from archive activity to empowering researchers

Auteur de l’article Par Hans Dillaerts
Date de l’article 14 mai 2017

Authors : Veerle Van den Eynden, Louise Corti

Sharing and publishing social science research data have a long history in the UK, through long-standing agreements with government agencies for sharing survey data and the data policy, infrastructure, and data services supported by the Economic and Social Research Council.

The UK Data Service and its predecessors developed data management, documentation, and publishing procedures and protocols that stand today as robust templates for data publishing.

As the ESRC research data policy requires grant holders to submit their research data to the UK Data Service after a grant ends, setting standards and promoting them has been essential in raising the quality of the resulting research data being published. In the past, received data were all processed, documented, and published for reuse in-house.

Recent investments have focused on guiding and training researchers in good data management practices and skills for creating shareable data, as well as a self-publishing repository system, ReShare. ReShare also receives data sets described in published data papers and achieves scientific quality assurance through peer review of submitted data sets before publication.

Social science data are reused for research, to inform policy, in teaching and for methods learning. Over a 10 years period, responsive developments in system workflows, access control options, persistent identifiers, templates, and checks, together with targeted guidance for researchers, have helped raise the standard of self-publishing social science data.

Lessons learned and developments in shifting publishing social science data from an archivist responsibility to a researcher process are showcased, as inspiration for institutions setting up a data repository.

URL : Advancing research data publishing practices for the social sciences: from archive activity to empowering researchers

DOI : doi:10.1007/s00799-016-0177-3

Étiquettes data sharing, Louise Corti, research data, research data management, social sciences, UK, United Kingdom, Veerle Van den Eynden

What incentives increase data sharing in health and medical research? A systematic review

Auteur de l’article Par Hans Dillaerts
Date de l’article 6 mai 2017

Authors : Anisa Rowhani-Farid, Michelle Allen, Adrian G. Barnett

Background

The foundation of health and medical research is data. Data sharing facilitates the progress of research and strengthens science. Data sharing in research is widely discussed in the literature; however, there are seemingly no evidence-based incentives that promote data sharing.

Methods

A systematic review (registration: doi.org/10.17605/OSF.IO/6PZ5E) of the health and medical research literature was used to uncover any evidence-based incentives, with pre- and post-empirical data that examined data sharing rates.

We were also interested in quantifying and classifying the number of opinion pieces on the importance of incentives, the number observational studies that analysed data sharing rates and practices, and strategies aimed at increasing data sharing rates.

Results

Only one incentive (using open data badges) has been tested in health and medical research that examined data sharing rates. The number of opinion pieces (n = 85) out-weighed the number of article-testing strategies (n = 76), and the number of observational studies exceeded them both (n = 106).

Conclusions

Given that data is the foundation of evidence-based health and medical research, it is paradoxical that there is only one evidence-based incentive to promote data sharing. More well-designed studies are needed in order to increase the currently low rates of data sharing.

URL : What incentives increase data sharing in health and medical research? A systematic review

Alternative location : http://researchintegrityjournal.biomedcentral.com/articles/10.1186/s41073-017-0028-9

Étiquettes Adrian G. Barnett, Anisa Rowhani-Farid, biomedical research, data sharing, Michelle Allen, research data

Design/methodology/approach

Purpose

Findings

Introduction

Methods

Results

Discussion

Background

Methods

Results

Conclusions