Building a Disciplinary, World‐Wide Data Infrastructure

Authors: Françoise Genova, Christophe Arviset, Bridget M. Almas, Laura Bartolo, Daan Broeder, Emily Law, Brian McMahon

Sharing scientific data with the objective of making it discoverable, accessible, reusable, and interoperable requires work and presents challenges being faced at the disciplinary level to define in particular how the data should be formatted and described.

This paper represents the Proceedings of a session held at SciDataCon 2016 (Denver, 12–13 September 2016). It explores the way a range of disciplines, namely materials science, crystallography, astronomy, earth sciences, humanities and linguistics, get organized at the international level to address those challenges. T

he disciplinary culture with respect to data sharing, science drivers, organization, lessons learnt and the elements of the data infrastructure which are or could be shared with others are briefly described. Commonalities and differences are assessed.

Common key elements for success are identified: data sharing should be science driven; defining the disciplinary part of the interdisciplinary standards is mandatory but challenging; sharing of applications should accompany data sharing. Incentives such as journal and funding agency requirements are also similar.

For all, social aspects are more challenging than technological ones. Governance is more diverse, often specific to the discipline organization. Being problem‐driven is also a key factor of success for building bridges to enable interdisciplinary research.

Several international data organizations such as CODATA, RDA and WDS can facilitate the establishment of disciplinary interoperability frameworks. As a spin‐off of the session, a RDA Disciplinary Interoperability Interest Group is proposed to bring together representatives across disciplines to better organize and drive the discussion for prioritizing, harmonizing and efficiently articulating disciplinary needs.

URL : Building a Disciplinary, World‐Wide Data Infrastructure

DOI : http://doi.org/10.5334/dsj-2017-016

 

Scientific data from and for the citizen

Authors : Sven Schade, Chrisa Tsinaraki, Elena Roglia

Powered by advances of technology, today’s Citizen Science projects cover a wide range of thematic areas and are carried out from local to global levels. This wealth of activities creates an abundance of data, for example, in the forms of observations submitted by mobile phones; readings of low-cost sensors; or more general information about peoples’ activities.

The management and possible sharing of this data has become a research topic in its own right. We conducted a survey in the summer of 2015 in order to collectively analyze the state of play in Citizen Science.

This paper summarizes our main findings related to data access, standardization and data preservation. We provide examples of good practices in each of these areas and outline actions to address identified challenges.

URL : http://firstmonday.org/ojs/index.php/fm/article/view/7842

Interopérabilité et logiques organisationnelles. Ce qu’ouvrir ses données veut dire

Auteurs/Authors : Marie Després-Lonnet, Béatrice Micheau, Marie Destandau

Dans la perspective de l’ouverture des données publiques, nous accompagnons trois institutions qui gèrent des fonds liés à la musique, dans cette triple évolution technique, organisationnelle et politique.

L’objectif est de concevoir une « ontologie » qui servira d’appui à la description de la musique. Notre collaboration avec les experts a permis de saisir les tensions que ce projet génère, malgré la volonté collective de parvenir à une modélisation partagée.

Nous avons ainsi pu montrer que chaque institution porte un regard situé sur la musique comme pratique sociale et sur les objets et documents qu’elle détient. La recherche d’un modèle commun et qui pourrait s’appliquer globalement nécessite que chaque institution envisage les données et les concepts associés d’une façon plus globale et remette en partie en question ses modes de faire.

Notre étude montre que pour ne pas aboutir à un modèle totalement abstrait, il convient de voir la modélisation comme une forme de discours qui s’inscrit dans la continuité des écritures de notre patrimoine culturel : écritures vivantes, faites de négociations constantes entre normes et bricolages, nécessités organisationnelles et adaptation à des contraintes ponctuelles, dont nous retrouvons sans cesse les multiples traces, qui sont autant de matériaux pour nos recherches sur l’anthropologie des savoirs.

Les recherches présentées dans cet article ont été partiellement financé par le projet ANR-2014-CE24-0020 «DOREMUS.

URL : http://www.revue-cossi.info/numeros/n-2-2017-bricolages-improvisations-et-resilience-organisationnelle-face-aux-risques-informationnels-et-communicationnels/663-2-2017-revue-despres-lonnet-micheau-destandau

Towards certified open data in digital service ecosystems

Authors : Anne Immonen, Eila Ovaska, Tuomas Paaso

The opportunities of open data have been recently recognized among companies in different domains. Digital service providers have increasingly been interested in the possibilities of innovating new ideas and services around open data.

Digital service ecosystems provide several advantages for service developers, enabling the service co-innovation and co-creation among ecosystem members utilizing and sharing common assets and knowledge.

The utilization of open data in digital services requires new innovation practices, service development models, and a collaboration environment. These can be provided by the ecosystem. However, since open data can be almost anything and originate from different kinds of data sources, the quality of data becomes the key issue.

The new challenge for service providers is how to guarantee the quality of open data. In the ecosystems, uncertain data quality poses major challenges. The main contribution of this paper is the concept of the Evolvable Open Data based digital service Ecosystem (EODE), which defines the kinds of knowledge and services that are required for validating open data in digital service ecosystems.

Thus, the EODE provides business potential for open data and digital service providers, as well as other actors around open data. The ecosystem capability model, knowledge management models, and the taxonomy of services to support the open data quality certification are described.

Data quality certification confirms that the open data is trustworthy and its quality is good enough to be accepted for the usage of the ecosystem’s services. The five-phase open data quality certification process, according to which open data is brought to the ecosystem and certified for the usage of the digital service ecosystem members using the knowledge models and support services of the ecosystem, is also described.

The initial experiences of the still ongoing validation steps are summarized, and the concept limitations and future development targets are identified.

URL : Towards certified open data in digital service ecosystems

DOI : 10.1007/s11219-017-9378-2

Le défi de l’interopérabilité entre plates-formes pour la construction de savoirs augmentés en sciences humaines et sociales

Auteurs/Authors : Camille Prime-Claverie, Annaïg Mahé

A l’ère numérique, le secteur de la recherche engendre une prolifération de contenus informatisés et garantir un meilleur accès aux résultats de recherche est un objectif qui pourrait paraître aisément réalisable.

Pourtant, depuis une décennie, le secteur de la communication scientifique traverse des mutations profondes qui se traduisent par des difficultés pour l’ensemble des acteurs à se positionner dans ce nouveau contexte.

L’information se retrouve disséminée au sein de plusieurs plateformes nées sous l’impulsion de différents types d’acteurs qui affichent des positions et intérêts parfois divergents.

Dans cet environnement largement distribué, la réalisation de l’interopérabilité devient un enjeu majeur pour un meilleur accès à l’IST, permettant en outre la circulation des données et leur enrichissement.

Cette contribution propose d’aborder la question de la circulation et du partage de la littérature scientifique en sciences humaines et sociales en France à partir de données moissonnables par le protocole OAI-PMH.

Elle tente mettre en exergue ce qui constitue des opportunités ou des freins pour la réutilisation, l’éditorialisation et la construction de savoirs augmentées dans ce domaine.

L’étude menée se centre sur cinq plateformes françaises mettant à disposition des documents scientifiques dans le domaine des SHS et sur l’étude d’un fournisseur de services proposant des fonctionnalités d’enrichissement.

URL : https://archivesic.ccsd.cnrs.fr/sic_01511618

Perseids: Experimenting with Infrastructure for Creating and Sharing Research Data in the Digital Humanities

Author : Bridget Almas

The Perseids project provides a platform for creating, publishing, and sharing research data, in the form of textual transcriptions, annotations and analyses. An offshoot and collaborator of the Perseus Digital Library (PDL),

Perseids is also an experiment in reusing and extending existing infrastructure, tools, and services.

This paper discusses infrastructure in the domain of digital humanities (DH). It outlines some general approaches to facilitating data sharing in this domain, and the specific choices we made in developing Perseids to serve that goal.

It concludes by identifying lessons we have learned about sustainability in the process of building Perseids, noting some critical gaps in infrastructure for the digital humanities, and suggesting some implications for the wider community.

URL : Perseids: Experimenting with Infrastructure for Creating and Sharing Research Data in the Digital Humanities

DOI : http://doi.org/10.5334/dsj-2017-019

COAR Roadmap Future Directions for Repository Interoperability

« In the past few years, Open Access repositories and their associated services have become an important component of the global e-research infrastructure. Increasingly, repositories are also being integrated with other systems, such as research administrative systems and with research data repositories, with the aim of providing a more integrated and seamless suite of services to various communities. Repositories can also be connected into networks (e.g. at the national or regional level) to support unified access to an open, aggregated collection of scholarship and related materials that machines can mine enabling researchers to work with content in new ways and allowing funders and institutions to track research outputs.
Scholarly communication is undergoing fundamental changes, in particular with new requirements for open access to research outputs, new forms of peer-review, and alternative methods for measuring impact. In parallel, technical developments, especially in communication and interface technologies facilitate bi-directional data exchange across related applications and systems. The aim of this roadmap is to identify important trends and their associated action points in order for the repository community to determine priorities for further investments in interoperability. »

URL : COAR Roadmap Future Directions for Repository Interoperability

Alternative URL : https://www.coar-repositories.org/files/Roadmap_final_formatted_20150203.pdf