Réflexions sur le fragment dans les pratiques scientifiques en ligne : entre matérialité documentaire et péricope

Auteurs/Authors : Gérald Kembellec, Thomas Bottini

Cette communication propose une réflexion pluridisciplinaire (SIC, ingénierie documentaire et théorie du document numérique, informatique, « humanités numériques », histoire des pratiques savantes) sur les usages du fragment dans les pratiques documentaires scientifiques en ligne.

En prolongement de ces éléments théoriques sont proposés un modèle théorique de la segmentation des contenus en unités de sens (péricope) et des directions d’implémentation.

URL : https://hal-univ-paris10.archives-ouvertes.fr/hal-01700064

Automating semantic publishing

Author : Silvio Peroni

Semantic Publishing involves the use of Web and Semantic Web technologies and standards for the semantic enhancement of a scholarly work so as to improve its discoverability, interactivity, openness and (re-)usability for both humans and machines.

Recently, people have suggested that the semantic enhancements of a scholarly work should be undertaken by the authors of that scholarly work, and should be considered as integral parts of the contribution subjected to peer review. However, this requires that the authors should spend additional time and effort adding such semantic annotations, time that they usually do not have available.

Thus, the most pragmatic way to facilitate this additional task is to use automated services that create the semantic annotation of authors’ scholarly articles by parsing the content that they have already written, thus reducing the additional time required of the authors to that for checking and validating these semantic annotations.

In this article, I propose a generic approach called compositional and iterative semantic enhancement (CISE) that enables the automatic enhancement of scholarly papers with additional semantic annotations in a way that is independent of the markup used for storing scholarly articles and the natural language used for writing their content.

URL : Automating semantic publishing

Alternative location : https://content.iospress.com/articles/data-science/ds012

Biotea: semantics for Pubmed Central

Authors : Alexander Garcia​, Federico Lopez, Leyla Garcia, Olga Giraldo, Victor Bucheli, Michel Dumontier

A significant portion of biomedical literature is represented in a manner that makes it difficult for consumers to find or aggregate content through a computational query. One approach to facilitate reuse of the scientific literature is to structure this information as linked data using standardized web technologies.

In this paper we present the second version of Biotea, a semantic, linked data version of the open-access subset of PubMed Central that has been enhanced with specialized annotation pipelines that uses existing infrastructure from the National Center for Biomedical Ontology.

We expose our models, services, software and datasets. Our infrastructure enables manual and semi-automatic annotation, resulting data are represented as RDF-based linked data and can be readily queried using the SPARQL query language.

We illustrate the utility of our system with several use cases. Our datasets, methods and techniques are available at http://biotea.github.io.

URL : Biotea: semantics for Pubmed Central

DOI : https://doi.org/10.7717/peerj.4201

Research Articles in Simplified HTML: a Web-first format for HTML-based scholarly articles

Authors : Silvio Peroni, Francesco Osborne, Angelo Di Iorio, Andrea Giovanni Nuzzolese, Francesco Poggi, Fabio Vitali, Enrico Motta


This paper introduces the Research Articles in Simplified HTML (or RASH), which is a Web-first format for writing HTML-based scholarly papers; it is accompanied by the RASH Framework, i.e. a set tools for interacting with RASH-based articles. The paper also presents an evaluation that involved authors and reviewers of RASH articles, submitted to the SAVE-SD 2015 and SAVE-SD 2016 workshops.


RASH has been developed in order to: be easy to learn and use; share scholarly documents (and embedded semantic annotations) through the Web; support its adoption within the existing publishing workflow.


The evaluation study confirmed that RASH can already be adopted in workshops, conferences and journals and can be quickly learnt by researchers who are familiar with HTML.

Research limitations

The evaluation study also highlighted some issues in the adoption of RASH, and in general of HTML formats, especially by less technical savvy users. Moreover, additional tools are needed, e.g. for enabling additional conversion from/to existing formats such as OpenXML.

Practical implications

RASH (and its Framework) is another step towards enabling the definition of formal representations of the meaning of the content of an article, facilitate its automatic discovery, enable its linking to semantically related articles, provide access to data within the article in actionable form, and allow integration of data between papers.

Social implications

RASH addresses the intrinsic needs related to the various users of a scholarly article: researchers (focussing on its content), readers (experiencing new ways for browsing it), citizen scientists (reusing available data formally defined within it through semantic annotations), publishers (using the advantages of new technologies as envisioned by the Semantic Publishing movement).


RASH focuses strictly on writing the content of the paper (i.e., organisation of text + semantic annotations) and leaves all the issues about it validation, visualisation, conversion, and semantic data extraction to the various tools developed within its Framework.

URL : https://essepuntato.github.io/papers/rash-peerj2016.html

Enjeux des « revues hypermédiatisées » pour l’édition scientifique

Auteurs/Authors : Lise Verlaet, Hans Dillaerts

Au sein de cet article nous nous intéresserons aux nouvelles formes de revues scientifiques numériques. Les mutations induites par le numérique ont en effet un impact fondamental sur le secteur de la communication scientifique (Dillaerts, 2012).

Comme nous le démontrerons dans une première partie à travers l’exposé de l’état de l’art, ces dernières se limitent bien souvent dans un premier temps à une simple transposition de la version papier. Toutefois de nouveaux modèles de diffusion sont apparus, notamment le Libre Accès (accès gratuit avec la possibilité de réutiliser et redistribuer l’article) ou encore la science ouverte laquelle prône une démarche scientifique ouverte, des modèles de peer review innovants (open peer review et les méga-revues).

Fort de ces observations et constats, nous développerons ensuite le concept de « revue hypermédiatisée » que nous présenterons au regard du développement de la revue COSSI (Communication, Organisation, Société du Savoir et Information). Inspiré de l’idée de « site médiateur » (Davallon & Jeanneret, 2004), une revue hypermédiatisée propose une redocumentarisation (Pédauque, 2006 ; Salaün, 2007) de son corpus pour en dégager un sens inédit.

URL : https://archivesic.ccsd.cnrs.fr/sic_01476924


Using the Semantic Web for Rapid Integration of WikiPathways with Other Biological Online Data Resources

Authors : Andra Waagmeester,  Martina Kutmon, Anders Riutta, Ryan Miller,  Egon L. Willighagen, Chris T.  Evelo , Alexander R. Pico

The diversity of online resources storing biological data in different formats provides a challenge for bioinformaticians to integrate and analyse their biological data.

The semantic web provides a standard to facilitate knowledge integration using statements built as triples describing a relation between two objects. WikiPathways, an online collaborative pathway resource, is now available in the semantic web through a SPARQL endpoint at http://sparql.wikipathways.org.

Having biological pathways in the semantic web allows rapid integration with data from other resources that contain information about elements present in pathways using SPARQL queries.

In order to convert WikiPathways content into meaningful triples we developed two new vocabularies that capture the graphical representation and the pathway logic, respectively. Each gene, protein, and metabolite in a given pathway is defined with a standard set of identifiers to support linking to several other biological resources in the semantic web.

WikiPathways triples were loaded into the Open PHACTS discovery platform and are available through its Web API (https://dev.openphacts.org/docs) to be used in various tools for drug development.

We combined various semantic web resources with the newly converted WikiPathways content using a variety of SPARQL query types and third-party resources, such as the Open PHACTS API. The ability to use pathway information to form new links across diverse biological data highlights the utility of integrating WikiPathways in the semantic web.

URL : Using the Semantic Web for Rapid Integration of WikiPathways with Other Biological Online Data Resources

DOI : http://dx.doi.org/10.1371/journal.pcbi.1004989

A Vision for Open Cyber-Scholarly Infrastructures

Author : Costantino Thanos

The characteristics of modern science, i.e., data-intensive, multidisciplinary, open, and heavily dependent on Internet technologies, entail the creation of a linked scholarly record that is online and open.

Instrumental in making this vision happen is the development of the next generation of Open Cyber-Scholarly Infrastructures (OCIs), i.e., enablers of an open, evolvable, and extensible scholarly ecosystem.

The paper delineates the evolving scenario of the modern scholarly record and describes the functionality of future OCIs as well as the radical changes in scholarly practices including new reading, learning, and information-seeking practices enabled by OCIs.

URL : A Vision for Open Cyber-Scholarly Infrastructures

Alternative location : http://www.mdpi.com/2304-6775/4/2/13