Research Articles in Simplified HTML: a Web-first format for HTML-based scholarly articles

Authors : Silvio Peroni, Francesco Osborne, Angelo Di Iorio, Andrea Giovanni Nuzzolese, Francesco Poggi, Fabio Vitali, Enrico Motta


This paper introduces the Research Articles in Simplified HTML (or RASH), which is a Web-first format for writing HTML-based scholarly papers; it is accompanied by the RASH Framework, i.e. a set tools for interacting with RASH-based articles. The paper also presents an evaluation that involved authors and reviewers of RASH articles, submitted to the SAVE-SD 2015 and SAVE-SD 2016 workshops.


RASH has been developed in order to: be easy to learn and use; share scholarly documents (and embedded semantic annotations) through the Web; support its adoption within the existing publishing workflow.


The evaluation study confirmed that RASH can already be adopted in workshops, conferences and journals and can be quickly learnt by researchers who are familiar with HTML.

Research limitations

The evaluation study also highlighted some issues in the adoption of RASH, and in general of HTML formats, especially by less technical savvy users. Moreover, additional tools are needed, e.g. for enabling additional conversion from/to existing formats such as OpenXML.

Practical implications

RASH (and its Framework) is another step towards enabling the definition of formal representations of the meaning of the content of an article, facilitate its automatic discovery, enable its linking to semantically related articles, provide access to data within the article in actionable form, and allow integration of data between papers.

Social implications

RASH addresses the intrinsic needs related to the various users of a scholarly article: researchers (focussing on its content), readers (experiencing new ways for browsing it), citizen scientists (reusing available data formally defined within it through semantic annotations), publishers (using the advantages of new technologies as envisioned by the Semantic Publishing movement).


RASH focuses strictly on writing the content of the paper (i.e., organisation of text + semantic annotations) and leaves all the issues about it validation, visualisation, conversion, and semantic data extraction to the various tools developed within its Framework.


Enjeux des « revues hypermédiatisées » pour l’édition scientifique

Auteurs/Authors : Lise Verlaet, Hans Dillaerts

Au sein de cet article nous nous intéresserons aux nouvelles formes de revues scientifiques numériques. Les mutations induites par le numérique ont en effet un impact fondamental sur le secteur de la communication scientifique (Dillaerts, 2012).

Comme nous le démontrerons dans une première partie à travers l’exposé de l’état de l’art, ces dernières se limitent bien souvent dans un premier temps à une simple transposition de la version papier. Toutefois de nouveaux modèles de diffusion sont apparus, notamment le Libre Accès (accès gratuit avec la possibilité de réutiliser et redistribuer l’article) ou encore la science ouverte laquelle prône une démarche scientifique ouverte, des modèles de peer review innovants (open peer review et les méga-revues).

Fort de ces observations et constats, nous développerons ensuite le concept de « revue hypermédiatisée » que nous présenterons au regard du développement de la revue COSSI (Communication, Organisation, Société du Savoir et Information). Inspiré de l’idée de « site médiateur » (Davallon & Jeanneret, 2004), une revue hypermédiatisée propose une redocumentarisation (Pédauque, 2006 ; Salaün, 2007) de son corpus pour en dégager un sens inédit.



Using the Semantic Web for Rapid Integration of WikiPathways with Other Biological Online Data Resources

Authors : Andra Waagmeester,  Martina Kutmon, Anders Riutta, Ryan Miller,  Egon L. Willighagen, Chris T.  Evelo , Alexander R. Pico

The diversity of online resources storing biological data in different formats provides a challenge for bioinformaticians to integrate and analyse their biological data.

The semantic web provides a standard to facilitate knowledge integration using statements built as triples describing a relation between two objects. WikiPathways, an online collaborative pathway resource, is now available in the semantic web through a SPARQL endpoint at

Having biological pathways in the semantic web allows rapid integration with data from other resources that contain information about elements present in pathways using SPARQL queries.

In order to convert WikiPathways content into meaningful triples we developed two new vocabularies that capture the graphical representation and the pathway logic, respectively. Each gene, protein, and metabolite in a given pathway is defined with a standard set of identifiers to support linking to several other biological resources in the semantic web.

WikiPathways triples were loaded into the Open PHACTS discovery platform and are available through its Web API ( to be used in various tools for drug development.

We combined various semantic web resources with the newly converted WikiPathways content using a variety of SPARQL query types and third-party resources, such as the Open PHACTS API. The ability to use pathway information to form new links across diverse biological data highlights the utility of integrating WikiPathways in the semantic web.

URL : Using the Semantic Web for Rapid Integration of WikiPathways with Other Biological Online Data Resources


A Vision for Open Cyber-Scholarly Infrastructures

Author : Costantino Thanos

The characteristics of modern science, i.e., data-intensive, multidisciplinary, open, and heavily dependent on Internet technologies, entail the creation of a linked scholarly record that is online and open.

Instrumental in making this vision happen is the development of the next generation of Open Cyber-Scholarly Infrastructures (OCIs), i.e., enablers of an open, evolvable, and extensible scholarly ecosystem.

The paper delineates the evolving scenario of the modern scholarly record and describes the functionality of future OCIs as well as the radical changes in scholarly practices including new reading, learning, and information-seeking practices enabled by OCIs.

URL : A Vision for Open Cyber-Scholarly Infrastructures

Alternative location :

Tackling complexity in an interdisciplinary scholarly network: Requirements for semantic publishing

Scholarly communication is complex. The clarification of concepts like “academic publication”, “document”, “semantics” and “ontology” facilitates tracking the limitations and benefits of the media of the current publishing system, as well as of a possible alternative medium.

In this paper, requirements for such a new medium of scholarly communication, labeled Scholarly Network, have been collected and a basic model has been developed. An interdisciplinary network of concepts and assertions, created with the help of Semantic Web technologies by scholars and reviewed by peers and information professionals, can provide a quick overview of the state of research.

The model picks up the concept of Nanopublications, but maps information in a more granular way. For a better understanding of which problems have to be solved by developing such a publication medium, e.g., inconsistency, theories of Radical Constructivism are of great help.


Semantic representation of scientific literature: bringing claims, contributions and named entities onto the Linked Open Data cloud


Finding relevant scientific literature is one of the essential tasks researchers are facing on a daily basis. Digital libraries and web information retrieval techniques provide rapid access to a vast amount of scientific literature. However, no further automated support is available that would enable fine-grained access to the knowledge ‘stored’ in these documents. The emerging domain of Semantic Publishing aims at making scientific knowledge accessible to both humans and machines, by adding semantic annotations to content, such as a publication’s contributions, methods, or application domains.

However, despite the promises of better knowledge access, the manual annotation of existing research literature is prohibitively expensive for wide-spread adoption. We argue that a novel combination of three distinct methods can significantly advance this vision in a fully-automated way: (i) Natural Language Processing (NLP) for Rhetorical Entity (RE) detection; (ii) Named Entity (NE) recognition based on the Linked Open Data (LOD) cloud; and (iii) automatic knowledge base construction for both NEs and REs using semantic web ontologies that interconnect entities in documents with the machine-readable LOD cloud.


We present a complete workflow to transform scientific literature into a semantic knowledge base, based on the W3C standards RDF and RDFS. A text mining pipeline, implemented based on the GATE framework, automatically extracts rhetorical entities of type Claims and Contributions from full-text scientific literature. These REs are further enriched with named entities, represented as URIs to the linked open data cloud, by integrating the DBpedia Spotlight tool into our workflow.

Text mining results are stored in a knowledge base through a flexible export process that provides for a dynamic mapping of semantic annotations to LOD vocabularies through rules stored in the knowledge base. We created a gold standard corpus from computer science conference proceedings and journal articles, where Claim and Contribution sentences are manually annotated with their respective types using LOD URIs. The performance of the RE detection phase is evaluated against this corpus, where it achieves an average F-measure of 0.73. We further demonstrate a number of semantic queries that show how the generated knowledge base can provide support for numerous use cases in managing scientific literature.

URL : Semantic representation of scientific literature: bringing claims, contributions and named entities onto the Linked Open Data cloud

Alternative location :

Improving The Future of Research Communications and e-Scholarship

The dissemination of knowledge derived from research and scholarship has a fundamental impact on the ways in which society develops and progresses, and at the same time it feeds back to improve subsequent research and scholarship.

Here, as in so many other areas of human activity, the internet is changing the way things work; two decades of emergent and increasingly pervasive information technology have demonstrated the potential for far more effective scholarly communication. But the use of this technology remains limited.

Force11 is a community of scholars, librarians, archivists, publishers and research funders that has arisen organically to help facilitate the change toward improved knowledge creation and sharing.

This document highlights the findings of the Force11 workshop on the Future of Research Communication held at Schloss Dagstuhl, Germany, in August 2011: it summarizes a number of key problems facing scholarly publishing today, and presents a vision that addresses these problems, proposing concrete steps that key stakeholders can take to improve the state of scholarly publishing.