Curating Scientific Information in Knowledge Infrastructures

Authors : Markus Stocker, Pauli Paasonen, Markus Fiebig, Martha A. Zaidan, Alex Hardisty

Interpreting observational data is a fundamental task in the sciences, specifically in earth and environmental science where observational data are increasingly acquired, curated, and published systematically by environmental research infrastructures.

Typically subject to substantial processing, observational data are used by research communities, their research groups and individual scientists, who interpret such primary data for their meaning in the context of research investigations.

The result of interpretation is information—meaningful secondary or derived data—about the observed environment. Research infrastructures and research communities are thus essential to evolving uninterpreted observational data to information. In digital form, the classical bearer of information are the commonly known “(elaborated) data products,” for instance maps.

In such form, meaning is generally implicit e.g., in map colour coding, and thus largely inaccessible to machines. The systematic acquisition, curation, possible publishing and further processing of information gained in observational data interpretation—as machine readable data and their machine readable meaning—is not common practice among environmental research infrastructures.

For a use case in aerosol science, we elucidate these problems and present a Jupyter based prototype infrastructure that exploits a machine learning approach to interpretation and could support a research community in interpreting observational data and, more importantly, in curating and further using resulting information about a studied natural phenomenon.

URL : Curating Scientific Information in Knowledge Infrastructures

DOI : http://doi.org/10.5334/dsj-2018-021

Facilitating and Improving Environmental Research Data Repository Interoperability

Authors : Corinna Gries, Amber Budden, Christine Laney, Margaret O’Brien, Mark Servilla, Wade Sheldon, Kristin Vanderbilt, David Vieglais

Environmental research data repositories provide much needed services for data preservation and data dissemination to diverse communities with domain specific or programmatic data needs and standards.

Due to independent development these repositories serve their communities well, but were developed with different technologies, data models and using different ontologies. Hence, the effectiveness and efficiency of these services can be vastly improved if repositories work together adhering to a shared community platform that focuses on the implementation of agreed upon standards and best practices for curation and dissemination of data.

Such a community platform drives forward the convergence of technologies and practices that will advance cross-domain interoperability. It will also facilitate contributions from investigators through standardized and streamlined workflows and provide increased visibility for the role of data managers and the curation services provided by data repositories, beyond preservation infrastructure.

Ten specific suggestions for such standardizations are outlined without any suggestions for priority or technical implementation. Although the recommendations are for repositories to implement, they have been chosen specifically with the data provider/data curator and synthesis scientist in mind.

URL : Facilitating and Improving Environmental Research Data Repository Interoperability

DOI : http://doi.org/10.5334/dsj-2018-022

APCs – Mirroring the impact factor or legacy of the subscription-based model?

Author : Nina Schönfelder

With the ongoing open-access transformation, article processing charges (APCs) are gaining importance as the dominant business model for scientific open-access journals. This paper analyzes which factors determine the level of an APC by means of multivariate linear regression.

With data from OpenAPC, APCs actually paid are explained by the following variables: (1) the “source normalized impact per paper” (SNIP), (2) whether the journal is open access or hybrid, (3) the publisher of the journal, (4) the subject area of the journal, and (5) the year.

The results show that the journal’s impact and the hybrid status are the most important factors for the level of APCs. However, the relationship between APC and SNIP is different for open-access journals and hybrid journals.

The journal’s impact is crucial for the level of APCs in open-access journals, whereas it little alters APCs for publications in hybrid-journals. This paper contributes to the emerging literature initiated by the “Pay It Forward”-study conducted at the University of California Libraries.

It sets the foundations for the assessment whether the large-scale open-access transformation of scientific journals is a financially viable way for each research institution in general and universities in particular.

URL : APCs – Mirroring the impact factor or legacy of the subscription-based model?

DOI : http://doi.org/10.4119/unibi/2931061

Research cafés: how libraries can build communities through research and engagement

Author : Katherine Stephan

The Research Support Team at Liverpool John Moores University (LJMU) runs events called research cafés throughout the academic year.

During these cafés, we bring together PhD students, early career researchers and more established academics over lunch to give them an opportunity to talk about their work to a lay audience of their peers and the public.

From its inception in 2013 we have maintained the overall format of the research café, based as it is on promoting interdisciplinary dialogue in an informal setting, while also making a few small but significant changes.

These changes have in turn increased the visibility and reach of research promotion within the Library. Against that backdrop, this article – which is based on a lightning talk and poster session presented at the 41st UKSG Annual Conference, Glasgow, in April 2018 – will outline why the library is ideally placed to facilitate this type of scholarship sharing and why research and community engagement should be viewed as an integral part of a university library’s agenda.

It will also discuss how its success has allowed our Team to work in partnership with colleagues from across the University in new and exciting ways. Finally, it will address what further developments we can make to continue to improve and help the research community at LJMU and beyond.

URL : Research cafés: how libraries can build communities through research and engagement

DOI : http://doi.org/10.1629/uksg.436

What increases (social) media attention: Research impact, author prominence or title attractiveness?

Authors : Olga Zagovora, Katrin Weller, Milan Janosov, Claudia Wagner, Isabella Peters

Do only major scientific breakthroughs hit the news and social media, or does a ‘catchy’ title help to attract public attention? How strong is the connection between the importance of a scientific paper and the (social) media attention it receives?

In this study we investigate these questions by analysing the relationship between the observed attention and certain characteristics of scientific papers from two major multidisciplinary journals: Nature Communication (NC) and Proceedings of the National Academy of Sciences (PNAS).

We describe papers by features based on the linguistic properties of their titles and centrality measures of their authors in their co-authorship network.

We identify linguistic features and collaboration patterns that might be indicators for future attention, and are characteristic to different journals, research disciplines, and media sources.

URL : https://arxiv.org/abs/1809.06299

Do all citations value the same? Valuing citations by the value of the citing items

Authors : Cristiano Giuffrida, Giovanni Abramo, Ciriaco Andrea D’Angelo

Bibliometricians have long recurred to citation counts to measure the impact of publications on the advancement of science. However, since the earliest days of the field, some scholars have questioned whether all citations should value the same, and have gone on to weight them by a variety of factors.

However sophisticated the operationalization of the measures, the methodologies used in weighting citations still present limits in their underlying assumptions. This work takes an alternate approach to resolving the underlying problem: the proposal is to value citations by the impact of the citing articles.

As well as conceptualizing a new indicator of impact, the work illustrates its application to the 2004-2012 Italian scientific production indexed in the WoS.

The new indicator appears highly correlated to traditional field normalized citations, however the shifts observed between the two measures are frequent and the number of outliers not at all negligible. Moreover, the new indicator seems to show greater “sensitivity” when used in identification of the top-cited papers.

URL : https://arxiv.org/abs/1809.06088

Nanopublications: A Growing Resource of Provenance-Centric Scientific Linked Data

Authors : Tobias Kuhn, Albert Meroño-Peñuela, Alexander Malic, Jorrit H. Poelen, Allen H. Hurlbert, Emilio Centeno Ortiz, Laura I. Furlong, Núria Queralt-Rosinach, Christine Chichester, Juan M. Banda, Egon Willighagen, Friederike Ehrhart, Chris Evelo, Tareq B. Malas, Michel Dumontier

Nanopublications are a Linked Data format for scholarly data publishing that has received considerable uptake in the last few years. In contrast to the common Linked Data publishing practice, nanopublications work at the granular level of atomic information snippets and provide a consistent container format to attach provenance and metadata at this atomic level.

While the nanopublications format is domain-independent, the datasets that have become available in this format are mostly from Life Science domains, including data about diseases, genes, proteins, drugs, biological pathways, and biotic interactions.

More than 10 million such nanopublications have been published, which now form a valuable resource for studies on the domain level of the given Life Science domains as well as on the more technical levels of provenance modeling and heterogeneous Linked Data.

We provide here an overview of this combined nanopublication dataset, show the results of some overarching analyses, and describe how it can be accessed and queried.

URL : https://arxiv.org/abs/1809.06532