Reproducible data citations for computational research

Author : Christian Schulz

The general purpose of a scientific publication is the exchange and spread of knowledge. A publication usually reports a scientific result and tries to convince the reader that it is valid.

With an ever-growing number of papers relying on computational methods that make use of large quantities of data and sophisticated statistical modeling techniques, a textual description of the result is often not enough for a publication to be transparent and reproducible.

While there are efforts to encourage sharing of code and data, we currently lack conventions for linking data sources to a computational result that is stated in the main publication text or used to generate a figure or table.

Thus, here I propose a data citation format that allows for an automatic reproduction of all computations. A data citation consists of a descriptor that refers to the functional program code and the input that generated the result.

The input itself may be a set of other data citations, such that all data transformations, from the original data sources to the final result, are transparently expressed by a directed graph.

Functions can be implemented in a variety of programming languages since data sources are expected to be stored in open and standardized text-based file formats.

A publication is then an online file repository consisting of a Hypertext Markup Language (HTML) document and additional data and code source files, together with a summarization of all data sources, similar to a list of references in a bibliography.

URL : https://arxiv.org/abs/1808.07541

Reflections on the Future of Research Curation and Research Reproducibility

Authors : John Baillieul, Gerry Grenier, Gianluca Setti

In the years since the launch of the World Wide Web in 1993, there have been profoundly transformative changes to the entire concept of publishing—exceeding all the previous combined technical advances of the centuries following the introduction of movable type in medieval Asia around the year 10001 and the subsequent large-scale commercialization of printing several centuries later by J. Gutenberg (circa 1440).

Periodicals in print—from daily newspapers to scholarly journals—are now quickly disappearing, never to return, and while no publishing sector has been unaffected, many scholarly journals are almost unrecognizable in comparison with their counterparts of two decades ago.

To say that digital delivery of the written word is fundamentally different is a huge understatement. Online publishing permits inclusion of multimedia and interactive content that add new dimensions to what had been available in print-only renderings.

As of this writing, the IEEE portfolio of journal titles comprises 59 online only2 (31%) and 132 that are published in both print and online. The migration from print to online is more stark than these numbers indicate because of the 132 periodicals that are both print and online, the print runs are now quite small and continue to decline.

In short, most readers prefer to have their subscriptions fulfilled by digital renderings only.

DOI : https://doi.org/10.1109/JPROC.2018.2816618

Reproducible research and GIScience: an evaluation using AGILE conference papers

Authors : Daniel Nüst​, Carlos Granell, Barbara Hofer, Markus Konkol, Frank O Ostermann, Rusne Sileryte, Valentina Cerutti

The demand for reproducibility of research is on the rise in disciplines concerned with data analysis and computational methods. In this work existing recommendations for reproducible research are reviewed and translated into criteria for assessing reproducibility of articles in the field of geographic information science (GIScience).

Using a sample of GIScience research from the Association of Geographic Information Laboratories in Europe (AGILE) conference series, we assess the current state of reproducibility of publications in this field. Feedback on the assessment was collected by surveying the authors of the sample papers.

The results show the reproducibility levels are low. Although authors support the ideals, the incentives are too small. Therefore we propose concrete actions for individual researchers and the AGILE conference series to improve transparency and reproducibility, such as imparting data and software skills, an award, paper badges, author guidelines for computational research, and Open Access publications.

URL : Reproducible research and GIScience: an evaluation using AGILE conference papers

DOI : https://doi.org/10.7287/peerj.preprints.26561v1