Publishing computational research — A review of infrastructures for reproducible and transparent scholarly communication

Authors : Markus Konkol, Daniel Nüst, Laura Goulier

Funding agencies increasingly ask applicants to include data and software management plans into proposals. In addition, the author guidelines of scientific journals and conferences more often include a statement on data availability, and some reviewers reject unreproducible submissions.

This trend towards open science increases the pressure on authors to provide access to the source code and data underlying the computational results in their scientific papers.

Still, publishing reproducible articles is a demanding task and not achieved simply by providing access to code scripts and data files. Consequently, several projects develop solutions to support the publication of executable analyses alongside articles considering the needs of the aforementioned stakeholders.

The key contribution of this paper is a review of applications addressing the issue of publishing executable computational research results. We compare the approaches across properties relevant for the involved stakeholders, e.g., provided features and deployment options, and also critically discuss trends and limitations.

The review can support publishers to decide which system to integrate into their submission process, editors to recommend tools for researchers, and authors of scientific papers to adhere to reproducibility principles.

URL : https://arxiv.org/abs/2001.00484

Improving the Reproducibility of LaTeX Documents by Enriching Figures with Embedded Scripts and Data

Author : Christian Thomas Jacobs

The introduction of open-access data policies by research councils, the enforcement of best practices, and the deployment of persistent online repositories have enabled datasets which support results in scientific papers to become more widely accessible.

Unfortunately, despite this advancement in the curation/publishing workflow, the data-driven figures within a paper often remain difficult to reproduce. Plotting or analysis scripts rarely accompany the manuscript or any associated software release; and even if they do, it may be unclear exactly which version was used.

Furthermore, the precise commands and parameters used to execute the scripts are often not included in a README file or in the paper itself. This paper introduces a new open-source digital curation tool, Pynea, for improving the reproducibility of LaTeX documents.

Each figure within a document is enriched by automatically embedding the plotting script and data files required to generate it, such that it can be regenerated by readers of the paper in the future.

The command used to execute the plotting script is also added to the figure’s metadata, along with details of the specific version of the script used (if the script is tracked with the Git version control system).

If the document is to be recompiled with a figure that has since changed, or had its plotting script or data files modified, the figure is regenerated such that the author can be confident that the latest version of the figure and its dependencies are included.

URL : Improving the Reproducibility of LaTeX Documents by Enriching Figures with Embedded Scripts and Data

DOI : https://doi.org/10.2218/ijdc.v14i1.656

Research Data Management in a Cultural Heritage Organisation

Author : Tom Drysdale

Research is a core function of cultural heritage organisations. Inevitably, the undertaking of research by galleries, libraries, archives and museums (the GLAM sector) leads to the creation of vast quantities of research data.

Yet despite growing recognition that research data must be managed if it is to be exploited effectively, and in spite of increasing understanding of research data management practices and needs, particularly in the higher education sector, knowledge of research data management in cultural heritage organisations remains extremely limited.

This paper represents an attempt to address the limited awareness of research data management in the cultural heritage sector. It presents the results of a data management audit conducted at Historic Royal Palaces (HRP) in 2018.

The study reveals that research data management at HRP is underdeveloped, while highlighting some causes for optimism.

The results of the study are compared to the results of similar studies conducted in UK higher education institutions (HEIs), highlighting the many discrepancies in the ways that research data is managed at HRP and in the HE sector.

Recognition of these differences and similarities, it is argued, is necessary for the development of better research data management practices and tools for the heritage sector.

URL : Research Data Management in a Cultural Heritage Organisation

DOI : https://doi.org/10.2218/ijdc.v14i1.647

Business as Usual with Article Processing Charges in the Transition towards OA Publishing: A Case Study Based on Elsevier

Author : Sergio Copiello

This paper addresses the topic of the article processing charges (APCs) that are paid when publishing articles using the open access (OA) option. Building on the Elsevier OA price list, company balance sheet figures, and ScienceDirect data, tentative answers to three questions are outlined using a Monte Carlo approach to deal with the uncertainty inherent in the inputs.

The first question refers to the level of APCs from the market perspective, under the hypothesis that all the articles published in Elsevier journals exploit the OA model so that the subscription to ScienceDirect becomes worthless.

The second question is how much Elsevier should charge for publishing all the articles under the OA model, assuming the profit margin reduces and adheres to the market benchmark.

The third issue is how many articles would have to be accepted, in an OA-only publishing landscape, so that the publisher benefits from the same revenue and profit margin as in the recent past.

The results point to high APCs, nearly twice the current level, being required to preserve the publisher’s profit margin. Otherwise, by relaxing that constraint, a downward shift of APCs can be expected so they would tend to get close to current values. Accordingly, the article acceptance rate could be likely to grow from 26–27% to about 35–55%.

URL : Business as Usual with Article Processing Charges in the Transition towards OA Publishing: A Case Study Based on Elsevier

DOI : https://doi.org/10.3390/publications8010003

Les données scientifiques face aux enjeux de la recherche en Sciences, Technologie et Médecine : enquête exploratoire à l’Université de Strasbourg

Auteur/Author : Violaine Rebouillat

Nous étudions la place des données scientifiques dans les pratiques de recherche à travers l’analyse de six projets du domaine des Sciences, Technologie, Médecine.

Il s’agit de questionner l’influence des stratégies de recherche sur la gestion et l’ouverture des données. Nous décrivons le rôle joué par la quête de reconnaissance par les pairs dans la recherche fondamentale et appliquée.

Nous montrons que les projets de recherche fondamentale tendent à suivre une logique, selon laquelle la publication d’articles dicte les priorités, tandis que les projets de recherche appliquée consacrent une attention plus grande aux données, en raison des enjeux économiques sous-jacents.

URL : https://hal-cnam.archives-ouvertes.fr/hal-02321077

Open forensic science

Authors : Jason M Chin, Gianni Ribeiro, Alicia Rairden

The mainstream sciences are experiencing a revolution of methodology. This revolution was inspired, in part, by the realization that a surprising number of findings in the bioscientific literature could not be replicated or reproduced by independent laboratories.

In response, scientific norms and practices are rapidly moving towards openness. These reforms promise many enhancements to the scientific process, notably improved efficiency and reliability of findings. Changes are also underway in the forensic.

After years of legal-scientific criticism and several reports from peak scientific bodies, efforts are underway to establish the validity of several forensic practices and ensure forensic scientists perform and present their work in a scientifically valid way.

In this article, the authors suggest that open science reforms are distinctively suited to addressing the problems faced by forensic science. Openness comports with legal and criminal justice values, helping ensure expert forensic evidence is more reliable and susceptible to rational evaluation by the trier of fact.

In short, open forensic science allows parties in legal proceedings to understand and assess the strength of the case against them, resulting in fairer outcomes. Moreover, several emerging open science initiatives allow for speedier and more collaborative research.

URL : Open forensic science

DOI : https://doi.org/10.1093/jlb/lsz009

Copyright and the Progress of Science: Why Text and Data Mining Is Lawful

Author : Michael W. Carroll

This Article argues that U.S. copyright law provides a competitive advantage in the global race for innovation policy because it permits researchers to conduct computational analysis — text and data mining — on any materials to which they have access.

Amendments to copyright law in Japan, and the European Union’s recent addition of limitations on copyright to legalize some TDM research, implicitly acknowledge the competitive benefits provided by the fair use provision of U.S. copyright law.

Focusing only on U.S. law, this Article makes two general contributions to the literature on fair use: (1) in cases involving archiving, the user’s security precautions are relevant under the first fair use factor and should not be treated as an unenumerated factor or as part of the market harm analysis; and (2) good faith should not be a factor in fair use analysis, but even if courts do consider good faith, TDM research conducted on infringing sources, such as Sci-Hub, is still lawful because the research provides transformative benefits without causing harm to the markets that matter.

This Article also revisits the issue of temporary copies to argue that certain steps in TDM research do not make copies that “count” under U.S. law and that it is possible to design cloud-based TDM research that does not implicate U.S. copyright law at all.

This Article addresses the needs of many audiences including policymakers, courts, university counsel, research libraries, and legal scholars who seek a thorough legal analysis to support this argument.

URL : https://lawreview.law.ucdavis.edu/issues/53/2/articles/53-2_carroll.html