Publishing computational research — A review of infrastructures for reproducible and transparent scholarly communication

Authors : Markus Konkol, Daniel Nüst, Laura Goulier

Funding agencies increasingly ask applicants to include data and software management plans into proposals. In addition, the author guidelines of scientific journals and conferences more often include a statement on data availability, and some reviewers reject unreproducible submissions.

This trend towards open science increases the pressure on authors to provide access to the source code and data underlying the computational results in their scientific papers.

Still, publishing reproducible articles is a demanding task and not achieved simply by providing access to code scripts and data files. Consequently, several projects develop solutions to support the publication of executable analyses alongside articles considering the needs of the aforementioned stakeholders.

The key contribution of this paper is a review of applications addressing the issue of publishing executable computational research results. We compare the approaches across properties relevant for the involved stakeholders, e.g., provided features and deployment options, and also critically discuss trends and limitations.

The review can support publishers to decide which system to integrate into their submission process, editors to recommend tools for researchers, and authors of scientific papers to adhere to reproducibility principles.

URL : https://arxiv.org/abs/2001.00484

Transparent, Reproducible, and Open Science Practices of Published Literature in Dermatology Journals: Cross-Sectional Analysis

Authors : J Michael Anderson, Andrew Niemann, Austin L Johnson, Courtney Cook, Daniel Tritz, Matt Vassar

Background

Reproducible research is a foundational component for scientific advancements, yet little is known regarding the extent of reproducible research within the dermatology literature.

Objective

This study aimed to determine the quality and transparency of the literature in dermatology journals by evaluating for the presence of 8 indicators of reproducible and transparent research practices.

Methods

By implementing a cross-sectional study design, we conducted an advanced search of publications in dermatology journals from the National Library of Medicine catalog. Our search included articles published between January 1, 2014, and December 31, 2018.

After generating a list of eligible dermatology publications, we then searched for full text PDF versions by using Open Access Button, Google Scholar, and PubMed. Publications were analyzed for 8 indicators of reproducibility and transparency—availability of materials, data, analysis scripts, protocol, preregistration, conflict of interest statement, funding statement, and open access—using a pilot-tested Google Form.

Results

After exclusion, 127 studies with empirical data were included in our analysis. Certain indicators were more poorly reported than others. We found that most publications (113, 88.9%) did not provide unmodified, raw data used to make computations, 124 (97.6%) failed to make the complete protocol available, and 126 (99.2%) did not include step-by-step analysis scripts.

Conclusions

Our sample of studies published in dermatology journals do not appear to include sufficient detail to be accurately and successfully reproduced in their entirety. Solutions to increase the quality, reproducibility, and transparency of dermatology research are warranted.

More robust reporting of key methodological details, open data sharing, and stricter standards journals impose on authors regarding disclosure of study materials might help to better the climate of reproducible research in dermatology.

URL : Transparent, Reproducible, and Open Science Practices of Published Literature in Dermatology Journals: Cross-Sectional Analysis

DOI : https://doi.org/10.2196/16078

Publishers’ Responsibilities in Promoting Data Quality and Reproducibility

Author : Iain Hrynaszkiewicz

Scholarly publishers can help to increase data quality and reproducible research by promoting transparency and openness.

Increasing transparency can be achieved by publishers in six key areas: (1) understanding researchers’ problems and motivations, by conducting and responding to the findings of surveys; (2) raising awareness of issues and encouraging behavioural and cultural change, by introducing consistent journal policies on sharing research data, code and materials; (3) improving the quality and objectivity of the peer-review process by implementing reporting guidelines and checklists and using technology to identify misconduct; (4) improving scholarly communication infrastructure with journals that publish all scientifically sound research, promoting study registration, partnering with data repositories and providing services that improve data sharing and data curation; (5) increasing incentives for practising open research with data journals and software journals and implementing data citation and badges for transparency; and (6) making research communication more open and accessible, with open-access publishing options, permitting text and data mining and sharing publisher data and metadata and through industry and community collaboration.

This chapter describes practical approaches being taken by publishers, in these six areas, their progress and effectiveness and the implications for researchers publishing their work.

URL : Publishers’ Responsibilities in Promoting Data Quality and Reproducibility

Alternative location : https://link.springer.com/chapter/10.1007%2F164_2019_290

Toward collaborative open data science in metabolomics using Jupyter Notebooks and cloud computing

Authors : Kevin M. Mendez, Leighton Pritchard, Stacey N. Reinke, David I. Broadhurst

Background

A lack of transparency and reporting standards in the scientific community has led to increasing and widespread concerns relating to reproduction and integrity of results. As an omics science, which generates vast amounts of data and relies heavily on data science for deriving biological meaning, metabolomics is highly vulnerable to irreproducibility.

The metabolomics community has made substantial efforts to align with FAIR data standards by promoting open data formats, data repositories, online spectral libraries, and metabolite databases.

Open data analysis platforms also exist; however, they tend to be inflexible and rely on the user to adequately report their methods and results. To enable FAIR data science in metabolomics, methods and results need to be transparently disseminated in a manner that is rapid, reusable, and fully integrated with the published work.

To ensure broad use within the community such a framework also needs to be inclusive and intuitive for both computational novices and experts alike.

Aim of Review

To encourage metabolomics researchers from all backgrounds to take control of their own data science, mould it to their personal requirements, and enthusiastically share resources through open science.

Key Scientific Concepts of Review

This tutorial introduces the concept of interactive web-based computational laboratory notebooks. The reader is guided through a set of experiential tutorials specifically targeted at metabolomics researchers, based around the Jupyter Notebook web application, GitHub data repository, and Binder cloud computing platform.

URL : Toward collaborative open data science in metabolomics using Jupyter Notebooks and cloud computing

DOI : https://doi.org/10.1007/s11306-019-1588-0

Survey on Scientific Shared Resource Rigor and Reproducibility

Authors : Kevin L. Knudtson, Robert H. Carnahan, Rebecca L. Hegstad-Davies, Nancy C. Fisher, Belynda Hicks, Peter A. Lopez, Susan M. Meyn, Sheenah M. Mische, Frances Weis-Garcia, Lisa D. White, Katia Sol-Church

Shared scientific resources, also known as core facilities, support a significant portion of the research conducted at biomolecular research institutions.

The Association of Biomolecular Resource Facilities (ABRF) established the Committee on Core Rigor and Reproducibility (CCoRRe) to further its mission of integrating advanced technologies, education, and communication in the operations of shared scientific resources in support of reproducible research.

In order to first assess the needs of the scientific shared resource community, the CCoRRe solicited feedback from ABRF members via a survey. The purpose of the survey was to gain information on how U.S. National Institutes of Health (NIH) initiatives on advancing scientific rigor and reproducibility influenced current services and new technology development.

In addition, the survey aimed to identify the challenges and opportunities related to implementation of new reporting requirements and to identify new practices and resources needed to ensure rigorous research.

The results revealed a surprising unfamiliarity with the NIH guidelines. Many of the perceived challenges to the effective implementation of best practices (i.e., those designed to ensure rigor and reproducibility) were similarly noted as a challenge to effective provision of support services in a core setting. Further, most cores routinely use best practices and offer services that support rigor and reproducibility.

These services include access to well-maintained instrumentation and training on experimental design and data analysis as well as data management. Feedback from this survey will enable the ABRF to build better educational resources and share critical best-practice guidelines.

These resources will become important tools to the core community and the researchers they serve to impact rigor and transparency across the range of science and technology.

DOI : https://dx.doi.org/10.7171%2Fjbt.19-3003-001

Open science and modified funding lotteries can impede the natural selection of bad science

Authors : Paul E. Smaldino, Matthew A. Turner, Pablo A. Contreras Kallens

Assessing scientists using exploitable metrics can lead to the degradation of research methods even without any strategic behaviour on the part of individuals, via ‘the natural selection of bad science.’

Institutional incentives to maximize metrics like publication quantity and impact drive this dynamic. Removing these incentives is necessary, but institutional change is slow.

However, recent developments suggest possible solutions with more rapid onsets. These include what we call open science improvements, which can reduce publication bias and improve the efficacy of peer review. In addition, there have been increasing calls for funders to move away from prestige- or innovation-based approaches in favour of lotteries.

We investigated whether such changes are likely to improve the reproducibility of science even in the presence of persistent incentives for publication quantity through computational modelling.

We found that modified lotteries, which allocate funding randomly among proposals that pass a threshold for methodological rigour, effectively reduce the rate of false discoveries, particularly when paired with open science improvements that increase the publication of negative results and improve the quality of peer review.

In the absence of funding that targets rigour, open science improvements can still reduce false discoveries in the published literature but are less likely to improve the overall culture of research practices that underlie those publications.

URL : Open science and modified funding lotteries can impede the natural selection of bad science

DOI : https://doi.org/10.1098/rsos.190194

Replicable Services for Reproducible Research: A Model for Academic Libraries

Authors : Franklin Sayre, Amy Riegelman

Over the past decade, evidence from disciplines ranging from biology to economics has suggested that many scientific studies may not be reproducible. This has led to declarations in both the scientific and lay press that science is experiencing a “reproducibility crisis” and that this crisis has consequences for the extent to which students, faculty, and the public at large can trust research.

Faculty build on these results with their own research, and students and the public use these results for everything from patient care to public policy. To build a model for how academic libraries can support reproducible research, the authors conducted a review of major guidelines from funders, publishers, and professional societies. Specific recommendations were extracted from guidelines and compared with existing academic library services and librarian expertise.

The authors believe this review shows that many of the recommendations for improving reproducibility are core areas of academic librarianship, including data management, scholarly communication, and methodological support for systematic reviews and data-intensive research.

By increasing our knowledge of disciplinary, journal, funder, and society perspectives on reproducibility, and reframing existing librarian expertise and services, academic librarians will be well positioned to be leaders in supporting reproducible research.

URL : Replicable Services for Reproducible Research: A Model for Academic Libraries

DOI : https://doi.org/10.5860/crl.80.2.260