Improving the Reproducibility of LaTeX Documents by Enriching Figures with Embedded Scripts and Data

Author : Christian Thomas Jacobs

The introduction of open-access data policies by research councils, the enforcement of best practices, and the deployment of persistent online repositories have enabled datasets which support results in scientific papers to become more widely accessible.

Unfortunately, despite this advancement in the curation/publishing workflow, the data-driven figures within a paper often remain difficult to reproduce. Plotting or analysis scripts rarely accompany the manuscript or any associated software release; and even if they do, it may be unclear exactly which version was used.

Furthermore, the precise commands and parameters used to execute the scripts are often not included in a README file or in the paper itself. This paper introduces a new open-source digital curation tool, Pynea, for improving the reproducibility of LaTeX documents.

Each figure within a document is enriched by automatically embedding the plotting script and data files required to generate it, such that it can be regenerated by readers of the paper in the future.

The command used to execute the plotting script is also added to the figure’s metadata, along with details of the specific version of the script used (if the script is tracked with the Git version control system).

If the document is to be recompiled with a figure that has since changed, or had its plotting script or data files modified, the figure is regenerated such that the author can be confident that the latest version of the figure and its dependencies are included.

URL : Improving the Reproducibility of LaTeX Documents by Enriching Figures with Embedded Scripts and Data


An Efficiency Comparison of Document Preparation Systems Used in Academic Research and Development

The choice of an efficient document preparation system is an important decision for any academic researcher. To assist the research community, we report a software usability study in which 40 researchers across different disciplines prepared scholarly texts with either Microsoft Word or LaTeX. The probe texts included simple continuous text, text with tables and subheadings, and complex text with several mathematical equations. We show that LaTeX users were slower than Word users, wrote less text in the same amount of time, and produced more typesetting, orthographical, grammatical, and formatting errors. On most measures, expert LaTeX users performed even worse than novice Word users. LaTeX users, however, more often report enjoying using their respective software. We conclude that even experienced LaTeX users may suffer a loss in productivity when LaTeX is used, relative to other document preparation systems. Individuals, institutions, and journals should carefully consider the ramifications of this finding when choosing document preparation strategies, or requiring them of authors.


DOI : 10.1371/journal.pone.0115069