Curated Archiving of Research Software Artifacts: Lessons Learned from the French Open Archive (HAL)

Authors : Roberto di Cosmo, Morane Gruenpeter, Bruno Marmol, Alain Monteil, Laurent Romary, Jozefina Sadowsa

Software has become an indissociable support of technical and scientific knowledge. The preservation of this universal body of knowledge is as essential as preserving research articles and data sets.

In the quest to make scientific results reproducible, and pass knowledge to future generations, we must preserve these three main pillars: research articles that describe the results, the data sets used or produced, and the software that embodies the logic of the data transformation.

The collaboration between Software Heritage (SWH), the Center for Direct Scientific Communication (CCSD) and the scientific and technical information services (IES) of The French Institute for Research in Computer Science and Automation (Inria) has resulted in a specified moderation and curation workflow for research software artifacts deposited in the HAL the French global open access repository.

The curation workflow was developed to help digital librarians and archivists handle this new and peculiar artifact – software source code. While implementing the workflow, a set of guidelines has emerged from the challenges and the solutions put in place to help all actors involved in the process.

URL : Curated Archiving of Research Software Artifacts: Lessons Learned from the French Open Archive (HAL)

DOI : https://doi.org/10.2218/ijdc.v15i1.698

Identifiers for Digital Objects: the Case of Software Source Code Preservation

Authors : Roberto Di Cosmo, Morane Gruenpeter, Stefano Zacchiroli

In the very broad scope addressed by digital preservation initiatives, a special place belongs to the scientific and technical artifacts that we need to properly archive to enable scientific reproducibility.

For these artifacts we need identifiers that are not only unique and persistent, but also support integrity in an intrinsic way. They must provide strong guarantees that the object denoted by a given identifier will always be the same, without relying on third parties and external administrative processes.

In this article, we report on our quest for this identifiers for digital objects (IDOs), whose properties are different from, and complementary to, those of the various digital identifiers of objects (DIOs) that are in widespread use today.

We argue that both kinds of identifiers are needed and present the framework for intrinsic persistent identifiers that we have adopted in Software Heritage for preserving billions of software artifacts.

URL : https://hal.archives-ouvertes.fr/hal-01865790