Identifiers for Digital Objects: the Case of Software Source Code Preservation

Authors : Roberto Di Cosmo, Morane Gruenpeter, Stefano Zacchiroli

In the very broad scope addressed by digital preservation initiatives, a special place belongs to the scientific and technical artifacts that we need to properly archive to enable scientific reproducibility.

For these artifacts we need identifiers that are not only unique and persistent, but also support integrity in an intrinsic way. They must provide strong guarantees that the object denoted by a given identifier will always be the same, without relying on third parties and external administrative processes.

In this article, we report on our quest for this identifiers for digital objects (IDOs), whose properties are different from, and complementary to, those of the various digital identifiers of objects (DIOs) that are in widespread use today.

We argue that both kinds of identifiers are needed and present the framework for intrinsic persistent identifiers that we have adopted in Software Heritage for preserving billions of software artifacts.

URL : https://hal.archives-ouvertes.fr/hal-01865790

Challenges and opportunities in the evolving digital preservation landscape: reflections from Portico

Authors: Kate Wittenberg, Sarah Glasser, Amy Kirchhoff, Sheila Morrissey, Stephanie Orphan

There has been tremendous growth in the amount of digital content created by libraries, publishers, cultural institutions and the general public. While there are great benefits to having content available in digital form, digital objects can be extremely short-lived unless proper attention is paid to preservation.

Reflecting on our experience with the digital preservation service Portico, we provide background on Portico’s history and evolving practice of sustainable preservation of the digital artifacts of scholarly communications.

We also provide an overview of the digital preservation landscape as we see it now, with some thoughts on current requirements for preservation, and thoughts on the opportunities and challenges that lie ahead.

URL : Challenges and opportunities in the evolving digital preservation landscape: reflections from Portico

DOI : http://doi.org/10.1629/uksg.421

Institutional repositories as infrastructures for long-term preservation

Authors : Helena Francke, Jonas Gamalielsson, Björn Lundell

Introduction

The study describes the conditions for long-term preservation of the content of the institutional repositories of Swedish higher education institutions based on an investigation of how deposited files are managed with regards to file format and how representatives of the repositories describe the functions of the repositories.

Method

The findings are based on answers to a questionnaire completed by thirty-four institutional repository representatives (97% response rate).

Analysis

Questionnaire answers were analysed through descriptive statistics and qualitative coding. The concept of information infrastructures was used to analytically discuss repository work.

Results

Visibility and access to content were considered to be the most important functions of the repositories, but long-term preservation was also considered important for publications and student theses.

Whereas a majority of repositories had some form of guidelines for which file formats were accepted, very few considered whether or not file formats constitute open standards. This can have consequences for the long-term sustainability and access of the content deposited in the repositories.

Conclusion

The study contributes to the discussion about the sustainability of research publications and data in the repositories by pointing to the potential difficulties involved for long-term preservation and access when there is little focus on and awareness of open file formats.

URL : http://www.informationr.net/ir/22-2/paper757.html

Strengthening institutional data management and promoting data sharing in the social and economic sciences

Authors : Monika Linne, Wolfgang Zenk-Möltgen

In the German social and economic sciences there is a growing awareness of flexible data distribution and research data reuse, especially as increasing numbers of research funders recommend publishing research data as the basis for scientific insight.

However, a data-sharing mentality has not yet been established in Germany attributable to researchers’ strong reservations about publishing their data.

This attitude is exacerbated by the fact that, at present, there is no trusted national data sharing repository that covers the particular requirements of institutions regarding research data.

This article discusses how this objective can be achieved with the project initiative SowiDataNet.

The development of a community-driven data repository is a logically consistent and important step towards an attitude shift concerning data sharing in the social and economic sciences.

DOI : http://doi.org/10.18352/lq.10195

Data Management and Preservation Planning for Big Science

‘Big Science’ – that is, science which involves large collaborations with dedicated facilities, and involving large data volumes and multinational investments – is often seen as different when it comes to data management and preservation planning.

Big Science handles its data differently from other disciplines and has data management problems that are qualitatively different from other disciplines. In part, these differences arise from the quantities of data involved, but possibly more importantly from the cultural, organisational and technical distinctiveness of these academic cultures.

Consequently, the data management systems are typically and rationally bespoke, but this means that the planning for data management and preservation (DMP) must also be bespoke.

These differences are such that ‘just read and implement the OAIS specification’ is reasonable Data Management and Preservation (DMP) advice, but this bald prescription can and should be usefully supported by a methodological ‘toolkit’, including overviews, case-studies and costing models to provide guidance on developing best practice in DMP policy and infrastructure for these projects, as well as considering OAIS validation, audit and cost modelling.

In this paper, we build on previous work with the LIGO collaboration to consider the role of DMP planning within these big science scenarios, and discuss how to apply current best practice.

We discuss the result of the MaRDI-Gross project (Managing Research Data Infrastructures – Big Science), which has been developing a toolkit to provide guidelines on the application of best practice in DMP planning within big science projects.

This is targeted primarily at projects’ engineering managers, but intending also to help funders collaborate on DMP plans which satisfy the requirements imposed on them.

URL : http://www.ijdc.net/index.php/ijdc/article/view/8.1.29

An activity based costing model for long term…

An activity-based costing model for long-term preservation and dissemination of digital research data: the case of DANS :

« Financial sustainability is an important attribute of a trusted, reliable digital repository. The authors of this paper use the case study approach to develop an activity-based costing (ABC) model. This is used for estimating the costs of preserving digital research data and identifying options for improving and sustaining relevant activities. The model is designed in the environment of the Data Archiving and Networked Services (DANS) institute, a well-known trusted repository. The DANS–ABC model has been tested on empirical cost data from activities performed by 51 employees in frames of over 40 different national and international projects. Costs of resources are being assigned to cost objects through activities and cost drivers. The ‘euros per dataset’ unit of costs measurement is introduced to analyse the outputs of the model. Funders, managers and other decision-making stakeholders are being provided with understandable information connected to the strategic goals of the organisation. The latter is being achieved by linking the DANS–ABC model to another widely used managerial tool—the Balanced Scorecard (BSC). The DANS–ABC model supports costing of services provided by a data archive, while the combination of the DANS–ABC with a BSC identifies areas in the digital preservation process where efficiency improvements are possible. »

URL : http://link.springer.com/article/10.1007/s00799-012-0092-1

Step by step installation guide of a digital…

Step by step installation guide of a digital preservation infrastructure :

« The Ceris-CNR project of digital preservation infrastructure has been committed by Bess (Social Science Electronic Library of Piemonte) for years 2011-2012 sponsored by Compagnia di San Paolo of Turin. Ceris-CNR role is to handle all the post-scan of the digitalization, for this purpose it has deployed the software and server platforms of the repository and also the web portal for the presentation, research and consulting. This report is a guide of step by step followed to build the digital archive infrastructure. »

URL : http://hdl.handle.net/10760/16911