Authors: Kate Wittenberg, Sarah Glasser, Amy Kirchhoff, Sheila Morrissey, Stephanie Orphan
There has been tremendous growth in the amount of digital content created by libraries, publishers, cultural institutions and the general public. While there are great benefits to having content available in digital form, digital objects can be extremely short-lived unless proper attention is paid to preservation.
Reflecting on our experience with the digital preservation service Portico, we provide background on Portico’s history and evolving practice of sustainable preservation of the digital artifacts of scholarly communications.
We also provide an overview of the digital preservation landscape as we see it now, with some thoughts on current requirements for preservation, and thoughts on the opportunities and challenges that lie ahead.
Authors : Helena Francke, Jonas Gamalielsson, Björn Lundell
The study describes the conditions for long-term preservation of the content of the institutional repositories of Swedish higher education institutions based on an investigation of how deposited files are managed with regards to file format and how representatives of the repositories describe the functions of the repositories.
The findings are based on answers to a questionnaire completed by thirty-four institutional repository representatives (97% response rate).
Questionnaire answers were analysed through descriptive statistics and qualitative coding. The concept of information infrastructures was used to analytically discuss repository work.
Visibility and access to content were considered to be the most important functions of the repositories, but long-term preservation was also considered important for publications and student theses.
Whereas a majority of repositories had some form of guidelines for which file formats were accepted, very few considered whether or not file formats constitute open standards. This can have consequences for the long-term sustainability and access of the content deposited in the repositories.
The study contributes to the discussion about the sustainability of research publications and data in the repositories by pointing to the potential difficulties involved for long-term preservation and access when there is little focus on and awareness of open file formats.
In the German social and economic sciences there is a growing awareness of flexible data distribution and research data reuse, especially as increasing numbers of research funders recommend publishing research data as the basis for scientific insight.
However, a data-sharing mentality has not yet been established in Germany attributable to researchers’ strong reservations about publishing their data.
This attitude is exacerbated by the fact that, at present, there is no trusted national data sharing repository that covers the particular requirements of institutions regarding research data.
This article discusses how this objective can be achieved with the project initiative SowiDataNet.
The development of a community-driven data repository is a logically consistent and important step towards an attitude shift concerning data sharing in the social and economic sciences.
‘Big Science’ – that is, science which involves large collaborations with dedicated facilities, and involving large data volumes and multinational investments – is often seen as different when it comes to data management and preservation planning.
Big Science handles its data differently from other disciplines and has data management problems that are qualitatively different from other disciplines. In part, these differences arise from the quantities of data involved, but possibly more importantly from the cultural, organisational and technical distinctiveness of these academic cultures.
Consequently, the data management systems are typically and rationally bespoke, but this means that the planning for data management and preservation (DMP) must also be bespoke.
These differences are such that ‘just read and implement the OAIS specification’ is reasonable Data Management and Preservation (DMP) advice, but this bald prescription can and should be usefully supported by a methodological ‘toolkit’, including overviews, case-studies and costing models to provide guidance on developing best practice in DMP policy and infrastructure for these projects, as well as considering OAIS validation, audit and cost modelling.
In this paper, we build on previous work with the LIGO collaboration to consider the role of DMP planning within these big science scenarios, and discuss how to apply current best practice.
We discuss the result of the MaRDI-Gross project (Managing Research Data Infrastructures – Big Science), which has been developing a toolkit to provide guidelines on the application of best practice in DMP planning within big science projects.
This is targeted primarily at projects’ engineering managers, but intending also to help funders collaborate on DMP plans which satisfy the requirements imposed on them.
An activity-based costing model for long-term preservation and dissemination of digital research data: the case of DANS :
« Financial sustainability is an important attribute of a trusted, reliable digital repository. The authors of this paper use the case study approach to develop an activity-based costing (ABC) model. This is used for estimating the costs of preserving digital research data and identifying options for improving and sustaining relevant activities. The model is designed in the environment of the Data Archiving and Networked Services (DANS) institute, a well-known trusted repository. The DANS–ABC model has been tested on empirical cost data from activities performed by 51 employees in frames of over 40 different national and international projects. Costs of resources are being assigned to cost objects through activities and cost drivers. The ‘euros per dataset’ unit of costs measurement is introduced to analyse the outputs of the model. Funders, managers and other decision-making stakeholders are being provided with understandable information connected to the strategic goals of the organisation. The latter is being achieved by linking the DANS–ABC model to another widely used managerial tool—the Balanced Scorecard (BSC). The DANS–ABC model supports costing of services provided by a data archive, while the combination of the DANS–ABC with a BSC identifies areas in the digital preservation process where efficiency improvements are possible. »
Step by step installation guide of a digital preservation infrastructure :
« The Ceris-CNR project of digital preservation infrastructure has been committed by Bess (Social Science Electronic Library of Piemonte) for years 2011-2012 sponsored by Compagnia di San Paolo of Turin. Ceris-CNR role is to handle all the post-scan of the digitalization, for this purpose it has deployed the software and server platforms of the repository and also the web portal for the presentation, research and consulting. This report is a guide of step by step followed to build the digital archive infrastructure. »
Preservation Status Of E-Resources: A Potential Crisis In Electronic Journal Preservation :
« E-journals have replaced the majority of titles formerly produced in paper format. Academic libraries are increasingly dependent on commercially produced, born-digital content that is purchased or licensed. The purpose of this presentation is to share the findings of a 2CUL study that assesses the role of LOCKSS and PORTICO in preserving each institution’s e-journal collections. The 2CUL initiative is a collaboration between Columbia University Library (CUL) and Cornell University Library (CUL) to join forces in providing content, expertise, and services that are impossible to accomplish acting alone.
Although LOCKSS is considered a successful digital preservation initiative, neither of the CULs felt that they fully understood the potential of the system for their own settings and collections. In support of this goal, a joint team was established in November 2010 to investigate various questions to assess how LOCKSS is being deployed and the implications of local practices for both CUL’s preservation frameworks. This study was seen as a high-level investigation to characterize the general landscape and identify further research questions. One of the practical outcomes was a comparative analysis of Portico and LOCKSS preservation coverage for Columbia and Cornell’s serial holdings. A key finding was that only 15-20% of the e-journal titles in the libraries’ collections are currently preserved by these two initiatives. Further analysis suggests the remaining titles fall into roughly 10 categories, with a variety of strategies needed to ensure their preservation. »