Revisiting the Data Lifecycle with Big Data Curation

Author : Line Pouchard

As science becomes more data-intensive and collaborative, researchers increasingly use larger and more complex data to answer research questions.

The capacity of storage infrastructure, the increased sophistication and deployment of sensors, the ubiquitous availability of computer clusters, the development of new analysis techniques, and larger collaborations allow researchers to address grand societal challenges in a way that is unprecedented.

In parallel, research data repositories have been built to host research data in response to the requirements of sponsors that research data be publicly available. Libraries are re-inventing themselves to respond to a growing demand to manage, store, curate and preserve the data produced in the course of publicly funded research.

As librarians and data managers are developing the tools and knowledge they need to meet these new expectations, they inevitably encounter conversations around Big Data. This paper explores definitions of Big Data that have coalesced in the last decade around four commonly mentioned characteristics: volume, variety, velocity, and veracity.

We highlight the issues associated with each characteristic, particularly their impact on data management and curation. We use the methodological framework of the data life cycle model, assessing two models developed in the context of Big Data projects and find them lacking.

We propose a Big Data life cycle model that includes activities focused on Big Data and more closely integrates curation with the research life cycle. These activities include planning, acquiring, preparing, analyzing, preserving, and discovering, with describing the data and assuring quality being an integral part of each activity.

We discuss the relationship between institutional data curation repositories and new long-term data resources associated with high performance computing centers, and reproducibility in computational science.

We apply this model by mapping the four characteristics of Big Data outlined above to each of the activities in the model. This mapping produces a set of questions that practitioners should be asking in a Big Data project

URL : Revisiting the Data Lifecycle with Big Data Curation

Alternative location :

The role of « open » in strategic library planning

Academic libraries are undergoing evolutionary change as emerging technologies and new philosophies about how information is created, distributed, and shared have disrupted traditional operations and services.

Additionally, the population that the academic library serves is increasingly distributed due to distance learning opportunities and new models of teaching and learning.  

This article, the first in this special issue, suggests that in today’s increasingly networked and distributed information environment, the strategic integration of open curation and collection development practices can serve as a useful means for organizing and providing structure to the diverse mass of available digital information, so that individual users of the library have access to coherent contexts for meaningful engagement with that information.

Building on insights from extant research and practice, this article proposes that colleges and universities recognize a more inclusive open access environment, including the integration of resources outside of those owned or created by the institution, and a shift toward policies that consider open access research and open educational resources as part of the library’s formal curatorial workflow and collection building.

At the conclusion on this article, authors Lisa Petrides and Cynthia Jimes offer a commentary on the six remaining articles that comprise this special issue on Models of Open Education in Higher Education, discussing the significant role that “open” policy and practice play in shaping teaching, learning, and scholarship in the global context of higher education.

URL : The role of « open » in strategic library planning


The Library as Publishing House

The academic library has taken on the new role of institutional publishing house, using institutional repository (IR) services to enable journal publishing and manage conference planning. Librarians taking on this new role as publisher must know the journal publishing work flow, including online article submission, peer review, publishing, marketing, and assessment.

They must understand international identifiers such as the electronic International Standard Serial Number (eISSN) and Digital Object Identifier (DOI). To manage conference planning functions, librarians need to understand event functions such as presentation submission, program scheduling, registration and third-party payment systems, proceedings publishing, and archiving.

In general, they need to be technologically savvy enough to configure and manage a specialized content management system, the institutional repository.


Big data et bibliothèques : traitement et analyse informatiques des collections numériques

Cette étude s’attache à présenter sous quels aspects les collections numériques des bibliothèques relèvent des problématiques propres aux données massives, et en quoi les techniques de fouille de données (text and data mining) représentent désormais une nécessité pour l’appropriation par les chercheurs des résultats de la littérature scientifique.

Ce travail, qui met au centre de son propos les techniques de fouille de données comme moyens de maîtriser la masse documentaire, identifie trois problématiques distinctes concernant les bibliothèques numériques et ces dispositifs de lecture algorithmiques : sont ainsi abordées successivement les démarches à mettre en oeuvre pour aider les chercheurs à faire usage de ces nouvelles méthodes de lecture, puis l’emploi de techniques de fouille de données sur les collections pour constituer de nouvelles formes d’instruments de recherche, et enfin l’usage de la fouille pour assister le traitement documentaire.

L’étude se conclut sur le détail des questions juridiques soulevées actuellement par la fouille de données, en rapport avec le droit de la propriété intellectuelle.

URL : Big data et bibliothèques : traitement et analyse informatiques des collections numériques

Alternative location :

Democratic Potential of New Models of Scholarship and the Crisis of Control

This paper frames the serials crisis as a loss of control over libraries’ collections and development budgets. While libraries have always had to contend with budget constraints, for many the rising cost of serials has become prohibitive, impeding on scholarship itself as librarians are forced to cut journal subscriptions.

Open Access (OA) journals hold the potential to partially alleviate the crisis, but a lasting solution might lie in altering expectations of scholars. Our critique of the dissemination of scholarly research looks to both Marxian economic theory and later critical theory, but finds both inadequate for a pragmatic solution to the crisis; instead, we adopt Deweyan democratic theory to argue in favour of public scholarship aided by librarians and vetted by scholarly societies.

URL : Democratic Potential of New Models of Scholarship and the Crisis of Control

Alternative location :

Transforming Roles: Canadian Academic Librarians Embedded in Faculty Research Projects

Academic librarians have always played an important role in providing research services and research-skills development to faculty in higher education. But that role is evolving to include the academic librarian as a unique and necessary research partner, practitioner and participant in collaborative, grant-funded research projects.

This article describes how a selected sample of Canadian academic librarians became embedded in faculty research projects and describes their experiences of participating in research teams.

Conducted as a series of semi-structured interviews, this qualitative study illustrates the emerging opportunities and challenges of the librarian-researcher role and how it is transforming the Canadian university library.


Archivage pérenne en bibliothèque universitaire : bilan et perspectives

Au vu des risques que créée l’obsolescence technologique, l’archivage pérenne des contenus numériques s’avère désormais incontournable et constitue un enjeu pour l’Enseignement supérieur et la Recherche. La préservation à très long terme nécessite cependant des compétences et des techniques spécifiques et implique des coûts, humains et financiers. Pour la bibliothèque universitaire se pose alors la question du rôle qui doit être le sien au regard de l’archivage pérenne des contenus numériques qu’elle est amenée à stocker, à diffuser ou à produire.

L’objectif de ce mémoire est de dresser un panorama de l’archivage pérenne dans l’Enseignement supérieur et la Recherche, de présenter et d’analyser un ensemble de retours d’expérience de bibliothèques universitaires ayant mené à bien, ou mettant actuellement en oeuvre, ou encore envisageant un projet d’archivage pérenne, et d’en déduire les difficultés et les obstacles qui s’opposent à une réelle avancée de l’archivage pérenne en bibliothèque universitaire.

Le mémoire envisagera la mutualisation, à divers niveaux, comme une réponse à ces difficultés.

URL : Archivage pérenne en bibliothèque universitaire : bilan et perspectives

Alternative location :