Revisiting the Data Lifecycle with Big Data Curation

Author : Line Pouchard

As science becomes more data-intensive and collaborative, researchers increasingly use larger and more complex data to answer research questions.

The capacity of storage infrastructure, the increased sophistication and deployment of sensors, the ubiquitous availability of computer clusters, the development of new analysis techniques, and larger collaborations allow researchers to address grand societal challenges in a way that is unprecedented.

In parallel, research data repositories have been built to host research data in response to the requirements of sponsors that research data be publicly available. Libraries are re-inventing themselves to respond to a growing demand to manage, store, curate and preserve the data produced in the course of publicly funded research.

As librarians and data managers are developing the tools and knowledge they need to meet these new expectations, they inevitably encounter conversations around Big Data. This paper explores definitions of Big Data that have coalesced in the last decade around four commonly mentioned characteristics: volume, variety, velocity, and veracity.

We highlight the issues associated with each characteristic, particularly their impact on data management and curation. We use the methodological framework of the data life cycle model, assessing two models developed in the context of Big Data projects and find them lacking.

We propose a Big Data life cycle model that includes activities focused on Big Data and more closely integrates curation with the research life cycle. These activities include planning, acquiring, preparing, analyzing, preserving, and discovering, with describing the data and assuring quality being an integral part of each activity.

We discuss the relationship between institutional data curation repositories and new long-term data resources associated with high performance computing centers, and reproducibility in computational science.

We apply this model by mapping the four characteristics of Big Data outlined above to each of the activities in the model. This mapping produces a set of questions that practitioners should be asking in a Big Data project

URL : Revisiting the Data Lifecycle with Big Data Curation

Alternative location :

The effects of an editor serving as one of the reviewers during the peer-review process

Authors : Marco Giordan, Attila Csikasz-Nagy, Andrew M. Collings


Publishing in scientific journals is one of the most important ways in which scientists disseminate research to their peers and to the wider public.

Pre-publication peer review underpins this process, but peer review is subject to various criticisms and is under pressure from growth in the number of scientific publications.


Here we examine an element of the editorial process at eLife, in which the Reviewing Editor usually serves as one of the referees, to see what effect this has on decision times, decision type, and the number of citations.

We analysed a dataset of 8,905 research submissions to eLife since June 2012, of which 2,750 were sent for peer review, using R and Python to perform the statistical analysis.


The Reviewing Editor serving as one of the peer reviewers results in faster decision times on average, with the time to final decision ten days faster for accepted submissions (n=1,405) and 5 days faster for papers that were rejected after peer review (n=1,099).

There was no effect on whether submissions were accepted or rejected, and a very small (but significant) effect on citation rates for published articles where the Reviewing Editor served as one of the peer reviewers.


An important aspect of eLife’s peer-review process is shown to be effective, given that decision times are faster when the Reviewing Editor serves as a reviewer. Other journals hoping to improve decision times could consider adopting a similar approach.

URL : The effects of an editor serving as one of the reviewers during the peer-review process


Dealing with Big Data

Authors : Tobias Blanke, Andrew Prescott

This book chapter attempts to counter anxieties in the humanities and social science about the role of big data in research by focusing on approaches which, by being firmly grounded in the traditional values of disciplines, enhance existing methods to produce fruitful research.

Big data poses many methodological challenges, but these pressures should prompt scholars to pay much closer attention to methodological issues than they have in the past.


Produire un rapport d’activité : Pourquoi ? Comment ?

Auteur/author : Thomas Violet

L’environnement culturel, institutionnel, économique et technologique en mutation incite les bibliothèques à produire un rapport annuel d’activité afin de disposer d’un document de référence et de promouvoir leurs activités.

Si elles possèdent depuis longtemps une tradition de collecte des données d’activité, basée sur les différentes enquêtes ministérielles, elle ne produisent pas toutes un document synthétique autonome.

À la croisée des problématiques d’évaluation et de communication, les différentes pratiques des bibliothèques françaises éclairent les possibilités et enjeux d’un tel document, ses objectifs, les difficultés de sa fabrication ainsi que son circuit de diffusion.

URL : Produire un rapport d’activité : Pourquoi ? Comment ?

Alternative location :

Quality Assessment of Studies Published in Open Access and Subscription Journals: Results of a Systematic Evaluation

Authors : Sonja Milovanovic, Jovana Stojanovic, Ljupcho Efremov, Rosarita Amore, Stefania Boccia


Along with the proliferation of Open Access (OA) publishing, the interest for comparing the scientific quality of studies published in OA journals versus subscription journals has also increased.

With our study we aimed to compare the methodological quality and the quality of reporting of primary epidemiological studies and systematic reviews and meta-analyses published in OA and non-OA journals.


In order to identify the studies to appraise, we listed all OA and non-OA journals which published in 2013 at least one primary epidemiologic study (case-control or cohort study design), and at least one systematic review or meta-analysis in the field of oncology.

For the appraisal, we picked up the first studies published in 2013 with case-control or cohort study design from OA journals (Group A; n = 12), and in the same time period from non-OA journals (Group B; n = 26); the first systematic reviews and meta-analyses published in 2013 from OA journals (Group C; n = 15), and in the same time period from non-OA journals (Group D; n = 32).

We evaluated the methodological quality of studies by assessing the compliance of case-control and cohort studies to Newcastle and Ottawa Scale (NOS) scale, and the compliance of systematic reviews and meta-analyses to Assessment of Multiple Systematic Reviews (AMSTAR) scale.

The quality of reporting was assessed considering the adherence of case-control and cohort studies to STrengthening the Reporting of OBservational studies in Epidemiology (STROBE) checklist, and the adherence of systematic reviews and meta-analyses to Preferred Reporting Items for Systematic reviews and Meta-Analysis (PRISMA) checklist.


Among case-control and cohort studies published in OA and non-OA journals, we did not observe significant differences in the median value of NOS score (Group A: 7 (IQR 7–8) versus Group B: 8 (7–9); p = 0.5) and in the adherence to STROBE checklist (Group A, 75% versus Group B, 80%; p = 0.1).

The results did not change after adjustment for impact factor. The compliance with AMSTAR and adherence to PRISMA checklist were comparable between systematic reviews and meta-analyses published in OA and non-OA journals (Group C, 46.0% versus Group D, 55.0%; p = 0.06), (Group C, 72.0% versus Group D, 76.0%; p = 0.1), respectively).


The epidemiological studies published in OA journals in the field of oncology approach the same methodological quality and quality of reporting as studies published in non-OA journal.

URL : Quality Assessment of Studies Published in Open Access and Subscription Journals: Results of a Systematic Evaluation


Sciences de gestion : comment la quête d’excellence freine la libre circulation des savoirs

Auteur/Author : Marie-France Lebouc, Anne Chartier

Professeures dans une école de gestion, nous menons, depuis des années, un dialogue réflexif sur nos pratiques de recherche et de publication. La gestion est un domaine de plus en plus populaire et où la concurrence entre écoles s’exacerbe.

Les gestionnaires et les chercheurs des écoles de gestion ressentent un besoin stratégique d’excellence et de bonne réputation, ce qui passe, bien sûr, par la publication. Comment les contraintes actuelles de publication pèsent-elles sur nous, les chercheurs ? Quels savoirs produisons-nous dorénavant et pour qui ?

Nous tenterons de répondre à partir d’un examen de nos propres pratiques. Notamment, nous verrons que des barrières empêchent l’accès à ces savoirs. Comment sortir de ces problèmes ?

Toujours à partir de notre compréhension personnelle de l’intériorisation et de l’institutionnalisation des contraintes de publication, nous émettrons une opinion plutôt pessimiste sur les chances de l’accès libre de libérer la circulation des savoirs en sciences de gestion.


A longitudinal study of independent scholar-published open access journals

Authors : Bo-Christer Björk, Cenyu Shen, Mikael Laakso

Open Access (OA) is nowadays increasingly being used as a business model for the publishing of scholarly peer reviewed journals, both by specialized OA publishing companies and major, predominantly subscription-based publishers.

However, in the early days of the web OA journals were mainly founded by independent academics, who were dissatisfied with the predominant print and subscription paradigm and wanted to test the opportunities offered by the new medium.

There is still an on-going debate about how OA journals should be operated, and the volunteer model used by many such ‘indie’ journals has been proposed as a viable alternative to the model adopted by big professional publishers where publishing activities are funded by authors paying expensive article processing charges (APCs).

Our longitudinal quantitative study of 250 ‘indie’ OA journals founded prior to 2002, showed that 51% of these journals were still in operation in 2014 and that the median number of articles published per year had risen from 11 to 18 among the survivors.

Of these surviving journals, only 8% had started collecting APCs. A more detailed qualitative case study of five such journals provided insights into how such journals have tried to ensure the continuity and longevity of operations.

URL : A longitudinal study of independent scholar-published open access journals