The way science and research is done is rapidly becoming more open and collaborative. The traditional way of publishing new findings in journals is becoming increasingly outdated and no longer serves the needs of much of science.
Whilst preprints can bring significant benefits of removing delay and selection, they do not go far enough if simply implemented alongside the existing journal system. We propose that we need a new approach, an Open Science Platform, that takes the benefits of preprints but adds formal, invited, and transparent post-publication peer review.
This bypasses the problems of the current journal system and, in doing so, moves the evaluation of research and researchers away from the journal-based Impact Factor and towards a fairer system of article-based qualitative and quantitative indicators.
In the long term, it should be irrelevant where a researcher publishes their findings. What is important is that research is shared and made available without delay within a framework that encourages quality standards and requires all players in the research community to work as collaborators.
“Open access” has become a central theme of journal reform in academic publishing. In this article, I examine the consequences of an important technological loophole in which publishers can claim to be adhering to the principles of open access by releasing articles in proprietary or “locked” formats that cannot be processed by automated tools, whereby even simple copy and pasting of text is disabled.
These restrictions will prevent the development of an important infrastructural element of a modern research enterprise, namely, scientific data science, or the use of data analytic techniques to conduct meta-analyses and investigations into the scientific corpus.
I give a brief history of the open access movement, discuss novel journalistic practices, and an overview of data-driven investigation of the scientific corpus. I argue that particularly in an era where the veracity of many research studies has been called into question, scientific data science should be one of the key motivations for open access publishing.
The enormous benefits of unrestricted access to the research literature should prompt scholars from all disciplines to reject publishing models whereby articles are released in proprietary formats or are otherwise restricted from being processed by automated tools as part of a data science pipeline.
L’archive ouverte nationale et pluridisciplinaire HAL héberge aujourd’hui des données de la recherche ainsi que des données supplémentaires sous la forme d’annexes.
Afin de tenter de définir des orientations pour cette infrastructure, ce mémoire présente un état de l’art des différents acteurs et enjeux qui gravitent autour de la thématique des données de la recherche. Ensuite, il s’attache à décrire les différents services mis en œuvre par les entrepôts de données de la recherche ainsi que les défis auxquels ils doivent répondre.
Enfin, est proposée une étude exploratoire des données supplémentaires hébergées par HAL, qui cherche à identifier quelles communautés scientifiques utilisent ce service et sous quelles formes.
Authors : Nadine Levin, Sabina Leonelli, Dagmara Weckowska, David Castle, John Dupré
This article documents how biomedical researchers in the United Kingdom understand and enact the idea of “openness.”
This is of particular interest to researchers and science policy worldwide in view of the recent adoption of pioneering policies on Open Science and Open Access by the U.K. government—policies whose impact on and implications for research practice are in need of urgent evaluation, so as to decide on their eventual implementation elsewhere.
This study is based on 22 in-depth interviews with U.K. researchers in systems biology, synthetic biology, and bioinformatics, which were conducted between September 2013 and February 2014.
Through an analysis of the interview transcripts, we identify seven core themes that characterize researchers’ understanding of openness in science and nine factors that shape the practice of openness in research.
Our findings highlight the implications that Open Science policies can have for research processes and outcomes and provide recommendations for enhancing their content, effectiveness, and implementation.
The ability to measure the use and impact of published data sets is key to the success of the open data/open science paradigm. A direct measure of impact would require tracking data (re)use in the wild, which is difficult to achieve.
This is therefore commonly replaced by simpler metrics based on data download and citation counts. In this paper we describe a scenario where it is possible to track the trajectory of a dataset after its publication, and show how this enables the design of accurate models for ascribing credit to data originators.
A Data Trajectory (DT) is a graph that encodes knowledge of how, by whom, and in which context data has been re-used, possibly after several generations. We provide a theoretical model of DTs that is grounded in the W3C PROV data model for provenance, and we show how DTs can be used to automatically propagate a fraction of the credit associated with transitively derived datasets, back to original data contributors.
We also show this model of transitive credit in action by means of a Data Reuse Simulator. In the longer term, our ultimate hope is that credit models based on direct measures of data reuse will provide further incentives to data publication.
We conclude by outlining a research agenda to address the hard questions of creating, collecting, and using DTs systematically across a large number of data reuse instances in the wild.
As scientific data volumes, format types, and sources increase rapidly with the invention and improvement of scientific capabilities, the resulting datasets are becoming more complex to manage as well.
One of the significant management challenges is pulling apart the individual contributions of specific people and organizations within large, complex projects.
This is important for two aspects:1) assigning responsibility and accountability for scientific work, and 2) giving professional credit to individuals (e.g. hiring, promotion, and tenure) who work within such large projects.
This paper aims to review the extant practice of data attribution and how it may be improved. Through a case study of creating a detailed attribution record for a climate model dataset, the paper evaluates the strengths and weaknesses of the current data attribution method and proposes an alternative attribution framework accordingly.
The paper concludes by demonstrating that, analogous to acknowledging the different roles and responsibilities shown in movie credits, the methodology developed in the study could be used in general to identify and map out the relationships among the organizations and individuals who had contributed to a dataset.
As a result, the framework could be applied to create data attribution for other dataset types beyond climate model datasets.
The development of e-Research infrastructure has enabled data to be shared and accessed more openly. Policy mandates for data sharing have contributed to the increasing availability of research data through data repositories, which create favourable conditions for the re-use of data for purposes not always anticipated by original collectors.
Despite the current efforts to promote transparency and reproducibility in science, datare-use cannot be assumed, nor merely considered a ‘thrifting’ activity where scientists shop around in datarepositories considering only the ease of access to data.
The lack of an integrated view of individual, socialand technological influential factors to intentional and actual data re-use behaviour was the key motivatorfor this study. Interviews with 13 social scientists produced 25 factors that were found to influence theirperceptions and experiences, including both their unsuccessful and successful attempts to re-use data.
These factors were grouped into six theoretical variables: perceived benefits, perceived risks, perceived effort,social influence, facilitating conditions, and perceived re-usability.
These research findings provide an in-depth understanding about the re-use of research data in the context of open science, which can be valuablein terms of theory and practice to help leverage data re-use and make publicly available data moreactionable.