Toward Reproducible Computational Research: An Empirical Analysis of Data and Code Policy Adoption by Journals

Journal policy on research data and code availability is an important part of the ongoing shift toward publishing reproducible computational science. This article extends the literature by studying journal data sharing policies by year (for both 2011 and 2012) for a referent set of 170 journals.

We make a further contribution by evaluating code sharing policies, supplemental materials policies, and open access status for these 170 journals for each of 2011 and 2012.

We build a predictive model of open data and code policy adoption as a function of impact factor and publisher and find higher impact journals more likely to have open data and code policies and scientific societies more likely to have open data and code policies than commercial publishers.

We also find open data policies tend to lead open code policies, and we find no relationship between open data and code policies and either supplemental material policies or open access journal status.

Of the journals in this study, 38% had a data policy, 22% had a code policy, and 66% had a supplemental materials policy as of June 2012. This reflects a striking one year increase of 16% in the number of data policies, a 30% increase in code policies, and a 7% increase in the number of supplemental materials policies.

We introduce a new dataset to the community that categorizes data and code sharing, supplemental materials, and open access policies in 2011 and 2012 for these 170 journals.

URL : http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0067111

Making research data repositories visible the re3data org…

Statut

Making research data repositories visible: the re3data.org registry :

“Researchers require infrastructures that ensure a maximum of accessibility, stability and reliability to facilitate working with and sharing of research data. Such infrastructures are being increasingly summarized under the term Research Data Repositories (RDR). The project re3data.org – Registry of Research Rata Repositories has begun to index research data repositories in 2012 and offers researchers, funding organizations, libraries and publishers an overview of the heterogeneous research data repository landscape. Information icons help researchers to easily identify an adequate repository for the storage and reuse of their data. This article describes the RDR landscape, outlines the practicality of re3data.org as a service, and shows how this service helps to find research data.”

URL : https://peerj.com/preprints/21v1/

The Role of the Library in the Research Enterprise

Libraries have provided services to researchers for many years. Changes in technology and new publishing models provide opportunities for libraries to be more involved in the research enterprise.

Within this article, the author reviews traditional library services, briefly describes the eScience and publishing landscape as it relates to libraries, and explores possible library programs in support of research. Many of the new opportunities require new partnerships, both within the institution and externally.

URL : http://dx.doi.org/10.7191/jeslib.2013.1043

Common Errors in Ecological Data Sharing Objectives…

Statut

Common Errors in Ecological Data Sharing :

Objectives: (1) to identify common errors in data organization and metadata completeness that would preclude a “reader” from being able to interpret and re-use the data for a new purpose; and (2) to develop a set of best practices derived from these common errors that would guide researchers in creating more usable data products that could be readily shared, interpreted, and used.
Methods: We used directed qualitative content analysis to assess and categorize data and metadata errors identified by peer reviewers of data papers published in the Ecological Society of America’s (ESA) Ecological Archives. Descriptive statistics provided the relative frequency of the errors identified during the peer review process.
Results: There were seven overarching error categories: Collection & Organization, Assure, Description, Preserve, Discover, Integrate, and Analyze/Visualize. These categories represent errors researchers regularly make at each stage of the Data Life Cycle. Collection & Organization and Description errors were some of the most common errors, both of which occurred in over 90% of the papers.
Conclusions: Publishing data for sharing and reuse is error prone, and each stage of the Data Life Cycle presents opportunities for mistakes. The most common errors occurred when the researcher did not provide adequate metadata to enable others to interpret and potentially re-use the data. Fortunately, there are ways to minimize these mistakes through carefully recording all details about study context, data collection, QA/ QC, and analytical procedures from the beginning of a research project and then including this descriptive information in the metadata.”

URL : http://escholarship.umassmed.edu/jeslib/vol2/iss2/1/

Open access to scientific literature and research data…

Statut

Open access to scientific literature and research data: a window of opportunity for latin america :

“The advance that the international open access movement has had in the last decade may seem to suggest that we are witnessing an important change in the model of scientific communication. This paper introduces the fundamental concepts of this movement, and in turn tries to measure the impact it has had in Latin America based on the development of different strategies.”

URL : http://sedici.unlp.edu.ar/handle/10915/23865

The data paper: a mechanism to incentivize data publishing in biodiversity science

Statut

Background

Free and open access to primary biodiversity data is essential for informed decision-making to achieve conservation of biodiversity and sustainable development. However, primary biodiversity data are neither easily accessible nor discoverable.

Among several impediments, one is a lack of incentives to data publishers for publishing of their data resources. One such mechanism currently lacking is recognition through conventional scholarly publication of enriched metadata, which should ensure rapid discovery of ‘fit-for-use’ biodiversity data resources.

Discussion

We review the state of the art of data discovery options and the mechanisms in place for incentivizing data publishers efforts towards easy, efficient and enhanced publishing, dissemination, sharing and re-use of biodiversity data.

We propose the establishment of the ‘biodiversity data paper’ as one possible mechanism to offer scholarly recognition for efforts and investment by data publishers in authoring rich metadata and publishing them as citable academic papers.

While detailing the benefits to data publishers, we describe the objectives, work flow and outcomes of the pilot project commissioned by the Global Biodiversity Information Facility in collaboration with scholarly publishers and pioneered by Pensoft Publishers through its journals Zookeys, PhytoKeys, MycoKeys, BioRisk, NeoBiota, Nature Conservation and the forthcoming Biodiversity Data Journal.

We then debate further enhancements of the data paper beyond the pilot project and attempt to forecast the future uptake of data papers as an incentivization mechanism by the stakeholder communities.

Conclusions

We believe that in addition to recognition for those involved in the data publishing enterprise, data papers will also expedite publishing of fit-for-use biodiversity data resources.

However, uptake and establishment of the data paper as a potential mechanism of scholarly recognition requires a high degree of commitment and investment by the cross-sectional stakeholder communities.”

URL : http://www.biomedcentral.com/1471-2105/12/S15/S2