Report on Integration of Data and Publications …

Report on Integration of Data and Publications :

“Scholarly communication is the foundation of modern research where empirical evidence is interpreted and communicated as published hypothesis driven research. Many current and recent reports highlight the impact of advancing technology on modern research and consequences this has on scholarly communication. As part of the ODE project this report sought to coalesce current though and opinions from numerous and diverse sources to reveal opportunities for supporting a more connected and integrated scholarly record. Four perspectives were considered, those of the Researcher who generates or reuses primary data, Publishers who provide the mechanisms to communicate research activities and Libraries & Data enters who maintain and preserve the evidence that underpins scholarly communication and the published record. This report finds the landscape fragmented and complex where competing interests can sometimes confuse and confound requirements, needs and expectations. Equally the report identifies clear opportunity for all stakeholders to directly enable a more joined up and vital scholarly record of modern research.”

URL : http://www.libereurope.eu/sites/default/files/ODE-ReportOnIntegrationOfDataAndPublication.pdf

Who Shares Who Doesn’t Factors Associated with Openly…

Who Shares? Who Doesn’t? Factors Associated with Openly Archiving Raw Research Data :

“Many initiatives encourage investigators to share their raw datasets in hopes of increasing research efficiency and quality. Despite these investments of time and money, we do not have a firm grasp of who openly shares raw research data, who doesn’t, and which initiatives are correlated with high rates of data sharing. In this analysis I use bibliometric methods to identify patterns in the frequency with which investigators openly archive their raw gene expression microarray datasets after study publication.

Automated methods identified 11,603 articles published between 2000 and 2009 that describe the creation of gene expression microarray data. Associated datasets in best-practice repositories were found for 25% of these articles, increasing from less than 5% in 2001 to 30%–35% in 2007–2009. Accounting for sensitivity of the automated methods, approximately 45% of recent gene expression studies made their data publicly available.

First-order factor analysis on 124 diverse bibliometric attributes of the data creation articles revealed 15 factors describing authorship, funding, institution, publication, and domain environments. In multivariate regression, authors were most likely to share data if they had prior experience sharing or reusing data, if their study was published in an open access journal or a journal with a relatively strong data sharing policy, or if the study was funded by a large number of NIH grants. Authors of studies on cancer and human subjects were least likely to make their datasets available.

These results suggest research data sharing levels are still low and increasing only slowly, and data is least available in areas where it could make the biggest impact. Let’s learn from those with high rates of sharing to embrace the full potential of our research output.”

URL : http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0018657

DataStaR A Data Sharing and Publication Infrastructure to…

DataStaR: A Data Sharing and Publication Infrastructure to Support Research :

“DataStaR, a Data Staging Repository (http://datastar.mannlib.cornell.edu/) in development at Cornell University’s Albert R. Mann Library (Ithaca, New York USA), is intended to support collaboration and data sharing among researchers during the research process, and to promote publishing or archiving data and high-quality metadata to discipline-specific data centers and/or institutional repositories. Researchers may store and share data with selected colleagues, select a repository for data publication, create high quality metadata in the formats required by external repositories and Cornell’s institutional repository, and obtain help from data librarians with any of these tasks. To facilitate cross-domain interoperability and flexibility in metadata management, we employ semantic web technologies as part of DataStaR’s metadata infrastructure. This paper describes the overall design of the system, the work to date with Cornell researchers and their data sets, and possibilities for extending DataStaR for use in international agriculture research..

URL : http://journals.sfu.ca/iaald/index.php/aginfo/article/view/199

Beyond the Data Deluge A Research Agenda for…

Beyond the Data Deluge: A Research Agenda for Large-Scale Data Sharing and Reuse :

“There is almost universal agreement that scientific data should be shared for use beyond the purposes for which they were initially collected. Access to data enables system-level science, expands the instruments and products of research to new communities, and advances solutions to complex human problems. While demands for data are not new, the vision of open access to data is increasingly ambitious. The aim is to make data accessible and usable to anyone, anytime, anywhere, and for any purpose. Until recently, scholarly investigations related to data sharing and reuse were sparse. They have become more common as technology and instrumentation have advanced, policies that mandate sharing have been implemented, and research has become more interdisciplinary. Each of these factors has contributed to what is commonly referred to as the “data deluge”. Most discussions about increases in the scale of sharing and reuse have focused on growing amounts of data. There are other issues related to open access to data that also concern scale which have not been as widely discussed: broader participation in data sharing and reuse, increases in the number and types of intermediaries, and more digital data products. The purpose of this paper is to develop a research agenda for scientific data sharing and reuse that considers these three areas.”

URL : http://www.ijdc.net/index.php/ijdc/article/view/163

Open Science in Practice: Researcher Perspectives and Participation

We report on an exploratory study consisting of brief case studies in selected disciplines, examining what motivates researchers to work (or want to work) in an open manner with regard to their data, results and protocols, and whether advantages are delivered by working in this way. We review the policy background to open science, and literature on the benefits attributed to open data, considering how these relate to curation and to questions of who participates in science.

The case studies investigate the perceived benefits to researchers, research institutions and funding bodies of utilising open scientific methods, the disincentives and barriers, and the degree to which there is evidence to support these perceptions. Six case study groups were selected in astronomy, bioinformatics, chemistry, epidemiology, language technology and neuroimaging.

The studies identify relevant examples and issues through qualitative analysis of interview transcripts. We provide a typology of degrees of open working across the research lifecycle, and conclude that better support for open working, through guidelines to assist research groups in identifying the value and costs of working more openly, and further research to assess the risks, incentives and shifts in responsibility entailed by opening up the research process are needed.

URL : http://www.ijdc.net/index.php/ijdc/article/view/173