Willingness to Share Research Data Is Related to the Strength of the Evidence and the Quality of Reporting of Statistical Results :
“Background : The widespread reluctance to share published research data is often hypothesized to be due to the authors’ fear that reanalysis may expose errors in their work or may produce conclusions that contradict their own. However, these hypotheses have not previously been studied systematically.
Methods and Findings : We related the reluctance to share research data for reanalysis to 1148 statistically significant results reported in 49 papers published in two major psychology journals. We found the reluctance to share data to be associated with weaker evidence (against the null hypothesis of no effect) and a higher prevalence of apparent errors in the reporting of statistical results. The unwillingness to share data was particularly clear when reporting errors had a bearing on statistical significance.
Conclusions : Our findings on the basis of psychological papers suggest that statistical results are particularly hard to verify when reanalysis is more likely to lead to contrasting conclusions. This highlights the importance of establishing mandatory data archiving policies.”
URL : http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0026828
A Surfboard for Riding the Wave. Towards a four country action programme on research data :
“The Riding the Wave report by the high level expert group on research data called for a collaborative data infrastructure that will enable researchers and other stakeholders from education, society and business to use, re-use and exploit research data to the maximum benefit of science and society. The Knowledge Exchange partners have embraced this vision and commissioned a report that translates Riding the Wave into actions for the four partner countries and beyond.
This paper builds on this report and presents an overview of the present situation with regard to research data in Denmark, Germany, the Netherlands and the United Kingdom and offers broad outlines for a possible action programme for the four countries in realising the envisaged collaborative data infrastructure. An action programme at the level of four countries will require the involvement of all stakeholders from the scientific community.”
URL : http://knowledge-exchange.info/Default.aspx?ID=469
This article outlines the work that the University of Oxford is undertaking to implement a coordinated data management infrastructure. The rationale for the approach being taken by Oxford is presented, with particular attention paid to the role of each service division. This is followed by a consideration of the relative advantages and disadvantages of institutional data repositories, as opposed to national or international data centres. The article then focuses on two ongoing JISC-funded projects, ‘Embedding Institutional Data Curation Services in Research’ (Eidcsr) and ‘Supporting Data Management Infrastructure for the Humanities’ (Sudamih).
Both projects are intra-institutional collaborations and involve working with researchers to develop particular aspects of infrastructure, including: University policy, systems for the preservation and documentation of research data, training and support, software tools for the visualisation of large images, and creating and sharing databases via the Web (Database as a Service).
URL : http://www.ijdc.net/index.php/ijdc/article/view/198
This report compares the legal status of research data in the four KE partner countries. The report also addresses where European copyright and database law poses flaws and obstacles to the access to research data and singles out pre-conditions for openly available data.
URL : http://repository.jisc.ac.uk/id/eprint/6280
Public Availability of Published Research Data in High-Impact Journals :
“Background : There is increasing interest to make primary data from published research publicly available. We aimed to assess the current status of making research data available in highly-cited journals across the scientific literature.
Methods and Results : We reviewed the first 10 original research papers of 2009 published in the 50 original research journals with the highest impact factor. For each journal we documented the policies related to public availability and sharing of data. Of the 50 journals, 44 (88%) had a statement in their instructions to authors related to public availability and sharing of data. However, there was wide variation in journal requirements, ranging from requiring the sharing of all primary data related to the research to just including a statement in the published manuscript that data can be available on request. Of the 500 assessed papers, 149 (30%) were not subject to any data availability policy. Of the remaining 351 papers that were covered by some data availability policy, 208 papers (59%) did not fully adhere to the data availability instructions of the journals they were published in, most commonly (73%) by not publicly depositing microarray data. The other 143 papers that adhered to the data availability instructions did so by publicly depositing only the specific data type as required, making a statement of willingness to share, or actually sharing all the primary data. Overall, only 47 papers (9%) deposited full primary raw data online. None of the 149 papers not subject to data availability policies made their full primary data publicly available.
Conclusion : A substantial proportion of original research papers published in high-impact journals are either not subject to any data availability policies, or do not adhere to the data availability instructions in their respective journals. This empiric evaluation highlights opportunities for improvement.”
URL : http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0024357
Who Shares? Who Doesn’t? Factors Associated with Openly Archiving Raw Research Data :
“Many initiatives encourage investigators to share their raw datasets in hopes of increasing research efficiency and quality. Despite these investments of time and money, we do not have a firm grasp of who openly shares raw research data, who doesn’t, and which initiatives are correlated with high rates of data sharing. In this analysis I use bibliometric methods to identify patterns in the frequency with which investigators openly archive their raw gene expression microarray datasets after study publication.
Automated methods identified 11,603 articles published between 2000 and 2009 that describe the creation of gene expression microarray data. Associated datasets in best-practice repositories were found for 25% of these articles, increasing from less than 5% in 2001 to 30%–35% in 2007–2009. Accounting for sensitivity of the automated methods, approximately 45% of recent gene expression studies made their data publicly available.
First-order factor analysis on 124 diverse bibliometric attributes of the data creation articles revealed 15 factors describing authorship, funding, institution, publication, and domain environments. In multivariate regression, authors were most likely to share data if they had prior experience sharing or reusing data, if their study was published in an open access journal or a journal with a relatively strong data sharing policy, or if the study was funded by a large number of NIH grants. Authors of studies on cancer and human subjects were least likely to make their datasets available.
These results suggest research data sharing levels are still low and increasing only slowly, and data is least available in areas where it could make the biggest impact. Let’s learn from those with high rates of sharing to embrace the full potential of our research output.”
URL : http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0018657
The Dataverse Network®: An Open-Source Application for Sharing, Discovering and Preserving Data :
“The Dataverse Network is an open-source application for publishing, referencing, extracting and analyzing research data. The main goal of the Dataverse Network is to solve the problems of data sharing through building technologies that enable institutions to reduce the burden for researchers and data publishers, and incentivize them to share their data. By installing Dataverse Network software, an institution is able to host multiple individual virtual archives, called “dataverses” for scholars, research groups, or journals, providing a data publication framework that supports author recognition, persistent citation, data discovery and preservation. Dataverses require no hardware or software costs, nor maintenance or backups by the data owner, but still enable all web visibility and credit to devolve to the data owner.”
URL : http://www.dlib.org/dlib/january11/crosas/01crosas.html