IISH Guidelines for preserving research data: a framework for preserving collaborative data collections for future research :
« Our guidelines highlight the iterative process of data collection, data processing, data analysis and publication of (interim) research results. The iterative process is best analyzed and illustrated by following the dynamics of data collection in online collaboratories. The production of data sets in such large scale data collection projects, typically takes a lot of time, whilst in the meantime research may already be performed on data sub-sets. If this leads to a publication a proper citation is required. Publishers and readers need to know exactly in what stage of the data collection process specific conclusions on these data were drawn. During this iterative process, research data need to be maintained, managed and disseminated in different forms and versions during the successive stages of the work carried out, in order to validate the outcomes and research results. These practices drive the requirements for data archiving and show that data archiving is not a once off data transfer transaction or even a linear process. Therefore from the perspective of the research process, we recommend the interconnection and interfacing between data collection and data archiving, in order to ensure the most effective and loss-less preservation of the research data. »
URL : http://www.surffoundation.nl/nl/themas/openonderzoek/cris/Documents/SURFshare_Collectioneren_Guidelines_IISH_DEF.pdf
To advance the pace of scientific discovery we propose a conceptual format that forms the basis of a truly new way of publishing science. In our proposal, all scientific communication objects (including experimental workflows, direct results, email conversations, and all drafted and published information artifacts) are labeled and stored in a great, big, distributed data store (or many distributed data stores that are all connected).
Each item has a set of metadata attached to it, which includes (at least) the person and time it was created, the type of object it is, and the status of the object including intellectual property rights and ownership. Every researcher can (and must) deposit every knowledge item that is produced in the lab into this repository.
With this deposition goes an essential metadata component that states who has the rights to see, use, distribute, buy or sell this item. Into this grand (and system-wise distributed, cloud-based) architecture, all items produced by a single lab, or several labs, are stored, labeled and connected.
URL : http://precedings.nature.com/documents/4742/version/1
Research Data: Who will share what, with whom, when, and why? :
« The deluge of scientific research data has excited the general public, as well as the scientific community, with the possibilities for better understanding of scientific problems, from climate to culture. For data to be available, researchers must be willing and able to share them. The policies of governments, funding agencies, journals, and university tenure and promotion committees also influence how, when, and whether research data are shared. Data are complex objects. Their purposes and the methods by which they are produced vary widely across scientific fields, as do the criteria for sharing them. To address these challenges, it is necessary to examine the arguments for sharing data and how those arguments match the motivations and interests of the scientific community and the public. Four arguments are examined: to make the results of publicly funded data available to the public, to enable others to ask new questions of extant data, to advance the state of science, and to reproduce research. Libraries need to consider their role in the face of each of these arguments, and what expertise and systems they require for data curation. »
URL : http://works.bepress.com/borgman/238/
Utopia documents: linking scholarly literature with research data :
« Motivation: In recent years, the gulf between the mass of accumulating-research data and the massive literature describing and analyzing those data has widened. The need for intelligent tools to bridge this gap, to rescue the knowledge being systematically isolated in literature and data silos, is now widely acknowledged.
Results: To this end, we have developed Utopia Documents, a novel PDF reader that semantically integrates visualization and data-analysis tools with published research articles. In a successful pilot with editors of the Biochemical Journal (BJ), the system has been used to transform static document features into objects that can be linked, annotated, visualized and analyzed interactively (http://www.biochemj.org/bj/424/3/). Utopia Documents is now used routinely by BJ editors to mark up article content prior to publication. Recent additions include integration of various text-mining and biodatabase plugins, demonstrating the system’s ability to seamlessly integrate on-line content with PDF articles. »
URL : http://www.bioinformatics.oxfordjournals.org/content/26/18/i568.full
Retooling Libraries for the Data Challenge :
« Eager to prove their relevance among scholars leaving print behind, libraries have participated vocally in the last half-decade’s conversation about digital research data. On the surface, libraries would seem to have much human and technological infrastructure ready-constructed to repurpose for data: digital library platforms and institutional repositories may appear fit for purpose. However, unless libraries understand the salient characteristics of research data, and how they do and do not fit with library processes and infrastructure, they run the risk of embarrassing missteps as they come to grips with the data challenge.
Whether managing research data is ‘the new special collections,’ a new form of regular academic-library collection development, or a brand-new library specialty, the possibilities have excited a great deal of talk, planning, and educational opportunity in a profession seeking to expand its boundaries.
Faced with shrinking budgets and staffs, library administrators may well be tempted to repurpose existing technology infrastructure and staff to address the data curation challenge. Existing digital libraries and institutional repositories seem on the surface to be a natural fit for housing digital research data. Unfortunately, significant mismatches exist between research data and library digital warehouses, as well as the processes and procedures librarians typically use to fill those warehouses. Repurposing warehouses and staff for research data is therefore neither straightforward nor simple; in some cases, it may even prove impossible. »
URL : http://www.ariadne.ac.uk/issue64/salo/