Connecting Data Publication to the Research Workflow: A Preliminary Analysis

Authors : Sünje Dallmeier-Tiessen, Varsha Khodiyar, Fiona Murphy, Amy Nurnberger, Lisa Raymond, Angus Whyte

The data curation community has long encouraged researchers to document collected research data during active stages of the research workflow, to provide robust metadata earlier, and support research data publication and preservation.

Data documentation with robust metadata is one of a number of steps in effective data publication. Data publication is the process of making digital research objects ‘FAIR’, i.e. findable, accessible, interoperable, and reusable; attributes increasingly expected by research communities, funders and society.

Research data publishing workflows are the means to that end. Currently, however, much published research data remains inconsistently and inadequately documented by researchers.

Documentation of data closer in time to data collection would help mitigate the high cost that repositories associate with the ingest process. More effective data publication and sharing should in principle result from early interactions between researchers and their selected data repository.

This paper describes a short study undertaken by members of the Research Data Alliance (RDA) and World Data System (WDS) working group on Publishing Data Workflows. We present a collection of recent examples of data publication workflows that connect data repositories and publishing platforms with research activity ‘upstream’ of the ingest process.

We re-articulate previous recommendations of the working group, to account for the varied upstream service components and platforms that support the flow of contextual and provenance information downstream.

These workflows should be open and loosely coupled to support interoperability, including with preservation and publication environments. Our recommendations aim to stimulate further work on researchers’ views of data publishing and the extent to which available services and infrastructure facilitate the publication of FAIR data.

We also aim to stimulate further dialogue about, and definition of, the roles and responsibilities of research data services and platform providers for the ‘FAIRness’ of research data publication workflows themselves.

URL : Connecting Data Publication to the Research Workflow: A Preliminary Analysis

DOI : https://doi.org/10.2218/ijdc.v12i1.533

Data Sharing and Cardiology : Platforms and Possibilities

AuthorsPranammya DeyJoseph S. RossJessica D. RitchieNihar R. DesaiSanjeev P. Bhavnani, Harlan M. Krumholz

Sharing deidentified patient-level research data presents immense opportunities to all stakeholders involved in cardiology research and practice. Sharing data encourages the use of existing data for knowledge generation to improve practice, while also allowing for validation of disseminated research.

In this review, we discuss key initiatives and platforms that have helped to accelerate progress toward greater sharing of data. These efforts are being prompted by government, universities, philanthropic sponsors of research, major industry players, and collaborations among some of these entities.

As data sharing becomes a more common expectation, policy changes will be required to encourage and assist data generators with the process of sharing the data they create.

Patients also will need access to their own data and to be empowered to share those data with researchers. Although medicine still lags behind other fields in achieving data sharing’s full potential, cardiology research has the potential to lead the way.

URL : http://www.onlinejacc.org/content/70/24/3018

 

Research Data Management Instruction for Digital Humanities

Author : Willow Dressel

eScience related library services at Princeton University started in response to the National Science Foundation’s (NSF) data management plan requirements, and grew to encompass a range of services including data management plan consultation, assistance with depositing into a disciplinary or institutional repository, and research data management instruction.

These services were initially directed at science and engineering disciplines on campus, but the eScience Librarian soon realized the relevance of research data management instruction for humanities disciplines with digital approaches.

Applicability to the digital humanities was initially recognized by discovery of related efforts from the history department’s Information Technology (IT) manager in the form of a graduate-student workshop on file and digital-asset management concepts.

Seeing the common ground these activities shared with research data management, a collaboration was formed between the history department’s IT Manager and the eScience Librarian to provide a research data management overview to the entire campus community.

The eScience Librarian was then invited to participate in the history department’s graduate student file and digital asset management workshop to provide an overview of other research data management concepts. Based on the success of the collaboration with the history department IT, the eScience Librarian offered to develop a workshop for the newly formed Center for Digital Humanities at Princeton.

To develop the workshop, background research on digital humanities curation was performed revealing similarities and differences between digital humanities curation and research data management in the sciences. These similarities and differences, workshop results, and areas of further study are discussed.

URL : Research Data Management Instruction for Digital Humanities

DOI : https://doi.org/10.7191/jeslib.2017.1115

Business models for sustainable research data repositories

Author : OECD

There is a large variety of repositories that are responsible for providing long term access to data that is used for research. As data volumes and the demands for more open access to this data increase, these repositories are coming under increasing financial pressures that can undermine their long-term sustainability.

This report explores the income streams, costs, value propositions, and business models for 48 research data repositories. It includes a set of recommendations designed to provide a framework for developing sustainable business models and to assist policy makers and funders in supporting repositories with a balance of policy regulation and incentives.

DOI : http://dx.doi.org/10.1787/302b12bb-en

Données de la recherche en SHS. Pratiques, représentations et attentes des chercheurs : une enquête à l’Université Rennes 2

Auteurs/Authors : Alexandre Serres, Marie-Laure Malingre, Morgane Mignon, Cécile Pierre, Didier Collet

Quels sont les types de données de recherche collectées, traitées et produites dans une université de lettres et sciences humaines et sociales ? Quelles sont les pratiques des chercheurs en SHS en matière de stockage, d’archivage, de diffusion, de partage de leurs données de recherche ?

Quelles sont leurs représentations et leurs définitions des données de recherche, leur position par rapport au libre accès ? Quels sont leurs besoins prioritaires en matière de gestion ou de partage des données de recherche ?

Comment perçoivent-ils le bon niveau d’une politique des données ? C’est pour répondre à toutes ces questions qu’une double enquête, statistique et qualitative, a été menée à l’Université Rennes 2 au printemps 2017, enquête portée par l’URFIST (Unité Régionale de Formation à l’Information Scientifique et Technique) de Rennes, la Maison des Sciences de l’Homme en Bretagne et le Service Commun de Documentation Rennes 2, avec le soutien des instances de l’université.

Le rapport et ses annexes en présentent ici tous les résultats, avec un certain nombre de propositions pour une politique des données de recherche.

URL : Données de la recherche en SHS. Pratiques, représentations et attentes des chercheurs : une enquête à l’Université Rennes 2

Alternative location : https://hal.archives-ouvertes.fr/hal-01635186

 

A review of data sharing statements in observational studies published in the BMJ: A cross-sectional study

Authors : Laura McDonald, Anna Schultze, Alex Simpson, Sophie Graham, Radek Wasiak, Sreeram V. Ramagopalan

In order to understand the current state of data sharing in observational research studies, we reviewed data sharing statements of observational studies published in a general medical journal, the British Medical Journal.

We found that the majority (63%) of observational studies published between 2015 and 2017 included a statement that implied that data used in the study could not be shared. If the findings of our exploratory study are confirmed, room for improvement in the sharing of real-world or observational research data exists.

URL : A review of data sharing statements in observational studies published in the BMJ: A cross-sectional study

DOI : http://dx.doi.org/10.12688/f1000research.12673.2

Versioned data: why it is needed and how it can be achieved (easily and cheaply)

Authors : Daniel S. Falster, Richard G. FitzJohn, Matthew W. Pennell, William K. Cornwell

The sharing and re-use of data has become a cornerstone of modern science. Multiple platforms now allow quick and easy data sharing. So far, however, data publishing models have not accommodated on-going scientific improvements in data: for many problems, datasets continue to grow with time — more records are added, errors fixed, and new data structures are created. In other words, datasets, like scientific knowledge, advance with time.

We therefore suggest that many datasets would be usefully published as a series of versions, with a simple naming system to allow users to perceive the type of change between versions. In this article, we argue for adopting the paradigm and processes for versioned data, analogous to software versioning.

We also introduce a system called Versioned Data Delivery and present tools for creating, archiving, and distributing versioned data easily, quickly, and cheaply. These new tools allow for individual research groups to shift from a static model of data curation to a dynamic and versioned model that more naturally matches the scientific process.

URL : Versioned data: why it is needed and how it can be achieved (easily and cheaply)

DOI : https://doi.org/10.7287/peerj.preprints.3401v1