Facilitating and Improving Environmental Research Data Repository Interoperability

Authors : Corinna Gries, Amber Budden, Christine Laney, Margaret O’Brien, Mark Servilla, Wade Sheldon, Kristin Vanderbilt, David Vieglais

Environmental research data repositories provide much needed services for data preservation and data dissemination to diverse communities with domain specific or programmatic data needs and standards.

Due to independent development these repositories serve their communities well, but were developed with different technologies, data models and using different ontologies. Hence, the effectiveness and efficiency of these services can be vastly improved if repositories work together adhering to a shared community platform that focuses on the implementation of agreed upon standards and best practices for curation and dissemination of data.

Such a community platform drives forward the convergence of technologies and practices that will advance cross-domain interoperability. It will also facilitate contributions from investigators through standardized and streamlined workflows and provide increased visibility for the role of data managers and the curation services provided by data repositories, beyond preservation infrastructure.

Ten specific suggestions for such standardizations are outlined without any suggestions for priority or technical implementation. Although the recommendations are for repositories to implement, they have been chosen specifically with the data provider/data curator and synthesis scientist in mind.

URL : Facilitating and Improving Environmental Research Data Repository Interoperability

DOI : http://doi.org/10.5334/dsj-2018-022

How are we Measuring Up? Evaluating Research Data Services in Academic Libraries

Authors : Heather L. Coates, Jake Carlson, Ryan Clement, Margaret Henderson, Lisa R Johnston, Yasmeen Shorish

INTRODUCTION

In the years since the emergence of federal funding agency data management and sharing requirements (http://datasharing.sparcopen.org/data), research data services (RDS) have expanded to dozens of academic libraries in the United States.

As these services have matured, service providers have begun to assess them. Given a lack of practical guidance in the literature, we seek to begin the discussion with several case studies and an exploration of four approaches suitable to assessing these emerging services.

DESCRIPTION OF PROGRAM

This article examines five case studies that vary by staffing, drivers, and institutional context in order to begin a practice-oriented conversation about how to evaluate and assess research data services in academic libraries.

The case studies highlight some commonly discussed challenges, including insufficient training and resources, competing demands for evaluation efforts, and the tension between evidence that can be easily gathered and that which addresses our most important questions.

We explore reflective practice, formative evaluation, developmental evaluation, and evidence-based library and information practice for ideas to advance practice.

NEXT STEPS

Data specialists engaged in providing research data services need strategies and tools with which to make decisions about their services. These range from identifying stakeholder needs to refining existing services to determining when to extend and discontinue declining services.

While the landscape of research data services is broad and diverse, there are common needs that we can address as a community. To that end, we have created a community-owned space to facilitate the exchange of knowledge and existing resources.

URL : How are we Measuring Up? Evaluating Research Data Services in Academic Libraries

DOI : http://doi.org/10.7710/2162-3309.2226

Vers une culture de la donnée en SHS : Une étude à l’Université de Lille

Auteur/Author : Joachim Schöpfel

La science ouverte figure parmi les priorités de l’Etat français. Dans la continuité des chantiers engagés par le gouvernement français sur la transformation numérique de l’Etat et sa modernisation, le deuxième plan d’action national 2018-2020 “Pour une action publique transparente et collaborative” précise que la France « soutient la mise en œuvre des principes du gouvernement ouvert pour renforcer (…) l’accès aux matériaux et résultats de la recherche ».

Le plan national pour la science ouverte, présenté début juillet 2018, a confirmé cette ambition. L’objectif est que les données produites par la recherche publique soient progressivement structurées en conformité avec les principes FAIR, préservées et, quand cela est possible, ouvertes.

Notre étude “Vers une culture de la donnée en SHS” souhaite contribuer à la mise en œuvre de l’écosystème de la science ouverte sur le terrain d’un campus universitaire.

L’étude a été réalisée dans le cadre du projet structurant D4Humanities, avec un financement de la MESHS et du Conseil Régional Hauts-de-France, et elle fait suite à des travaux de recherche menés depuis 2013 par le laboratoire GERiiCO.

Conduite sous forme d’entretiens avec 51 chercheurs, doctorants, responsables de laboratoires, chefs de projets et ingénieurs en charge de données, l’étude poursuit trois objectifs :

  1. (Re)Mettre les enseignants-chercheurs au cœur de la mise en œuvre de l’écosystème de la science ouverte sur le campus, avec leurs besoins, priorités et interrogations.
  2. Identifier des opportunités et verrous pour une politique de données.
  3. Recommander dix actions à mettre en place pour développer la culture de données sur le campus.

Menée comme un audit sur un terrain particulier et dans le domaine des sciences humaines et sociales, l’étude a une portée pragmatique: dégager les éléments indispensables pour une politique cohérente de la production, gestion et réutilisation des données de la recherche sur un campus en sciences humaines et sociales, et contribuer ainsi à l’appropriation du concept de la science ouverte par une « mise en culture de la donnée, qui effectue une mise en sens d’usages disséminés et spécialisés de données ouvertes ».

Une première partie (« Constats préalables ») s’appuie sur deux études (Rennes 2, Lille 3) pour mieux cerner le concept de la donnée de recherche et son caractère de « longue traîne » ; cette partie synthétise les pratiques, motivations et attentes des enseignants-chercheurs dans ce domaine, en SHS.

Elle aborde également d’une manière générale la question des services et dispositifs de données. Une deuxième partie (« Observations ») décrit un paysage contrasté à partir des entretiens menés en 2017 et 2018 sur le campus SHS de l’Université de Lille.

Les besoins prioritaires des chercheurs sont la sécurité des données et systèmes, et la communication au sein des projets. L’image qui se dégage est un continuum de pratiques plus ou moins efficaces, formalisées et adéquates, avec une gouvernance parfois incertaine, au niveau des projets aussi bien qu’au niveau des structures.

Ces pratiques sont liées aux communautés disciplinaires mais plus encore, aux méthodes, équipements et thématiques scientifiques. La troisième partie (« Vers une culture de la donnée ») liste d’une manière succincte dix recommandations qui, ensemble, définissent un cadre de référence pour la mise en œuvre d’une politique de données sur un campus SHS :

  1. Mettre en place un pilotage scientifique
  2. Investir d’une manière ciblée
  3. Viser les projets, pas les laboratoires
  4. Utiliser les plans de gestion comme levier
  5. Apporter des réponses aux contraintes de sécurité
  6. Apporter des réponses aux besoins de communication
  7. Apporter des réponses aux besoins de curation
  8. Proposer plusieurs solutions pour la conservation des données
  9. Institutionnaliser le lien avec la TGIR Huma-Num
  10. Soutenir les bonnes pratiques

URL : Vers une culture de la donnée en SHS : Une étude à l’Université de Lille

Alternative location : https://hal.archives-ouvertes.fr/GERIICO/hal-01846849v1

Health Sciences Libraries Advancing Collaborative Clinical Research Data Management in Universities

Authors : Tania P. Bardyn, Emily F. Patridge, Michael T. Moore, Jane J. Koh

Purpose

Medical libraries need to actively review their service models and explore partnerships with other campus entities to provide better-coordinated clinical research management services to faculty and researchers. TRAIL (Translational Research and Information Lab), a five-partner initiative at the University of Washington (UW), explores how best to leverage existing expertise and space to deliver clinical research data management (CRDM) services and emerging technology support to clinical researchers at UW and collaborating institutions in the Pacific Northwest.

Methods

The initiative offers 14 services and a technology-enhanced innovation lab located in the Health Sciences Library (HSL) to support the University of Washington clinical and research enterprise.

Sharing of staff and resources merges library and non-library workflows, better coordinating data and innovation services to clinical researchers. Librarians have adopted new roles in CRDM, such as providing user support and training for UW’s Research Electronic Data Capture (REDCap) instance.

Results

TRAIL staff are quickly adapting to changing workflows and shared services, including teaching classes on tools used to manage clinical research data. Researcher interest in TRAIL has sparked new collaborative initiatives and service offerings. Marketing and promotion will be important for raising researchers’ awareness of available services.

Conclusions

Medical librarians are developing new skills by supporting and teaching CRDM. Clinical and data librarians better understand the information needs of clinical and translational researchers by being involved in the earlier stages of the research cycle and identifying technologies that can improve healthcare outcomes.

At health sciences libraries, leveraging existing resources and bringing services together is central to how university medical librarians will operate in the future.

DOI : https://doi.org/10.7191/jeslib.2018.1130

Clinical Trial Participants’ Views of the Risks and Benefits of Data Sharing

Authors : Michelle M. Mello, Van Lieou, Steven N. Goodman

Background

Sharing of participant-level clinical trial data has potential benefits, but concerns about potential harms to research participants have led some pharmaceutical sponsors and investigators to urge caution. Little is known about clinical trial participants’ perceptions of the risks of data sharing.

Methods

We conducted a structured survey of 771 current and recent participants from a diverse sample of clinical trials at three academic medical centers in the United States. Surveys were distributed by mail (350 completed surveys) and in clinic waiting rooms (421 completed surveys) (overall response rate, 79%).

Results

Less than 8% of respondents felt that the potential negative consequences of data sharing outweighed the benefits. A total of 93% were very or somewhat likely to allow their own data to be shared with university scientists, and 82% were very or somewhat likely to share with scientists in for-profit companies.

Willingness to share data did not vary appreciably with the purpose for which the data would be used, with the exception that fewer participants were willing to share their data for use in litigation.

The respondents’ greatest concerns were that data sharing might make others less willing to enroll in clinical trials (37% very or somewhat concerned), that data would be used for marketing purposes (34%), or that data could be stolen (30%). Less concern was expressed about discrimination (22%) and exploitation of data for profit (20%).

Conclusions

In our study, few clinical trial participants had strong concerns about the risks of data sharing. Provided that adequate security safeguards were in place, most participants were willing to share their data for a wide range of uses. (Funded by the Greenwall Foundation.)

URL : https://www.nejm.org/doi/full/10.1056/NEJMsa1713258

Data management and sharing in neuroimaging: Practices and perceptions of MRI researchers

Authors : John A. Borghi, Ana E. Van Gulick

Neuroimaging methods such as magnetic resonance imaging (MRI) involve complex data collection and analysis protocols, which necessitate the establishment of good research data management (RDM). Despite efforts within the field to address issues related to rigor and reproducibility, information about the RDM-related practices and perceptions of neuroimaging researchers remains largely anecdotal.

To inform such efforts, we conducted an online survey of active MRI researchers that covered a range of RDM-related topics. Survey questions addressed the type(s) of data collected, tools used for data storage, organization, and analysis, and the degree to which practices are defined and standardized within a research group.

Our results demonstrate that neuroimaging data is acquired in multifarious forms, transformed and analyzed using a wide variety of software tools, and that RDM practices and perceptions vary considerably both within and between research groups, with trainees reporting less consistency than faculty.

Ratings of the maturity of RDM practices from ad-hoc to refined were relatively high during the data collection and analysis phases of a project and significantly lower during the data sharing phase.

Perceptions of emerging practices including open access publishing and preregistration were largely positive, but demonstrated little adoption into current practice.

URL : Data management and sharing in neuroimaging: Practices and perceptions of MRI researchers

DOI : https://doi.org/10.1371/journal.pone.0200562

The Modern Research Data Portal: a design pattern for networked, data-intensive science

Authors : Kyle Chard, Eli Dart, Ian Foster​, David Shifflett, Steven Tuecke, Jason Williams

We describe best practices for providing convenient, high-speed, secure access to large data via research data portals. We capture these best practices in a new design pattern, the Modern Research Data Portal, that disaggregates the traditional monolithic web-based data portal to achieve orders-of-magnitude increases in data transfer performance, support new deployment architectures that decouple control logic from data storage, and reduce development and operations costs.

We introduce the design pattern; explain how it leverages high-performance data enclaves and cloud-based data management services; review representative examples at research laboratories and universities, including both experimental facilities and supercomputer sites; describe how to leverage Python APIs for authentication, authorization, data transfer, and data sharing; and use coding examples to demonstrate how these APIs can be used to implement a range of research data portal capabilities.

Sample code at a companion web site, https://docs.globus.org/mrdp, provides application skeletons that readers can adapt to realize their own research data portals.

URL : The Modern Research Data Portal: a design pattern for networked, data-intensive science

DOI : https://doi.org/10.7717/peerj-cs.144