Integrating Data Science Tools into a Graduate Level Data Management Course

Authors: Pete E. Pascuzzi, Megan R. Sapp Nelson

Objective

This paper describes a project to revise an existing research data management (RDM) course to include instruction in computer skills with robust data science tools.

Setting

A Carnegie R1 university.

Brief Description

Graduate student researchers need training in the basic concepts of RDM. However, they generally lack experience with robust data science tools to implement these concepts holistically. Two library instructors fundamentally redesigned an existing research RDM course to include instruction with such tools.

The course was divided into lecture and lab sections to facilitate the increased instructional burden. Learning objectives and assessments were designed at a higher order to allow students to demonstrate that they not only understood course concepts but could use their computer skills to implement these concepts.

Results

Twelve students completed the first iteration of the course. Feedback from these students was very positive, and they appreciated the combination of theoretical concepts, computer skills and hands-on activities. Based on student feedback, future iterations of the course will include more “flipped” content including video lectures and interactive computer tutorials to maximize active learning time in both lecture and lab.

The substance of this article is based upon poster presentations at RDAP Summit 2018.

URL : Integrating Data Science Tools into a Graduate Level Data Management Course

DOI : https://doi.org/10.7191/jeslib.2018.1152

Toward a Better Data Management Plan: The Impact of DMPs on Grant Funded Research Practices

Author : Sara Mannheimer

Data Management Plans (DMPs) are often required for grant applications. But do strong DMPs lead to better data management and sharing practices? Several recent research projects in the Library and Information Science field have investigated data management planning and practice through DMP content analysis and data-management-related interviews.

However, research hasn’t yet shown how DMPs ultimately affect data management and data sharing practices during grant-funded research. The research described in this article contributes to the existing literature by examining the impact of DMPs on grant awards and on Principal Investigators’ (PIs) data management and sharing practices.

The results of this research suggest the following key takeaways:

(1) Most PIs practice internal data management in order to prevent data loss, to facilitate sharing within the research team, and to seamlessly continue their research during personnel turnover;

(2) PIs still have room to grow in understanding specialized concepts such as metadata and policies for use and reuse;

(3) PIs may need guidance on practices that facilitate FAIR data, such as using metadata standards, assigning licenses to their data, and publishing in data repositories.

Ultimately, the results of this research can inform academic library services and support stronger, more actionable DMPs. The substance of this article is based upon a lightning talk presentation at RDAP Summit 2018.

URL : Toward a Better Data Management Plan: The Impact of DMPs on Grant Funded Research Practices

DOI : https://doi.org/10.7191/jeslib.2018.1155

Text data mining and data quality management for research information systems in the context of open data and open science

Authors : Otmane Azeroual, Gunter Saake, Mohammad Abuosba, Joachim Schöpfel

In the implementation and use of research information systems (RIS) in scientific institutions, text data mining and semantic technologies are a key technology for the meaningful use of large amounts of data.

It is not the collection of data that is difficult, but the further processing and integration of the data in RIS. Data is usually not uniformly formatted and structured, such as texts and tables that cannot be linked.

These include various source systems with their different data formats such as project and publication databases, CERIF and RCD data model, etc. Internal and external data sources continue to develop.

On the one hand, they must be constantly synchronized and the results of the data links checked. On the other hand, the texts must be processed in natural language and certain information extracted.

Using text data mining, the quality of the metadata is analyzed and this identifies the entities and general keywords. So that the user is supported in the search for interesting research information.

The information age makes it easier to store huge amounts of data and increase the number of documents on the internet, in institutions’ intranets, in newswires and blogs is overwhelming.

Search engines should help to specifically open up these sources of information and make them usable for administrative and research purposes. Against this backdrop, the aim of this paper is to provide an overview of text data mining techniques and the management of successful data quality for RIS in the context of open data and open science in scientific institutions and libraries, as well as to provide ideas for their application. In particular, solutions for the RIS will be presented.

URL : https://arxiv.org/abs/1812.04298

From Open Access to Open Data: collaborative work in the university libraries of Catalonia

Authors: Mireia Alcalá Ponce de León, Lluís Anglada i de Ferrer

In the last years, the scientific community and funding bodies have paid attention to collected, generated or used data throughout different research activities. The dissemination of these data becomes one of the constituent elements of Open Science.

For this reason, many funders are requiring or promoting the development of Data Management Plans, and depositing open data following the FAIR principles (Findable, Accessible, Interoperable and Reusable).

Libraries and research offices of Catalan universities –which coordinately work within the Open Science Area of CSUC– offer support services to research data management. The different works carried out at the Consortium level will be presented, as well the implementation of the service in each university.

URL : From Open Access to Open Data: collaborative work in the university libraries of Catalonia

DOI : http://doi.org/10.18352/lq.10253

The History, Advocacy and Efficacy of Data Management Plans

Authors : Nicholas Smale, Kathryn Unsworth, Gareth Denyer, Daniel Barr

Data management plans (DMPs) have increasingly been encouraged as a key component of institutional and funding body policy. Although DMPs necessarily place administrative burden on researchers, proponents claim that DMPs have myriad benefits, including enhanced research data quality, increased rates of data sharing, and institutional planning and compliance benefits.

In this manuscript, we explore the international history of DMPs and describe institutional and funding body DMP policy. We find that economic and societal benefits from presumed increased rates of data sharing was the original driver of mandating DMPs by funding bodies.

Today, 86% of UK Research Councils and 63% of US funding bodies require submission of a DMP with funding applications. Given that no major Australian funding bodies require DMP submission, it is of note that 37% of Australian universities have taken the initiative to internally mandate DMPs.

Institutions both within Australia and internationally frequently promote the professional benefits of DMP use, and endorse DMPs as ‘best practice’. We analyse one such typical DMP implementation at a major Australian institution, finding that DMPs have low levels of apparent translational value.

Indeed, an extensive literature review suggests there is very limited published systematic evidence that DMP use has any tangible benefit for researchers, institutions or funding bodies.

We are therefore led to question why DMPs have become the go-to tool for research data professionals and advocates of good data practice. By delineating multiple use-cases and highlighting the need for DMPs to be fit for intended purpose, we question the view that a good DMP is necessarily that which encompasses the entire data lifecycle of a project.

Finally, we summarise recent developments in the DMP landscape, and note a positive shift towards evidence-based research management through more researcher-centric, educative, and integrated DMP services.

URL : The History, Advocacy and Efficacy of Data Management Plans

DOI : https://doi.org/10.1101/443499

Facilitating and Improving Environmental Research Data Repository Interoperability

Authors : Corinna Gries, Amber Budden, Christine Laney, Margaret O’Brien, Mark Servilla, Wade Sheldon, Kristin Vanderbilt, David Vieglais

Environmental research data repositories provide much needed services for data preservation and data dissemination to diverse communities with domain specific or programmatic data needs and standards.

Due to independent development these repositories serve their communities well, but were developed with different technologies, data models and using different ontologies. Hence, the effectiveness and efficiency of these services can be vastly improved if repositories work together adhering to a shared community platform that focuses on the implementation of agreed upon standards and best practices for curation and dissemination of data.

Such a community platform drives forward the convergence of technologies and practices that will advance cross-domain interoperability. It will also facilitate contributions from investigators through standardized and streamlined workflows and provide increased visibility for the role of data managers and the curation services provided by data repositories, beyond preservation infrastructure.

Ten specific suggestions for such standardizations are outlined without any suggestions for priority or technical implementation. Although the recommendations are for repositories to implement, they have been chosen specifically with the data provider/data curator and synthesis scientist in mind.

URL : Facilitating and Improving Environmental Research Data Repository Interoperability

DOI : http://doi.org/10.5334/dsj-2018-022

Health Sciences Libraries Advancing Collaborative Clinical Research Data Management in Universities

Authors : Tania P. Bardyn, Emily F. Patridge, Michael T. Moore, Jane J. Koh

Purpose

Medical libraries need to actively review their service models and explore partnerships with other campus entities to provide better-coordinated clinical research management services to faculty and researchers. TRAIL (Translational Research and Information Lab), a five-partner initiative at the University of Washington (UW), explores how best to leverage existing expertise and space to deliver clinical research data management (CRDM) services and emerging technology support to clinical researchers at UW and collaborating institutions in the Pacific Northwest.

Methods

The initiative offers 14 services and a technology-enhanced innovation lab located in the Health Sciences Library (HSL) to support the University of Washington clinical and research enterprise.

Sharing of staff and resources merges library and non-library workflows, better coordinating data and innovation services to clinical researchers. Librarians have adopted new roles in CRDM, such as providing user support and training for UW’s Research Electronic Data Capture (REDCap) instance.

Results

TRAIL staff are quickly adapting to changing workflows and shared services, including teaching classes on tools used to manage clinical research data. Researcher interest in TRAIL has sparked new collaborative initiatives and service offerings. Marketing and promotion will be important for raising researchers’ awareness of available services.

Conclusions

Medical librarians are developing new skills by supporting and teaching CRDM. Clinical and data librarians better understand the information needs of clinical and translational researchers by being involved in the earlier stages of the research cycle and identifying technologies that can improve healthcare outcomes.

At health sciences libraries, leveraging existing resources and bringing services together is central to how university medical librarians will operate in the future.

DOI : https://doi.org/10.7191/jeslib.2018.1130