Integrating Data Science Tools into a Graduate Level Data Management Course

Authors: Pete E. Pascuzzi, Megan R. Sapp Nelson

Objective

This paper describes a project to revise an existing research data management (RDM) course to include instruction in computer skills with robust data science tools.

Setting

A Carnegie R1 university.

Brief Description

Graduate student researchers need training in the basic concepts of RDM. However, they generally lack experience with robust data science tools to implement these concepts holistically. Two library instructors fundamentally redesigned an existing research RDM course to include instruction with such tools.

The course was divided into lecture and lab sections to facilitate the increased instructional burden. Learning objectives and assessments were designed at a higher order to allow students to demonstrate that they not only understood course concepts but could use their computer skills to implement these concepts.

Results

Twelve students completed the first iteration of the course. Feedback from these students was very positive, and they appreciated the combination of theoretical concepts, computer skills and hands-on activities. Based on student feedback, future iterations of the course will include more “flipped” content including video lectures and interactive computer tutorials to maximize active learning time in both lecture and lab.

The substance of this article is based upon poster presentations at RDAP Summit 2018.

URL : Integrating Data Science Tools into a Graduate Level Data Management Course

DOI : https://doi.org/10.7191/jeslib.2018.1152

Toward a Better Data Management Plan: The Impact of DMPs on Grant Funded Research Practices

Author : Sara Mannheimer

Data Management Plans (DMPs) are often required for grant applications. But do strong DMPs lead to better data management and sharing practices? Several recent research projects in the Library and Information Science field have investigated data management planning and practice through DMP content analysis and data-management-related interviews.

However, research hasn’t yet shown how DMPs ultimately affect data management and data sharing practices during grant-funded research. The research described in this article contributes to the existing literature by examining the impact of DMPs on grant awards and on Principal Investigators’ (PIs) data management and sharing practices.

The results of this research suggest the following key takeaways:

(1) Most PIs practice internal data management in order to prevent data loss, to facilitate sharing within the research team, and to seamlessly continue their research during personnel turnover;

(2) PIs still have room to grow in understanding specialized concepts such as metadata and policies for use and reuse;

(3) PIs may need guidance on practices that facilitate FAIR data, such as using metadata standards, assigning licenses to their data, and publishing in data repositories.

Ultimately, the results of this research can inform academic library services and support stronger, more actionable DMPs. The substance of this article is based upon a lightning talk presentation at RDAP Summit 2018.

URL : Toward a Better Data Management Plan: The Impact of DMPs on Grant Funded Research Practices

DOI : https://doi.org/10.7191/jeslib.2018.1155

Les enjeux de l’interopérabilité dans la diffusion et la valorisation des données archéologiques

Auteur/Author : Pauline Vignaud

Discipline historique et scientifique, l’archéologie a vu ses pratiques évoluées depuis l’arrivée du numérique. Dès lors, plusieurs problématiques se sont imposées aux archéologues notamment dans leur manière de diffuser et de valoriser leurs données.

Dans ce contexte-là, des questions autour de l’interopérabilité ont émergé notamment les outils à développer (plateformes, applications, projets) et à mettre en place pour permettre le partage et la mise en valeur des données archéologiques.

Ce mémoire propose d’explorer toutes les thématiques (jeux de données, réutilisation…) où l’interopérabilité intervient dans cet environnement scientifique comme un facteur favorisant – ou problématique dans la diffusion et la valorisation.

URL : Les enjeux de l’interopérabilité dans la diffusion et la valorisation des données archéologiques

Alternative location : https://www.enssib.fr/bibliotheque-numerique/notices/68376-les-enjeux-de-l-interoperabilite-dans-la-diffusion-et-la-valorisation-des-donnees-archeologiques

From Open Access to Open Data: collaborative work in the university libraries of Catalonia

Authors: Mireia Alcalá Ponce de León, Lluís Anglada i de Ferrer

In the last years, the scientific community and funding bodies have paid attention to collected, generated or used data throughout different research activities. The dissemination of these data becomes one of the constituent elements of Open Science.

For this reason, many funders are requiring or promoting the development of Data Management Plans, and depositing open data following the FAIR principles (Findable, Accessible, Interoperable and Reusable).

Libraries and research offices of Catalan universities –which coordinately work within the Open Science Area of CSUC– offer support services to research data management. The different works carried out at the Consortium level will be presented, as well the implementation of the service in each university.

URL : From Open Access to Open Data: collaborative work in the university libraries of Catalonia

DOI : http://doi.org/10.18352/lq.10253

Reproducible research practices, transparency, and open access data in the biomedical literature, 2015–2017

Authors : Joshua D. Wallach, Kevin W. Boyack, John P. A. Ioannidis

Currently, there is a growing interest in ensuring the transparency and reproducibility of the published scientific literature. According to a previous evaluation of 441 biomedical journals articles published in 2000–2014, the biomedical literature largely lacked transparency in important dimensions.

Here, we surveyed a random sample of 149 biomedical articles published between 2015 and 2017 and determined the proportion reporting sources of public and/or private funding and conflicts of interests, sharing protocols and raw data, and undergoing rigorous independent replication and reproducibility checks.

We also investigated what can be learned about reproducibility and transparency indicators from open access data provided on PubMed. The majority of the 149 studies disclosed some information regarding funding (103, 69.1% [95% confidence interval, 61.0% to 76.3%]) or conflicts of interest (97, 65.1% [56.8% to 72.6%]).

Among the 104 articles with empirical data in which protocols or data sharing would be pertinent, 19 (18.3% [11.6% to 27.3%]) discussed publicly available data; only one (1.0% [0.1% to 6.0%]) included a link to a full study protocol. Among the 97 articles in which replication in studies with different data would be pertinent, there were five replication efforts (5.2% [1.9% to 12.2%]).

Although clinical trial identification numbers and funding details were often provided on PubMed, only two of the articles without a full text article in PubMed Central that discussed publicly available data at the full text level also contained information related to data sharing on PubMed; none had a conflicts of interest statement on PubMed.

Our evaluation suggests that although there have been improvements over the last few years in certain key indicators of reproducibility and transparency, opportunities exist to improve reproducible research practices across the biomedical literature and to make features related to reproducibility more readily visible in PubMed.

URL : Reproducible research practices, transparency, and open access data in the biomedical literature, 2015–2017

DOI : https://doi.org/10.1371/journal.pbio.2006930

Open Science by Design

Contributors : National Academies of Sciences, Engineering, and Medicine; Policy and Global Affairs; Board on Research Data and Information; Committee on Toward an Open Science Enterprise

Openness and sharing of information are fundamental to the progress of science and to the effective functioning of the research enterprise. The advent of scientific journals in the 17th century helped power the Scientific Revolution by allowing researchers to communicate across time and space, using the technologies of that era to generate reliable knowledge more quickly and efficiently.

Harnessing today’s stunning, ongoing advances in information technologies, the global research enterprise and its stakeholders are moving toward a new open science ecosystem.

Open science aims to ensure the free availability and usability of scholarly publications, the data that result from scholarly research, and the methodologies, including code or algorithms, that were used to generate those data.

Open Science by Design is aimed at overcoming barriers and moving toward open science as the default approach across the research enterprise.

This report explores specific examples of open science and discusses a range of challenges, focusing on stakeholder perspectives. It is meant to provide guidance to the research enterprise and its stakeholders as they build strategies for achieving open science and take the next steps.

URL : https://www.nap.edu/catalog/25116/open-science-by-design-realizing-a-vision-for-21st-century

The History, Advocacy and Efficacy of Data Management Plans

Authors : Nicholas Smale, Kathryn Unsworth, Gareth Denyer, Daniel Barr

Data management plans (DMPs) have increasingly been encouraged as a key component of institutional and funding body policy. Although DMPs necessarily place administrative burden on researchers, proponents claim that DMPs have myriad benefits, including enhanced research data quality, increased rates of data sharing, and institutional planning and compliance benefits.

In this manuscript, we explore the international history of DMPs and describe institutional and funding body DMP policy. We find that economic and societal benefits from presumed increased rates of data sharing was the original driver of mandating DMPs by funding bodies.

Today, 86% of UK Research Councils and 63% of US funding bodies require submission of a DMP with funding applications. Given that no major Australian funding bodies require DMP submission, it is of note that 37% of Australian universities have taken the initiative to internally mandate DMPs.

Institutions both within Australia and internationally frequently promote the professional benefits of DMP use, and endorse DMPs as ‘best practice’. We analyse one such typical DMP implementation at a major Australian institution, finding that DMPs have low levels of apparent translational value.

Indeed, an extensive literature review suggests there is very limited published systematic evidence that DMP use has any tangible benefit for researchers, institutions or funding bodies.

We are therefore led to question why DMPs have become the go-to tool for research data professionals and advocates of good data practice. By delineating multiple use-cases and highlighting the need for DMPs to be fit for intended purpose, we question the view that a good DMP is necessarily that which encompasses the entire data lifecycle of a project.

Finally, we summarise recent developments in the DMP landscape, and note a positive shift towards evidence-based research management through more researcher-centric, educative, and integrated DMP services.

URL : The History, Advocacy and Efficacy of Data Management Plans

DOI : https://doi.org/10.1101/443499