Documentation and Visualisation of Workflows for Effective Communication, Collaboration and Publication @ Source

Authors : Cerys Willoughby, Jeremy G. Frey

Workflows processing data from research activities and driving in silico experiments are becoming an increasingly important method for conducting scientific research. Workflows have the advantage that not only can they be automated and used to process data repeatedly, but they can also be reused – in part or whole – enabling them to be evolved for use in new experiments.

A number of studies have investigated strategies for storing and sharing workflows for the benefit of reuse. These have revealed that simply storing workflows in repositories without additional context does not enable workflows to be successfully reused.

These studies have investigated what additional resources are needed to facilitate users of workflows and in particular to add provenance traces and to make workflows and their resources machine-readable.

These additions also include adding metadata for curation, annotations for comprehension, and including data sets to provide additional context to the workflow. Ultimately though, these mechanisms still rely on researchers having access to the software to view and run the workflows.

We argue that there are situations where researchers may want to understand a workflow that goes beyond what provenance traces provide and without having to run the workflow directly; there are many situations in which it can be difficult or impossible to run the original workflow.

To that end, we have investigated the creation of an interactive workflow visualization that captures the flow chart element of the workflow with additional context including annotations, descriptions, parameters, metadata and input, intermediate, and results data that can be added to the record of a workflow experiment to enhance both curation and add value to enable reuse.

We have created interactive workflow visualisations for the popular workflow creation tool KNIME, which does not provide users with an in-built function to extract provenance information that can otherwise only be viewed through the tool itself.

Making use of the strengths of KNIME for adding documentation and user-defined metadata we can extract and create a visualisation and curation package that encourages and enhances curation@source, facilitating effective communication, collaboration, and reuse of workflows.

URL : Documentation and Visualisation of Workflows for Effective Communication, Collaboration and Publication @ Source

DOI : https://doi.org/10.2218/ijdc.v12i1.532

Research Data Management Instruction for Digital Humanities

Author : Willow Dressel

eScience related library services at Princeton University started in response to the National Science Foundation’s (NSF) data management plan requirements, and grew to encompass a range of services including data management plan consultation, assistance with depositing into a disciplinary or institutional repository, and research data management instruction.

These services were initially directed at science and engineering disciplines on campus, but the eScience Librarian soon realized the relevance of research data management instruction for humanities disciplines with digital approaches.

Applicability to the digital humanities was initially recognized by discovery of related efforts from the history department’s Information Technology (IT) manager in the form of a graduate-student workshop on file and digital-asset management concepts.

Seeing the common ground these activities shared with research data management, a collaboration was formed between the history department’s IT Manager and the eScience Librarian to provide a research data management overview to the entire campus community.

The eScience Librarian was then invited to participate in the history department’s graduate student file and digital asset management workshop to provide an overview of other research data management concepts. Based on the success of the collaboration with the history department IT, the eScience Librarian offered to develop a workshop for the newly formed Center for Digital Humanities at Princeton.

To develop the workshop, background research on digital humanities curation was performed revealing similarities and differences between digital humanities curation and research data management in the sciences. These similarities and differences, workshop results, and areas of further study are discussed.

URL : Research Data Management Instruction for Digital Humanities

DOI : https://doi.org/10.7191/jeslib.2017.1115

Versioned data: why it is needed and how it can be achieved (easily and cheaply)

Authors : Daniel S. Falster, Richard G. FitzJohn, Matthew W. Pennell, William K. Cornwell

The sharing and re-use of data has become a cornerstone of modern science. Multiple platforms now allow quick and easy data sharing. So far, however, data publishing models have not accommodated on-going scientific improvements in data: for many problems, datasets continue to grow with time — more records are added, errors fixed, and new data structures are created. In other words, datasets, like scientific knowledge, advance with time.

We therefore suggest that many datasets would be usefully published as a series of versions, with a simple naming system to allow users to perceive the type of change between versions. In this article, we argue for adopting the paradigm and processes for versioned data, analogous to software versioning.

We also introduce a system called Versioned Data Delivery and present tools for creating, archiving, and distributing versioned data easily, quickly, and cheaply. These new tools allow for individual research groups to shift from a static model of data curation to a dynamic and versioned model that more naturally matches the scientific process.

URL : Versioned data: why it is needed and how it can be achieved (easily and cheaply)

DOI : https://doi.org/10.7287/peerj.preprints.3401v1

 

Research Data Management Practices in University libraries: A study

Authors : Manorama Tripathi, Archana Shukla, Sharad Kumar Sonkar

The paper has studied the research data management (RDM) services implemented by different university libraries for managing, organizing, curating and preserving research data generated at their universities’ departments and laboratories, for data reuse and sharing.

It has surveyed the central university libraries and the best 20 university libraries of the world to highlight how RDM is extended to the researchers. Further, it has suggested a model for the university libraries in the country to follow for actually deploying RDM services.

URL : Research Data Management Practices in University libraries: A study

Alternative location : http://publications.drdo.gov.in/ojs/index.php/djlit/article/view/11336

A Campus Partnership to Foster Compliance with Funder Mandates

Authors : Jeff R. Broadbent, Andrea Payant, Kevin Peterson, Betty Rozum, Liz Woolcott

Data from federally funded research must now be made publicly accessible and discoverable. Researchers must adhere to guidelines established by federal agencies, and universities must be prepared to demonstrate compliance with the federal mandate.

At Utah State University, the Office of Research and Graduate Studies and the Merrill-Cazier Library partnered to facilitate data sharing and create an audit trail demonstrating compliance with the terms of each researcher’s award.

This systematic approach uses existing resources such as the grant management system, the institutional repository (IR), and the Library online catalog. This paper describes our process and the first eight months of implementation.

URL : http://digitalcommons.usu.edu/lib_pubs/274/

Research Data Management Services in Academic Libraries in the US: A Content Analysis of Libraries’ Websites

Authors : Ayoung Yoon, Teresa Schultz

Examining landscapes of research data management services in academic libraries is timely and significant for both those libraries on the front line and the libraries that are already ahead.

While it provides overall understanding of where the research data management program is at and where it is going, it also provides understanding of current practices and data management recommendations and/or tool adoptions as well as revealing areas of improvement and support.

This study examined the research data (management) services in academic libraries in the United States through a content analysis of 185 library websites, with four main areas of focus: service, information, education, and network.

The results from the content analysis of these webpages reveals that libraries need to advance and engage more actively to provide services, supply information online, and develop educational services.

There is also a wide variation among library data management services and programs according to their web presence.

URL : http://crl.acrl.org/index.php/crl/article/view/16788/18346

The Evolution, Approval and Implementation of the U.S. Geological Survey Science Data Lifecycle Model

Authors : John L. Faundeen, Vivian B. Hutchison

This paper details how the U.S. Geological Survey (USGS) Community for Data Integration (CDI) Data Management Working Group developed a Science Data Lifecycle Model, and the role the Model plays in shaping agency-wide policies and data management applications.

Starting with an extensive literature review of existing data lifecycle models, representatives from various backgrounds in USGS attended a two-day meeting where the basic elements for the Science Data Lifecycle Model were determined.

Refinements and reviews spanned two years, leading to finalization of the model and documentation in a formal agency publication1.

The Model serves as a critical framework for data management policy, instructional resources, and tools. The Model helps the USGS address both the Office of Science and Technology Policy (OSTP)2 for increased public access to federally funded research, and the Office of Management and Budget (OMB)3 2013 Open Data directives, as the foundation for a series of agency policies related to data management planning, metadata development, data release procedures, and the long-term preservation of data.

Additionally, the agency website devoted to data management instruction and best practices (www2.usgs.gov/datamanagement) is designed around the Model’s structure and concepts. This paper also illustrates how the Model is being used to develop tools for supporting USGS research and data management processes.

URL : http://escholarship.umassmed.edu/jeslib/vol6/iss2/4/