Measuring and Mapping Data Reuse: Findings From an Interactive Workshop on Data Citation and Metrics for Data Reuse

Author : Lisa Federer

Widely adopted standards for data citation are foundational to efforts to track and quantify data reuse. Without the means to track data reuse and metrics to measure its impact, it is difficult to reward researchers who share high-value data with meaningful credit for their contribution.

Despite initial work on developing guidelines for data citation and metrics, standards have not yet been universally adopted. This article reports on the recommendations collected from a workshop held at the Future of Research Communications and e-Scholarship (FORCE11) 2018 meeting titled Measuring and Mapping Data Reuse: An Interactive Workshop on Metrics for Data.

A range of stakeholders were represented among the participants, including publishers, researchers, funders, repository administrators, librarians, and others.

Collectively, they generated a set of 68 recommendations for specific actions that could be taken by standards and metrics creators; publishers; repositories; funders and institutions; creators of reference management software and citation styles; and researchers, students, and librarians.

These specific, concrete, and actionable recommendations would help facilitate broader adoption of standard citation mechanisms and easier measurement of data reuse.

URL : Measuring and Mapping Data Reuse: Findings From an Interactive Workshop on Data Citation and Metrics for Data Reuse

DOI : https://doi.org/10.1162/99608f92.ccd17b00

Peer Review of Research Data Submissions to ScholarsArchive@OSU: How can we improve the curation of research datasets to enhance reusability?

Authors : Clara Llebot, Steven Van Tuyl

Objective

Best practices such as the FAIR Principles (Findability, Accessibility, Interoperability, Reusability) were developed to ensure that published datasets are reusable. While we employ best practices in the curation of datasets, we want to learn how domain experts view the reusability of datasets in our institutional repository, ScholarsArchive@OSU.

Curation workflows are designed by data curators based on their own recommendations, but research data is extremely specialized, and such workflows are rarely evaluated by researchers.

In this project we used peer-review by domain experts to evaluate the reusability of the datasets in our institutional repository, with the goal of informing our curation methods and ensure that the limited resources of our library are maximizing the reusability of research data.

Methods

We asked all researchers who have datasets submitted in Oregon State University’s repository to refer us to domain experts who could review the reusability of their data sets. Two data curators who are non-experts also reviewed the same datasets.

We gave both groups review guidelines based on the guidelines of several journals. Eleven domain experts and two data curators reviewed eight datasets.

The review included the quality of the repository record, the quality of the documentation, and the quality of the data. We then compared the comments given by the two groups.

Results

Domain experts and non-expert data curators largely converged on similar scores for reviewed datasets, but the focus of critique by domain experts was somewhat divergent.

A few broad issues common across reviews were: insufficient documentation, the use of links to journal articles in the place of documentation, and concerns about duplication of effort in creating documentation and metadata. Reviews also reflected the background and skills of the reviewer.

Domain experts expressed a lack of expertise in data curation practices and data curators expressed their lack of expertise in the research domain.

Conclusions

The results of this investigation could help guide future research data curation activities and align domain expert and data curator expectations for reusability of datasets.

We recommend further exploration of these common issues and additional domain expert peer-review project to further refine and align expectations for research data reusability.

URL : Peer Review of Research Data Submissions to ScholarsArchive@OSU: How can we improve the curation of research datasets to enhance reusability?

DOI : https://doi.org/10.7191/jeslib.2019.1166

Disciplinary data publication guides

Authors : Zosia Beckles, Stephen Gray, Debra Hiom, Kirsty Merrett, Kellie Snow, Damian Steer

Many academic disciplines have very comprehensive standard for data publication and clear guidance from funding bodies and academic publishers. In other cases, whilst much good-quality general guidance exists, there is a lack of information available to researchers to help them decide which specific data elements should be shared.

This is a particular issue for disciplines with very varied data types, such as engineering, and presents an unnecessary barrier to researchers wishing to meet funder expectations on data sharing.

This article outlines a project to provide simple, visual, discipline-specific guidance on data publication, undertaken at the University of Bristol at the request of the Faculty of Engineering.

URL : Disciplinary data publication guides

DOI : https://doi.org/10.2218/ijdc.v13i1.603

Data Management: New Tools, New Organization, and New Skills in a French Research Institute

Authors : Caroline Martin, Colette Cadiou, Emmanuelle Jannès-Ober

In the context of E-science and open access, visibility and impact of scientific results and data have become important aspects for spreading information to users and to the society in general.

The objective of this general trend of the economy is to feed the innovation process and create economic value. In our institute, the French National Research Institute of Science and Technology for Environment and Agriculture, Irstea, the department in charge of scientific and technical information, with the help of other professionals (Scientists, IT professionals, ethics advisors…), has recently developed suitable services for the researchers and for their needs concerning the data management in order to answer European recommendations for open data.

This situation has demanded to review the different workflows between databases, to question the organizational aspects between skills, occupations, and departments in the institute.

In fact, the data management involves all professionals and researchers to asset their working ways together.

URL : Data Management: New Tools, New Organization, and New Skills in a French Research Institute

DOI : http://doi.org/10.18352/lq.10196

Research Data Reusability: Conceptual Foundations, Barriers and Enabling Technologies

Author : Costantino Thanos

High-throughput scientific instruments are generating massive amounts of data. Today, one of the main challenges faced by researchers is to make the best use of the world’s growing wealth of data. Data (re)usability is becoming a distinct characteristic of modern scientific practice.

By data (re)usability, we mean the ease of using data for legitimate scientific research by one or more communities of research (consumer communities) that is produced by other communities of research (producer communities).

Data (re)usability allows the reanalysis of evidence, reproduction and verification of results, minimizing duplication of effort, and building on the work of others. It has four main dimensions: policy, legal, economic and technological. The paper addresses the technological dimension of data reusability.

The conceptual foundations of data reuse as well as the barriers that hamper data reuse are presented and discussed. The data publication process is proposed as a bridge between the data author and user and the relevant technologies enabling this process are presented.

URL : Research Data Reusability: Conceptual Foundations, Barriers and Enabling Technologies

DOI : http://dx.doi.org/10.3390/publications5010002

Data publication with the structural biology data grid supports live analysis

Access to experimental X-ray diffraction image data is fundamental for validation and reproduction of macromolecular models and indispensable for development of structural biology processing methods. Here, we established a diffraction data publication and dissemination system, Structural Biology Data Grid (SBDG; data.sbgrid.org), to preserve primary experimental data sets that support scientific publications.

Data sets are accessible to researchers through a community driven data grid, which facilitates global data access. Our analysis of a pilot collection of crystallographic data sets demonstrates that the information archived by SBDG is sufficient to reprocess data to statistics that meet or exceed the quality of the original published structures.

SBDG has extended its services to the entire community and is used to develop support for other types of biomedical data sets. It is anticipated that access to the experimental data sets will enhance the paradigm shift in the community towards a much more dynamic body of continuously improving data analysis.

URL : Data publication with the structural biology data grid supports live analysis

DOI : 10.1038/ncomms10882

Publishing without Publishers: a Decentralized Approach to Dissemination, Retrieval, and Archiving of Data

Statut

“Making available and archiving scientific results is for the most part still considered the task of classical publishing companies, despite the fact that classical forms of publishing centered around printed narrative articles no longer seem well-suited in the digital age. In particular, there exist currently no efficient, reliable, and agreed-upon methods for publishing scientific datasets, which have become increasingly important for science. Here we propose to design scientific data publishing as a Web-based bottom-up process, without top-down control of central authorities such as publishing companies. We present a protocol and a server network to decentrally store and archive data in the form of nanopublications, an RDF-based format to represent scientific data with formal semantics. We show how this approach allows researchers to produce, publish, retrieve, address, verify, and recombine datasets and their individual nanopublications in a reliable and trustworthy manner, and we argue that this architecture could be used for the Semantic Web in general. Our evaluation of the current small network shows that this system is efficient and reliable, and we discuss how it could grow to handle the large amounts of structured data that modern science is producing and consuming.”

URL : http://arxiv-web3.library.cornell.edu/abs/1411.2749