Data journals: incentivizing data access and documentation within the scholarly communication system

Authors : William H. Walters

Data journals provide strong incentives for data creators to verify, document and disseminate their data. They also bring data access and documentation into the mainstream of scholarly communication, rewarding data creators through existing mechanisms of peer-reviewed publication and citation tracking.

These same advantages are not generally associated with data repositories, or with conventional journals’ data-sharing mandates. This article describes the unique advantages of data journals. It also examines the data journal landscape, presenting the characteristics of 13 data journals in the fields of biology, environmental science, chemistry, medicine and health sciences.

These journals vary considerably in size, scope, publisher characteristics, length of data reports, data hosting policies, time from submission to first decision, article processing charges, bibliographic index coverage and citation impact. They are similar, however, in their peer review criteria, their open access license terms and the characteristics of their editorial boards.

URL : Data journals: incentivizing data access and documentation within the scholarly communication system

DOI : http://doi.org/10.1629/uksg.510

Research Data Management as an Integral Part of the Research Process of Empirical Disciplines Using Landscape Ecology as an Example

Authors : Winfried Schröder, Stefan Nickel

Research Data Management (RDM) is regarded as an elementary component of empirical disciplines. Taking Landscape Ecology in Germany as an example the article demonstrates how to integrate RDM into the research design as a complement of the classic quality control and assurance in empirical research that has, so far, generally been limited to data production.

Sharing and reuse of empirical data by scientists as well as thorough peer reviews of knowledge produced by empirical research requires that the problem of the research in question, the operationalized definitions of the objects of investigation and their representative selection are documented and archived as well as the methods of data production including indicators for data quality and all data collected and produced.

On this basis, the extent to which this complemented design of research processes has already been realized is demonstrated by research projects of the Chair of Landscape Ecology at the University of Vechta, Germany.

This study is part of a joined research project on Research Data Management funded by the German Federal Ministry of Education and Research.

URL : Research Data Management as an Integral Part of the Research Process of Empirical Disciplines Using Landscape Ecology as an Example

DOI : http://doi.org/10.5334/dsj-2020-026

Data journals: incentivizing data access and documentation within the scholarly communication system

Author : William H. Walters

Data journals provide strong incentives for data creators to verify, document and disseminate their data. They also bring data access and documentation into the mainstream of scholarly communication, rewarding data creators through existing mechanisms of peer-reviewed publication and citation tracking.

These same advantages are not generally associated with data repositories, or with conventional journals’ data-sharing mandates. This article describes the unique advantages of data journals.

It also examines the data journal landscape, presenting the characteristics of 13 data journals in the fields of biology, environmental science, chemistry, medicine and health sciences.

These journals vary considerably in size, scope, publisher characteristics, length of data reports, data hosting policies, time from submission to first decision, article processing charges, bibliographic index coverage and citation impact.

They are similar, however, in their peer review criteria, their open access license terms and the characteristics of their editorial boards.

URL : Data journals: incentivizing data access and documentation within the scholarly communication system

DOI : http://doi.org/10.1629/uksg.510

Formalizing Privacy Laws for License Generation and Data Repository Decision Automation

Authors : Micah Altman, Stephen Chong, Alexandra Wood

In this paper, we summarize work-in-progress on expert system support to automate some data deposit and release decisions within a data repository, and to generate custom license agreements for those data transfers.

Our approach formalizes via a logic programming language the privacy-relevant aspects of laws, regulations, and best practices, supported by legal analysis documented in legal memoranda.

This formalization enables automated reasoning about the conditions under which a repository can transfer data, through interrogation of users, and the application of formal rules to the facts obtained from users.

The proposed system takes the specific conditions for a given data release and produces a custom data use agreement that accurately captures the relevant restrictions on data use.

This enables appropriate decisions and accurate licenses, while removing the bottleneck of lawyer effort per data transfer.

The operation of the system aims to be transparent, in the sense that administrators, lawyers, institutional review boards, and other interested parties can evaluate the legal reasoning and interpretation embodied in the formalization, and the specific rationale for a decision to accept or release a particular dataset.

URL : https://arxiv.org/abs/1910.10096

A Discussion of Value Metrics for Data Repositories in Earth and Environmental Sciences

Authors : Cynthia Parr, Corinna Gries, Margaret O’Brien, Robert R. Downs, Ruth Duerr, Rebecca Koskela, Philip Tarrant, Keith E. Maull, Nancy Hoebelheinrich, Shelley Stall

Despite growing recognition of the importance of public data to the modern economy and to scientific progress, long-term investment in the repositories that manage and disseminate scientific data in easily accessible-ways remains elusive. Repositories are asked to demonstrate that there is a net value of their data and services to justify continued funding or attract new funding sources.

Here, representatives from a number of environmental and Earth science repositories evaluate approaches for assessing the costs and benefits of publishing scientific data in their repositories, identifying various metrics that repositories typically use to report on the impact and value of their data products and services, plus additional metrics that would be useful but are not typically measured.

We rated each metric by (a) the difficulty of implementation by our specific repositories and (b) its importance for value determination. As managers of environmental data repositories, we find that some of the most easily obtainable data-use metrics (such as data downloads and page views) may be less indicative of value than metrics that relate to discoverability and broader use.

Other intangible but equally important metrics (e.g., laws or regulations impacted, lives saved, new proposals generated), will require considerable additional research to describe and develop, plus resources to implement at scale.

As value can only be determined from the point of view of a stakeholder, it is likely that multiple sets of metrics will be needed, tailored to specific stakeholder needs. Moreover, economically based analyses or the use of specialists in the field are expensive and can happen only as resources permit.

URL : A Discussion of Value Metrics for Data Repositories in Earth and Environmental Sciences

DOI : http://doi.org/10.5334/dsj-2019-058

Workflows Allowing Creation of Journal Article Supporting Information and Findable, Accessible, Interoperable, and Reusable (FAIR)-Enabled Publication of Spectroscopic Data

Authors : Agustin Barba, Santiago Dominguez, Carlos Cobas, David P. Martinsen, Charles Romain, Henry S. Rzepa; Felipe Seoane

There is an increasing focus on the part of academic institutions, funding agencies, and publishers, if not researchers themselves, on preservation and sharing of research data. Motivations for sharing include research integrity, replicability, and reuse.

One of the barriers to publishing data is the extra work involved in preparing data for publication once a journal article and its supporting information have been completed.

In this work, a method is described to generate both human and machine-readable supporting information directly from the primary instrumental data files and to generate the metadata to ensure it is published in accordance with findable, accessible, interoperable, and reusable (FAIR) guidelines.

Using this approach, both the human readable supporting information and the primary (raw) data can be submitted simultaneously with little extra effort.

Although traditionally the data package would be sent to a journal publisher for publication alongside the article, the data package could also be published independently in an institutional FAIR data repository.

Workflows are described that store the data packages and generate metadata appropriate for such a repository. The methods both to generate and to publish the data packages have been implemented for NMR data, but the concept is extensible to other types of spectroscopic data as well.

URL : Workflows Allowing Creation of Journal Article Supporting Information and Findable, Accessible, Interoperable, and Reusable (FAIR)-Enabled Publication of Spectroscopic Data

DOI : https://doi.org/10.1021/acsomega.8b03005

Are Research Datasets FAIR in the Long Run?

Authors : Dennis Wehrle, Klaus Rechert

Currently, initiatives in Germany are developing infrastructure to accept and preserve dissertation data together with the dissertation texts (on state level – bwDATA Diss, on federal level – eDissPlus).

In contrast to specialized data repositories, these services will accept data from all kind of research disciplines. To ensure FAIR data principles (Wilkinson et al., 2016), preservation plans are required, because ensuring accessibility, interoperability and re-usability even for a minimum ten year data redemption period can become a major challenge.

Both for longevity and re-usability, file formats matter. In order to ensure access to data, the data’s encoding, i.e. their technical and structural representation in form of file formats, needs to be understood. Hence, due to a fast technical lifecycle, interoperability, re-use and in some cases even accessibility depends on the data’s format and our future ability to parse or render these.

This leads to several practical questions regarding quality assurance, potential access options and necessary future preservation steps. In this paper, we analyze datasets from public repositories and apply a file format based long-term preservation risk model to support workflows and services for non-domain specific data repositories.

URL : Are Research Datasets FAIR in the Long Run?

DOI : https://doi.org/10.2218/ijdc.v13i1.659