Researchers and Research Data: Improving and Incentivising Sharing and Archiving

Authors : Minna Ventsel, Beth Montague-Hellen

There has been a lot of discussion within the scientific community around the issues of reproducibility in research, with questions being raised about the integrity of research due to failure to reproduce or confirm the findings of some of the studies. Researchers need to adhere to the FAIR (findable, accessible, interoperable, and reusable) principles to contribute to collaborative and open science, but these open data principles can also support reproducibility and issues around ensuring data integrity.

This article uses observations and metrics from data sharing and research integrity related activities, undertaken by a Research Integrity and Data Specialist at the Francis Crick Institute, to discuss potential reasons behind a slow uptake of FAIR data practices. We then suggest solutions undertaken at the Francis Crick institute which can be followed by institutes and universities to improve the integrity of research from a data perspective.

One major solution discussed is the implementation of a data archive system at the Francis Crick Institute to ensure the integrity of data long term, comply with our funders’ data management requirements, and to safeguard our researchers against any potential research integrity allegations in the future.

URL : Researchers and Research Data: Improving and Incentivising Sharing and Archiving

DOI : https://doi.org/10.2218/v19i1.983

Agile Research Data Management with Open Source: LinkAhead

Authors : Daniel Hornung, Florian Spreckelsen, Thomas Weiß

Research data management (RDM) in academic scientific environments increasingly enters the focus as an important part of good scientific practice and as a topic with big potentials for saving time and money. Nevertheless, there is a shortage of appropriate tools, which fulfill the specific requirements in scientific research.

We identified where the requirements in science deviate from other fields and proposed a list of requirements which RDM software should answer to become a viable option. We analyzed a number of currently available technologies and tool categories for matching these requirements and identified areas where no tools can satisfy researchers’ needs.

Finally we assessed the open-source RDMS (research data management system) LinkAhead for compatibility with the proposed features and found that it fulfills the requirements in the area of semantic, flexible data handling in which other tools show weaknesses.

URL : Agile Research Data Management with Open Source: LinkAhead

DOI : https://doi.org/10.48694/inggrid.3866

Emerging roles and responsibilities of libraries in support of reproducible research

Authors : Birgit Schmidt, Andrea Chiarelli, Lucia Loffreda, Jeroen Sondervan

Ensuring the reproducibility of research is a multi-stakeholder effort that comes with challenges and opportunities for individual researchers and research communities, librarians, publishers, funders and service providers. These emerge at various steps of the research process, and, in particular, at the publication stage.

Previous work by Knowledge Exchange highlighted that, while there is growing awareness among researchers, reproducible publication practices have been slow to change. Importantly, research reproducibility has not yet reached institutional agendas: this work seeks to highlight the rationale for libraries to initiate and/or step up their engagement with this topic, which we argue is well aligned with their core values and strategic priorities.

We draw on secondary analysis of data gathered by Knowledge Exchange, focusing on the literature identified as well as interviews held with librarians. We extend this through further investigation of the literature and by integrating the findings of discussions held at the 2022 LIBER conference, to provide an updated picture of how libraries engage with research reproducibility.

Libraries have a significant role in promoting responsible research practices, including transparency and reproducibility, by leveraging their connections to academic communities and collaborating with stakeholders like research funders and publishers. Our recommendations for libraries include: i) partnering with researchers to promote a research culture that values transparency and reproducibility, ii) enhancing existing research infrastructure and support; and iii) investing in raising awareness and developing skills and capacities related to these principles.

URL : Emerging roles and responsibilities of libraries in support of reproducible research

DOI : https://doi.org/10.53377/lq.14947

The Future of Data in Research Publishing: From Nice to Have to Need to Have?

Authors : Christine L. Borgman, Amy Brand

Science policy promotes open access to research data for purposes of transparency and reuse of data in the public interest. We expect demands for open data in scholarly publishing to accelerate, at least partly in response to the opacity of artificial intelligence algorithms.

Open data should be findable, accessible, interoperable, and reusable (FAIR), and also trustworthy and verifiable. The current state of open data in scholarly publishing is in transition from ‘nice to have’ to ‘need to have.’

Research data are valuable, interpretable, and verifiable only in context of their origin, and with sufficient infrastructure to facilitate reuse. Making research data useful is expensive; benefits and costs are distributed unevenly.

Open data also poses risks for provenance, intellectual property, misuse, and misappropriation in an era of trolls and hallucinating AI algorithms. Scholars and scholarly publishers must make evidentiary data more widely available to promote public trust in research.

To make research processes more trustworthy, transparent, and verifiable, stakeholders need to make greater investments in data stewardship and knowledge infrastructures.

DOI : https://doi.org/10.1162/99608f92.b73aae77

Making Mathematical Research Data FAIR: A Technology Overview

Authors : Tim Conrad, Eloi Ferrer, Daniel Mietchen, Larissa Pusch, Johannes Stegmuller, Moritz Schubotz

The sharing and citation of research data is becoming increasingly recognized as an essential building block in scientific research across various fields and disciplines. Sharing research data allows other researchers to reproduce results, replicate findings, and build on them. Ultimately, this will foster faster cycles in knowledge generation.

Some disciplines, such as astronomy or bioinformatics, already have a long history of sharing data; many others do not. The current landscape of so-called research data repositories is diverse. This review aims to perform a technology review on existing data repositories/portals with a focus on mathematical research data.

URL : Making Mathematical Research Data FAIR: A Technology Overview

Original location: https://arxiv.org/abs/2309.11829

An iterative and interdisciplinary categorisation process towards FAIRer digital resources for sensitive life-sciences data

Authors : Romain David, Christian Ohmann, Jan‑Willem Boiten, Mónica Cano Abadía, Florence Bietrix, Steve Canham, Maria Luisa Chiusano, Walter Dastrù, Arnaud Laroquette, Dario Longo, Michaela Th. Mayrhofer, Maria Panagiotopoulou, Audrey S. Richard, Sergey Goryanin, Pablo Emilio Verde

For life science infrastructures, sensitive data generate an additional layer of complexity. Cross-domain categorisation and discovery of digital resources related to sensitive data presents major interoperability challenges. To support this FAIRification process, a toolbox demonstrator aiming at support for discovery of digital objects related to sensitive data (e.g., regulations, guidelines, best practice, tools) has been developed.

The toolbox is based upon a categorisation system developed and harmonised across a cluster of 6 life science research infrastructures. Three different versions were built, tested by subsequent pilot studies, finally leading to a system with 7 main categories (sensitive data type, resource type, research field, data type, stage in data sharing life cycle, geographical scope, specific topics).

109 resources attached with the tags in pilot study 3 were used as the initial content for the toolbox demonstrator, a software tool allowing searching of digital objects linked to sensitive data with filtering based upon the categorisation system.

Important next steps are a broad evaluation of the usability and user-friendliness of the toolbox, extension to more resources, broader adoption by different life-science communities, and a long-term vision for maintenance and sustainability.

URL : An iterative and interdisciplinary categorisation process towards FAIRer digital resources for sensitive life-sciences data

DOI : https://doi.org/10.1038/s41598-022-25278-z