Between administration and research: Understanding data management practices in an institutional context

Authors : Stefan Reichmann, Thomas Klebel, Ilire Hasani-Mavriqi, Tony Ross-Hellauer

Research Data Management (RDM) promises to make research outputs more transparent, findable, and reproducible. Strategies to streamline data management across disciplines are of key importance.

This paper presents results of an institutional survey (N = 258) at a medium-sized Austrian university with a STEM focus, supplemented with interviews (N = 18), to give an overview of the state-of-play of RDM practices across faculties and disciplinary contexts.

RDM services are on the rise but remain somewhat behind leading countries like the Netherlands and UK, showing only the beginnings of a culture attuned to RDM. There is considerable variation between faculties and institutes with respect to data amounts, complexity of data sets, data collection and analysis, and data archiving.

Data sharing practices within fields tend to be inconsistent. RDM is predominantly regarded as an administrative task, to the detriment of considerations of good research practice. Problems with RDM fall in two categories: Generic problems transcend specific research interests, infrastructures, and departments while discipline-specific problems need a more targeted approach.

The paper extends the state-of-the-art on RDM practices by combining in-depth qualitative material with quantified, detailed data about RDM practices and needs. The findings should be of interest to any comparable research institution with a similar agenda.

URL : Between administration and research: Understanding data management practices in an institutional context

DOI : https://doi.org/10.1002/asi.24492

Classification and analysis of PubPeer comments: How a web journal club is used

Author : José Luis Ortega

This study explores the use of PubPeer by the scholarly community, to understand the issues discussed in an online journal club, the disciplines most commented on, and the characteristics of the most prolific users.

A sample of 39,985 posts about 24,779 publications were extracted from PubPeer in 2019 and 2020. These comments were divided into seven categories according to their degree of seriousness (Positive review, Critical review, Lack of information, Honest errors, Methodological flaws, Publishing fraud, and Manipulation).

The results show that more than two-thirds of comments are posted to report some type of misconduct, mainly about image manipulation. These comments generate most discussion and take longer to be posted. By discipline, Health Sciences and Life Sciences are the most discussed research areas.

The results also reveal “super commenters,” users who access the platform to systematically review publications. The study ends by discussing how various disciplines use the site for different purposes.

URL : Classification and analysis of PubPeer comments: How a web journal club is used

DOI : https://doi.org/10.1002/asi.24568

Structure of Research Article Abstracts in Political Science: A Genre-Based Study

Author : Hesham Suleiman Alyousef

The research article (RA) abstract is the first section researchers read to determine its relevance to their interests. Researchers need to possess an implicit knowledge of the rhetorical move structure and organization of this section. Unlike most scientific disciplines, political science RA abstracts are unstructured, that is, with no headings (or moves), which makes it more challenging.

To the best of our knowledge, the rhetorical move structure in high readership political science RA abstracts has not been researched. This study investigated (a) the rhetorical move structure in 120 political science RA abstracts from six high-impact journals, (b) the most common move patterns, and (c) the move(s) occupying most textual space. The findings indicated the lack of obligatory moves. A move structure model for writing a political science RA abstract is proposed, comprising four conventional moves (Introduction [I]–Purpose [P]–Methods [M]–Results [R]) and two optional step/move, namely, Research Gap step and Discussion [D] move. The results also showed that the first most frequent move pattern is I-P-M-R-D, followed by I-P-M-R and the I-P-R-D.

The fact that an RA abstract summarizes the whole RA results in move embedding, particularly in the four moves, I-P-M-R. The findings revealed the importance of the Results move as it occupied nearly one third of text space. The results may contribute to the fields of discourse and genre studies.

They may provide invaluable insights for novice political science researchers attempting to publish their work in high-ranking journals. The proposed move structure model can act as a guide for English for Academic Purposes (EAP)/English for Specific Purposes (ESP) tutors and political science authors.

URL : Structure of Research Article Abstracts in Political Science: A Genre-Based Study

DOI : https://doi.org/10.1177%2F21582440211040797

Research Data Management Challenges in Citizen Science Projects and Recommendations for Library Support Services. A Scoping Review and Case Study

Authors: Jitka Stilund Hansen, Signe Gadegaard, Karsten Kryger Hansen, Asger Væring Larsen, Søren Møller, Gertrud Stougård Thomsen, Katrine Flindt Holmstrand

Citizen science (CS) projects are part of a new era of data aggregation and harmonisation that facilitates interconnections between different datasets. Increasing the value and reuse of CS data has received growing attention with the appearance of the FAIR principles and systematic research data management (RDM) practises, which are often promoted by university libraries.

However, RDM initiatives in CS appear diversified and if CS have special needs in terms of RDM is unclear. Therefore, the aim of this article is firstly to identify RDM challenges for CS projects and secondly, to discuss how university libraries may support any such challenges.

A scoping review and a case study of Danish CS projects were performed to identify RDM challenges. 48 articles were selected for data extraction. Four academic project leaders were interviewed about RDM practices in their CS projects.

Challenges and recommendations identified in the review and case study are often not specific for CS. However, finding CS data, engaging specific populations, attributing volunteers and handling sensitive data including health data are some of the challenges requiring special attention by CS project managers. Scientific requirements or national practices do not always encompass the nature of CS projects.

Based on the identified challenges, it is recommended that university libraries focus their services on 1) identifying legal and ethical issues that the project managers should be aware of in their projects, 2) elaborating these issues in a Terms of Participation that also specifies data handling and sharing to the citizen scientist, and 3) motivating the project manager to good data handling practises.

Adhering to the FAIR principles and good RDM practices in CS projects will continuously secure contextualisation and data quality. High data quality increases the value and reuse of the data and, therefore, the empowerment of the citizen scientists.

URL : Research Data Management Challenges in Citizen Science Projects and Recommendations for Library Support Services. A Scoping Review and Case Study

DOI : http://doi.org/10.5334/dsj-2021-025

Visual Summary Identification From Scientific Publications via Self-Supervised Learning

Authors : Shintaro Yamamoto, Anne Lauscher, Simone Paolo Ponzetto, Goran Glavaš, Shigeo Morishima

The exponential growth of scientific literature yields the need to support users to both effectively and efficiently analyze and understand the some body of research work. This exploratory process can be facilitated by providing graphical abstracts–a visual summary of a scientific publication.

Accordingly, previous work recently presented an initial study on automatic identification of a central figure in a scientific publication, to be used as the publication’s visual summary.

This study, however, have been limited only to a single (biomedical) domain. This is primarily because the current state-of-the-art relies on supervised machine learning, typically relying on the existence of large amounts of labeled data: the only existing annotated data set until now covered only the biomedical publications.

In this work, we build a novel benchmark data set for visual summary identification from scientific publications, which consists of papers presented at conferences from several areas of computer science. We couple this contribution with a new self-supervised learning approach to learn a heuristic matching of in-text references to figures with figure captions.

Our self-supervised pre-training, executed on a large unlabeled collection of publications, attenuates the need for large annotated data sets for visual summary identification and facilitates domain transfer for this task. We evaluate our self-supervised pretraining for visual summary identification on both the existing biomedical and our newly presented computer science data set.

The experimental results suggest that the proposed method is able to outperform the previous state-of-the-art without any task-specific annotations.

URL : Visual Summary Identification From Scientific Publications via Self-Supervised Learning

DOI : https://doi.org/10.3389/frma.2021.719004

Open science, the replication crisis, and environmental public health

Author : Daniel J. Hicks

Concerns about a crisis of mass irreplicability across scientific fields (“the replication crisis”) have stimulated a movement for open science, encouraging or even requiring researchers to publish their raw data and analysis code.

Recently, a rule at the US Environmental Protection Agency (US EPA) would have imposed a strong open data requirement. The rule prompted significant public discussion about whether open science practices are appropriate for fields of environmental public health.

The aims of this paper are to assess (1) whether the replication crisis extends to fields of environmental public health; and (2) in general whether open science requirements can address the replication crisis.

There is little empirical evidence for or against mass irreplicability in environmental public health specifically. Without such evidence, strong claims about whether the replication crisis extends to environmental public health – or not – seem premature.

By distinguishing three concepts – reproducibility, replicability, and robustness – it is clear that open data initiatives can promote reproducibility and robustness but do little to promote replicability.

I conclude by reviewing some of the other benefits of open science, and offer some suggestions for funding streams to mitigate the costs of adoption of open science practices in environmental public health.

URL : Open science, the replication crisis, and environmental public health

DOI : https://doi.org/10.1080/08989621.2021.1962713

Citizen-driven participatory research conducted through knowledge intermediary units. A thematic synthesis of the literature on “Science Shops

Authors : Anne-Sophie Gresle, Eduardo Urias, Rosario Scandurra, Bálint Balázs, Irene Jimeno, Leonardo de la Torre Ávila, Maria Jesus Pinazo

A Science Shop acts as a mission-oriented intermediary unit between the scientific sphere and civil society organizations. It seeks to facilitate citizen-driven open science projects that respond to the needs of civil society organizations and which, typically, include students in the work process.

We performed a thematic analysis of a systematically selected literature on Science Shops to understand how the scientific literature reflects the historical evolution of Science Shops in different settings and what factors the literature associates with the rise and fall of the Science Shop.

We used the PRISMA methodology to search for scientific papers in indexed journals in eight databases published in English, French and Spanish, and employed the thematic theory approach to extract and systematize our results. Twenty-six scientific articles met the inclusion criteria.

We identified three meta-categories and ten sub-topics which can serve as key pointers to guide the set-up and future work of Science Shops. Our results identify a major paradox: Science Shops incorporate public values in their scientific agendas but have difficulties sustaining themselves institutionally as they do not fit the current dominant research paradigm. Science shops represent a persuasive complementary approach to the way science is defined, executed and produced today.

URL : Citizen-driven participatory research conducted through knowledge intermediary units. A thematic synthesis of the literature on “Science Shops

DOI : https://doi.org/10.22323/2.20050202