An empirical examination of data reuser trust in a digital repository

Authors : Elizabeth Yakel, Ixchel M. Faniel, Lionel P. Robert Jr

Most studies of trusted digital repositories have focused on the internal factors delineated in the Open Archival Information System (OAIS) Reference Model—organizational structure, technical infrastructure, and policies, procedures, and processes.

Typically, these factors are used during an audit and certification process to demonstrate a repository can be trusted. The factors influencing a repository’s designated community of users to trust it remains largely unexplored.

This article proposes and tests a model of trust in a data repository and the influence trust has on users’ intention to continue using it. Based on analysis of 245 surveys from quantitative social scientists who published research based on the holdings of one data repository, findings show three factors are positively related to data reuser trust—integrity, identification, and structural assurance.

In turn, trust and performance expectancy are positively related to data reusers’ intentions to return to the repository for more data. As one of the first studies of its kind, it shows the conceptualization of trusted digital repositories needs to go beyond high-level definitions and simple application of the OAIS standard.

Trust needs to encompass the complex trust relationship between designated communities of users that the repositories are being built to serve.

URL : Asso for Info Science Tech – 2024 – Yakel – An empirical examination of data reuser trust in a digital repository

DOI : https://doi.org/10.1002/asi.24933

An open dataset of article processing charges from six large scholarly publishers

Authors : Leigh-Ann Butler, Madelaine Hare, Nina Schönfelder, Eric Schares, Juan Pablo Alperin, Stefanie Haustein

This paper introduces a dataset of article processing charges (APCs) produced from the price lists of six large scholarly publishers – Elsevier, Frontiers, PLOS, MDPI, Springer Nature and Wiley – between 2019 and 2023.

APC price lists were downloaded from publisher websites each year as well as via Wayback Machine snapshots to retrieve fees per journal per year. The dataset includes journal metadata, APC collection method, and annual APC price list information in several currencies (USD, EUR, GBP, CHF, JPY, CAD) for 8,712 unique journals and 36,618 journal-year combinations.

The dataset was generated to allow for more precise analysis of APCs and can support library collection development and scientometric analysis estimating APCs paid in gold and hybrid OA journals.

URL : An open dataset of article processing charges from six large scholarly publishers

Arxiv : https://arxiv.org/abs/2406.08356

To share or not to share? Image data sharing in the social sciences and humanities

Authors : Elina Late, Mette Skov, Sanna Kumpulainen

Introduction

The paper aims to investigate image data sharing within social science and humanities. While data sharing is encouraged as a part of the open science movement, little is known about the approaches and factors influencing the sharing of image data.

This information is evident as the use of image data in these fields of research is increasing, and data sharing is context dependent.

Method

The study analyses qualitative semi-structured interviews with 14 scholars who incorporate digital images as a core component of their research data.

Analysis

Content analysis is conducted to gather information about scholars’ image data sharing and motivating and impeding factors related to it.

Results

The findings show that image data sharing is not an established research practice, and when it happens it is mostly done via informal means by sharing data through personal contacts. Supporting the scientific community, the open science agenda and fulfilling research funders’ requirements motivate scholars to share their data. Impeding factors relate to the qualities of data, ownership of data, data stewardship, and research integrity.

Conclusion

Advancing image data sharing requires the development of research infrastructures and providing support and guidelines. Better understanding of the scholars’ image data practices is also needed.

URL : To share or not to share? Image data sharing in the social sciences and humanities

DOI : https://doi.org/10.47989/ir292834

Exploring scholarly perceptions of preprint servers

Authors : Shir Aviv-Reuven, Jenny Bronstein, Ariel Rosenfeld

Introduction

Preprint servers play an important role in scholarly communication.  The study investigates scholars’ engagement, experiences, and perceptions regarding the use of these servers, both as information sources and publishing venues. This qualitative study seeks to extend our understanding of how these servers operate within the academic ecosystem and influence scholarly communication.

Method

Data was collected through 32 semi-structured interviews with scholars from different disciplines, to explore their engagement, experiences and perceptions in using these platforms.

Analysis

The data collected from these interviews underwent thematic content analysis using ATLAS.ti software. This analysis facilitated the organization and thematic examination of the textual narratives derived from the interviews.

Results

In this study, scholars discussed their perceptions about the benefits of using preprint servers in scholarly work such as rapid dissemination of information and open access, but also raised concerns regarding the lack of peer review for the studies uploaded to these servers.

Conclusion

These findings emphasize the growing, yet diverse, role preprint servers play in scholarly communication and their differential impact across academic disciplines.

URL : Exploring scholarly perceptions of preprint servers

DOI : https://doi.org/10.47989/ir292820

Where there’s a will there’s a way: ChatGPT is used more for science in countries where it is prohibited

Authors : Honglin Bao, Mengyi Sun, Misha Teplitskiy

Regulating AI has emerged as a key societal challenge, but which methods of regulation are effective is unclear. Here, we measure the effectiveness of restricting AI services geographically using the case of ChatGPT and science. OpenAI prohibits access to ChatGPT from several countries including China and Russia.

If the restrictions are effective, there should be minimal use of ChatGPT in prohibited countries. We measured use by developing a classifier based on prior work showing that early versions of ChatGPT overrepresented distinctive words like “delve.”

We trained the classifier on abstracts before and after ChatGPT “polishing” and validated it on held-out abstracts and those where authors self-declared to have used AI, where it substantially outperformed off-the-shelf LLM detectors GPTZero and ZeroGPT. Applying the classifier to preprints from Arxiv, BioRxiv, and MedRxiv reveals that ChatGPT was used in approximately 12.6% of preprints by August 2023 and use was 7.7% higher in countries without legal access.

Crucially, these patterns appeared before the first major legal LLM became widely available in China, the largest restricted-country preprint producer. ChatGPT use was associated with higher views and downloads, but not citations or journal placement.

Overall, restricting ChatGPT geographically has proven ineffective in science and possibly other domains, likely due to widespread workarounds.

URL : https://arxiv.org/abs/2406.11583

Virtual academic conferencing: a scoping review of 1984-2021 literature. Novel modalities vs. long standing challenges in scholarly communication

Authors : Agnieszka Olechnicka, Adam Ploszaj, Ewa Zegler-Poleska

This study reviews the literature on virtual academic conferences, which have gained significant attention due to the COVID-19 pandemic. We conducted a scoping review, analyzing 147 documents available up to October 5th, 2021.

We categorized this literature, identified main themes, examined theoretical approaches, evaluated empirical findings, and synthesized the advantages and disadvantages of virtual academic conferences. We find that the existing literature on virtual academic conferences is mainly descriptive and lacks a solid theoretical framework for studying the phenomenon.

Despite the rapid growth of the literature documenting and discussing virtual conferencing induced by the pandemic, the understanding of the phenomenon is limited. We provide recommendations for future research on academic virtual conferences: their impact on research productivity, quality, and collaboration; relations to social, economic, and geopolitical inequalities in science; and their environmental aspects.

We stress the need for further research encompassing the development of a theoretical framework that will guide empirical studies.

URL : https://arxiv.org/abs/2406.11583

Open Science at the Generative AI Turn: An Exploratory Analysis of Challenges and Opportunities

Authors : Mohammad Hosseini, Serge P.J.M. Horbach, Kristi L. Holmes, Tony Ross-Hellauer

Technology influences Open Science (OS) practices, because conducting science in transparent, accessible, and participatory ways requires tools/platforms for collaborative research and sharing results. Due to this direct relationship, characteristics of employed technologies directly impact OS objectives. Generative Artificial Intelligence (GenAI) models are increasingly used by researchers for tasks such as text refining, code generation/editing, reviewing literature, data curation/analysis.

GenAI promises substantial efficiency gains but is currently fraught with limitations that could negatively impact core OS values such as fairness, transparency and integrity, and harm various social actors. In this paper, we explore possible positive and negative impacts of GenAI on OS.

We use the taxonomy within the UNESCO Recommendation on Open Science to systematically explore the intersection of GenAI and OS. We conclude that using GenAI could advance key OS objectives by further broadening meaningful access to knowledge, enabling efficient use of infrastructure, improving engagement of societal actors, and enhancing dialogue among knowledge systems.

However, due to GenAI limitations, it could also compromise the integrity, equity, reproducibility, and reliability of research, while also having potential implications for the political economy of research and its infrastructure. Hence, sufficient checks, validation and critical assessments are essential when incorporating GenAI into research workflows.

URL : Open Science at the Generative AI Turn: An Exploratory Analysis of Challenges and Opportunities

DOI : https://doi.org/10.31235/osf.io/zns7g