To share or not to share? Image data sharing in the social sciences and humanities

Authors : Elina Late, Mette Skov, Sanna Kumpulainen

Introduction

The paper aims to investigate image data sharing within social science and humanities. While data sharing is encouraged as a part of the open science movement, little is known about the approaches and factors influencing the sharing of image data.

This information is evident as the use of image data in these fields of research is increasing, and data sharing is context dependent.

Method

The study analyses qualitative semi-structured interviews with 14 scholars who incorporate digital images as a core component of their research data.

Analysis

Content analysis is conducted to gather information about scholars’ image data sharing and motivating and impeding factors related to it.

Results

The findings show that image data sharing is not an established research practice, and when it happens it is mostly done via informal means by sharing data through personal contacts. Supporting the scientific community, the open science agenda and fulfilling research funders’ requirements motivate scholars to share their data. Impeding factors relate to the qualities of data, ownership of data, data stewardship, and research integrity.

Conclusion

Advancing image data sharing requires the development of research infrastructures and providing support and guidelines. Better understanding of the scholars’ image data practices is also needed.

URL : To share or not to share? Image data sharing in the social sciences and humanities

DOI : https://doi.org/10.47989/ir292834

Exploring scholarly perceptions of preprint servers

Authors : Shir Aviv-Reuven, Jenny Bronstein, Ariel Rosenfeld

Introduction

Preprint servers play an important role in scholarly communication.  The study investigates scholars’ engagement, experiences, and perceptions regarding the use of these servers, both as information sources and publishing venues. This qualitative study seeks to extend our understanding of how these servers operate within the academic ecosystem and influence scholarly communication.

Method

Data was collected through 32 semi-structured interviews with scholars from different disciplines, to explore their engagement, experiences and perceptions in using these platforms.

Analysis

The data collected from these interviews underwent thematic content analysis using ATLAS.ti software. This analysis facilitated the organization and thematic examination of the textual narratives derived from the interviews.

Results

In this study, scholars discussed their perceptions about the benefits of using preprint servers in scholarly work such as rapid dissemination of information and open access, but also raised concerns regarding the lack of peer review for the studies uploaded to these servers.

Conclusion

These findings emphasize the growing, yet diverse, role preprint servers play in scholarly communication and their differential impact across academic disciplines.

URL : Exploring scholarly perceptions of preprint servers

DOI : https://doi.org/10.47989/ir292820

Where there’s a will there’s a way: ChatGPT is used more for science in countries where it is prohibited

Authors : Honglin Bao, Mengyi Sun, Misha Teplitskiy

Regulating AI has emerged as a key societal challenge, but which methods of regulation are effective is unclear. Here, we measure the effectiveness of restricting AI services geographically using the case of ChatGPT and science. OpenAI prohibits access to ChatGPT from several countries including China and Russia.

If the restrictions are effective, there should be minimal use of ChatGPT in prohibited countries. We measured use by developing a classifier based on prior work showing that early versions of ChatGPT overrepresented distinctive words like “delve.”

We trained the classifier on abstracts before and after ChatGPT “polishing” and validated it on held-out abstracts and those where authors self-declared to have used AI, where it substantially outperformed off-the-shelf LLM detectors GPTZero and ZeroGPT. Applying the classifier to preprints from Arxiv, BioRxiv, and MedRxiv reveals that ChatGPT was used in approximately 12.6% of preprints by August 2023 and use was 7.7% higher in countries without legal access.

Crucially, these patterns appeared before the first major legal LLM became widely available in China, the largest restricted-country preprint producer. ChatGPT use was associated with higher views and downloads, but not citations or journal placement.

Overall, restricting ChatGPT geographically has proven ineffective in science and possibly other domains, likely due to widespread workarounds.

URL : https://arxiv.org/abs/2406.11583

Virtual academic conferencing: a scoping review of 1984-2021 literature. Novel modalities vs. long standing challenges in scholarly communication

Authors : Agnieszka Olechnicka, Adam Ploszaj, Ewa Zegler-Poleska

This study reviews the literature on virtual academic conferences, which have gained significant attention due to the COVID-19 pandemic. We conducted a scoping review, analyzing 147 documents available up to October 5th, 2021.

We categorized this literature, identified main themes, examined theoretical approaches, evaluated empirical findings, and synthesized the advantages and disadvantages of virtual academic conferences. We find that the existing literature on virtual academic conferences is mainly descriptive and lacks a solid theoretical framework for studying the phenomenon.

Despite the rapid growth of the literature documenting and discussing virtual conferencing induced by the pandemic, the understanding of the phenomenon is limited. We provide recommendations for future research on academic virtual conferences: their impact on research productivity, quality, and collaboration; relations to social, economic, and geopolitical inequalities in science; and their environmental aspects.

We stress the need for further research encompassing the development of a theoretical framework that will guide empirical studies.

URL : https://arxiv.org/abs/2406.11583

Open Science at the Generative AI Turn: An Exploratory Analysis of Challenges and Opportunities

Authors : Mohammad Hosseini, Serge P.J.M. Horbach, Kristi L. Holmes, Tony Ross-Hellauer

Technology influences Open Science (OS) practices, because conducting science in transparent, accessible, and participatory ways requires tools/platforms for collaborative research and sharing results. Due to this direct relationship, characteristics of employed technologies directly impact OS objectives. Generative Artificial Intelligence (GenAI) models are increasingly used by researchers for tasks such as text refining, code generation/editing, reviewing literature, data curation/analysis.

GenAI promises substantial efficiency gains but is currently fraught with limitations that could negatively impact core OS values such as fairness, transparency and integrity, and harm various social actors. In this paper, we explore possible positive and negative impacts of GenAI on OS.

We use the taxonomy within the UNESCO Recommendation on Open Science to systematically explore the intersection of GenAI and OS. We conclude that using GenAI could advance key OS objectives by further broadening meaningful access to knowledge, enabling efficient use of infrastructure, improving engagement of societal actors, and enhancing dialogue among knowledge systems.

However, due to GenAI limitations, it could also compromise the integrity, equity, reproducibility, and reliability of research, while also having potential implications for the political economy of research and its infrastructure. Hence, sufficient checks, validation and critical assessments are essential when incorporating GenAI into research workflows.

URL : Open Science at the Generative AI Turn: An Exploratory Analysis of Challenges and Opportunities

DOI : https://doi.org/10.31235/osf.io/zns7g

Disciplinary Differences and Scholarly Literature: Discovery, Browsing, and Formats

Authors : Chad E. Buckley, Rachel E. Scott, Anne Shelley, Cassie Thayer-Styes, Julie A Murphy

This study reports faculty experiences regarding the discovery of scholarly content, highlighting similarities and differences across a range of academic disciplines. The authors interviewed twenty-five faculty members at a public, high-research university in the Midwest to explore the intersections of discovery, browsing, and format from diverse disciplinary perspectives.

Although most participants rely on similar discovery tools such as library catalogs and databases and Google Scholar, their discovery techniques varied according to the discipline and type of research being done. Browsing is not a standard method for discovery, but it is still done selectively and strategically by some scholars.

Journal articles are the most important format across disciplines, but books, chapters, and conference proceedings are core for some scholars and should be considered when facilitating discovery. The findings detail several ways in which disciplinary and personal experiences shape scholars’ practices.

The authors discuss the perceived disconnect between browsability, discovery, and access of scholarly literature and explore solutions that make the library central to discovery and browsing.

URL : https://ir.library.illinoisstate.edu/fpml/196

Data Science and AI in Context: Summary and Insights

Author : Alfred Spector

This paper explores how to deploy data science and data-driven AI, focusing on the broad collection of considerations beyond those of statistics and machine learning. Building on an analysis rubric introduced in a recent textbook by the author and three others, this paper summarizes some of the book’s key points and adds reflections on AI’s extraordinary growth and societal effects. The paper also discusses how to balance inevitable trade-offs and provides further thoughts on societal implications.

DOI : https://doi.org/10.1162/99608f92.cdebd845