An open dataset of article processing charges from six large scholarly publishers

Authors : Leigh-Ann Butler, Madelaine Hare, Nina Schönfelder, Eric Schares, Juan Pablo Alperin, Stefanie Haustein

This paper introduces a dataset of article processing charges (APCs) produced from the price lists of six large scholarly publishers – Elsevier, Frontiers, PLOS, MDPI, Springer Nature and Wiley – between 2019 and 2023.

APC price lists were downloaded from publisher websites each year as well as via Wayback Machine snapshots to retrieve fees per journal per year. The dataset includes journal metadata, APC collection method, and annual APC price list information in several currencies (USD, EUR, GBP, CHF, JPY, CAD) for 8,712 unique journals and 36,618 journal-year combinations.

The dataset was generated to allow for more precise analysis of APCs and can support library collection development and scientometric analysis estimating APCs paid in gold and hybrid OA journals.

URL : An open dataset of article processing charges from six large scholarly publishers

Arxiv :

To share or not to share? Image data sharing in the social sciences and humanities

Authors : Elina Late, Mette Skov, Sanna Kumpulainen


The paper aims to investigate image data sharing within social science and humanities. While data sharing is encouraged as a part of the open science movement, little is known about the approaches and factors influencing the sharing of image data.

This information is evident as the use of image data in these fields of research is increasing, and data sharing is context dependent.


The study analyses qualitative semi-structured interviews with 14 scholars who incorporate digital images as a core component of their research data.


Content analysis is conducted to gather information about scholars’ image data sharing and motivating and impeding factors related to it.


The findings show that image data sharing is not an established research practice, and when it happens it is mostly done via informal means by sharing data through personal contacts. Supporting the scientific community, the open science agenda and fulfilling research funders’ requirements motivate scholars to share their data. Impeding factors relate to the qualities of data, ownership of data, data stewardship, and research integrity.


Advancing image data sharing requires the development of research infrastructures and providing support and guidelines. Better understanding of the scholars’ image data practices is also needed.

URL : To share or not to share? Image data sharing in the social sciences and humanities


Exploring scholarly perceptions of preprint servers

Authors : Shir Aviv-Reuven, Jenny Bronstein, Ariel Rosenfeld


Preprint servers play an important role in scholarly communication.  The study investigates scholars’ engagement, experiences, and perceptions regarding the use of these servers, both as information sources and publishing venues. This qualitative study seeks to extend our understanding of how these servers operate within the academic ecosystem and influence scholarly communication.


Data was collected through 32 semi-structured interviews with scholars from different disciplines, to explore their engagement, experiences and perceptions in using these platforms.


The data collected from these interviews underwent thematic content analysis using ATLAS.ti software. This analysis facilitated the organization and thematic examination of the textual narratives derived from the interviews.


In this study, scholars discussed their perceptions about the benefits of using preprint servers in scholarly work such as rapid dissemination of information and open access, but also raised concerns regarding the lack of peer review for the studies uploaded to these servers.


These findings emphasize the growing, yet diverse, role preprint servers play in scholarly communication and their differential impact across academic disciplines.

URL : Exploring scholarly perceptions of preprint servers


Where there’s a will there’s a way: ChatGPT is used more for science in countries where it is prohibited

Authors : Honglin Bao, Mengyi Sun, Misha Teplitskiy

Regulating AI has emerged as a key societal challenge, but which methods of regulation are effective is unclear. Here, we measure the effectiveness of restricting AI services geographically using the case of ChatGPT and science. OpenAI prohibits access to ChatGPT from several countries including China and Russia.

If the restrictions are effective, there should be minimal use of ChatGPT in prohibited countries. We measured use by developing a classifier based on prior work showing that early versions of ChatGPT overrepresented distinctive words like “delve.”

We trained the classifier on abstracts before and after ChatGPT “polishing” and validated it on held-out abstracts and those where authors self-declared to have used AI, where it substantially outperformed off-the-shelf LLM detectors GPTZero and ZeroGPT. Applying the classifier to preprints from Arxiv, BioRxiv, and MedRxiv reveals that ChatGPT was used in approximately 12.6% of preprints by August 2023 and use was 7.7% higher in countries without legal access.

Crucially, these patterns appeared before the first major legal LLM became widely available in China, the largest restricted-country preprint producer. ChatGPT use was associated with higher views and downloads, but not citations or journal placement.

Overall, restricting ChatGPT geographically has proven ineffective in science and possibly other domains, likely due to widespread workarounds.


Virtual academic conferencing: a scoping review of 1984-2021 literature. Novel modalities vs. long standing challenges in scholarly communication

Authors : Agnieszka Olechnicka, Adam Ploszaj, Ewa Zegler-Poleska

This study reviews the literature on virtual academic conferences, which have gained significant attention due to the COVID-19 pandemic. We conducted a scoping review, analyzing 147 documents available up to October 5th, 2021.

We categorized this literature, identified main themes, examined theoretical approaches, evaluated empirical findings, and synthesized the advantages and disadvantages of virtual academic conferences. We find that the existing literature on virtual academic conferences is mainly descriptive and lacks a solid theoretical framework for studying the phenomenon.

Despite the rapid growth of the literature documenting and discussing virtual conferencing induced by the pandemic, the understanding of the phenomenon is limited. We provide recommendations for future research on academic virtual conferences: their impact on research productivity, quality, and collaboration; relations to social, economic, and geopolitical inequalities in science; and their environmental aspects.

We stress the need for further research encompassing the development of a theoretical framework that will guide empirical studies.


Open Science at the Generative AI Turn: An Exploratory Analysis of Challenges and Opportunities

Authors : Mohammad Hosseini, Serge P.J.M. Horbach, Kristi L. Holmes, Tony Ross-Hellauer

Technology influences Open Science (OS) practices, because conducting science in transparent, accessible, and participatory ways requires tools/platforms for collaborative research and sharing results. Due to this direct relationship, characteristics of employed technologies directly impact OS objectives. Generative Artificial Intelligence (GenAI) models are increasingly used by researchers for tasks such as text refining, code generation/editing, reviewing literature, data curation/analysis.

GenAI promises substantial efficiency gains but is currently fraught with limitations that could negatively impact core OS values such as fairness, transparency and integrity, and harm various social actors. In this paper, we explore possible positive and negative impacts of GenAI on OS.

We use the taxonomy within the UNESCO Recommendation on Open Science to systematically explore the intersection of GenAI and OS. We conclude that using GenAI could advance key OS objectives by further broadening meaningful access to knowledge, enabling efficient use of infrastructure, improving engagement of societal actors, and enhancing dialogue among knowledge systems.

However, due to GenAI limitations, it could also compromise the integrity, equity, reproducibility, and reliability of research, while also having potential implications for the political economy of research and its infrastructure. Hence, sufficient checks, validation and critical assessments are essential when incorporating GenAI into research workflows.

URL : Open Science at the Generative AI Turn: An Exploratory Analysis of Challenges and Opportunities


Disciplinary Differences and Scholarly Literature: Discovery, Browsing, and Formats

Authors : Chad E. Buckley, Rachel E. Scott, Anne Shelley, Cassie Thayer-Styes, Julie A Murphy

This study reports faculty experiences regarding the discovery of scholarly content, highlighting similarities and differences across a range of academic disciplines. The authors interviewed twenty-five faculty members at a public, high-research university in the Midwest to explore the intersections of discovery, browsing, and format from diverse disciplinary perspectives.

Although most participants rely on similar discovery tools such as library catalogs and databases and Google Scholar, their discovery techniques varied according to the discipline and type of research being done. Browsing is not a standard method for discovery, but it is still done selectively and strategically by some scholars.

Journal articles are the most important format across disciplines, but books, chapters, and conference proceedings are core for some scholars and should be considered when facilitating discovery. The findings detail several ways in which disciplinary and personal experiences shape scholars’ practices.

The authors discuss the perceived disconnect between browsability, discovery, and access of scholarly literature and explore solutions that make the library central to discovery and browsing.