An open dataset of article processing charges from six large scholarly publishers

Authors : Leigh-Ann Butler, Madelaine Hare, Nina Schönfelder, Eric Schares, Juan Pablo Alperin, Stefanie Haustein

This paper introduces a dataset of article processing charges (APCs) produced from the price lists of six large scholarly publishers – Elsevier, Frontiers, PLOS, MDPI, Springer Nature and Wiley – between 2019 and 2023.

APC price lists were downloaded from publisher websites each year as well as via Wayback Machine snapshots to retrieve fees per journal per year. The dataset includes journal metadata, APC collection method, and annual APC price list information in several currencies (USD, EUR, GBP, CHF, JPY, CAD) for 8,712 unique journals and 36,618 journal-year combinations.

The dataset was generated to allow for more precise analysis of APCs and can support library collection development and scientometric analysis estimating APCs paid in gold and hybrid OA journals.

URL : An open dataset of article processing charges from six large scholarly publishers

Arxiv : https://arxiv.org/abs/2406.08356

An analysis of the effects of sharing research data, code, and preprints on citations

Authors : Giovanni Colavizza, Lauren Cadwallader, Marcel LaFlamme, Grégory Dozot, Stéphane Lecorney, Daniel Rappo, Iain Hrynaszkiewicz

Calls to make scientific research more open have gained traction with a range of societal stakeholders. Open Science practices include but are not limited to the early sharing of results via preprints and openly sharing outputs such as data and code to make research more reproducible and extensible. Existing evidence shows that adopting Open Science practices has effects in several domains.

In this study, we investigate whether adopting one or more Open Science practices leads to significantly higher citations for an associated publication, which is one form of academic impact. We use a novel dataset known as Open Science Indicators, produced by PLOS and DataSeer, which includes all PLOS publications from 2018 to 2023 as well as a comparison group sampled from the PMC Open Access Subset. In total, we analyze circa 122’000 publications. We calculate publication and author-level citation indicators and use a broad set of control variables to isolate the effect of Open Science Indicators on received citations.

We show that Open Science practices are adopted to different degrees across scientific disciplines. We find that the early release of a publication as a preprint correlates with a significant positive citation advantage of about 20.2% on average. We also find that sharing data in an online repository correlates with a smaller yet still positive citation advantage of 4.3% on average.

However, we do not find a significant citation advantage for sharing code. Further research is needed on additional or alternative measures of impact beyond citations. Our results are likely to be of interest to researchers, as well as publishers, research funders, and policymakers.

Arxiv : https://arxiv.org/abs/2404.16171

Preprint citation practice in PLOS

Authors : Marc Bertin, Iana Atanassova

The role of preprints in the scientific production and their part in citations have been growing over the past 10 years. In this paper we study preprint citations in several different aspects: the progression of preprint citations over time, their relative frequencies in relation to the IMRaD structure of articles, their distributions over time, per preprint database and per PLOS journal.

We have processed the PLOS corpus that covers 7 journals and a total of about 240,000 articles up to January 2021, and produced a dataset of 8460 preprint citation contexts that cite 12 different preprint databases.

Our results show that preprint citations are found with the highest frequency in the Method section of articles, though small variations exist with respect to journals. The PLOS Computational Biology journal stands out as it contains more than three times more preprint citations than any other PLOS journal.

The relative parts of the different preprint databases are also examined. While ArXiv and bioRxiv are the most frequent citation sources, bioRxiv’s disciplinary nature can be observed as it is the source of more than 70% of preprint citations in PLOS Biology, PLOS Genetics and PLOS Pathogens.

We have also compared the lexical content of preprint citation contexts to the citation content to peer-reviewed publications. Finally, by performing a lexicometric analysis, we have shown that preprint citation contexts differ significantly from citation contexts of peer-reviewed publications.

This confirms that authors make use of different lexical content when citing preprints compared to the rest of citations.

URL : Preprint citation practice in PLOS

DOI : https://doi.org/10.1007/s11192-022-04388-5

Investigating the division of scientific labor using the Contributor Roles Taxonomy (CRediT)

Authors : Vincent Larivière, David Pontille, Cassidy R. Sugimoto

Contributorship statements were introduced by scholarly journals in the late 1990s to provide more details on the specific contributions made by authors to research papers.

After more than a decade of idiosyncratic taxonomies by journals, a partnership between medical journals and standards organizations has led to the establishment, in 2015, of the Contributor Roles Taxonomy (CRediT), which provides a standardized set of 14 research contributions.

Using the data from Public Library of Science (PLOS) journals over the 2017–2018 period (N = 30,054 papers), this paper analyzes how research contributions are divided across research teams, focusing on the association between division of labor and number of authors, and authors’ position and specific contributions.

It also assesses whether some contributions are more likely to be performed in conjunction with others and examines how the new taxonomy provides greater insight into the gendered nature of labor division. The paper concludes with a discussion of results with respect to current issues in research evaluation, science policy, and responsible research practices.

URL : Investigating the division of scientific labor using the Contributor Roles Taxonomy (CRediT)

DOI : https://doi.org/10.1162/qss_a_00097

A survey of researchers’ needs and priorities for data sharing

Authors : Iain Hrynaszkiewicz, James Harney, Lauren Cadwallader

PLOS has long supported Open Science. One of the ways in which we do so is via our stringent data availability policy established in 2014. Despite this policy, and more data sharing policies being introduced by other organizations, best practices for data sharing are adopted by a minority of researchers in their publications. Problems with effective research data sharing persist and these problems have been quantified by previous research as a lack of time, resources, incentives, and/or skills to share data.

In this study we built on this research by investigating the importance of tasks associated with data sharing, and researchers’ satisfaction with their ability to complete these tasks. By investigating these factors we aimed to better understand opportunities for new or improved solutions for sharing data.

In May-June 2020 we surveyed researchers from Europe and North America to rate tasks associated with data sharing on (i) their importance and (ii) their satisfaction with their ability to complete them. We received 728 completed and 667 partial responses. We calculated mean importance and satisfaction scores to highlight potential opportunities for new solutions to and compare different cohorts.

Tasks relating to research impact, funder compliance, and credit had the highest importance scores. 52% of respondents reuse research data but the average satisfaction score for obtaining data for reuse was relatively low. Tasks associated with sharing data were rated somewhat important and respondents were reasonably well satisfied in their ability to accomplish them. Notably, this included tasks associated with best data sharing practice, such as use of data repositories. However, the most common method for sharing data was in fact via supplemental files with articles, which is not considered to be best practice.

We presume that researchers are unlikely to seek new solutions to a problem or task that they are satisfied in their ability to accomplish, even if many do not attempt this task. This implies there are few opportunities for new solutions or tools to meet these researcher needs. Publishers can likely meet these needs for data sharing by working to seamlessly integrate existing solutions that reduce the effort or behaviour change involved in some tasks, and focusing on advocacy and education around the benefits of sharing data.

There may however be opportunities – unmet researcher needs – in relation to better supporting data reuse, which could be met in part by strengthening data sharing policies of journals and publishers, and improving the discoverability of data associated with published articles.

DOI : https://doi.org/10.31219/osf.io/njr5u

“No comment”?: A study of commenting on PLOS articles

Authors : Simon Wakeling, Peter Willett, Claire Creaser, Jenny Fry, Stephen Pinfield, Valerie Spezi, Marc Bonne, Christina Founti, Itzelle Medina Perea

Article commenting functionality allows users to add publically visible comments to an article on a publisher’s website. As well as facilitating forms of post-publication peer review, for publishers of open-access mega-journals (large, broad scope, OA journals that seek to publish all technically or scientifically sound research) comments are also thought to serve as a means for the community to discuss and communicate the significance and novelty of the research, factors which are not assessed during peer review.

In this paper we present the results of an analysis of commenting on articles published by the Public Library of Science (PLOS), publisher of the first and best-known mega-journal PLOS ONE, between 2003 and 2016.

We find that while overall commenting rates are low, and have declined since 2010, there is substantial variation across different PLOS titles. Using a typology of comments developed for this research we also find that only around half of comments engage in an academic discussion of the article, and that these discussions are most likely to focus on the paper’s technical soundness.

Our results suggest that publishers have yet to encourage significant numbers of readers to leave comments, with implications for the effectiveness of commenting as a means of collecting and communicating community perceptions of an article’s importance.

DOI : https://doi.org/10.1177%2F0165551518819965

If funders and libraries subscribed to open access: The case of eLife, PLOS, and BioOne

Authors : John Willinsky​, Matthew Rusk

Following on recent initiatives in which funders and libraries directly fund open access publishing, this study works out the economics of systematically applying this approach to three biomedical and biology publishing entities by determining the publishing costs for the funders that sponsored the research, while assigning the costs for unsponsored articles to the libraries.

The study draws its data from the non-profit biomedical publishers eLife and PLOS, and the nonprofit journal aggregator BioOne, with this sample representing a mix of publishing revenue models, including funder sponsorship, article processing charges (APC), and subscription fees.

This funder-library open access subscription model is proposed as an alternative to both the closed-subscription model, which funders and libraries no longer favor, and the APC open access model, which has limited scalability across scholarly publishing domains.

Utilizing PubMed filtering and manual-sampling strategies, as well as publicly available publisher revenue data, the study demonstrates that in 2015, 86 percent of the articles in eLife and PLOS acknowledged funder support, as did 76 percent of the articles in the largely subscription journals of BioOne. Twelve percent of the articles identified the NIH as a funder, 8 percent identifies other U.S. government agencies.

Approximately half of the articles were funded by non-U.S. government agencies, including 1 percent by Wellcome Trust and 0.5 percent by Howard Hughes Medical Institute. For 17 percent of the articles, which lacked a funder, the study demonstrates how a collection of research libraries, similar to the one currently subscribing to BioOne, could cover publishing costs.

The goal of the study is to inform stakeholder considerations of open access models that can work across the disciplines by (a) providing a cost breakdown for direct funder and library support for open access publishing; (b) positing the use of publishing data-management organizations (such as Crossref and ORCID) to facilitate per article open access support; and (c) proposing ways in which such a model offers a more efficient, equitable, and scalable approach to open access than the prevailing APC model, which originated with biomedical publishing.

URL : If funders and libraries subscribed to open access: The case of eLife, PLOS, and BioOne

DOI : https://doi.org/10.7287/peerj.preprints.3392v1