Enabling preprint discovery, evaluation, and analysis with Europe PMC

Authors : Mariia Levchenko, Michael Parkin, Johanna McEntyre, Melissa Harrison

Preprints provide an indispensable tool for rapid and open communication of early research findings. Preprints can also be revised and improved based on scientific commentary uncoupled from journal-organised peer review. The uptake of preprints in the life sciences has increased significantly in recent years, especially during the COVID-19 pandemic, when immediate access to research findings became crucial to address the global health emergency.

With ongoing expansion of new preprint servers, improving discoverability of preprints is a necessary step to facilitate wider sharing of the science reported in preprints. To address the challenges of preprint visibility and reuse, Europe PMC, an open database of life science literature, began indexing preprint abstracts and metadata from several platforms in July 2018. Since then, Europe PMC has continued to increase coverage through addition of new servers, and expanded its preprint initiative to include the full text of preprints related to COVID-19 in July 2020 and then the full text of preprints supported by the Europe PMC funder consortium in April 2022.

The preprint collection can be searched via the website and programmatically, with abstracts and the open access full text of COVID-19 and Europe PMC funder preprint subsets available for bulk download in a standard machine-readable JATS XML format. This enables automated information extraction for large-scale analyses of the preprint corpus, accelerating scientific research of the preprint literature itself.

This publication describes steps taken to build trust, improve discoverability, and support reuse of life science preprints in Europe PMC. Here we discuss the benefits of indexing preprints alongside peer-reviewed publications, and challenges associated with this process.

URL : Enabling preprint discovery, evaluation, and analysis with Europe PMC

DOI : https://doi.org/10.1101/2024.04.19.590240

To Open or Not to Open: An Exploration of Faculty Decisions to Publish Open-Access Article

Authors : Jessica Kirschner, Hillary Miller, Preeti Kamat, Jose Alcaine, Sergio Chaparro, Nina Exner


Faculty face numerous pressures as they decide whether to publish articles open access (OA). This pilot study investigated the extent to which School of Education faculty members’ engagement with OA was influenced by promotion and tenure (P&T) and how this influence related to other intrinsic, extrinsic, and contextual factors.


This exploratory, sequential, mixed-method study adapted Social Exchange Theory to understand faculty engagement with OA article publication. The study used a quantitative survey followed by qualitative interviews and focus groups.


Participants reported that P&T had substantive influence over faculty practices regarding OA. Connected factors included beliefs about OA journal quality, colleagues’ perceptions regarding OA, and OA articles’ wider impacts.


P&T was an important driver in article publishing decisions. However, when discussing OA in P&T, faculty also discussed a range of related issues such as OA journal quality. Furthermore, OA adopters tended to be those who have even stronger beliefs about the impact of OA than about OA’s role in P&T.

URL : To Open or Not to Open: An Exploration of Faculty Decisions to Publish Open-Access Article

DOI : https://doi.org/10.31274/jlsc.16894

An analysis of the effects of sharing research data, code, and preprints on citations

Authors : Giovanni Colavizza, Lauren Cadwallader, Marcel LaFlamme, Grégory Dozot, Stéphane Lecorney, Daniel Rappo, Iain Hrynaszkiewicz

Calls to make scientific research more open have gained traction with a range of societal stakeholders. Open Science practices include but are not limited to the early sharing of results via preprints and openly sharing outputs such as data and code to make research more reproducible and extensible. Existing evidence shows that adopting Open Science practices has effects in several domains.

In this study, we investigate whether adopting one or more Open Science practices leads to significantly higher citations for an associated publication, which is one form of academic impact. We use a novel dataset known as Open Science Indicators, produced by PLOS and DataSeer, which includes all PLOS publications from 2018 to 2023 as well as a comparison group sampled from the PMC Open Access Subset. In total, we analyze circa 122’000 publications. We calculate publication and author-level citation indicators and use a broad set of control variables to isolate the effect of Open Science Indicators on received citations.

We show that Open Science practices are adopted to different degrees across scientific disciplines. We find that the early release of a publication as a preprint correlates with a significant positive citation advantage of about 20.2% on average. We also find that sharing data in an online repository correlates with a smaller yet still positive citation advantage of 4.3% on average.

However, we do not find a significant citation advantage for sharing code. Further research is needed on additional or alternative measures of impact beyond citations. Our results are likely to be of interest to researchers, as well as publishers, research funders, and policymakers.

Arxiv : https://arxiv.org/abs/2404.16171

Text mining arXiv: a look through quantitative finance papers

Author : Michele Leonardo Bianchi

This paper explores articles hosted on the arXiv preprint server with the aim to uncover valuable insights hidden in this vast collection of research. Employing text mining techniques and through the application of natural language processing methods, we examine the contents of quantitative finance papers posted in arXiv from 1997 to 2022.

We extract and analyze crucial information from the entire documents, including the references, to understand the topics trends over time and to find out the most cited researchers and journals on this domain. Additionally, we compare numerous algorithms to perform topic modeling, including state-of-the-art approaches.

Arxiv : https://arxiv.org/abs/2401.01751

On the Fast Track to Full Gold Open Access

Author : Robert Kudelić

The world of scientific publishing is changing; the days of an old type of subscription-based earnings for publishers seem over, and we are entering a new era. It seems as if an ever-increasing number of journals from disparate publishers are going Gold, Open Access that is, yet have we rigorously ascertained the issue in its entirety, or are we touting the strengths and forgetting about constructive criticism and careful weighing of evidence?

We will therefore present the current state of the art, in a compact review/bibliometrics style, of this more relevant than ever hot topic and suggest solutions that are most likely to be acceptable to all parties–while the performed analysis also shows there seems to be a link between trends in scientific publishing and tumultuous world events, which in turn has a special significance for the publishing environment in the current world stage.

URL : On the Fast Track to Full Gold Open Access

Arxiv : https://arxiv.org/abs/2311.08313

To preprint or not to preprint: A global researcher survey

Authors : Rong Ni, Ludo Waltman

Open science is receiving widespread attention globally, and preprinting offers an important way to implement open science practices in scholarly publishing. To develop a systematic understanding of researchers’ adoption of and attitudes toward preprinting, we conducted a survey of authors of research papers published in 2021 and early 2022. Our survey results show that the United States and Europe led the way in the adoption of preprinting.

The United States and European respondents reported a higher familiarity with and a stronger commitment to preprinting than their colleagues elsewhere in the world. The adoption of preprinting is much stronger in physics and astronomy as well as mathematics and computer science than in other research areas. Respondents identified free accessibility of preprints and acceleration of research communication as the most important benefits of preprinting.

Low reliability and credibility of preprints, sharing results before peer review and premature media coverage are the most significant concerns about preprinting, emphasized in particular by respondents in the life and health sciences. According to respondents, the most crucial strategies to encourage preprinting are integrating preprinting into journal submission workflows and providing recognition for posting preprints.

URL : To preprint or not to preprint: A global researcher survey

DOI : https://doi.org/10.1002/asi.24880