Enabling preprint discovery, evaluation, and analysis with Europe PMC

Authors : Mariia Levchenko, Michael Parkin, Johanna McEntyre, Melissa Harrison

Preprints provide an indispensable tool for rapid and open communication of early research findings. Preprints can also be revised and improved based on scientific commentary uncoupled from journal-organised peer review. The uptake of preprints in the life sciences has increased significantly in recent years, especially during the COVID-19 pandemic, when immediate access to research findings became crucial to address the global health emergency.

With ongoing expansion of new preprint servers, improving discoverability of preprints is a necessary step to facilitate wider sharing of the science reported in preprints. To address the challenges of preprint visibility and reuse, Europe PMC, an open database of life science literature, began indexing preprint abstracts and metadata from several platforms in July 2018. Since then, Europe PMC has continued to increase coverage through addition of new servers, and expanded its preprint initiative to include the full text of preprints related to COVID-19 in July 2020 and then the full text of preprints supported by the Europe PMC funder consortium in April 2022.

The preprint collection can be searched via the website and programmatically, with abstracts and the open access full text of COVID-19 and Europe PMC funder preprint subsets available for bulk download in a standard machine-readable JATS XML format. This enables automated information extraction for large-scale analyses of the preprint corpus, accelerating scientific research of the preprint literature itself.

This publication describes steps taken to build trust, improve discoverability, and support reuse of life science preprints in Europe PMC. Here we discuss the benefits of indexing preprints alongside peer-reviewed publications, and challenges associated with this process.

URL : Enabling preprint discovery, evaluation, and analysis with Europe PMC

DOI : https://doi.org/10.1101/2024.04.19.590240

PubTator 3.0: an AI-powered literature resource for unlocking biomedical knowledge

Authors  : Chih-Hsuan Wei, Alexis Allot, Po-Ting Lai, Robert Leaman, Shubo Tian, Ling Luo, Qiao Jin, Zhizheng Wang, Qingyu Chen, Zhiyong Lu

PubTator 3.0 (https://www.ncbi.nlm.nih.gov/research/pubtator3/) is a biomedical literature resource using state-of-the-art AI techniques to offer semantic and relation searches for key concepts like proteins, genetic variants, diseases and chemicals. It currently provides over one billion entity and relation annotations across approximately 36 million PubMed abstracts and 6 million full-text articles from the PMC open access subset, updated weekly.

PubTator 3.0’s online interface and API utilize these precomputed entity relations and synonyms to provide advanced search capabilities and enable large-scale analyses, streamlining many complex information needs. We showcase the retrieval quality of PubTator 3.0 using a series of entity pair queries, demonstrating that PubTator 3.0 retrieves a greater number of articles than either PubMed or Google Scholar, with higher precision in the top 20 results.

We further show that integrating ChatGPT (GPT-4) with PubTator APIs dramatically improves the factuality and verifiability of its responses. In summary, PubTator 3.0 offers a comprehensive set of features and tools that allow researchers to navigate the ever-expanding wealth of biomedical literature, expediting research and unlocking valuable insights for scientific discovery.

URL : PubTator 3.0: an AI-powered literature resource for unlocking biomedical knowledge

DOI : https://doi.org/10.1093/nar/gkae235

A survey of how biology researchers assess credibility when serving on grant and hiring committees

Authors : Iain Hrynaszkiewicz, Beruria Novich, James Harney, Veronique Kiermer

Researchers who serve on grant review and hiring committees have to make decisions about the intrinsic value of research in short periods of time, and research impact metrics such Journal Impact Factor (JIF) exert undue influence on these decisions. Initiatives such as the Coalition for Advancing Research Assessment (CoARA) and the Declaration on Research Assessment (DORA) emphasize responsible use of quantitative metrics and avoidance of journal-based impact metrics for research assessment. Further, our previous qualitative research suggested that assessing credibility, or trustworthiness, of research is important to researchers not only when they seek to inform their own research but also in the context of research assessment committees.

To confirm our findings from previous interviews in quantitative terms, we surveyed 485 biology researchers who have served on committees for grant review or hiring and promotion decisions, to understand how they assess the credibility of research outputs in these contexts. We found that concepts like credibility, trustworthiness, quality and impact lack consistent definitions and interpretations by researchers, which had already been observed in our interviews.

We also found that assessment of credibility is very important to most (81%) of researchers serving in these committees but fewer than half of respondents are satisfied with their ability to assess credibility. A substantial proportion of respondents (57% of respondents) report using journal reputation and JIF to assess credibility – proxies that research assessment reformers consider inappropriate to assess credibility because they don’t rely on intrinsic characteristics of the research.

This gap between importance of an assessment and satisfaction in the ability to conduct it was reflected in multiple aspects of credibility we tested and it was greatest for researchers seeking to assess the integrity of research (such as identifying signs of fabrication, falsification, or plagiarism), and the suitability and completeness of research methods. Non-traditional research outputs associated with Open Science practices – research data, code, protocol and preprints sharing – are particularly hard for researchers to assess, despite the potential of Open Science practices to signal trustworthiness.

Our results suggest opportunities to develop better guidance and better signals to support the evaluation of research credibility and trustworthiness – and ultimately support research assessment reform, away from the use of inappropriate proxies for impact and towards assessing the intrinsic characteristics and values researchers see as important.

DOI : https://doi.org/10.31222/osf.io/ht836

Reporting of interventional clinical trial results in a French academic center: a survey of completed studies

Authors : Anne Sophie Alix Doucet, Constant VINATIER, Loïc Fin, Hervé Léna, Hélène Rangé, Clara Locher, Florian Naudet

Background: The dissemination of clinical trial results is an important scientific and ethical endeavour. This survey of completed interventional studies in a French academic center describes their reporting status.

Methods: We explored all interventional studies sponsored by Rennes University Hospital identified on the French Open Science Monitor which tracks trials registered on EUCTR or clinicaltrials.gov, and provides an automatic assessment of the reporting of results. For each study, we ascertained the actual reporting of results using systematic searches on the hospital internal database, bibliographic databases (Google Scholar, PubMed), and by contacting all principal investigators (PIs). We describe several features (including total budget and numbers of trial participants) of the studies that did not report any results.

Results: The French Open Science Monitor identified 93 interventional studies, among which 10 (11%) reported results. In contrast, our survey identified 36 studies (39%) reporting primary analysis results and an additional 18 (19%) reporting results for secondary analyses (without results for their primary analysis). The overall budget for studies that did not report any results was estimated to be €5,051,253 for a total of 6,735 trial participants. The most frequent reasons for the absence of results reported by PIs were lack of time for 18 (42%), and logistic difficulties (e.g. delay in obtaining results or another blocking factor) for 12 (28%). An association was found between non-publication and negative results (adjusted Odds Ratio = 4.70, 95% Confidence Interval [1.67;14.11]).

Conclusions: Even allowing for the fact that automatic searches underestimate the number of studies with published results, the level of reporting was disappointingly low. This amounts to a waste of trial participants’ implication and money. Corrective actions are needed.

URL : Reporting of interventional clinical trial results in a French academic center: a survey of completed studies

DOI : https://doi.org/10.21203/rs.3.rs-3782467/v1

Does it pay to pay? A comparison of the benefits of open-access publishing across various sub-fields in biology

Authors : Amanda D. Clark, Tanner C. Myers, Todd D. Steury, Ali Krzton et al.

Authors are often faced with the decision of whether to maximize traditional impact metrics or minimize costs when choosing where to publish the results of their research. Many subscription-based journals now offer the option of paying an article processing charge (APC) to make their work open.

Though such “hybrid” journals make research more accessible to readers, their APCs often come with high price tags and can exclude authors who lack the capacity to pay to make their research accessible.

Here, we tested if paying to publish open access in a subscription-based journal benefited authors by conferring more citations relative to closed access articles. We identified 146,415 articles published in 152 hybrid journals in the field of biology from 2013–2018 to compare the number of citations between various types of open access and closed access articles.

In a simple generalized linear model analysis of our full dataset, we found that publishing open access in hybrid journals that offer the option confers an average citation advantage to authors of 17.8 citations compared to closed access articles in similar journals.

After taking into account the number of authors, Journal Citation Reports 2020 Quartile, year of publication, and Web of Science category, we still found that open access generated significantly more citations than closed access (p < 0.0001).

However, results were complex, with exact differences in citation rates among access types impacted by these other variables. This citation advantage based on access type was even similar when comparing open and closed access articles published in the same issue of a journal (p < 0.0001).

However, by examining articles where the authors paid an article processing charge, we found that cost itself was not predictive of citation rates (p = 0.14). Based on our findings of access type and other model parameters, we suggest that, in the case of the 152 journals we analyzed, paying for open access does confer a citation advantage.

For authors with limited budgets, we recommend pursuing open access alternatives that do not require paying a fee as they still yielded more citations than closed access. For authors who are considering where to submit their next article, we offer additional suggestions on how to balance exposure via citations with publishing costs.

URL : Does it pay to pay? A comparison of the benefits of open-access publishing across various sub-fields in biology

DOI : https://doi.org/10.7717/peerj.16824

The Nexus of Open Science and Innovation: Insights from Patent Citations

Author : Abdelghani Maddi

This paper aims to analyze the extent to which inventive activity relies on open science. In other words, it investigates whether inventors utilize Open Access (OA) publications more than subscription-based ones, especially given that some inventors may lack institutional access.

To achieve this, we utilized the (Marx, 2023) database, which contains citations of patents to scientific publications (Non-Patent References-NPRs). We focused on publications closely related to invention, specifically those cited solely by inventors within the body of patent texts. Our dataset was supplemented by OpenAlex data.

The final sample comprised 961,104 publications cited in patents, of which 861,720 had a DOI. Results indicate that across all disciplines, OA publications are 38% more prevalent in patent citations (NPRs) than in the overall OpenAlex database.

In biology and medicine, inventors use 73% and 27% more OA publications, respectively, compared to closed-access ones. Chemistry and computer science are also disciplines where OA publications are more frequently utilized in patent contexts than subscription-based ones.

HAL : https://cnrs.hal.science/hal-04454843

Is gold open access helpful for academic purification? A causal inference analysis based on retracted articles in biochemistry

Authors : Er-Te Zheng, Zhichao Fang, Hui-Zhen Fu

The relationship between transparency and credibility has long been a subject of theoretical and analytical exploration within the realm of social sciences, and it has recently attracted increasing attention in the context of scientific research. Retraction serves as a pivotal mechanism in addressing concerns about research integrity.

This study aims to empirically examining the relationship between open access level and the effectiveness of current mechanism, specifically academic purification centered on retracted articles. In this study, we used matching and Difference-in-Difference (DiD) methods to examine whether gold open access is helpful for academic purification in biochemistry field.

We collected gold open access (Gold OA) and non-open access (non-OA) biochemistry retracted articles as the treatment group, and matched them with corresponding unretracted articles as the control group from 2005 to 2021 based on Web of Science and Retraction Watch database.

The results showed that compared to non-OA, Gold OA is advantageous in reducing the retraction time of flawed articles, but does not demonstrate a significant advantage in reducing citations after retraction. This indicates that Gold OA may help expedite the detection and retraction of flawed articles, ultimately promoting the practice of responsible research.

DOI : https://doi.org/10.1016/j.ipm.2023.103640