An analysis of the effects of sharing research data, code, and preprints on citations

Authors : Giovanni Colavizza, Lauren Cadwallader, Marcel LaFlamme, Grégory Dozot, Stéphane Lecorney, Daniel Rappo, Iain Hrynaszkiewicz

Calls to make scientific research more open have gained traction with a range of societal stakeholders. Open Science practices include but are not limited to the early sharing of results via preprints and openly sharing outputs such as data and code to make research more reproducible and extensible. Existing evidence shows that adopting Open Science practices has effects in several domains.

In this study, we investigate whether adopting one or more Open Science practices leads to significantly higher citations for an associated publication, which is one form of academic impact. We use a novel dataset known as Open Science Indicators, produced by PLOS and DataSeer, which includes all PLOS publications from 2018 to 2023 as well as a comparison group sampled from the PMC Open Access Subset. In total, we analyze circa 122’000 publications. We calculate publication and author-level citation indicators and use a broad set of control variables to isolate the effect of Open Science Indicators on received citations.

We show that Open Science practices are adopted to different degrees across scientific disciplines. We find that the early release of a publication as a preprint correlates with a significant positive citation advantage of about 20.2% on average. We also find that sharing data in an online repository correlates with a smaller yet still positive citation advantage of 4.3% on average.

However, we do not find a significant citation advantage for sharing code. Further research is needed on additional or alternative measures of impact beyond citations. Our results are likely to be of interest to researchers, as well as publishers, research funders, and policymakers.

Arxiv : https://arxiv.org/abs/2404.16171

Between Flat-Earthers and Fitness Coaches: Who is Citing Scientific Publications in YouTube Video Descriptions?

Authors : Olga Zagovora, Katrin Weller

In this study, we undertake an extensive analysis of YouTube channels that reference research publications in their video descriptions, offering a unique insight into the intersection of digital media and academia. Our investigation focuses on three principal aspects: the background of YouTube channel owners, their thematic focus, and the nature of their operational dynamics, specifically addressing whether they work individually or in groups. Our results highlight a strong emphasis on content related to science and engineering, as well as health, particularly in channels managed by individual researchers and academic institutions.

However, there is a notable variation in the popularity of these channels, with professional YouTubers and commercial media entities often outperforming in terms of viewer engagement metrics like likes, comments, and views. This underscores the challenge academic channels face in attracting a wider audience. Further, we explore the role of academic actors on YouTube, scrutinizing their impact in disseminating research and the types of publications they reference.

Despite a general inclination towards professional academic topics, these channels displayed a varied effectiveness in spotlighting highly cited research. Often, they referenced a wide array of publications, indicating a diverse but not necessarily impact-focused approach to content selection.

Arxiv : https://arxiv.org/abs/2404.15083

Rapport d’Enquête Création d’une revue d’articles sur des jeux de données Data Journal SHS

Auteur.ices/Authors : Laurence Bizien, Véronique Cohoner, Fiona Edmond, Arnaud Natal, Pierre Peraldi-Mittelette

La présente enquête a été menée dans le cadre du projet de création d’une revue de données interdisciplinaire en Sciences Humaines et Sociales à l’horizon 2025. Le groupe de travail (GT) œuvrant à ce projet a vu le jour suite à la journée d’études organisée par la Maison des Sciences de l’Homme Lorraine le 10 mars 2023; intitulée : « Un data journal interdisciplinaire pour les sciences humaines et sociales. Enjeux scientifiques et mise en œuvre pratique »

URL : Rapport d’Enquête Création d’une revue d’articles sur des jeux de données Data Journal SHS

HAL : https://hal.univ-lorraine.fr/hal-04541094

The role of non-scientific factors vis-a-vis the quality of publications in determining their scholarly impact

Authors : Giovanni Abramo, Ciriaco Andrea D’Angelo, Leonardo Grilli

In the evaluation of scientific publications’ impact, the interplay between intrinsic quality and non-scientific factors remains a subject of debate. While peer review traditionally assesses quality, bibliometric techniques gauge scholarly impact. This study investigates the role of non-scientific attributes alongside quality scores from peer review in determining scholarly impact.

Leveraging data from the first Italian Research Assessment Exercise (VTR 2001-2003) and Web of Science citations, we analyse the relationship between quality scores, non-scientific factors, and publication short- and long-term impact.

Our findings shed light on the significance of non-scientific elements overlooked in peer review, offering policymakers and research management insights in choosing evaluation methodologies. Sections delve into the debate, identify non-scientific influences, detail methodologies, present results, and discuss implications.

Arxiv : https://arxiv.org/abs/2404.05345

Sentiment Analysis of Citations in Scientific Articles Using ChatGPT: Identifying Potential Biases and Conflicts of Interest

Author : Walid Hariri

Scientific articles play a crucial role in advancing knowledge and informing research directions. One key aspect of evaluating scientific articles is the analysis of citations, which provides insights into the impact and reception of the cited works. This article introduces the innovative use of large language models, particularly ChatGPT, for comprehensive sentiment analysis of citations within scientific articles.

By leveraging advanced natural language processing (NLP) techniques, ChatGPT can discern the nuanced positivity or negativity of citations, offering insights into the reception and impact of cited works. Furthermore, ChatGPT’s capabilities extend to detecting potential biases and conflicts of interest in citations, enhancing the objectivity and reliability of scientific literature evaluation.

This study showcases the transformative potential of artificial intelligence (AI)-powered tools in enhancing citation analysis and promoting integrity in scholarly research.

Arxiv : https://arxiv.org/abs/2404.01800

Text mining arXiv: a look through quantitative finance papers

Author : Michele Leonardo Bianchi

This paper explores articles hosted on the arXiv preprint server with the aim to uncover valuable insights hidden in this vast collection of research. Employing text mining techniques and through the application of natural language processing methods, we examine the contents of quantitative finance papers posted in arXiv from 1997 to 2022.

We extract and analyze crucial information from the entire documents, including the references, to understand the topics trends over time and to find out the most cited researchers and journals on this domain. Additionally, we compare numerous algorithms to perform topic modeling, including state-of-the-art approaches.

Arxiv : https://arxiv.org/abs/2401.01751