Entre loi et modèles : variations autour des concepts Zipfiens

Auteurs/Authors : Marc Bertin, Thierry Lafouge

La loi de Zipf s’intéresse aux phénomènes de régularité dans les différents domaines de la connaissance. La régularité mise en exergue ici est celle de la fréquence des mots dans un texte qui s’ancre historiquement autour de l’ingénierie linguistique. Nous présentons les modèles historiques à travers une formalisation mathématique commune afin de mieux appréhender l’intelligibilité des modèles historiques proposés dans la littérature et de discuter de la controverse entre Mandelbrot et Simon.

Nous nous interrogeons sur sa nature et sa résilience à travers une discussion bibliométrique et lexicographique. En s’appuyant sur la position de Kendall, la conclusion positionnera la loi de Zipf par rapport au SHS.

URL : https://intelligibilite-numerique.numerev.com/numeros/n-3-2022/2628-entre-loi-et-modeles-variations-autour-des-concepts-zipfiens

Preprint citation practice in PLOS

Authors : Marc Bertin, Iana Atanassova

The role of preprints in the scientific production and their part in citations have been growing over the past 10 years. In this paper we study preprint citations in several different aspects: the progression of preprint citations over time, their relative frequencies in relation to the IMRaD structure of articles, their distributions over time, per preprint database and per PLOS journal.

We have processed the PLOS corpus that covers 7 journals and a total of about 240,000 articles up to January 2021, and produced a dataset of 8460 preprint citation contexts that cite 12 different preprint databases.

Our results show that preprint citations are found with the highest frequency in the Method section of articles, though small variations exist with respect to journals. The PLOS Computational Biology journal stands out as it contains more than three times more preprint citations than any other PLOS journal.

The relative parts of the different preprint databases are also examined. While ArXiv and bioRxiv are the most frequent citation sources, bioRxiv’s disciplinary nature can be observed as it is the source of more than 70% of preprint citations in PLOS Biology, PLOS Genetics and PLOS Pathogens.

We have also compared the lexical content of preprint citation contexts to the citation content to peer-reviewed publications. Finally, by performing a lexicometric analysis, we have shown that preprint citation contexts differ significantly from citation contexts of peer-reviewed publications.

This confirms that authors make use of different lexical content when citing preprints compared to the rest of citations.

URL : Preprint citation practice in PLOS

DOI : https://doi.org/10.1007/s11192-022-04388-5