From academic to media capital: To what extent does the scientific reputation of universities translate into Wikipedia attention?

Authors : Wenceslao Arroyo-MachadoAdrián A. Díaz-FaesEnrique Herrera-ViedmaRodrigo Costas

Universities face increasing demands to improve their visibility, public outreach, and online presence. There is a broad consensus that scientific reputation significantly increases the attention universities receive.

However, in most cases estimates of scientific reputation are based on composite or weighted indicators and absolute positions in university rankings. In this study, we adopt a more granular approach to assessment of universities’ scientific performance using a multidimensional set of indicators from the Leiden Ranking and testing their individual effects on university Wikipedia page views.

We distinguish between international and local attention and find a positive association between research performance and Wikipedia attention which holds for regions and linguistic areas. Additional analysis shows that productivity, scientific impact, and international collaboration have a curvilinear effect on universities’ Wikipedia attention.

This finding suggests that there may be other factors than scientific reputation driving the general public’s interest in universities. Our study adds to a growing stream of work which views altmetrics as tools to deepen science–society interactions rather than direct measures of impact and recognition of scientific outputs.

URL : From academic to media capital: To what extent does the scientific reputation of universities translate into Wikipedia attention?


Gender and country biases in Wikipedia citations to scholarly publications

Authors : Xiang Zheng, Jiajing Chen, Erjia Yan, Chaoqun Ni

Ensuring Wikipedia cites scholarly publications based on quality and relevancy without biases is critical to credible and fair knowledge dissemination. We investigate gender- and country-based biases in Wikipedia citation practices using linked data from the Web of Science and a Wikipedia citation dataset.

Using coarsened exact matching, we show that publications by women are cited less by Wikipedia than expected, and publications by women are less likely to be cited than those by men. Scholarly publications by authors affiliated with non-Anglosphere countries are also disadvantaged in getting cited by Wikipedia, compared with those by authors affiliated with Anglosphere countries.

The level of gender- or country-based inequalities varies by research field, and the gender-country intersectional bias is prominent in math-intensive STEM fields. To ensure the credibility and equality of knowledge presentation, Wikipedia should consider strategies and guidelines to cite scholarly publications independent of the gender and country of authors.

URL : Gender and country biases in Wikipedia citations to scholarly publications


Exploring open access coverage of Wikipedia-cited research across the White Rose Universities

Authors : Andy Tattersall, Nick Sheppard, Thom Blake, Kate O’Neill, Christopher Carroll

The popular online encyclopaedia Wikipedia is an important and influential platform that assists with the communication of science to a global audience. Using data obtained from and Unpaywall, we looked at research from the White Rose Universities (Sheffield, Leeds and York) that is cited on Wikipedia.

Of that research, we explored what percentage of citations were available open access (OA) and the location of those citations to ascertain whether they were hosted by publishers or within OA repositories.

This article explores the importance of access to OA research within such an important and leading platform as Wikipedia and how well it supports effective scientific communication across society.

URL : Exploring open access coverage of Wikipedia-cited research across the White Rose Universities


Citation needed? Wikipedia and the COVID-19 pandemic

Authors : Omer Benjakob, Rona Aviram, Jonathan Sobel

With the COVID-19 pandemic’s outbreak at the beginning of 2020, millions across the world flocked to Wikipedia to read about the virus. Our study offers an in-depth analysis of the scientific backbone supporting Wikipedia’s COVID-19 articles.

Using references as a readout, we asked which sources informed Wikipedia’s growing pool of COVID-19-related articles during the pandemic’s first wave (January-May 2020). We found that coronavirus-related articles referenced trusted media sources and cited high-quality academic research.

Moreover, despite a surge in preprints, Wikipedia’s COVID-19 articles had a clear preference for open-access studies published in respected journals and made little use of non-peer-reviewed research up-loaded independently to academic servers.

Building a timeline of COVID-19 articles on Wikipedia from 2001-2020 revealed a nuanced trade-off between quality and timeliness, with a growth in COVID-19 article creation and citations, from both academic research and popular media.

It further revealed how preexisting articles on key topics related to the virus created a frame-work on Wikipedia for integrating new knowledge. This “scientific infrastructure” helped provide context, and regulated the influx of new information into Wikipedia.

Lastly, we constructed a network of DOI-Wikipedia articles, which showed the landscape of pandemic-related knowledge on Wikipedia and revealed how citations create a web of scientific knowledge to support coverage of scientific topics like COVID-19 vaccine development.

Understanding how scientific research interacts with the digital knowledge-sphere during the pandemic provides insight into how Wikipedia can facilitate access to science. It also sheds light on how Wikipedia successfully fended of disinformation on the COVID-19 and may provide insight into how its unique model may be deployed in other contexts.

‘I Updated the ‘: The Evolution of References in the English Wikipedia and the Implications for Altmetrics

Authors : Olga Zagovora, Roberto Ulloa, Katrin Weller, Fabian Flöck

With this work, we present a publicly available dataset of the history of all the references (more than 55 million) ever used in the English Wikipedia until June 2019. We have applied a new method for identifying and monitoring references in Wikipedia, so that for each reference we can provide data about associated actions: creation, modifications, deletions, and reinsertions.

The high accuracy of this method and the resulting dataset was confirmed via a comprehensive crowdworker labelling campaign. We use the dataset to study the temporal evolution of Wikipedia references as well as users’ editing behaviour.

We find evidence of a mostly productive and continuous effort to improve the quality of references: (1) there is a persistent increase of reference and document identifiers (DOI, PubMedID, PMC, ISBN, ISSN, ArXiv ID), and (2) most of the reference curation work is done by registered humans (not bots or anonymous editors).

We conclude that the evolution of Wikipedia references, including the dynamics of the community processes that tend to them should be leveraged in the design of relevance indexes for altmetrics, and our dataset can be pivotal for such effort.


Collaborer sur Wikipédia pour co-construire une société de la connaissance : Opportunités, défis et enjeux pour le monde universitaire

Auteur/Author : Sawsan Attallah Bidart

Dans cet article, nous étudierons le potentiel de Wikipédia, en tant que plateforme collaborative en ligne populaire, en relation avec la co-construction d’une société de la connaissance dont la voie peut être ouverte par des universitaires.

Les concepts de connaissance et de popularisation sont explorés et liés au monde universitaire et à la collaboration sociale constructive. Les opportunités, les défis et les enjeux de l’adoption de Wikipédia par les universitaires sont étudiés en relation avec le rôle des universitaires dans la recherche et en éducation.

Une analyse critique de la littérature existante sur Wikipédia est menée, en tenant compte de la popularité de la plateforme, tout en explorant les perceptions négatives de l’outil. En outre, une enquête est menée auprès de 85 femmes du monde universitaire afin de comprendre leurs perceptions et leur utilisation de la plateforme.


Wikipedia Citations: A comprehensive dataset of citations with identifiers extracted from English Wikipedia

Authors : Harshdeep Singh, Robert West, Giovanni Colavizza

Wikipedia’s contents are based on reliable and published sources. To this date, little is known about what sources Wikipedia relies on, in part because extracting citations and identifying cited sources is challenging. To close this gap, we release Wikipedia Citations, a comprehensive dataset of citations extracted from Wikipedia.

A total of 29.3M citations were extracted from 6.1M English Wikipedia articles as of May 2020, and classified as being to books, journal articles or Web contents. We were thus able to extract 4.0M citations to scholarly publications with known identifiers — including DOI, PMC, PMID, and ISBN — and further labeled an extra 261K citations with DOIs from Crossref.

As a result, we find that 6.7% of Wikipedia articles cite at least one journal article with an associated DOI. Scientific articles cited from Wikipedia correspond to 3.5% of all articles with a DOI currently indexed in the Web of Science. We release all our code to allow the community to extend upon our work and update the dataset in the future.