Contribuer à la diffusion du patrimoine documentaire sur Wikipédia : pratiques et enjeux pour les institutions culturelles

Auteurs/Authors : Jessica de Bideran, Romain Wenz

Pour les institutions patrimoniales, la participation à des initiatives ouvertes telles que Wikipédia est un changement de paradigme. Sur la base d’une expérience d’enseignement de plusieurs années en lien avec des structures culturelles conservant du patrimoine documentaire, l’article permet de confronter les réflexions théoriques contemporaines sur la dissémination des connaissances à la réalité des institutions administratives.

L’enjeu pour les acteurs est de concilier les besoins et principes d’une encyclopédie généraliste et mondiale avec les attentes d’institutions historiquement centrées sur la présentation de collections à des visiteurs physiques.

Toutefois, vouloir être présent sur Wikipédia implique pour l’institution culturelle de repenser sa posture vis-à-vis de publics virtuels et non moins collectivement organisés et engagés.


Science through Wikipedia: A novel representation of open knowledge through co-citation networks

Authors : Wenceslao Arroyo-Machado, Daniel Torres-Salinas, Enrique Herrera-Viedma, Esteban Romero-Frías

This study provides an overview of science from the Wikipedia perspective. A methodology has been established for the analysis of how Wikipedia editors regard science through their references to scientific papers.

The method of co-citation has been adapted to this context in order to generate Pathfinder networks (PFNET) that highlight the most relevant scientific journals and categories, and their interactions in order to find out how scientific literature is consumed through this open encyclopaedia.

In addition to this, their obsolescence has been studied through Price index. A total of 1 433 457 references available at this http URL have been initially taken into account. After pre-processing and linking them to the data from Elsevier’s CiteScore Metrics the sample was reduced to 847 512 references made by 193 802 Wikipedia articles to 598 746 scientific articles belonging to 14 149 journals indexed in Scopus.

As highlighted results we found a significative presence of “Medicine” and “Biochemistry, Genetics and Molecular Biology” papers and that the most important journals are multidisciplinary in nature, suggesting also that high-impact factor journals were more likely to be cited. Furthermore, only 13.44% of Wikipedia citations are to Open Access journals.


Les désaccords éditoriaux dans Wikipédia comme tensions entre régimes épistémiques

Auteurs/Authors : Guillaume Carbou, Gilles Sahut

Malgré son architecture normative élaborée, Wikipédia est le lieu de désaccords récurrents entre contributeurs.

Les auteurs montrent, à partir de l’analyse argumentative d’un corpus des pages de discussion d’articles suscitant de forts débats (OGM, 11 septembre, etc.), que ces désaccords sont en partie sous-tendus par l’existence de « régimes épistémiques » concurrents sur Wikipédia.

Ces régimes épistémiques (encyclopédiste, scientifique, scientiste, wiki, critique et doxique) correspondent à autant de conceptions divergentes du « valide » et des modalités pour y aboutir.


Wikipedia Text Reuse: Within and Without

Authors : Milad Alshomary, Michael Völske, Tristan Licht, Henning Wachsmuth, Benno Stein, Matthias Hagen, Martin Potthast

We study text reuse related to Wikipedia at scale by compiling the first corpus of text reuse cases within Wikipedia as well as without (i.e., reuse of Wikipedia text in a sample of the Common Crawl).

To discover reuse beyond verbatim copy and paste, we employ state-of-the-art text reuse detection technology, scaling it for the first time to process the entire Wikipedia as part of a distributed retrieval pipeline.

We further report on a pilot analysis of the 100 million reuse cases inside, and the 1.6 million reuse cases outside Wikipedia that we discovered. Text reuse inside Wikipedia gives rise to new tasks such as article template induction, fixing quality flaws due to inconsistencies arising from asynchronous editing of reused passages, or complementing Wikipedia’s ontology.

Text reuse outside Wikipedia yields a tangible metric for the emerging field of quantifying Wikipedia’s influence on the web. To foster future research into these tasks, and for reproducibility’s sake, the Wikipedia text reuse corpus and the retrieval pipeline are made freely available.


Wikipedia: an opportunity to rethink the links between sources’ credibility, trust and authority

Authors : Gilles Sahut, André Tricot

The Web and its main tools (Google, Wikipedia, Facebook, Twitter) deeply raise and renew fundamental questions, that everyone asks almost every day: Is this information or content true? Can I trust this author or source?

These questions are not new, they have been the same with books, newspapers, broadcasting and television, and, more fundamentally, in every human interpersonal communication.

This paper is focused on two scientific problems on this issue. The first one is theoretical: to address this issue, many concepts have been used in library and information sciences, communication and psychology.

The links between these concepts are not clear: sometimes two concepts are considered as synonymous, sometimes as very different. The second one is historical: sources like Wikipedia deeply challenge the epistemic evaluation of information sources, compared to previous modes of information production.

This paper proposes an integrated and simple model considering the relation between a user, a document and an author as human communication. It reduces the problem to three concepts: credibility as a characteristic granted to information depending on its truth-value; trust as the ability to produce credible information; authority when the power to influence of an author is accepted, i.e., when readers accept that the source can modify their opinion, knowledge and decisions.

The model describes also two kinds of relationships between the three concepts: an upward link and a downward link. The model is confronted with findings of empirical research on Wikipedia in particular.


The Evolution of the Concept of Semantic Web in the Context of Wikipedia: An Exploratory Approach to Study the Collective Conceptualization in a Digital Collaborative Environment

Authors : Luís Miguel Machado, Maria Manuel Borges, Renato Rocha Souza

Wikipedia, as a “social machine”, is a privileged place to observe the collective construction of concepts without central control. Based on Dahlberg’s theory of concept, and anchored in the pragmatism of Hjørland—in which the concepts are socially negotiated meanings—the evolution of the concept of semantic web (SW) was analyzed in the English version of Wikipedia.

An exploratory, descriptive, and qualitative study was designed and we identified 26 different definitions (between 12 July 2001 and 31 December 2017), of which eight are of particular relevance for their duration, with the latter being the two recorded at the end of the analyzed period.

According to them, SW: “is an extension of the web” and “is a Web of Data”; the latter, used as a complementary definition, links to Berners-Lee’s publications. In Wikipedia, the evolution of the SW concept appears to be based on the search for the use of non-technical vocabulary and the control of authority carried out by the debate.

As a space for collective bargaining of meanings, the Wikipedia study may bring relevant contributions to a community’s understanding of a particular concept and how it evolves over time.

URL : The Evolution of the Concept of Semantic Web in the Context of Wikipedia: An Exploratory Approach to Study the Collective Conceptualization in a Digital Collaborative Environment


La gouvernance de Wikipédia : élaboration de règles et théorie d’Ostrom

Auteur/Author : Gilles Sahut

La réussite de Wikipédia est fréquemment attribuée à la pertinence de sa gouvernance. Toutefois, il n’existe pas de consensus scientifique pour la caractériser.

Dans cette étude empirique, nous nous penchons sur une facette de cette gouvernance au sein de la Wikipédia francophone : les modalités de construction de deux règles liées à la citation des sources.

Elles sont étudiées au travers de la théorie d’Ostrom sur les communs. Nous montrons que ces règles sont discutées et écrites par une minorité de contributeurs particulièrement impliqués. Ainsi, il n’y a pas, dans Wikipédia, de « classe politique » coupée du terrain.

Nous soulignons également l’influence du dispositif communicationnel interne sur ce processus ainsi que celle de la Wikipédia anglophone.

URL : La gouvernance de Wikipédia : élaboration de règles et théorie d’Ostrom

