Interroger le texte scientifique

Auteur/Author : Guillaume Cabanac

Les documents textuels sont des vecteurs d’information familiers et incontournables de notre société de l’information. Avec l’essor des plateformes numériques et des médias sociaux, le texte se décline désormais en pages web, billets de blogs, commentaires, tweets et tags, entre autres. Auparavant consommateurs passifs, les lecteurs se muent à leur tour en producteurs de contenus.

En résultent des échanges interpersonnels qui tissent des réseaux sociaux numériques s’étendant bien au-delà de nos cercles relationnels. Dans ce contexte, nature et format des textes, intentions de leurs auteurs (informer, rediffuser, critiquer, compléter, corriger, etc.), contexte spatio-temporel ainsi que véracité et fraîcheur variables des informations sont autant de subtilités à intégrer dans les modèles de recherche d’information.

La première partie de ce mémoire présente une synthèse de résultats en recherche d’information visant à modéliser ces facteurs pour améliorer la pertinence des recherches sur des corpus textuels, notamment issus de médias sociaux.

Le programme de recherche que je développe vise également à « interroger le texte » pour révéler des informations au sujet de son contenu, de ses auteurs et de ses lecteurs. Le texte scientifique a été choisi comme cible pour la richesse de son contenu et de ses méta- données. Ainsi, la deuxième partie du mémoire synthétise des résultats en scientométrie, terme désignant l’étude quantitative des sciences et de l’innovation.

Il s’est agi de questionner des textes scientifiques et les réseaux sous-jacents (lexique, références, auteurs, institutions, etc.) pour faire émerger des connaissances à forte valeur ajoutée et apporter un éclairage sur la création et la diffusion des savoirs scientifiques.

Les deux volets articulés dans ce mémoire concourent à définir un programme de recherche interdisciplinaire à la croisée de l’informatique, la scientométrie et la sociologie des sciences.

Son ambition consiste à interroger le texte scientifique pour en améliorer l’accès (via la recherche d’information) tout en contribuant à éliciter les ressorts de la genèse et de l’évolution des mondes sociaux et des savoirs en sciences (via la scientométrie).

URL : Interroger le texte scientifique

Alternative location :

Bibliometrics and academic staff assessment in Polish university libraries – current trends

Authors : Danuta Ryś, Anna Chadaj

Academic staff assessment in Poland is, to a large extent, based on bibliographic indicators, such as the number of scientific publications produced, the Ministry of Science and Higher Education score pertaining to the journal rank and the publication type, as well as the number of citations and derivatives.

Relevant data is retrieved from bibliographic databases developed by libraries, international citation indexes available for Polish scientific institutions under a national licence, and from open-access international and Polish sources, which are briefly presented in the article.

The workload entailed, and in consequence, the results of this citation search vary depending on the search method applied. For this reason university staff members and university authorities often seek assistance for this from the university library staff. This in return provides an opportunity for libraries to increase their role within the academic community.

In order to investigate the matter further, the authors conducted a survey among the largest academic libraries in Poland.

The findings confirm that bibliometric processes (namely, the registration and the formal acceptance of university staff scientific publications, and compilation of citation reports) have become a vital part of modern library work. Bibliographies of university staff publications developed by libraries include various bibliometric indicators (those most frequently used being identified in the article), and have become an important source of statistical and bibliometric information.

The survey results highlight the most frequently used bibliometric sources and methods. Examples of bibliographic databases created by the libraries and bibliometric indicators used within these databases are also presented.


Characterization, description, and considerations for the use of funding acknowledgement data in Web of Science

Funding acknowledgements found in scientific publications have been used to study the impact of funding on research since the 1970s. However, no broad scale indexation of that paratextual element was done until 2008, when Thomson Reuters Web of Science started to add funding acknowledgement information to its bibliographic records.

As this new information provides a new dimension to bibliometric data that can be systematically exploited, it is important to understand the characteristics of these data and the underlying implications for their use.

This paper analyses the presence and distribution of funding acknowledgement data covered in Web of Science.

Our results show that prior to 2009 funding acknowledgements coverage is extremely low and therefore not reliable. Since 2008, funding information has been collected mainly for publications indexed in the Science Citation Index Expanded (SCIE); more recently (2015), inclusion of funding texts for publications indexed in the Social Science Citation Index (SSCI) has been implemented.

Arts & Humanities Citation Index (AHCI) content is not indexed for funding acknowledgement data. Moreover, English-language publications are the most reliably covered.

Finally, not all types of documents are equally covered for funding information indexation and only articles and reviews show consistent coverage.

The characterization of the funding acknowledgement information collected by Thomson Reuters can therefore help understand the possibilities offered by the data but also their limitations.


The application of bibliometrics to research evaluation in the humanities and social sciences: an exploratory study using normalized Google Scholar data for the publications of a research institute

In the humanities and social sciences, bibliometric methods for the assessment of research performance are (so far) less common. The current study takes a concrete example in an attempt to evaluate a research institute from the area of social sciences and humanities with the help of data from Google Scholar (GS).

In order to use GS for a bibliometric study, we have developed procedures for the normalisation of citation impact, building on the procedures of classical bibliometrics. In order to test the convergent validity of the normalized citation impact scores, we have calculated normalized scores for a subset of the publications based on data from the WoS or Scopus.

Even if scores calculated with the help of GS and WoS/Scopus are not identical for the different publication types (considered here), they are so similar that they result in the same assessment of the institute investigated in this study: For example, the institute’s papers whose journals are covered in WoS are cited at about an average rate (compared with the other papers in the journals).

URL :  :

Meaningful Metrics: A 21st Century Librarian’s Guide to Bibliometrics, Altmetrics, and Research Impact

What does it mean to have meaningful metrics in today’s complex higher education landscape? With a foreword by Heather Piwowar and Jason Priem, this highly engaging and activity-laden book serves to introduce readers to the fast-paced world of research metrics from the unique perspective of academic librarians and LIS practitioners.

Starting with the essential histories of bibliometrics and altmetrics, and continuing with in-depth descriptions of the core tools and emerging issues at stake in the future of both fields, Meaningful Metrics is a convenient all-in-one resource that is designed to be used by a range of readers, from those with little to no background on the subject to those looking to become movers and shakers in the current scholarly metrics movement. Authors Borchardt and Roemer, offer tips, tricks, and real-world examples illustrate how librarians can support the successful adoption of research metrics, whether in their institutions or across academia as a whole.


Coauthorship networks: A directed network approach considering the order and number of coauthors

« In many scientific fields, the order of coauthors on a paper conveys information about each individual’s contribution to a piece of joint work. We argue that in prior network analyses of coauthorship networks, the information on ordering has been insufficiently considered because ties between authors are typically symmetrized. This is basically the same as assuming that each co-author has contributed equally to a paper. We introduce a solution to this problem by adopting a coauthorship credit allocation model proposed by Kim and Diesner (2014), which in its core conceptualizes co-authoring as a directed, weighted, and self-looped network. We test and validate our application of the adopted framework based on a sample data of 861 authors who have published in the journal Psychometrika. Results suggest that this novel sociometric approach can complement traditional measures based on undirected networks and expand insights into coauthoring patterns such as the hierarchy of collaboration among scholars. As another form of validation, we also show how our approach accurately detects prominent scholars in the Psychometric Society affiliated with the journal. »


Using bibliometrics to support the facilitation of cross-disciplinary communication

Given the importance of cross-disciplinary research (CDR), facilitating CDR effectiveness is a priority for many institutions and funding agencies. There are a number of CDR types, however, and the effectiveness of facilitation efforts will require sensitivity to that diversity. This article presents a method characterizing a spectrum of CDR designed to inform facilitation efforts that relies on bibliometric techniques and citation data.

We illustrate its use by the Toolbox Project, an ongoing effort to enhance cross-disciplinary communication in CDR teams through structured, philosophical dialogue about research assumptions in a workshop setting. Toolbox Project workshops have been conducted with more than 85 research teams, but the project’s extensibility to an objectively characterized range of CDR collaborations has not been examined.

To guide wider application of the Toolbox Project, we have developed a method that uses multivariate statistical analyses of transformed citation proportions from published manuscripts to identify candidate areas of CDR, and then overlays information from previous Toolbox participant groups on these areas to determine candidate areas for future application.

The approach supplies 3 results of general interest:
1) A way to employ small data sets and familiar statistical techniques to characterize CDR spectra as a guide to scholarship on CDR patterns and trends.
2) A model for using bibliometric techniques to guide broadly applicable interventions similar to the Toolbox.
3) A method for identifying the location of collaborative CDR teams on a map of scientific activity, of use to research administrators, research teams, and other efforts to enhance CDR projects.