In which fields are citations indicators of research quality?

Authors : Mike Thelwall, Kayvan Kousha, Emma Stuart, Meiko Makita, Mahshid Abdoli, Paul Wilson, Jonathan Levitt

Citation counts are widely used as indicators of research quality to support or replace human peer review and for lists of top cited papers, researchers, and institutions. Nevertheless, the relationship between citations and research quality is poorly evidenced. We report the first large-scale science-wide academic evaluation of the relationship between research quality and citations (field normalized citation counts), correlating them for 87,739 journal articles in 34 field-based UK Units of Assessment (UoA).

The two correlate positively in all academic fields, from very weak (0.1) to strong (0.5), reflecting broadly linear relationships in all fields. We give the first evidence that the correlations are positive even across the arts and humanities. The patterns are similar for the field classification schemes of Scopus and, although varying for some individual subjects and therefore more uncertain for these.

We also show for the first time that no field has a citation threshold beyond which all articles are excellent quality, so lists of top cited articles are not pure collections of excellence, and neither is any top citation percentile indicator. Thus, while appropriately field normalized citations associate positively with research quality in all fields, they never perfectly reflect it, even at high values.

More readers in more places: the benefits of open access for scholarly books

Authors : Cameron Neylon, Alkim Ozaygen, Lucy Montgomery, Chun-Kai (Karl) Huang, Ros Pyne, Mithu Lucraft, Christina Emery

Open access to scholarly contents has grown substantially in recent years. This includes the number of books published open access online. However, there is limited study on how usage patterns (via downloads, citations and web visibility) of these books may differ from their closed counterparts. Such information is not only important for book publishers, but also for researchers in disciplines where books are the norm.

This article reports on findings from comparing samples of books published by Springer Nature to shed light on differences in usage patterns across open access and closed books. The study includes a selection of 281 open access books and a sample of 3,653 closed books (drawn from 21,059 closed books using stratified random sampling).

The books are stratified by combinations of book type, discipline and year of publication to enable likewise comparisons within each stratum and to maximize statistical power of the sample.

The results show higher geographic diversity of usage, higher numbers of downloads and more citations for open access books across all strata. Importantly, open access books have increased access and usage for traditionally underserved populations.

Citations are not opinions: a corpus linguistics approach to understanding how citations are made

Author : Domenic Rosati

Citation content analysis seeks to understand citations based on the language used during the making of a citation. A key issue in citation content analysis is looking for linguistic structures that characterize distinct classes of citations for the purposes of understanding the intent and function of a citation.

Previous works have focused on modeling linguistic features first and drawn conclusions on the language structures unique to each class of citation function based on the performance of a classification task or inter-annotator agreement.

In this study, we start with a large sample of a pre-classified citation corpus, 2 million citations from each class of the scite Smart Citation dataset (supporting, disputing, and mentioning citations), and analyze its corpus linguistics in order to reveal the unique and statistically significant language structures belonging to each type of citation.

By generating comparison tables for each citation type we present a number of interesting linguistic features that uniquely characterize citation type. What we find is that within citation collocates, there is very low correlation between citation type and sentiment.

Additionally, we find that the subjectivity of citation collocates across classes is very low. These findings suggest that the sentiment of collocates is not a predictor of citation function and that due to their low subjectivity, an opinion-expressing mode of understanding citations, implicit in previous citation sentiment analysis literature, is inappropriate.

Instead, we suggest that citations can be better understood as claims-making devices where the citation type can be explained by understanding how two claims are being compared. By presenting this approach, we hope to inspire similar corpus linguistic studies on citations that derive a more robust theory of citation from an empirical basis using citation corpora.


Literature practices: processes leading up to a citation

Authors : Nikolai Klitzing, Rink Hoekstra, Jan-Willem Strijbos


Literature practices represent the process leading up to the citation of a source, and consist of the selection, reading and citing of sources. The purpose of this paper is to explore possible factors that might influence researchers during this process and discover possible consequences of researchers’ citation behaviours.


In this exploratory study, various factors which could influence literature practices were explored via a questionnaire amongst 112 researchers. Participants were first authors of articles published in 2016 in one of five different journals within the disciplines of experimental psychology, educational sciences and social psychology. Academic positions of the participants ranged from PhD student to full professor.


Frequencies and percentages showed that researchers seemed to be influenced in their literature practices by various factors, such as editors suggesting articles and motivation to cite.

Additionally, a high percentage of researchers reported taking shortcuts when citing articles (e.g. using secondary citations and reading selectively). Logistic regression did not reveal a clear relationship between academic work experience and research practices.

Seeing that researchers seem to be influenced by a variety of factors in their literature practices, the scientific community might benefit from better citation practices and guidelines in order to provide more structure to the process of literature practices.


This paper provides first insights into researchers’ literature practices. Possible reasons for problems with citation accuracy and replicating research findings are highlighted. Opportunities for further research on the topic of citation behaviours are presented.


Measuring Book Impact Based on the Multi-granularity Online Review Mining

As with articles and journals, the customary methods for measuring books’ academic impact mainly involve citations, which is easy but limited to interrogating traditional citation databases and scholarly book reviews, Researchers have attempted to use other metrics, such as Google Books, libcitation, and publisher prestige.

However, these approaches lack content-level information and cannot determine the citation intentions of users. Meanwhile, the abundant online review resources concerning academic books can be used to mine deeper information and content utilizing altmetric perspectives.

In this study, we measure the impacts of academic books by multi-granularity mining online reviews, and we identify factors that affect a book’s impact. First, online reviews of a sample of academic books on are crawled and processed.

Then, multi-granularity review mining is conducted to identify review sentiment polarities and aspects’ sentiment values. Lastly, the numbers of positive reviews and negative reviews, aspect sentiment values, star values, and information regarding helpfulness are integrated via the entropy method, and lead to the calculation of the final book impact scores.

The results of a correlation analysis of book impact scores obtained via our method versus traditional book citations show that, although there are substantial differences between subject areas, online book reviews tend to reflect the academic impact.

Thus, we infer that online reviews represent a promising source for mining book impact within the altmetric perspective and at the multi-granularity content level. Moreover, our proposed method might also be a means by which to measure other books besides academic publications.


A review of the literature on citation impact indicators

Citation impact indicators nowadays play an important role in research evaluation, and consequently these indicators have received a lot of attention in the bibliometric and scientometric literature. This paper provides an in-depth review of the literature on citation impact indicators. First, an overview is given of the literature on bibliographic databases that can be used to calculate citation impact indicators (Web of Science, Scopus, and Google Scholar).

Next, selected topics in the literature on citation impact indicators are reviewed in detail. The first topic is the selection of publications and citations to be included in the calculation of citation impact indicators. The second topic is the normalization of citation impact indicators, in particular normalization for field differences.

Counting methods for dealing with co-authored publications are the third topic, and citation impact indicators for journals are the last topic. The paper concludes by offering some recommendations for future research.


Delayed Open Access – an overlooked high impact…


Delayed Open Access – an overlooked high-impact category of openly available scientific literature:

“Delayed open access (OA) refers to scholarly articles in subscription journals made available openly on the web directly through the publisher at the expiry of a set embargo period. Though a substantial number of journals have practiced delayed OA since they started publishing e-versions, empirical studies concerning open access have often overlooked this body of literature. This study provides comprehensive quantitative measurements by identifying delayed OA journals, collecting data concerning their publication volumes, embargo lengths, and citation rates. Altogether 492 journals were identified, publishing a combined total of 111 312 articles in 2011. 77,8 % of these articles were made open access within 12 months from publication, with 85,4 % becoming available within 24 months. A journal impact factor analysis revealed that delayed OA journals have on average twice as high average citation rates compared to closed subscription journals, and three times as high as immediate OA journals. Overall the results demonstrate that delayed OA journals constitute an important segment of the openly available scholarly journal literature, both by their sheer article volume as well as by including a substantial proportion of high impact journals.”