Citations are not opinions: a corpus linguistics approach to understanding how citations are made

Author : Domenic Rosati

Citation content analysis seeks to understand citations based on the language used during the making of a citation. A key issue in citation content analysis is looking for linguistic structures that characterize distinct classes of citations for the purposes of understanding the intent and function of a citation.

Previous works have focused on modeling linguistic features first and drawn conclusions on the language structures unique to each class of citation function based on the performance of a classification task or inter-annotator agreement.

In this study, we start with a large sample of a pre-classified citation corpus, 2 million citations from each class of the scite Smart Citation dataset (supporting, disputing, and mentioning citations), and analyze its corpus linguistics in order to reveal the unique and statistically significant language structures belonging to each type of citation.

By generating comparison tables for each citation type we present a number of interesting linguistic features that uniquely characterize citation type. What we find is that within citation collocates, there is very low correlation between citation type and sentiment.

Additionally, we find that the subjectivity of citation collocates across classes is very low. These findings suggest that the sentiment of collocates is not a predictor of citation function and that due to their low subjectivity, an opinion-expressing mode of understanding citations, implicit in previous citation sentiment analysis literature, is inappropriate.

Instead, we suggest that citations can be better understood as claims-making devices where the citation type can be explained by understanding how two claims are being compared. By presenting this approach, we hope to inspire similar corpus linguistic studies on citations that derive a more robust theory of citation from an empirical basis using citation corpora.


Increasing visibility and discoverability of scholarly publications with academic search engine optimization

Authors : Lisa Schilhan, Christian Kaier, Karin Lackner

With the help of academic search engine optimization (ASEO), publications can more easily be found in academic search engines and databases. Authors can improve the ranking of their publications by adjusting titles, keywords and abstracts.

Carefully considered wording makes publications easier to find and, ideally, cited more often. This article is meant to support authors in making their scholarly publications more visible. It provides basic information on ranking mechanisms as well as tips and tricks on how to improve the findability of scholarly publications while also pointing out the limits of optimization.

This article, authored by three scholarly communications librarians, draws on their experience of hosting journals, providing workshops for researchers and individual publication support, as well as on their investigations of the ranking algorithms of search engines and databases.

URL : Increasing visibility and discoverability of scholarly publications with academic search engine optimization


Scaling Small; Or How to Envision New Relationalities for Knowledge Production

Authors: Janneke Adema, Samuel A. Moore

Within the field of open access (OA) publishing, community-led publishing projects are experimenting increasingly with new forms of collaboration and organisation. They do so by focusing on setting up horizontal alliances between independent projects within a certain sector (e.g., scholar-led presses), or vertically across sectors with other not-for-profit organisations (e.g., through collaborations with libraries, universities, and funders), in order to create multi-stakeholder ecologies within scholarly publishing.

Yet at the same time, imaginaries for future modes of OA knowledge production are still controlled through demands for ‘scalability’ and ‘sustainability’, which are both seen as preconditions for scholarly communication models and practices to succeed and to be efficient. But they are also prerequi­sites to receive funding for publishing projects or infrastructure development.

The scalability of open models is perceived as essential to compete in a landscape dominated by a handful of major corporate players.Drawing on our work with the Radical Open Access Collective, the ScholarLed consortium, and the Community-led Open Publishing Infrastructures for Mono­graphs (COPIM) project, this article outlines an alternative organisational prin­ciple for governing community-led publishing projects based on mutual reliance, care, and other forms of commoning.

Termed ‘scaling small’, this principle eschews standard approaches to organisational growth that tend to flatten community diversity through economies of scale. Instead, it puts forward the idea that scale can be nurtured through intentional collaborations between community-driven pro­jects that promote a bibliodiverse ecosystem while providing resilience through resource sharing and other kinds of collaboration.

Following Anna Tsing’s recom­mendations to keep in mind how reimagining our knowledge practices requires we pay particular attention to articulations between the scalable and the nonscalable (Tsing, 2012), what is needed to enable this is, first and foremost, a rethinking of existing systems and infrastructures and how they currently function – systems that have historically developed and been continuously remade to encourage fur­ther scalability.

We further explore the possibilities of scaling small with particular reference to Anna Tsing’s work on the ‘latent commons’ and Massimo De Angelis’ discussion of ‘boundary commoning’, examining how these concepts are on display within the Radical Open Access Collective, ScholarLed and the COPIM project.

As we will argue, reimagining the relations within publishing beyond a mere calcula­tive logic, i.e., one that is focused on assessing the sustainability of alternative models, is essential in not-for-profit OA publishing environments, particularly if we want new forms of collaboration to arise and to redefine the future of scholarly publishing in communal settings.

URL : Scaling Small; Or How to Envision New Relationalities for Knowledge Production


Value-added services in institutional repositories in Spanish public universities

Authors : Andrés Fernández-Ramos, Leticia Barrionuevo


The aim of the present study was to analyse the value-added services offered by institutional repositories in Spanish public universities.


Information was collected on the main characteristics of repositories in Spanish public universities and the value-added services they offered, using a checklist with twenty-five items divided into three dimensions: information on the repository; information on the records; and instructions for use and dissemination.


We determined the frequency of each value-added service in the repositories included in the study and analysed the main modalities in which these services were offered. We also analysed the similarity between repositories using multidimensional scaling methods.


We found high variability between repositories and indicated that some value-added services were widely offered whereas others were only provided by a few repositories.


We believe that the provision of value-added services could have a direct impact on repository use because such services are related to many of the reasons that explain repository under-utilisation, such as low perceived usefulness, difficulties depositing work and lack of knowledge about what should or can be deposited.


The evolving role of preprints in the dissemination of COVID-19 research and their impact on the science communication landscape

Authors : Nicholas Fraser, Liam Brierley, Gautam Dey, Jessica K. Polka, Máté Pálfy, Federico Nann, Jonathon Alexis Coates

The world continues to face a life-threatening viral pandemic. The virus underlying the Coronavirus Disease 2019 (COVID-19), Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), has caused over 98 million confirmed cases and 2.2 million deaths since January 2020.

Although the most recent respiratory viral pandemic swept the globe only a decade ago, the way science operates and responds to current events has experienced a cultural shift in the interim.

The scientific community has responded rapidly to the COVID-19 pandemic, releasing over 125,000 COVID-19–related scientific articles within 10 months of the first confirmed case, of which more than 30,000 were hosted by preprint servers.

We focused our analysis on bioRxiv and medRxiv, 2 growing preprint servers for biomedical research, investigating the attributes of COVID-19 preprints, their access and usage rates, as well as characteristics of their propagation on online platforms.

Our data provide evidence for increased scientific and public engagement with preprints related to COVID-19 (COVID-19 preprints are accessed more, cited more, and shared more on various online platforms than non-COVID-19 preprints), as well as changes in the use of preprints by journalists and policymakers.

We also find evidence for changes in preprinting and publishing behaviour: COVID-19 preprints are shorter and reviewed faster.

Our results highlight the unprecedented role of preprints and preprint servers in the dissemination of COVID-19 science and the impact of the pandemic on the scientific communication landscape.

URL : The evolving role of preprints in the dissemination of COVID-19 research and their impact on the science communication landscape


For how long and with what relevance do genetics articles retracted due to research misconduct remain active in the scientific literature

Authors : Rafael Dal-Ré, Carmen Ayuso

We aimed to quantify the number of pre- and post-retraction citations obtained by genetics articles retracted due to research misconduct. All retraction notices available in the Retraction Watch database for genetics articles published in 1970–2016 were assessed.

The reasons for retraction were fabrication/falsification and plagiarism. The endpoints were the number of citations of retracted articles and when and how journals reported on retractions and whether this was published on PubMed.

Four hundred and sixty retracted genetics articles were cited 34,487 times; 7,945 (23%) were post-retraction citations. Median time to retraction and time to last citation were 3.2 and 3 years, respectively. Most (96%) had a PubMed retraction notice, One percent of these were totally removed from journal websites altogether, and 4% had no information available on either the online or PDF versions.

Ninety percent of citations were from articles retracted due to falsification/fabrication. The percentage of post-retraction citations was significantly higher in the case of plagiarism (42%) than in the case of fabrication/falsification (21.5%) (p<0.001). Median time to retraction was shorter (1.3 years) in the case of plagiarism than for fabrication/falsification (4.8 years, p<0.001).

The retraction was more frequently reported in the PDFs (70%) for the fabrication/falsification cases than for the plagiarism cases (43%, p<0.001). The highest rate of retracted papers due to falsification/fabrication was among authors in the USA, and the highest rate for plagiarism was in China.

Although most retractions were appropriately handled by journals, the gravest issue was that median time to retraction for articles retracted for falsification/fabrication was nearly 5 years, earning close to 6800 post-retraction citations. Journals should implement processes to speed-up the retraction process that will help to minimize post-retraction citations.

URL : For how long and with what relevance do genetics articles retracted due to research misconduct remain active in the scientific literature


Requiem for impact factors and high publication charges

Authors : Chris R Triggle, Ross MacDonald, David J. Triggle, Donald Grierson

Journal impact factors, publication charges and assessment of quality and accuracy of scientific research are critical for researchers, managers, funders, policy makers, and society. Editors and publishers compete for impact factor rankings, to demonstrate how important their journals are, and researchers strive to publish in perceived top journals, despite high publication and access charges.

This raises questions of how top journals are identified, whether assessments of impacts are accurate and whether high publication charges borne by the research community are justified, bearing in mind that they also collectively provide free peer-review to the publishers.

Although traditional journals accelerated peer review and publication during the COVID-19 pandemic, preprint servers made a greater impact with over 30,000 open access articles becoming available and accelerating a trend already seen in other fields of research.

We review and comment on the advantages and disadvantages of a range of assessment methods and the way in which they are used by researchers, managers, employers and publishers.

We argue that new approaches to assessment are required to provide a realistic and comprehensive measure of the value of research and journals and we support open access publishing at a modest, affordable price to benefit research producers and consumers.

URL : Requiem for impact factors and high publication charges