Co-citations in context: Disciplinary heterogeneity is relevant

Authors : James Bradley, Sitaram Devarakonda, Avon Davey, Dmitriy Korobskiy, Siyu Liu, Djamil Lakhdar-Hamina, Tandy Warnow, George Chacko

Citation analysis of the scientific literature has been used to study and define disciplinary boundaries, to trace the dissemination of knowledge, and to estimate impact. Co-citation, the frequency with which pairs of publications are cited, provides insight into how documents relate to each other and across fields.

Co-citation analysis has been used to characterize combinations of prior work as conventional or innovative and to derive features of highly cited publications. Given the organization of science into disciplines, a key question is the sensitivity of such analyses to frame of reference.

Our study examines this question using semantically themed citation networks. We observe that trends reported to be true across the scientific literature do not hold for focused citation networks, and we conclude that inferring novelty using co-citation analysis and random graph models benefits from disciplinary context.

DOI : https://doi.org/10.1162/qss_a_00007

From Academia to Software Development: Publication Citations in Source Code Comments

Authors : Akira Inokuchi, Yusuf Sulistyo Nugroho, Fumiaki Konishi, Hideaki Hata, Akito Monden, Kenichi Matsumoto

Academic publications have been evaluated with the impact on research communities based on the number of citations. On the other hand, the impact of academic publications on industry has been rarely studied.

This paper investigates how academic publications contribute to software development by analyzing publication citations in source code comments in open source software repositories.

We propose an automated approach of detecting academic publications based on Named Entity Recognition, and achieve 0.90 in F1 as detection accuracy. We conduct a large-scale study of publication citations with 319,438,977 comments collected from active 25,925 repositories written in seven programming languages.

Our findings indicate that academic publications can be knowledge sources of software development, and there can be potential issues of obsoleting knowledge.

URL : https://arxiv.org/abs/1910.06932

The citation from patents to scientific output revisited: a new approach to the matching Patstat / Scopus

Authors : Vicente P. Gerrero-Bote, Rodrigo Sánchez-Jiménez, Félix De-Moya-Anegón

Patents include citations, both to other patents and to documents that are not patents (NPL, Non-patent literature). Among the latter include citations to articles published in scientific journals.

Just as the scientific impact is studied through the citation of articles and other scientific works, the technological impact of scientific works can also be studied through the citation they receive from patents.

The NPL references included in the patents are far from being standardized, so determining which scientific article they refer to is not trivial. This paper presents a procedure for linking the NPL references of the patents collected in the Patstat database and the scientific works indexed in the Scopus bibliographic database.

This procedure consists of two phases: a broad generation of candidate couples and another phase of validation of couples. It has been implemented with reasonable good results and affordable costs.

URL : https://recyt.fecyt.es/index.php/EPI/article/view/epi.2019.jul.01

The citation advantage of linking publications to research data

Authors : Giovanni Colavizza, Iain Hrynaszkiewicz, Isla Staden, Kirstie Whitaker, Barbara McGillivray

Efforts to make research results open and reproducible are increasingly reflected by journal policies encouraging or mandating authors to provide data availability statements.

As a consequence of this, there has been a strong uptake of data availability statements in recent literature. Nevertheless, it is still unclear what proportion of these statements actually contain well-formed links to data, for example via a URL or permanent identifier, and if there is an added value in providing them.

We consider 531,889 journal articles published by PLOS and BMC which are part of the PubMed Open Access collection, categorize their data availability statements according to their content and analyze the citation advantage of different statement categories via regression.

We find that, following mandated publisher policies, data availability statements have become common by now, yet statements containing a link to a repository are still just a fraction of the total.

We also find that articles with these statements, in particular, can have up to 25.36% higher citation impact on average: an encouraging result for all publishers and authors who make the effort of sharing their data. All our data and code are made available in order to reproduce and extend our results.

URL : https://arxiv.org/abs/1907.02565

Highly cited references in PLOS ONE and their in-text usage over time

Authors : Wolfgang Otto, Behnam Ghavimi, Philipp Mayr, Rajesh Piryani, Vivek Kumar Singh

In this article, we describe highly cited publications in a PLOS ONE full-text corpus. For these publications, we analyse the citation contexts concerning their position in the text and their age at the time of citing.

By selecting the perspective of highly cited papers, we can distinguish them based on the context during citation even if we do not have any other information source or metrics.

We describe the top cited references based on how, when and in which context they are cited. The focus of this study is on a time perspective to explain the nature of the reception of highly cited papers.

We have found that these references are distinguishable by the IMRaD sections of their citation. And further, we can show that the section usage of highly cited papers is time-dependent.

The longer the citation interval, the higher the probability that a reference is cited in a method section.

URL : https://arxiv.org/abs/1903.11693

Journals that Rise from the Fourth Quartile to the First Quartile in Six Years or Less: Mechanisms of Change and the Role of Journal Self-Citations

Author : Juan Miguel Campanario

Journal self-citations may be increased artificially to inflate a journal’s scientometric indicators. The aim of this study was to identify possible mechanisms of change in a cohort of journals that rose from the fourth (Q4) to the first quartile (Q1) over six years or less in Journal Citation Reports (JCR), and the role of journal self-citations in these changes.

A total of 51 different journals sampled from all JCR Science Citation Index (SCI) subject categories improved their rank position from Q4 in 2009 to Q1 in any year from 2010 to 2015. I identified changes in the numerator or denominator of the Journal Impact Factor (JIF) that were involved in each year-to-year transition.

The main mechanism of change was the increase in the number of citations used to compute the JIF. The effect of journal self-citations in the increase of the JIF was studied. The main conclusion is that there was no evidence of widespread JIF manipulation through the overuse of journal self-citations.

URL : Journals that Rise from the Fourth Quartile to the First Quartile in Six Years or Less: Mechanisms of Change and the Role of Journal Self-Citations

DOI : https://doi.org/10.3390/publications6040047

On the Heterogeneous Distributions in Paper Citations

Authors : Jinhyuk Yun, Sejung Ahn, June Young Lee

Academic papers have been the protagonists in disseminating expertise. Naturally, paper citation pattern analysis is an efficient and essential way of investigating the knowledge structure of science and technology.

For decades, it has been observed that citation of scientific literature follows a heterogeneous and heavy-tailed distribution, and many of them suggest a power-law distribution, log-normal distribution, and related distributions.

However, many studies are limited to small-scale approaches; therefore, it is hard to generalize. To overcome this problem, we investigate 21 years of citation evolution through a systematic analysis of the entire citation history of 42,423,644 scientific literatures published from 1996 to 2016 and contained in SCOPUS.

We tested six candidate distributions for the scientific literature in three distinct levels of Scimago Journal & Country Rank (SJR) classification scheme. First, we observe that the raw number of annual citation acquisitions tends to follow the log-normal distribution for all disciplines, except for the first year of the publication.

We also find significant disparity between the yearly acquired citation number among the journals, which suggests that it is essential to remove the citation surplus inherited from the prestige of the journals.

Our simple method for separating the citation preference of an individual article from the inherited citation of the journals reveals an unexpected regularity in the normalized annual acquisitions of citations across the entire field of science.

Specifically, the normalized annual citation acquisitions have power-law probability distributions with an exponential cut-off of the exponents around 2.3, regardless of its publication and citation year.

Our results imply that journal reputation has a substantial long-term impact on the citation.

URL : https://arxiv.org/abs/1810.08809