Google Scholar as a data source for research assessment

Authors : Emilio Delgado López-Cózar, Enrique Orduna-Malea, Alberto Martín-Martín

The launch of Google Scholar (GS) marked the beginning of a revolution in the scientific information market. This search engine, unlike traditional databases, automatically indexes information from the academic web. Its ease of use, together with its wide coverage and fast indexing speed, have made it the first tool most scientists currently turn to when they need to carry out a literature search.

Additionally, the fact that its search results were accompanied from the beginning by citation counts, as well as the later development of secondary products which leverage this citation data (such as Google Scholar Metrics and Google Scholar Citations), made many scientists wonder about its potential as a source of data for bibliometric analyses.

The goal of this chapter is to lay the foundations for the use of GS as a supplementary source (and in some disciplines, arguably the best alternative) for scientific evaluation.

First, we present a general overview of how GS works. Second, we present empirical evidences about its main characteristics (size, coverage, and growth rate). Third, we carry out a systematic analysis of the main limitations this search engine presents as a tool for the evaluation of scientific performance.

Lastly, we discuss the main differences between GS and other more traditional bibliographic databases in light of the correlations found between their citation data. We conclude that Google Scholar presents a broader view of the academic world because it has brought to light a great amount of sources that were not previously visible.

URL : https://arxiv.org/abs/1806.04435

Collaboration Diversity and Scientific Impact

Authors : Yuxiao Dong, Hao Ma, Jie Tang, Kuansan Wang

The shift from individual effort to collaborative output has benefited science, with scientific work pursued collaboratively having increasingly led to more highly impactful research than that pursued individually.

However, understanding of how the diversity of a collaborative team influences the production of knowledge and innovation is sorely lacking. Here, we study this question by breaking down the process of scientific collaboration of 32.9 million papers over the last five decades.

We find that the probability of producing a top-cited publication increases as a function of the diversity of a team of collaborators—namely, the distinct number of institutions represented by the team.

We discover striking phenomena where a smaller, yet more diverse team is more likely to generate highly innovative work than a relatively larger team within one institution.

We demonstrate that the synergy of collaboration diversity is universal across different generations, research fields, and tiers of institutions and individual authors.

Our findings suggest that collaboration diversity strongly and positively correlates with the production of scientific innovation, giving rise to the potential revolution of the policies used by funding agencies and authorities to fund research projects, and broadly the principles used to organize teams, organizations, and societies.

URL : https://arxiv.org/abs/1806.03694

Analysing researchers’ outreach efforts and the association with publication metrics: A case study of Kudos

Authors : Mojisola Erdt, Htet Htet Aung, Ashley Sara Aw, Charlie Rapple, Yin-Leng Theng

With the growth of scholarly collaboration networks and social communication platforms, members of the scholarly community are experimenting with their approach to disseminating research outputs, in an effort to increase their audience and outreach.

However, from a researcher’s point of view, it is difficult to determine whether efforts to make work more visible are worthwhile (in terms of the association with publication metrics) and within that, difficult to assess which platform or network is most effective for sharing work and connecting to a wider audience.

We undertook a case study of Kudos (https://www.growkudos.com), a web-based service that claims to help researchers increase the outreach of their publications, to examine the most effective tools for sharing publications online, and to investigate which actions are associated with improved metrics.

We extracted a dataset from Kudos of 830,565 unique publications claimed by authors, for which 20,775 had actions taken to explain or share via Kudos, and for 4,867 of these full text download data from publishers was available.

Findings show that researchers are most likely to share their work on Facebook, but links shared on Twitter are more likely to be clicked on. A Mann-Whitney U test revealed that a treatment group (publications having actions in Kudos) had a significantly higher median average of 149 full text downloads (23.1% more) per publication as compared to a control group (having no actions in Kudos) with a median average of 121 full text downloads per publication.

These findings suggest that performing actions on publications, such as sharing, explaining, or enriching, could help to increase the number of full text downloads of a publication.

URL : Analysing researchers’ outreach efforts and the association with publication metrics: A case study of Kudos

DOI : https://doi.org/10.1371/journal.pone.0183217

Citation Count Analysis for Papers with Preprints

Authors : Sergey Feldman, Kyle Lo, Waleed Ammar

We explore the degree to which papers prepublished on arXiv garner more citations, in an attempt to paint a sharper picture of fairness issues related to prepublishing. A paper’s citation count is estimated using a negative-binomial generalized linear model (GLM) while observing a binary variable which indicates whether the paper has been prepublished.

We control for author influence (via the authors’ h-index at the time of paper writing), publication venue, and overall time that paper has been available on arXiv. Our analysis only includes papers that were eventually accepted for publication at top-tier CS conferences, and were posted on arXiv either before or after the acceptance notification.

We observe that papers submitted to arXiv before acceptance have, on average, 65\% more citations in the following year compared to papers submitted after. We note that this finding is not causal, and discuss possible next steps.

URL : https://arxiv.org/abs/1805.05238

Measuring Scientific Broadness

Authors : Tom Price, Sabine Hossenfelder

Who has not read letters of recommendations that comment on a student’s `broadness’ and wondered what to make of it?

We here propose a way to quantify scientific broadness by a semantic analysis of researchers’ publications. We apply our methods to papers on the open-access server arXiv.org and report our findings.

URL : https://arxiv.org/abs/1805.04647

The Journal Impact Factor: A brief history, critique, and discussion of adverse effects

Authors : Vincent Lariviere, Cassidy R. Sugimoto

The Journal Impact Factor (JIF) is, by far, the most discussed bibliometric indicator. Since its introduction over 40 years ago, it has had enormous effects on the scientific ecosystem: transforming the publishing industry, shaping hiring practices and the allocation of resources, and, as a result, reorienting the research activities and dissemination practices of scholars.

Given both the ubiquity and impact of the indicator, the JIF has been widely dissected and debated by scholars of every disciplinary orientation. Drawing on the existing literature as well as on original research, this chapter provides a brief history of the indicator and highlights well-known limitations-such as the asymmetry between the numerator and the denominator, differences across disciplines, the insufficient citation window, and the skewness of the underlying citation distributions.

The inflation of the JIF and the weakening predictive power is discussed, as well as the adverse effects on the behaviors of individual actors and the research enterprise. Alternative journal-based indicators are described and the chapter concludes with a call for responsible application and a commentary on future developments in journal indicators

URL : https://arxiv.org/abs/1801.08992

Comparing scientific and technological impact of biomedical research

Author : Qing Ke

Traditionally, the number of citations that a scholarly paper receives from other papers is used as the proxy of its scientific impact. Yet citations can come from domains outside the scientific community, and one such example is through patented technologies—paper can be cited by patents, achieving technological impact.

While the scientific impact of papers has been extensively studied, the technological aspect remains largely unknown. Here we aim to fill this gap by presenting a comparative study on how 919 thousand biomedical papers are cited by U.S. patents and by other papers over time.

We observe a positive correlation between citations from patents and from papers, but there is little overlap between the two domains in either the most cited papers, or papers with the most delayed recognition.

We also find that the two types of citations exhibit distinct temporal variations, with patent citations lagging behind paper citations for a median of 6 years for the majority of papers. Our work contributes to the understanding of the technological, and societal in general, impact of papers.

URL : https://arxiv.org/abs/1804.04105