Google Scholar as a data source for research assessment

Authors : Emilio Delgado López-Cózar, Enrique Orduna-Malea, Alberto Martín-Martín

The launch of Google Scholar (GS) marked the beginning of a revolution in the scientific information market. This search engine, unlike traditional databases, automatically indexes information from the academic web. Its ease of use, together with its wide coverage and fast indexing speed, have made it the first tool most scientists currently turn to when they need to carry out a literature search.

Additionally, the fact that its search results were accompanied from the beginning by citation counts, as well as the later development of secondary products which leverage this citation data (such as Google Scholar Metrics and Google Scholar Citations), made many scientists wonder about its potential as a source of data for bibliometric analyses.

The goal of this chapter is to lay the foundations for the use of GS as a supplementary source (and in some disciplines, arguably the best alternative) for scientific evaluation.

First, we present a general overview of how GS works. Second, we present empirical evidences about its main characteristics (size, coverage, and growth rate). Third, we carry out a systematic analysis of the main limitations this search engine presents as a tool for the evaluation of scientific performance.

Lastly, we discuss the main differences between GS and other more traditional bibliographic databases in light of the correlations found between their citation data. We conclude that Google Scholar presents a broader view of the academic world because it has brought to light a great amount of sources that were not previously visible.

URL : https://arxiv.org/abs/1806.04435

The counting house: measuring those who count. Presence of Bibliometrics, Scientometrics, Informetrics, Webometrics and Altmetrics in the Google Scholar Citations, ResearcherID, ResearchGate, Mendeley & Twitter

Authors : Alberto Martin-Martin, Enrique Orduna-Malea, Juan M. Ayllon, Emilio Delgado Lopez-Cozar

Following in the footsteps of the model of scientific communication, which has recently gone through a metamorphosis (from the Gutenberg galaxy to the Web galaxy), a change in the model and methods of scientific evaluation is also taking place.

A set of new scientific tools are now providing a variety of indicators which measure all actions and interactions among scientists in the digital space, making new aspects of scientific communication emerge.

In this work we present a method for capturing the structure of an entire scientific community (the Bibliometrics, Scientometrics, Informetrics, Webometrics, and Altmetrics community) and the main agents that are part of it (scientists, documents, and sources) through the lens of Google Scholar Citations.

Additionally, we compare these author portraits to the ones offered by other profile or social platforms currently used by academics (ResearcherID, ResearchGate, Mendeley, and Twitter), in order to test their degree of use, completeness, reliability, and the validity of the information they provide.

A sample of 814 authors (researchers in Bibliometrics with a public profile created in Google Scholar Citations was subsequently searched in the other platforms, collecting the main indicators computed by each of them.

The data collection was carried out on September, 2015. The Spearman correlation was applied to these indicators (a total of 31) , and a Principal Component Analysis was carried out in order to reveal the relationships among metrics and platforms as well as the possible existence of metric cluster.

URL : https://arxiv.org/abs/1602.02412

‘Just Google it’ – the scope of freely available information sources for doctoral thesis writing

Authors : Vincas Grigas, Simona Juzėnienė, Jonė Veličkaitė

Introduction

Recent developments in the field of scientific information resource provision lead us to the key research question, namely,what is the coverage of freely available information sources when writing doctoral theses, and whether the academic library can assume the leading role as a direct intermediator for information users.

Method

Citation analysis of doctoral theses was conducted in the summer of 2015. A total of thirty-nine theses (with 6,998 references) defended at Vilnius University at the end of 2014 was selected (30 per cent of all defended theses).

Theses were randomly chosen from different research fields: the humanities, social sciences, biomedical sciences, technological sciences, and physical sciences.

Analysis

The research team was tasked with identifying whether certain resources could be found in the eCatalogue of an academic library, its subscribed databases, freely available online (through Google or Google Scholar), or whether the resources from the library`s subscribed databases are identical to those which are freely available.

The data gathering process included such resource categories as journal papers, printed and electronic books or book chapters, and other documents (legal reports, conference papers, newspaper articles, Websites, theses, etc.).

Conclusions

Library collections and subscribed databases could cover up to 80 per cent of all information resources used in doctoral theses. Among the most significant findings to emerge from this study is the fact that on average more than half (57 per cent) of all utilised information resources were freely available or were accessed without library support.

We may presume that the library as a direct intermediator for information users is potentially important and irreplaceable only in four out of ten attempts of PhD students to seek information.

URL : http://www.informationr.net/ir/22-1/paper738.html

Back to the past: on the shoulders of an academic search engine giant

A study released by the Google Scholar team found an apparently increasing fraction of citations to old articles from studies published in the last 24 years (1990-2013). To demonstrate this finding we conducted a complementary study using a different data source (Journal Citation Reports), metric (aggregate cited half-life), time spam (2003-2013), and set of categories (53 Social Science subject categories and 167 Science subject categories).

Although the results obtained confirm and reinforce the previous findings, the possible causes of this phenomenon keep unclear. We finally hypothesize that first page results syndrome in conjunction with the fact that Google Scholar favours the most cited documents are suggesting the growing trend of citing old documents is partly caused by Google Scholar.

URL : http://arxiv.org/abs/1603.09111

The application of bibliometrics to research evaluation in the humanities and social sciences: an exploratory study using normalized Google Scholar data for the publications of a research institute

In the humanities and social sciences, bibliometric methods for the assessment of research performance are (so far) less common. The current study takes a concrete example in an attempt to evaluate a research institute from the area of social sciences and humanities with the help of data from Google Scholar (GS).

In order to use GS for a bibliometric study, we have developed procedures for the normalisation of citation impact, building on the procedures of classical bibliometrics. In order to test the convergent validity of the normalized citation impact scores, we have calculated normalized scores for a subset of the publications based on data from the WoS or Scopus.

Even if scores calculated with the help of GS and WoS/Scopus are not identical for the different publication types (considered here), they are so similar that they result in the same assessment of the institute investigated in this study: For example, the institute’s papers whose journals are covered in WoS are cited at about an average rate (compared with the other papers in the journals).

URL :  : https://figshare.com/articles/The_application_of_bibliometrics_to_research_evaluation_in_the_humanities_and_social_sciences_an_exploratory_study_using_normalized_Google_Scholar_data_for_the_publications_of_a_research_institute/1293588

The dark side of Open Access in Google and Google Scholar: the case of Latin-American repositories

Statut

Since repositories are a key tool in making scholarly knowledge open access, determining their presence and impact on the Web is essential, particularly in Google (search engine par excellence) and Google Scholar (a tool increasingly used by researchers to search for academic information). The few studies conducted so far have been limited to very specific geographic areas (USA), which makes it necessary to find out what is happening in other regions that are not part of mainstream academia, and where repositories play a decisive role in the visibility of scholarly production. The main objective of this study is to ascertain the presence and visibility of Latin American repositories in Google and Google Scholar through the application of page count and visibility indicators. For a sample of 137 repositories, the results indicate that the indexing ratio is low in Google, and virtually nonexistent in Google Scholar; they also indicate a complete lack of correspondence between the repository records and the data produced by these two search tools. These results are mainly attributable to limitations arising from the use of description schemas that are incompatible with Google Scholar (repository design) and the reliability of web indicators (search engines). We conclude that neither Google nor Google Scholar accurately represent the actual size of open access content published by Latin American repositories; this may indicate a non-indexed, hidden side to open access, which could be limiting the dissemination and consumption of open access scholarly literature.

URL : http://arxiv-web3.library.cornell.edu/abs/1406.4331

Google Scholar as a tool for discovering journal…

Google Scholar as a tool for discovering journal articles in library and information science :

Purpose: The purpose of this paper is to measure the coverage of Google Scholar for the Library and Information Science (LIS) journal literature as identified by a list of core LIS journals from a study by Schlögl and Petschnig (2005).

Methods: We checked every article from 35 major LIS journals from the years 2004 to 2006 for availability in Google Scholar (GS). We also collected information on the type of availability—i.e., whether a certain article was available as a PDF for a fee, as a free PDF, or as a preprint.

Results: We found that only some journals are completely indexed by Google Scholar, that the ratio of versions available depends on the type of publisher, and that availability varies a lot from journal to journal. Google Scholar cannot substitute for abstracting and indexing services in that it does not cover the complete literature of the field. However, it can be used in many cases to easily find available full texts of articles already found using another tool.

Originality/value: This study differs from other Google Scholar coverage studies in that it takes into account not only whether an article is indexed in GS at all, but also the type of availability.”

URL : http://eprints.rclis.org/handle/10760/16084