Google Scholar, Microsoft Academic, Scopus, Dimensions, Web of Science, and OpenCitations’ COCI: a multidisciplinary comparison of coverage via citations

Authors : Alberto Martín-Martín, Mike Thelwall, Enrique Orduna-Malea, Emilio Delgado López-Cózar

New sources of citation data have recently become available, such as Microsoft Academic, Dimensions, and the OpenCitations Index of CrossRef open DOI-to-DOI citations (COCI). Although these have been compared to the Web of Science Core Collection (WoS), Scopus, or Google Scholar, there is no systematic evidence of their differences across subject categories.

In response, this paper investigates 3,073,351 citations found by these six data sources to 2,515 English-language highly-cited documents published in 2006 from 252 subject categories, expanding and updating the largest previous study. Google Scholar found 88% of all citations, many of which were not found by the other sources, and nearly all citations found by the remaining sources (89–94%).

A similar pattern held within most subject categories. Microsoft Academic is the second largest overall (60% of all citations), including 82% of Scopus citations and 86% of WoS citations. In most categories, Microsoft Academic found more citations than Scopus and WoS (182 and 223 subject categories, respectively), but had coverage gaps in some areas, such as Physics and some Humanities categories. After Scopus, Dimensions is fourth largest (54% of all citations), including 84% of Scopus citations and 88% of WoS citations.

It found more citations than Scopus in 36 categories, more than WoS in 185, and displays some coverage gaps, especially in the Humanities. Following WoS, COCI is the smallest, with 28% of all citations. Google Scholar is still the most comprehensive source. In many subject categories Microsoft Academic and Dimensions are good alternatives to Scopus and WoS in terms of coverage.

DOI : https://doi.org/10.1007/s11192-020-03690-4

Inferring the causal effect of journals on citations

Author : Vincent Traag

Articles in high-impact journals are by definition more highly cited on average. But are they cited more often because the articles are somehow “better”? Or are they cited more often simply because they appeared in a high-impact journal? Although some evidence suggests the latter the causal relationship is not clear.

We here compare citations of published journal articles to citations of their preprint versions to uncover the causal mechanism. We build on an earlier model to infer the causal effect of journals on citations. We find evidence for both effects.

We show that high-impact journals seem to select articles that tend to attract more citations. At the same time, we find that high-impact journals augment the citation rate of published articles.

Our results yield a deeper understanding of the role of journals in the research system. The use of journal metrics in research evaluation has been increasingly criticised in recent years and article-level citations are sometimes suggested as an alternative.

Our results show that removing impact factors from evaluation does not negate the influence of journals. This insight has important implications for changing practices of research evaluation.

URL : https://arxiv.org/abs/1912.08648

How Many More Cites is a $3,000 Open Access Fee Buying You? Empirical Evidence from a Natural Experiment

Authors :  Frank Mueller-Langer, Richard Watt

This paper analyzes the effect of open access (OA) status of published journal articles on peer recognition, as measured by the number of citations. Using cross-sectional and panel data from interdisciplinary mathematics and economics journals, we perform negative binomial, Poisson and linear regressions together with generalized method of moments/instrumental variable methods regressions.

We benefit from a natural experiment via hybrid OA pilot agreements. Under these agreements, OA status is exogenously assigned to all articles of authors affiliated with hybrid OA pilot institutions.

Our cross-sectional analysis of the full sample suggests that there is no citation benefit associated with hybrid OA. In contrast, for the subpopulation of journal articles for which neither OA pre-prints nor OA post-prints are available, we find positive hybrid OA effects for the full sample and each discipline separately.

We address the issue of selection bias by exploiting a panel of journal articles for which OA pre-prints are available. Citations to pre-prints allow us to identify the intrinsic quality of articles prior to publication in a journal.

The results from the panel analysis provide additional empirical evidence for a negligible hybrid OA citation effect.

URL : https://ssrn.com/abstract=3096572

Research impact of paywalled versus open access papers

Authors : Éric Archambault, Grégoire Côté, Brooke Struck, Matthieu Voorons

This note presents data from the 1science oaIndx on the average of relative citations (ARC) for 3.3 million papers published from 2007 to 2009 and indexed in the Web of Science (WoS).

These data show a decidedly large citation advantage for open access (OA) papers, despite them suffering from a lag in availability compared to paywalled papers.

URL : http://www.1science.com/oanumbr.html

The data sharing advantage in astrophysics

We present here evidence for the existence of a citation advantage within astrophysics for papers that link to data. Using simple measures based on publication data from NASA Astrophysics Data System we find a citation advantage for papers with links to data receiving on the average significantly more citations per paper than papers without links to data. Furthermore, using INSPEC and Web of Science databases we investigate whether either papers of an experimental or theoretical nature display different citation behavior.

URL : http://arxiv.org/abs/1511.02512

With Great Power Comes Great Responsibility: the Importance of Rejection, Power, and Editors in the Practice of Scientific Publishing

Statut

“Peer review is an important element of scientific communication but deserves quantitative examination. We used data from the handling service manuscript Central for ten mid-tier ecology and evolution journals to test whether number of external reviews completed improved citation rates for all accepted manuscripts. Contrary to a previous study examining this issue using resubmission data as a proxy for reviews, we show that citation rates of manuscripts do not correlate with the number of individuals that provided reviews. Importantly, externally-reviewed papers do not outperform editor-only reviewed published papers in terms of visibility within a 5-year citation window. These findings suggest that in many instances editors can be all that is needed to review papers (or at least conduct the critical first review to assess general suitability) if the purpose of peer review is to primarily filter and that journals can consider reducing the number of referees associated with reviewing ecology and evolution papers.”

URL : With Great Power Comes Great Responsibility: the Importance of Rejection, Power, and Editors in the Practice of Scientific Publishing

DOI: 10.1371/journal.pone.0085382

Data reuse and the open data citation advantage…

Statut

Data reuse and the open data citation advantage :

Background: Attribution to the original contributor upon reuse of published data is important both as a reward for data creators and to document the provenance of research findings. Previous studies have found that papers with publicly available datasets receive a higher number of citations than similar studies without available data. However, few previous analyses have had the statistical power to control for the many variables known to predict citation rate, which has led to uncertain estimates of the “citation benefit”. Furthermore, little is known about patterns in data reuse over time and across datasets.

Method and Results: Here, we look at citation rates while controlling for many known citation predictors and investigate the variability of data reuse. In a multivariate regression on 10,555 studies that created gene expression microarray data, we found that studies that made data available in a public repository received 9% (95% confidence interval: 5% to 13%) more citations than similar studies for which the data was not made available. Date of publication, journal impact factor, open access status, number of authors, first and last author publication history, corresponding author country, institution citation history, and study topic were included as covariates. The citation benefit varied with date of dataset deposition: a citation benefit was most clear for papers published in 2004 and 2005, at about 30%. Authors published most papers using their own datasets within two years of their first publication on the dataset, whereas data reuse papers published by third-party investigators continued to accumulate for at least six years. To study patterns of data reuse directly, we compiled 9,724 instances of third party data reuse via mention of GEO or ArrayExpress accession numbers in the full text of papers. The level of third-party data use was high: for 100 datasets deposited in year 0, we estimated that 40 papers in PubMed reused a dataset by year 2, 100 by year 4, and more than 150 data reuse papers had been published by year 5. Data reuse was distributed across a broad base of datasets: a very conservative estimate found that 20% of the datasets deposited between 2003 and 2007 had been reused at least once by third parties.

Conclusion: After accounting for other factors affecting citation rate, we find a robust citation benefit from open data, although a smaller one than previously reported. We conclude there is a direct effect of third-party data reuse that persists for years beyond the time when researchers have published most of the papers reusing their own data. Other factors that may also contribute to the citation benefit are considered. We further conclude that, at least for gene expression microarray data, a substantial fraction of archived datasets are reused, and that the intensity of dataset reuse has been steadily increasing since 2003.”

URL : https://peerj.com/articles/175/