We present here evidence for the existence of a citation advantage within astrophysics for papers that link to data. Using simple measures based on publication data from NASA Astrophysics Data System we find a citation advantage for papers with links to data receiving on the average significantly more citations per paper than papers without links to data. Furthermore, using INSPEC and Web of Science databases we investigate whether either papers of an experimental or theoretical nature display different citation behavior.
« Peer review is an important element of scientific communication but deserves quantitative examination. We used data from the handling service manuscript Central for ten mid-tier ecology and evolution journals to test whether number of external reviews completed improved citation rates for all accepted manuscripts. Contrary to a previous study examining this issue using resubmission data as a proxy for reviews, we show that citation rates of manuscripts do not correlate with the number of individuals that provided reviews. Importantly, externally-reviewed papers do not outperform editor-only reviewed published papers in terms of visibility within a 5-year citation window. These findings suggest that in many instances editors can be all that is needed to review papers (or at least conduct the critical first review to assess general suitability) if the purpose of peer review is to primarily filter and that journals can consider reducing the number of referees associated with reviewing ecology and evolution papers. »
Data reuse and the open data citation advantage :
« Background: Attribution to the original contributor upon reuse of published data is important both as a reward for data creators and to document the provenance of research findings. Previous studies have found that papers with publicly available datasets receive a higher number of citations than similar studies without available data. However, few previous analyses have had the statistical power to control for the many variables known to predict citation rate, which has led to uncertain estimates of the “citation benefit”. Furthermore, little is known about patterns in data reuse over time and across datasets.
Method and Results: Here, we look at citation rates while controlling for many known citation predictors and investigate the variability of data reuse. In a multivariate regression on 10,555 studies that created gene expression microarray data, we found that studies that made data available in a public repository received 9% (95% confidence interval: 5% to 13%) more citations than similar studies for which the data was not made available. Date of publication, journal impact factor, open access status, number of authors, first and last author publication history, corresponding author country, institution citation history, and study topic were included as covariates. The citation benefit varied with date of dataset deposition: a citation benefit was most clear for papers published in 2004 and 2005, at about 30%. Authors published most papers using their own datasets within two years of their first publication on the dataset, whereas data reuse papers published by third-party investigators continued to accumulate for at least six years. To study patterns of data reuse directly, we compiled 9,724 instances of third party data reuse via mention of GEO or ArrayExpress accession numbers in the full text of papers. The level of third-party data use was high: for 100 datasets deposited in year 0, we estimated that 40 papers in PubMed reused a dataset by year 2, 100 by year 4, and more than 150 data reuse papers had been published by year 5. Data reuse was distributed across a broad base of datasets: a very conservative estimate found that 20% of the datasets deposited between 2003 and 2007 had been reused at least once by third parties.
Conclusion: After accounting for other factors affecting citation rate, we find a robust citation benefit from open data, although a smaller one than previously reported. We conclude there is a direct effect of third-party data reuse that persists for years beyond the time when researchers have published most of the papers reusing their own data. Other factors that may also contribute to the citation benefit are considered. We further conclude that, at least for gene expression microarray data, a substantial fraction of archived datasets are reused, and that the intensity of dataset reuse has been steadily increasing since 2003. »
URL : https://peerj.com/articles/175/
The Rich Get Richer and the Poor Get Poorer: The Effect of Open Access on Cites to Science Journals Across the Quality Spectrum :
« An open-access journal allows free online access to its articles, obtaining revenue from fees charged to submitting authors. Using panel data on science journals, we are able to circumvent some problems plaguing previous studies of the impact of open access on citations. We find that moving from paid to open access increases cites by 8% on average in our sample, but the effect varies across the quality of content. Open access increases cites to the best content (top-ranked journals or articles in upper quintiles of citations within a volume) but reduces cites to lower-quality content. We construct a model to explain these findings in which being placed on a broad open-access platform can increase the competition among articles for readers’ attention. We can find structural parameters allowing the model to fit the quintile results quite closely. »
URL : http://ssrn.com/abstract=2269040
Delayed Open Access – an overlooked high-impact category of openly available scientific literature:
« Delayed open access (OA) refers to scholarly articles in subscription journals made available openly on the web directly through the publisher at the expiry of a set embargo period. Though a substantial number of journals have practiced delayed OA since they started publishing e-versions, empirical studies concerning open access have often overlooked this body of literature. This study provides comprehensive quantitative measurements by identifying delayed OA journals, collecting data concerning their publication volumes, embargo lengths, and citation rates. Altogether 492 journals were identified, publishing a combined total of 111 312 articles in 2011. 77,8 % of these articles were made open access within 12 months from publication, with 85,4 % becoming available within 24 months. A journal impact factor analysis revealed that delayed OA journals have on average twice as high average citation rates compared to closed subscription journals, and three times as high as immediate OA journals. Overall the results demonstrate that delayed OA journals constitute an important segment of the openly available scholarly journal literature, both by their sheer article volume as well as by including a substantial proportion of high impact journals. »
URL : http://hanken.halvi.helsinki.fi/portal/en/publications/delayed-open-access%28a2eb7a79-1078-4657-9d57-4f9f5a1ff228%29.html
Does Online Availability Increase Citations? Theory and Evidence from a Panel of Economics and Business Journals :
« Does online availability boost citations? The answer has implications for issues ranging from the value of a citation to the sustainability of open-access journals. Using panel data on citations to economics and business journals, we show that the enormous effects found in previous studies were an artifact of their failure to control for article quality, disappearing once we add fixed effects as controls. The absence of an aggregate effect masks heterogeneity across platforms: JSTOR stands apart from others, boosting citations around 10%. We examine other sources of heterogeneity including whether JSTOR increases cites from authors in developing more than developed countries and increases cites to “long-tail” more than “superstar” articles. Our theoretical analysis informs the econometric specification and allows us to translate our results for citation increases into welfare terms. »
URL : http://ssrn.com/abstract=1746243
Comparing journals from different fields of Science and Social Science through a JCR Subject Categories Normalized Impact Factor :
« The journal Impact Factor (IF) is not comparable among fields of Science and Social Science because of systematic differences in publication and citation behaviour across disciplines. In this work, a decomposing of the field aggregate impact factor into five normally distributed variables is presented. Considering these factors, a Principal Component Analysis is employed to find the sources of the variance in the JCR subject categories of Science and Social Science. Although publication and citation behaviour differs largely across disciplines, principal components explain more than 78% of the total variance and the average number of references per paper is not the primary factor explaining the variance in impact factors across categories. The Categories Normalized Impact Factor (CNIF) based on the JCR subject category list is proposed and compared with the IF. This normalization is achieved by considering all the indexing categories of each journal. An empirical application, with one hundred journals in two or more subject categories of economics and business, shows that the gap between rankings is reduced around 32% in the journals analyzed. This gap is obtained as the maximum distance among the ranking percentiles from all categories where each journal is included. »
URL : http://arxiv.org/abs/1304.5107