Data reuse and the open data citation advantage…

Data reuse and the open data citation advantage :

Background: Attribution to the original contributor upon reuse of published data is important both as a reward for data creators and to document the provenance of research findings. Previous studies have found that papers with publicly available datasets receive a higher number of citations than similar studies without available data. However, few previous analyses have had the statistical power to control for the many variables known to predict citation rate, which has led to uncertain estimates of the “citation benefit”. Furthermore, little is known about patterns in data reuse over time and across datasets.

Method and Results: Here, we look at citation rates while controlling for many known citation predictors and investigate the variability of data reuse. In a multivariate regression on 10,555 studies that created gene expression microarray data, we found that studies that made data available in a public repository received 9% (95% confidence interval: 5% to 13%) more citations than similar studies for which the data was not made available. Date of publication, journal impact factor, open access status, number of authors, first and last author publication history, corresponding author country, institution citation history, and study topic were included as covariates. The citation benefit varied with date of dataset deposition: a citation benefit was most clear for papers published in 2004 and 2005, at about 30%. Authors published most papers using their own datasets within two years of their first publication on the dataset, whereas data reuse papers published by third-party investigators continued to accumulate for at least six years. To study patterns of data reuse directly, we compiled 9,724 instances of third party data reuse via mention of GEO or ArrayExpress accession numbers in the full text of papers. The level of third-party data use was high: for 100 datasets deposited in year 0, we estimated that 40 papers in PubMed reused a dataset by year 2, 100 by year 4, and more than 150 data reuse papers had been published by year 5. Data reuse was distributed across a broad base of datasets: a very conservative estimate found that 20% of the datasets deposited between 2003 and 2007 had been reused at least once by third parties.

Conclusion: After accounting for other factors affecting citation rate, we find a robust citation benefit from open data, although a smaller one than previously reported. We conclude there is a direct effect of third-party data reuse that persists for years beyond the time when researchers have published most of the papers reusing their own data. Other factors that may also contribute to the citation benefit are considered. We further conclude that, at least for gene expression microarray data, a substantial fraction of archived datasets are reused, and that the intensity of dataset reuse has been steadily increasing since 2003.”

URL : https://peerj.com/articles/175/

Open Access and the Changing Landscape of Research…

Open Access and the Changing Landscape of Research Impact Indicators: New Roles for Repositories :

“The debate about the need to revise metrics that evaluate research excellence has been ongoing for years, and a number of studies have identified important issues that have yet to be addressed. Internet and other technological developments have enabled the collection of richer data and new approaches to research assessment exercises. Open access strongly advocates for maximizing research impact by enhancing seamless accessibility. In addition, new tools and strategies have been used by open access journals and repositories to showcase how science can benefit from free online dissemination. Latest players in the debate include initiatives based on alt-metrics, which enrich the landscape with promising indicators. To start with, the article gives a brief overview of the debate and the role of open access in advancing a new frame to assess science. Next, the work focuses on the strategy that the Spanish National Research Council’s repository DIGITAL.CSIC is implementing to collect a rich set of statistics and other metrics that are useful for repository administrators, researchers and the institution alike. A preliminary analysis of data hints at correlations between free dissemination of research through DIGITAL.CSIC and enhanced impact, reusability and sharing of CSIC science on the web.”

URL : http://www.mdpi.com/2304-6775/1/2/56

Universality of scholarly impact metrics We present…

Universality of scholarly impact metrics:

“We present a method to quantify the disciplinary bias of any scholarly impact metric, and introduce a simple universal metric that allows to compare the impact of scholars across scientific disciplines.”

URL : http://arxiv.org/abs/1305.6339

Delayed Open Access – an overlooked high impact…

Delayed Open Access – an overlooked high-impact category of openly available scientific literature:

“Delayed open access (OA) refers to scholarly articles in subscription journals made available openly on the web directly through the publisher at the expiry of a set embargo period. Though a substantial number of journals have practiced delayed OA since they started publishing e-versions, empirical studies concerning open access have often overlooked this body of literature. This study provides comprehensive quantitative measurements by identifying delayed OA journals, collecting data concerning their publication volumes, embargo lengths, and citation rates. Altogether 492 journals were identified, publishing a combined total of 111 312 articles in 2011. 77,8 % of these articles were made open access within 12 months from publication, with 85,4 % becoming available within 24 months. A journal impact factor analysis revealed that delayed OA journals have on average twice as high average citation rates compared to closed subscription journals, and three times as high as immediate OA journals. Overall the results demonstrate that delayed OA journals constitute an important segment of the openly available scholarly journal literature, both by their sheer article volume as well as by including a substantial proportion of high impact journals.”

URL : http://hanken.halvi.helsinki.fi/portal/en/publications/delayed-open-access%28a2eb7a79-1078-4657-9d57-4f9f5a1ff228%29.html

Does Online Availability Increase Citations Theory and Evidence…

Does Online Availability Increase Citations? Theory and Evidence from a Panel of Economics and Business Journals :

“Does online availability boost citations? The answer has implications for issues ranging from the value of a citation to the sustainability of open-access journals. Using panel data on citations to economics and business journals, we show that the enormous effects found in previous studies were an artifact of their failure to control for article quality, disappearing once we add fixed effects as controls. The absence of an aggregate effect masks heterogeneity across platforms: JSTOR stands apart from others, boosting citations around 10%. We examine other sources of heterogeneity including whether JSTOR increases cites from authors in developing more than developed countries and increases cites to “long-tail” more than “superstar” articles. Our theoretical analysis informs the econometric specification and allows us to translate our results for citation increases into welfare terms.”

URL : http://ssrn.com/abstract=1746243

On the impact of Gold Open Access journals…

On the impact of Gold Open Access journals :

“Gold Open Access (=Open Access publishing) is for many the preferred route to achieve unrestricted and immediate access to research output. However, true Gold Open Access journals are still outnumbered by traditional journals. Moreover availability of Gold OA journals differs from discipline to discipline and often leaves scientists concerned about the impact of these existent titles. This study identified the current set of Gold Open Access journals featuring a Journal Impact Factor (JIF) by means of Ulrichsweb, Directory of Open Access Journals and Journal Citation Reports (JCR). The results were analyzed regarding disciplines, countries, quartiles of the JIF distribution in JCR and publishers. Furthermore the temporal impact evolution was studied for a Top 50 titles list (according to JIF) by means of Journal Impact Factor, SJR and SNIP in the time interval 2000–2010. The identified top Gold Open Access journals proved to be well-established and their impact is generally increasing for all the analyzed indicators. The majority of JCR-indexed OA journals can be assigned to Life Sciences and Medicine. The success-rate for JCR inclusion differs from country to country and is often inversely proportional to the number of national OA journal titles. Compiling a list of JCR-indexed OA journals is a cumbersome task that can only be achieved with non-Thomson Reuters data sources. A corresponding automated feature to produce current lists ‘‘on the fly’’ would be desirable in JCR in order to conveniently track the impact evolution of Gold OA journals.”

URL : https://uscholar.univie.ac.at/view/o:246061