Wikipedia as a gateway to biomedical research: The relative distribution and use of citations in the English Wikipedia

Authors : Lauren A. Maggio, John M. Willinsky, Ryan M. Steinberg, Daniel Mietchen, Joseph L. Wass, Ting Dong

Wikipedia is a gateway to knowledge. However, the extent to which this gateway ends at Wikipedia or continues via supporting citations is unknown. Wikipedia’s gateway functionality has implications for information design and education, notably in medicine.

This study aims to establish benchmarks for the relative distribution and referral (click) rate of citations, as indicated by presence of a Digital Object Identifier (DOI), from Wikipedia, with a focus on medical citations.

DOIs referred from the English Wikipedia in August 2016 were obtained from Crossref.org. Next, based on a DOI presence on a WikiProject Medicine page, all DOIs in Wikipedia were categorized as medical (WP:MED) or non-medical (non-WP:MED).

Using this categorization, referred DOIs were classified as WP:MED, non-WP:MED, or BOTH, meaning the DOI may have been referred from either category. Data were analyzed using descriptive and inferential statistics.

Out of 5.2 million Wikipedia pages, 4.42% (n=229,857) included at least one DOI. 68,870 were identified as WP:MED, with 22.14% (n=15,250) featuring one or more DOIs. WP:MED pages featured on average 8.88 DOI citations per page, whereas non-WP:MED pages had on average 4.28 DOI citations.

For DOIs only on WP:MED pages, a DOI was referred every 2,283 pageviews and for non-WP-MED pages every 2,467 pageviews. DOIs from both pages accounted for 12% (n=58,475) of referrals, making determining a referral rate for both impossible.

While these results cannot provide evidence of greater citation referral from WP:MED than non-WP:MED, they do provide benchmarks to assess strategies for changing referral patterns.

These changes might include editors adopting new methods for designing and presenting citations or the introduction of teaching strategies that address the value of consulting citations as a tool for extending learning.

URL : Wikipedia as a gateway to biomedical research: The relative distribution and use of citations in the English Wikipedia

DOI : https://doi.org/10.1101/165159

How to share data for collaboration

Authors : Shannon E Ellis, Jeffrey T Leek

Within the statistics community, a number of guiding principles for sharing data have emerged; however, these principles are not always made clear to collaborators generating the data. To bridge this divide, we have established a set of guidelines for sharing data.

In these, we highlight the need to provide raw data to the statistician, the importance of consistent formatting, and the necessity of including all essential experimental information and pre-processing steps carried out to the statistician. With these guidelines we hope to avoid errors and delays in data analysis.

URL : How to share data for collaboration

DOI : https://doi.org/10.7287/peerj.preprints.3139v1

 

Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data

Authors : Julie A. McMurry, Nick Juty, Niklas Blomberg, Tony Burdett, Tom Conlin, Nathalie Conte, Mélanie Courtot, John Deck, Michel Dumontier, Donal K. Fellows, Alejandra Gonzalez-Beltran, Philipp Gormanns, Jeffrey Grethe, Janna Hastings, Jean-Karim Hériché, Henning Hermjakob, Jon C. Ison, Rafael C. Jimenez, Simon Jupp, John Kunze, Camille Laibe, Nicolas Le Novère, James Malone, Maria Jesus Martin, Johanna R. McEntyre, Chris Morris, Juha Muilu, Wolfgang Müller, Philippe Rocca-Serra, Susanna-Assunta Sansone, Murat Sariyar, Jacky L. Snoep, Stian Soiland-Reyes, Natalie J. Stanford, Neil Swainston, Nicole Washington, Alan R. Williams, Sarala M. Wimalaratne, Lilly M. Winfree, Katherine Wolstencroft, Carole Goble, Christopher J. Mungall, Melissa A. Haendel, Helen Parkinson

In many disciplines, data are highly decentralized across thousands of online databases (repositories, registries, and knowledgebases). Wringing value from such databases depends on the discipline of data science and on the humble bricks and mortar that make integration possible; identifiers are a core component of this integration infrastructure.

Drawing on our experience and on work by other groups, we outline 10 lessons we have learned about the identifier qualities and best practices that facilitate large-scale data integration. Specifically, we propose actions that identifier practitioners (database providers) should take in the design, provision and reuse of identifiers.

We also outline the important considerations for those referencing identifiers in various circumstances, including by authors and data generators. While the importance and relevance of each lesson will vary by context, there is a need for increased awareness about how to avoid and manage common identifier problems, especially those related to persistence and web-accessibility/resolvability.

We focus strongly on web-based identifiers in the life sciences; however, the principles are broadly relevant to other disciplines.

URL : Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data

DOI : https://doi.org/10.1371/journal.pbio.2001414

The State of OA: A large-scale analysis of the prevalence and impact of Open Access articles

Authors : Heather Piwowar, Jason Priem, Vincent Larivière, Juan Pablo Alperin, Lisa Matthias, Bree Norlander, Ashley Farley, Jevin West, Stefanie Haustein

Despite growing interest in Open Access (OA) to scholarly literature, there is an unmet need for large-scale, up-to-date, and reproducible studies assessing the prevalence and characteristics of OA. We address this need using oaDOI, an open online service that determines OA status for 67 million articles.

We use three samples, each of 100,000 articles, to investigate OA in three populations: 1) all journal articles assigned a Crossref DOI, 2) recent journal articles indexed in Web of Science, and 3) articles viewed by users of Unpaywall, an open-source browser extension that lets users find OA articles using oaDOI.

We estimate that at least 28% of the scholarly literature is OA (19M in total) and that this proportion is growing, driven particularly by growth in Gold and Hybrid. The most recent year analyzed (2015) also has the highest percentage of OA (45%). Because of this growth, and the fact that readers disproportionately access newer articles, we find that Unpaywall users encounter OA quite frequently: 47% of articles they view are OA. Notably, the most common mechanism for OA is not Gold, Green, or Hybrid OA, but rather an under-discussed category we dub Bronze: articles made free-to-read on the publisher website, without an explicit Open license.

We also examine the citation impact of OA articles, corroborating the so-called open-access citation advantage: accounting for age and discipline, OA articles receive 18% more citations than average, an effect driven primarily by Green and Hybrid OA. We encourage further research using the free oaDOI service, as a way to inform OA policy and practice.

URL : The State of OA: A large-scale analysis of the prevalence and impact of Open Access articles

DOI : https://doi.org/10.7287/peerj.preprints.3119v1

 

Improving the Measurement of Scientific Success by Reporting a Self-Citation Index

Authors : JustinW. Flatt, Alessandro Blasimme, Effy Vayena

Who among the many researchers is most likely to usher in a new era of scientific breakthroughs? This question is of critical importance to universities, funding agencies, as well as scientists who must compete under great pressure for limited amounts of research money.

Citations are the current primary means of evaluating one’s scientific productivity and impact, and while often helpful, there is growing concern over the use of excessive self-citations to help build sustainable careers in science.

Incorporating superfluous self-citations in one’s writings requires little effort, receives virtually no penalty, and can boost, albeit artificially, scholarly impact and visibility, which are both necessary for moving up the academic ladder.

Such behavior is likely to increase, given the recent explosive rise in popularity of web-based citation analysis tools (Web of Science, Google Scholar, Scopus, and Altmetric) that rank research performance.

Here, we argue for new metrics centered on transparency to help curb this form of self-promotion that, if left unchecked, can have a negative impact on the scientific workforce, the way that we publish new knowledge, and ultimately the course of scientific advance.

URL : Improving the Measurement of Scientific Success by Reporting a Self-Citation Index

DOI : http://www.mdpi.com/2304-6775/5/3/20

The Surge in New University Presses and Academic- Led Publishing: An Overview of a Changing Publishing Ecology in the UK

Authors : Janneke Adema, Graham Stone

This article outlines the rise and development of New University Presses and Academic-Led Presses in the UK or publishing for the UK market. Based on the Jisc research project, Changing publishing ecologies: a landscape study of new university presses and academic-led publishing, commonalities between these two types of presses are identified to better assess their future needs and requirements.

Based on this analysis, the article argues for the development of a publishing toolkit, for further research into the creation of a typology of presses and publishing initiatives, and for support with community building to help these initiatives grow and develop further, whilst promoting a more diverse publishing ecology.

URL : The Surge in New University Presses and Academic- Led Publishing: An Overview of a Changing Publishing Ecology in the UK

DOI : http://doi.org/10.18352/lq.10210

Practicing What You Preach: Evaluating Access of Open Access Research

Author : Teresa Schultz

The open access movement seeks to encourage all researchers to make their works openly available and free of paywalls so more people can access their knowledge. Yet some researchers who study open access (OA) continue to publish their work in paywalled journals and fail to make it open.

This project set out to study just how many published research articles about OA fall into this category, how many are being made open (whether by being published in a gold OA or hybrid journal or through open deposit), and how library and information science authors compare to other disciplines researching this field.

Because of the growth of tools available to help researchers find open versions of articles, this study also sought to compare how these new tools compare to Google Scholar in their ability to disseminating OA research.

From a sample collected from Web of Science of articles published since 2010, the study found that although a majority of research articles about OA are open in some form, a little more than a quarter are not.

A smaller rate of library science researchers made their work open compared to non-library science researchers. In looking at the copyright of these articles published in hybrid and open journals, authors were more likely to retain copyright ownership if they printed in an open journal compared to authors in hybrid journals.

Articles were more likely to be published with a Creative Commons license if published in an open journal compared to those published in hybrid journals.

URL : Practicing What You Preach: Evaluating Access of Open Access Research

DOI : https://dx.doi.org/10.17605/OSF.IO/YBDR8