Wikiometrics: A Wikipedia Based Ranking System

We present a new concept – Wikiometrics – the derivation of metrics and indicators from Wikipedia. Wikipedia provides an accurate representation of the real world due to its size, structure, editing policy and popularity. We demonstrate an innovative mining methodology, where different elements of Wikipedia – content, structure, editorial actions and reader reviews – are used to rank items in a manner which is by no means inferior to rankings produced by experts or other methods. We test our proposed method by applying it to two real-world ranking problems: top world universities and academic journals. Our proposed ranking methods were compared to leading and widely accepted benchmarks, and were found to be extremely correlative but with the advantage of the data being publically available.

URL : http://arxiv.org/abs/1601.01058

Reproducible Research Practices and Transparency across the Biomedical Literature

There is a growing movement to encourage reproducibility and transparency practices in the scientific community, including public access to raw data and protocols, the conduct of replication studies, systematic integration of evidence in systematic reviews, and the documentation of funding and potential conflicts of interest.

In this survey, we assessed the current status of reproducibility and transparency addressing these indicators in a random sample of 441 biomedical journal articles published in 2000–2014. Only one study provided a full protocol and none made all raw data directly available. Replication studies were rare (n = 4), and only 16 studies had their data included in a subsequent systematic review or meta-analysis. The majority of studies did not mention anything about funding or conflicts of interest.

The percentage of articles with no statement of conflict decreased substantially between 2000 and 2014 (94.4% in 2000 to 34.6% in 2014); the percentage of articles reporting statements of conflicts (0% in 2000, 15.4% in 2014) or no conflicts (5.6% in 2000, 50.0% in 2014) increased.

Articles published in journals in the clinical medicine category versus other fields were almost twice as likely to not include any information on funding and to have private funding. This study provides baseline data to compare future progress in improving these indicators in the scientific literature.

URL : Reproducible Research Practices and Transparency across the Biomedical Literature

DOI : 10.1371/journal.pbio.1002333

The Rise and Decline of an Open Collaboration System: How Wikipedia’s reaction to popularity is causing its decline

Open collaboration systems like Wikipedia need to maintain a pool of volunteer contributors in order to remain relevant. Wikipedia was created through a tremendous number of contributions by millions of contributors. However, recent research has shown that the number of active contributors in Wikipedia has been declining steadily for years, and suggests that a sharp decline in the retention of newcomers is the cause.

This paper presents data that show that several changes the Wikipedia community made to manage quality and consistency in the face of a massive growth in participation have ironically crippled the very growth they were designed to manage. Specifically, the restrictiveness of the encyclopedia’s primary quality control mechanism and the algorithmic tools used to reject contributions are implicated as key causes of decreased newcomer retention.

Further, the community’s formal mechanisms for norm articulation are shown to have calcified against changes – especially changes proposed by newer editors.

URL : https://www-users.cs.umn.edu/~halfak/publications/The_Rise_and_Decline/halfaker13rise-preprint.pdf

A Two-Step Model for Assessing Relative Interest in E-books Compared to Print

Librarians often wish to know whether readers in a particular discipline favor e-books or print books. Because print circulation and e-book usage statistics are not directly comparable, it can be hard to determine the relative interest of readers in the two types of books. This study demonstrates a two-step method by which librarians can assess the appeal of books in various formats.

First, a nominal assessment of use or nonuse is performed; this eliminates the difficulty of comparing print circulation to e-book usage statistics.

Then, the comparison of actual use to Percentage of Expected Use (PEU) is made. By examining the distance between PEU of e-books to PEU of print books in a discipline, librarians can determine whether patrons have a strong preference for one format over another.

URL : http://m.crl.acrl.org/content/77/1/20

Leading by Example? ALA Division Publications, Open Access, and Sustainability

This investigation explores scholarly communication business models in American Library Association (ALA) division peer-reviewed academic journals. Previous studies reveal the numerous issues organizations and publishers face in the academic publishing environment. Through an analysis of documented procedures, policies, and finances of five ALA division journals, we compare business and access models.

We conclude that some ALA divisions prioritize the costs associated with changing business models, including hard-to-estimate costs such as the labor of volunteers. For other divisions, the financial aspects are less important than maintaining core values, such as those defined in ALA’s Core Values in Librarianship.

URL : http://m.crl.acrl.org/content/early/2015/12/14/crl15-841.abstract

Enabling Open Science: Wikidata for Research

Wiki4R will create an innovative virtual research environment (VRE) for Open Science at scale, engaging both professional researchers and citizen data scientists in new and potentially transformative forms of collaboration. It is based on the realizations that (1) the structured parts of the Web itself can be regarded as a VRE, (2) such environments depend on communities, (3) closed environments are limited in their capacity to nurture thriving communities.

Wiki4R will therefore integrate Wikidata, the multilingual semantic backbone behind Wikipedia, into existing research processes to enable transdisciplinary research and reduce fragmentation of research in and outside Europe. By establishing a central shared information node, research data can be linked and annotated into knowledge. Despite occasional uses of Wikipedia or Wikidata in research, significant barriers to broader adoption in the sciences or digital humanities exist, including lack of integration into existing research processes and inadequate handling of provenances.

The proposed actions include providing best practices and tools for semantic mapping, adoption of citation and author identifiers, interoperability layers for integration with existing research environments, and the development of policies for information quality and interchange. The effectiveness of the actions will be tested in pilot use cases.

Unforeseen barriers will be investigated and documented. We will promote the adoption of Wiki4R by making it easy to use and integrate, demonstrate the applicability in selected research domains, and provide diverse training opportunities.

Wiki4R leverages the expertise gained in Europe through the Wikidata and DBpedia projects to further strengthen the established virtual community of 14000 people. As a result of increased interaction between professional science and citizens, it will provide an improved basis for Responsible Research and Innovation and Open Science in the European Research Area.

URL : Enabling Open Science: Wikidata for Research

Alternative location : http://rio.pensoft.net/articles.php?id=7573

Assessing Research Data Management Practices of Faculty at Carnegie Mellon University

INTRODUCTION

Recent changes to requirements for research data management by federal granting agencies and by other funding institutions have resulted in the emergence of institutional support for these requirements. At CMU, we sought to formalize assessment of research data management practices of researchers at the institution by launching a faculty survey and conducting a number of interviews with researchers.

METHODS

We submitted a survey on research data management practices to a sample of faculty including questions about data production, documentation, management, and sharing practices. The survey was coupled with in-depth interviews with a subset of faculty. We also make estimates of the amount of research data produced by faculty.

RESULTS

Survey and interview results suggest moderate level of awareness of the regulatory environment around research data management. Results also present a clear picture of the types and quantities of data being produced at CMU and how these differ among research domains. Researchers identified a number of services that they would find valuable including assistance with data management planning and backup/storage services. We attempt to estimate the amount of data produced and shared by researchers at CMU.

DISCUSSION

Results suggest that researchers may need and are amenable to assistance with research data management. Our estimates of the amount of data produced and shared have implications for decisions about data storage and preservation.

CONCLUSION

Our survey and interview results have offered significant guidance for building a suite of services for our institution.

URL : Assessing Research Data Management Practices of Faculty at Carnegie Mellon University

DOI : http://doi.org/10.7710/2162-3309.1258