The coverage of Microsoft Academic: Analyzing the publication output of a university

Authors : Sven E. Hug, Martin P. Brändle

This is the first in-depth study on the coverage of Microsoft Academic (MA). The coverage of a verified publication list of a university was analyzed on the level of individual publications in MA, Scopus, and Web of Science (WoS).

Citation counts were analyzed and issues related to data retrieval and data quality were examined. A Perl script was written to retrieve metadata from MA. We find that MA covers journal articles, working papers, and conference items to a substantial extent. MA surpasses Scopus and WoS clearly with respect to book-related document types and conference items but falls slightly behind Scopus with regard to journal articles.

MA shows the same biases as Scopus and WoS with regard to the coverage of the social sciences and humanities, non-English publications, and open-access publications. Rank correlations of citation counts are high between MA and the benchmark databases.

We find that the publication year is correct for 89.5% of all publications and the number of authors for 95.1% of the journal articles. Given the fast and ongoing development of MA, we conclude that MA is on the verge of becoming a bibliometric superpower. However, comprehensive studies on the quality of MA data are still lacking.


Library and Information Sciences : Trends and Research

« This book explores the development, trends and research of library and information sciences (LIS) in the digital age. Inside, readers will find research and case studies written by LIS experts, educators and theorists, most of whom have visited China, delivered presentations there and drafted their articles based on feedback they received. As a result, readers will discover the LIS issues and concerns that China and the international community have in common.

The book first introduces the opportunities and challenges faced by the library and information literacy profession and discusses the key role of librarians in the future of information literacy education. Next, it covers trends in LIS education by examining the vision of the iSchool movement and detailing its practice in Syracuse University.

The book then covers issues in information seeking and retrieval by showing how visual data mining technology can be used to detect the relationship and pattern between terms on the Q&A of a social media site. It also includes a case study regarding tracing information seeking behavior and usage on a multimedia website.

Next, the book stresses the importance of building an academic accreditation framework for scientific datasets, explores the relationship between bibliometrics and university rankings, and details the birth and development of East Asian Libraries in North America.

Overall, the book offers readers insight into the changing nature of LIS, including the electronic dissemination of information, the impact of the Internet on libraries, the changing responsibilities of library professionals, the new paradigm for evaluating information, and characteristics and functions of today’s library personnel. »


Alternative URL :

Dark Research: information content in many modern research papers is not easily discoverable online

« Background: Research is published in indexed, online scholarly journals so that published knowledge can be easily found and built upon by others. Most scholars rely on relatively few online indexing service providers to search for relevant scholarly content. It is under-appreciated that the quality of indexing can vary across different journals and that this can have an adverse effect on the quality of research.

Objective: In this short paper I compare the recall of commonly used online indexers; Google Scholar, Web of Knowledge, Scopus, Microsoft Academic Search and Mendeley Search against a selection of over 20,000 papers published in two different high-volume journals: PLOS ONE and Zootaxa.

Results: When using Google Scholar, content in Zootaxa has low recall for search terms that are known to occur in it, significantly lower than the near-perfect recall of the same terms in PLOS ONE. All other indexers tend to have lower recall than Google Scholar except Scopus which outperformed Google Scholar for recall on Zootaxa searches. I also elaborate why Dark Research is undesirable for optimal scientific progress with some recommendations for change.

Conclusion: This research is a basic proof-of-concept which demonstrates that when searching for published scholarly content, relevant studies can remain hidden as ’Dark Research’ in poorly-indexed journals, even despite expertise-informed efforts to find the content. The technological capability to do full text indexing on all modern scholarly journal content certainly exists, it is perhaps just publisher-imposed access-restrictions on content that prevents this from happening. »

URL : Dark Research: information content in many modern research papers is not easily discoverable online


Doctoral Students in New Zealand Have Low Awareness…

Doctoral Students in New Zealand Have Low Awareness of Institutional Repository Existence, but Positive Attitudes Toward Open Access Publication of Their Work :

« Objective : To investigate doctoral students’ knowledge of and attitudes toward open access models of scholarly communication and institutional repositories, and to examine their willingness to comply with a mandatory institutional repository (IR) submission policy.

Design : Mixed method, sequential exploratory design.

Setting : A large, multi-campus New Zealand university that mandates IR deposit of doctoral theses.

Subjects :Two doctoral students from each of four university colleges were interviewed. All 901 doctoral students were subsequently sent a survey, with 251 responding.

Methods : Semi-structured interviews with eight subjects selected by purposive sampling, followed by a survey sent to all doctoral students. The authors used NVivo 8 for analysis of interview data, along with a two-phase approach to coding. First, they analyzed transcripts from semi-structured interviews line-by-line to identify themes. In the second phase, authors employed focused coding to analyze the most common themes and to merge or drop peripheral themes. Themes were mapped against Rogers’ diffusion of innovation theory and social exchange theory constructs to aid interpretation. The results were used to develop a survey with a fixed set of response choices. Authors then analyzed survey results using Excel and SurveyMonkey, first as a single data set and then by discipline.

Main Results : The authors found that general awareness of open access was high (62%), and overall support for open access publication was 86.3%. Awareness of IRs as a general concept was much lower at 48%. Those subject to a mandatory IR deposit policy for doctoral theses overwhelmingly indicated willingness to comply (92.6%), as did those matriculating prior to the policy (83.3%), although only 77.3% of all respondents agreed that deposit should be mandatory. Only 17.6% of respondents had deposited their own work in an IR, while 31.7% reported directly accessing a repository for research. The greatest perceived benefits of IR participation were removal of cost for readers, ease of sharing research, increased exposure and citing of one’s work, and professional networking. The greatest perceived risks were plagiarism, loss of ability to publish elsewhere, and less prestige relative to traditional publication. The reason most given for selecting a specific publication outlet was recommendation of a doctoral supervisor. Disciplinary differences in responses were not sizable. For additional interpretation, the authors applied Rogers’s diffusion of innovations theory to determine the extent to which IRs are effective innovations. The authors posit that repositories will become a more widely adopted innovations as awareness of IRs in general increases, and through increased awareness that IR content is discoverable through major search engines such as Google Scholar, thus improving usability and increasing dissemination of research. Using the social exchange theory framework, the authors found that respondents’ expressed willingness to deposit their work in IRs demonstrated altruistic motives for sharing their research freely with others, appreciation for the reciprocity of gaining access to others’ research, and awareness of the potential direct reward of having their work cited more often.

Conclusion : Authors identified that lack of awareness, rather than resistance to deposit, as the main barrier to IR depository participation. Major benefits perceived for participating included the public good of knowledge sharing and increased exposure for one’s work. Concerns included copyright and plagiarism issues. These findings have implications for communication and marketing campaigns to promote doctoral students’ deposit of their work in institutional repositories. While respondents reported low direct use of IRs for conducting research, the vast majority reported using Google Scholar, and so may have unknowingly accessed open access repository content. This finding suggests that attention be given to enhanced metadata for optimizing discoverability of IR content through general search engines. »


Evaluation of a Web Portal for Improving Public Access to Evidence-Based Health Information and Health Literacy Skills: A Pragmatic Trial

« Background :
Using the conceptual framework of shared decision-making and evidence-based practice, a web portal was developed to serve as a generic (non disease-specific) tailored intervention to improve the lay public’s health literacy skills.

Objective : To evaluate the effects of the web portal compared to no intervention in a real-life setting.

Methods: A pragmatic randomised controlled parallel trial using simple randomisation of 96 parents who had children aged

Results : Use of the web portal was found to improve attitudes towards searching for health information. This variable was identified as the most important predictor of intention to search in both samples. Participants considered the web portal to have good usability, usefulness, and credibility. The intervention group showed slight increases in the use of evidence-based information, critical appraisal skills, and participation compared to the group receiving no intervention, but these differences were not statistically significant.

Conclusion : Despite the fact that the study was underpowered, we found that the web portal may have a positive effect on attitudes towards searching for health information. Furthermore, participants considered the web portal to be a relevant tool. It is important to continue experimenting with web-based resources in order to increase user participation in health care decision-making. »


Usage and Impact of Controlled Vocabularies in a…

Usage and Impact of Controlled Vocabularies in a Subject Repository for Indexing and Retrieval :

« Since 2009, the German National Library for Economics (ZBW) supports both indexing and retrieval of Open Access scientific publications like working papers, postprint articles and conference papers by means of a terminology web service. This web service is based on concepts organized as a ‘Standard Thesaurus for Economics’ (STW), which is modelled and regularly published as Linked Open Data. Moreover, it is integrated into the institution’s subject repository for automatically suggesting appropriate key words while indexing and retrieving documents, and for automatically expanding search queries on demand to gain better search results. While this approach looks promising to augment ‘off the shelf’ repository software systems in a lightweight manner with a disciplinary profile, there is still significant uncertainty about the effective usage and impact of controlled terms in the realm of these systems. To cope with this, we analyze the repository’s logfiles to get evidence of search behaviour which is potentially influenced by auto suggestion and expansion of scientific terms derived from a discipline’s literature. »