The NIH Open Citation Collection: A public access, broad coverage resource

Authors : B. Ian Hutchins, Kirk L. Baker, Matthew T. Davis, Mario A. Diwersy, Ehsanul Haque, Robert M. Harriman, Travis A. Hoppe, Stephen A. Leicht, Payam Meyer, George M. Santangelo

Citation data have remained hidden behind proprietary, restrictive licensing agreements, which raises barriers to entry for analysts wishing to use the data, increases the expense of performing large-scale analyses, and reduces the robustness and reproducibility of the conclusions.

For the past several years, the National Institutes of Health (NIH) Office of Portfolio Analysis (OPA) has been aggregating and enhancing citation data that can be shared publicly. Here, we describe the NIH Open Citation Collection (NIH-OCC), a public access database for biomedical research that is made freely available to the community.

This dataset, which has been carefully generated from unrestricted data sources such as MedLine, PubMed Central (PMC), and CrossRef, now underlies the citation statistics delivered in the NIH iCite analytic platform.

We have also included data from a machine learning pipeline that identifies, extracts, resolves, and disambiguates references from full-text articles available on the internet. Open citation links are available to the public in a major update of iCite (

The diverse niches of megajournals: Specialism within generalism

Authors: Kyle Siler, Vincent Larivière, Cassidy R. Sugimoto

Over the past decade, megajournals have expanded in popularity and established a legitimate niche in academic publishing. Leveraging advantages of digital publishing, megajournals are characterized by large publication volume, broad interdisciplinary scope, and peer‐review filters that select primarily for scientific soundness as opposed to novelty or originality.

These publishing innovations are complementary and competitive vis‐à‐vis traditional journals. We analyze how megajournals (PLOS One, Scientific Reports) are represented in different fields relative to prominent generalist journals (Nature, PNAS, Science) and “quasi‐megajournals” (Nature Communications, PeerJ).

Our results show that both megajournals and prominent traditional journals have distinctive niches, despite the similar interdisciplinary scopes of such journals.

These niches—defined by publishing volume and disciplinary diversity—are dynamic and varied over the relatively brief histories of the analyzed megajournals. Although the life sciences are the predominant contributor to megajournals, there is variation in the disciplinary composition of different megajournals.

The growth trajectories and disciplinary composition of generalist journals—including megajournals—reflect changing knowledge dissemination and reward structures in science.

Cultural obstacles to research data management and sharing at TU Delft

Authors : Esther Plomp, Nicolas Dintzner, Marta Teperek, Alastair Dunning

Research data management (RDM) is increasingly important in scholarship. Many researchers are, however, unaware of the benefits of good RDM and unsure about the practical steps they can take to improve their RDM practices. Delft University of Technology (TU Delft) addresses this cultural barrier by appointing Data Stewards at every faculty.

By providing expert advice and increasing awareness, the Data Stewardship project focuses on incremental improvements in current data and software management and sharing practices.

This cultural change is accelerated by the Data Champions who share best practices in data management with their peers. The Data Stewards and Data Champions build a community that allows a discipline-specific approach to RDM. Nevertheless, cultural change also requires appropriate rewards and incentives.

While local initiatives are important, and we discuss several examples in this paper, systemic changes to the academic rewards system are needed. This will require collaborative efforts of a broad coalition of stakeholders and we will mention several such initiatives.

This article demonstrates that community building is essential in changing the code and data management culture at TU Delft.

Echoes des publications scientifiques en SHS sur les réseaux sociaux. Le cas des contenus d’Open Édition sur Twitter.

Auteurs/Authors : Lucie Loubère, Fidelia Ibekwe-Sanjuan

Les réseaux sociaux en se diffusant sur l’intégralité de la société sont également entrés dans le monde de la recherche. Ces outils accélèrent la circulation de l’information, et pourraient atteindre une audience différente du circuit universitaire.

Parallèlement les plateformes de savoir ouvert se développent et rendent accessible à tout le monde le savoir scientifique. Notre étude se focalise sur l’étude des tweets émis entre 2013 et 2017 pointant vers un contenu d’OpenEdition. Nous avons analysé les réseaux de retweets ainsi que les contenus textuels des tweets par étude lexicométrique.

Une classification interdisciplinaire pour l’échange et la médiation des données ouvertes de la recherche

Authors : Marcin Trzmielewski, Claudio Gnoli

Cette réflexion propose une évaluation de l’Integrative Levels Classification (ILC) en vue d’organiser des données ouvertes de la recherche en sciences humaines et sociales. Elle s’appuie sur une analyse des propriétés de l’ILC par rapport aux pratiques, usages et contexte de partage et de médiation des données.

Revisiting “the 1990s debutante”: Scholar‐led publishing and the prehistory of the open access movement

Author : Samuel A. Moore

The movement for open access publishing (OA) is often said to have its roots in the scientific disciplines, having been popularized by scientific publishers and formalized through a range of top‐down policy interventions. But there is an often‐neglected prehistory of OA that can be found in the early DIY publishers of the late 1980s and early 1990s.

Managed entirely by working academics, these journals published research in the humanities and social sciences and stand out for their unique set of motivations and practices.

This article explores this separate lineage in the history of the OA movement through a critical‐theoretical analysis of the motivations and practices of the early scholar‐led publishers.

Alongside showing the involvement of the humanities and social sciences in the formation of OA, the analysis reveals the importance that these journals placed on experimental practices, critique of commercial publishing, and the desire to reach new audiences.

The Future of OA: A large-scale analysis projecting Open Access publication and readership

Authors : Heather Piwowar, Jason Priem, Richard Orr

Understanding the growth of open access (OA) is important for deciding funder policy, subscription allocation, and infrastructure planning.

This study analyses the number of papers available as OA over time. The models includes both OA embargo data and the relative growth rates of different OA types over time, based on the OA status of 70 million journal articles published between 1950 and 2019.

The study also looks at article usage data, analyzing the proportion of views to OA articles vs views to articles which are closed access. Signal processing techniques are used to model how these viewership patterns change over time. Viewership data is based on 2.8 million uses of the Unpaywall browser extension in July 2019.

We found that Green, Gold, and Hybrid papers receive more views than their Closed or Bronze counterparts, particularly Green papers made available within a year of publication. We also found that the proportion of Green, Gold, and Hybrid articles is growing most quickly.

In 2019:

  • 31% of all journal articles are available as OA
  • 52% of article views are to OA article

Given existing trends, we estimate that by 2025:

  • 44% of all journal articles will be available as OA

  • 70% of article views will be to OA articles

The declining relevance of closed access articles is likely to change the landscape of scholarly communication in the years to come.

