Open access analytics with open access repository data: A Multi-level perspective

Author : Ibraheem Mohammed Sultan Al Sadi

Within nearly two decades after the open access movement emerged, its community has drawn attention to understanding its development, coverage, obstacles and motivations. To do so, they depend on data-centric analytics of open access publishing activities, using Web information space as their data sources for these analytical activities.

Open access repositories are one such data source that nurtures open access publishing activities and are a valuable source for analytics. Therefore, the open access community utilises open access repository infrastructure to develop and operate analytics, harnessing the widely adopted Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) interoperability layer to develop value-added services with an analytics agenda.

However,this layer presents its limitations and challenges regarding the support of analytical value-added services. To address these practices, this research has taken the step to consolidate these practices into the ‘open access analytics’ notion of drawing attention to its significance and bridge it with data analytics literature.

As part of this, an explanatory case study demonstrate show the OAI-PMH service provider approach supports open access analytics and also presents its limitations using Registry of Open Access Repositories (ROAR) analytics as a case study.

The case study reflects the limitation of open access registries to enable a single point of discovery due to the quality of their records and complexity of open access repositories taxonomy, the complexity of operationalising the unit of analysis in particular analytics due to the limitations in the OAI-PMH metadata schemes, the complex and resource-intensive harvesting process due to the large volume of data and the low quality of OAI-PMH standards adoptions and the issue of service provider suitability due to a single point of failure.

Also, this doctoral thesis proposes the use of Open Access Analytics using Open Access Repository Data with a Social Machine (OAA-OARD-SM) as a conceptual frame work to deliver open access analytics by using the open access repository infrastructure in acollaborative manner with social machines.

Furthermore, it takes advantage of the web observatory infrastructure as a form of web-based mediated technology to coordinate the open access analytics process. The conceptual framework re-frames the open access analytics process into four layers: the open access repository layer, the open access registry layer, the data analytics layer and open access analytics layer.

It also conceptualises analytics practices carried out within individual repository boundaries as core practices for the realisation of open access analytics and examines how the repository management team can participate in the open access analytics process.

To understand this, expert interviews were carried out to investigate and understand the analytics practices within the repository boundaries and the repository management teams’ interactions with analytics applications that are fed by the open access repository or used by repository management to operate open access analytics.

The interviews provide insight into the variations in the types of analytic practices and highlight the active role played by the repository management team in these practices. Thus, it provides an understanding of the analytics practices within open access repositories by classifying them into two main categories: the distributed analytical applications and locally operated analytics.

The distributed analytics application includes cross repository OAI-based analytics, cross-repository usage data aggregators, solo-repository content-centric analytics and solo-repository centric analytics.

On the other hand, the locally operated analytics take forms of Current Research Information System (CRIS),repository embedded functionalities and in-house developed analytics. It also classifies the repository management interactions with analytics into four roles: data analyst, administrative, data and system management, and system development and support.

Lastly, it raises concerns associated with the application of analytics on open access repositories, including data-related, cost-related and analytical concerns.


OAI-PMH à « l’heure du web sémantique » : bilans et perspectives

Auteur/Author : Vincent de Lavenne de la Montoise

À l’approche du vingtième anniversaire du protocole OAI-PMH, et dans un environnement web qui a subi de profondes évolutions (technologiques et d’usages), quelle est l’actualité de l’échange de données ? Comment se sont construits les usages des professionnel le s en la matière ? Sont ils adaptés aux défis actuels ?

Ce travail se propose d’analyser l’exposition et l’échange de données sous un angle historique, avant d’essayer de comprendre les enjeux actuels qui détermineront quelle(s) solution(s) techniques choisir.

URL : OAI-PMH à « l’heure du web sémantique » : bilans et perspectives

Original location :

Federated Search Service for OAI-compliant, Open-Access Repositories in India

Many of the research institutions and universities across the world are facilitating open-access (OA) to their intellectual outputs through their respective OA institutional repositories (IRs) or through the centralized subject-based repositories. The registry of open access repositories (ROAR) lists more than 2850 such repositories across the world. The awareness about the benefits of OA to scholarly literature and OA publishing is picking up in India, too. As per the ROAR statistics, to date, there are more than 90 OA repositories in the country. India is doing particularly well in publishing open-access journals (OAJ). As per the directory of open-access journals (DOAJ), to date, India with 390 OAJs, is ranked 5th in the world in terms of numbers of OAJs being published.

Much of the research done in India is reported in the journals published from India. These journals have limited readership and many of them are not being indexed by Web of Science, Scopus or other leading international abstracting and indexing databases. Consequently, research done in the country gets hidden not only from the fellow countrymen, but also from the international community. This situation can be easily overcome if all the researchers facilitate OA to their publications.

One of the easiest ways to facilitate OA to scientific literature is through the institutional repositories. If every research institution and university in India set up an open-access IR and ensure that copies of the final accepted versions of all the research publications are uploaded in the IRs, then the research done in India will get far better visibility. The federation of metadata from all the distributed, interoperable OA repositories in the country will serve as a window to the research done across the country.

Federation of metadata from the distributed OAI-compliant repositories can be easily achieved by setting up harvesting software like the PKP Harvester. In this paper, we share our experience in setting up a prototype metadata harvesting service using the PKP harvesting software for the OAI-compliant repositories in India.


Web Services for Bibliometrics

Institutional repositories have spread in universities where they provide services for recording, distributing, and preserving the institution’s intellectual output. When the Lausanne “academic server”, named SERVAL, was launched at the end of 2008, the Faculty of Biology and Medicine addressed from the outset the issue of quality of metadata. Accuracy is fundamental since research funds are allocated on the basis of the statistics and indicators provided by the repository. The Head of faculty also charged the medical library to explore different ways to measure and assess the research output. The first step for the Lausanne university medical library was to implement the PubMed and the Web of Science web services to easily extract clean bibliographic information from the databases directly into the repository.

Now the medical library is testing other web services (from CrossRef, Web of Science, etc.) to generate quantitative data on research impact mainly. The approach is essentially based on citation linking. Although the utility of citation and bibliometric evaluation is still debated, the most prevalent output measures used for research evaluation are still those based on citation analysis. Even when a new scientific evaluation indicator is proposed, such as h-index, we can always see its link with citation. Additionally, the results of a new indicator are often compared with citation analysis. The presentation will review the web services which might be used in institutional repositories to collect and aggregate citation information for the researchers’ publications.