The aging effect in evolving scientific citation networks

Authors : Feng Hu, Lin Ma, Xiu-Xiu Zhan, Yinzuo Zhou, Chuang Liu, Haixing Zhao, Zi-Ke Zhang

The study of citation networks is of interest to the scientific community. However, the underlying mechanism driving individual citation behavior remains imperfectly understood, despite the recent proliferation of quantitative research methods.

Traditional network models normally use graph theory to consider articles as nodes and citations as pairwise relationships between them. In this paper, we propose an alternative evolutionary model based on hypergraph theory in which one hyperedge can have an arbitrary number of nodes, combined with an aging effect to reflect the temporal dynamics of scientific citation behavior.

Both theoretical approximate solution and simulation analysis of the model are developed and validated using two benchmark datasets from different disciplines, i.e. publications of the American Physical Society (APS) and the Digital Bibliography & Library Project (DBLP).

Further analysis indicates that the attraction of early publications will decay exponentially. Moreover, the experimental results show that the aging effect indeed has a significant influence on the description of collective citation patterns.

Shedding light on the complex dynamics driving these mechanisms facilitates the understanding of the laws governing scientific evolution and the quantitative evaluation of scientific outputs.

URL : The aging effect in evolving scientific citation networks

DOI : https://doi.org/10.1007/s11192-021-03929-8

What is the benefit from publishing a working paper in a journal in terms of citations? Evidence from economics

Authors : Klaus Wohlraben, Constantin Bürgi

Many papers in economics that are published in peer reviewed journals are initially released in widely circulated working paper series. This raises the question about the benefit of publishing in a peer-reviewed journal in terms of citations.

Specifically, we address the question: to what extent does the stamp of approval obtained by publishing in a peer-reviewed journal lead to more subsequent citations for papers that are already available in working paper series? Our data set comprises about 28,000 working papers from four major working paper series in economics.

Using panel data methods, we show that the publication in a peer reviewed journal results in around twice the number of yearly citations relative to working papers that never get published in a journal. Our results hold in several robustness checks.

URL : What is the benefit from publishing a working paper in a journal in terms of citations? Evidence from economics

DOI : https://doi.org/10.1007/s11192-021-03942-x

Evidence for Trusted Digital Repository Reviews: An Analysis of Perspectives

Author : Jonathan David Crabtree

Building trust in our research infrastructure is important for the future of the academy. Trust in research data repositories is critical as they provide the evidence for past discoveries as well as the input for future discoveries.

Archives and repositories are examining their options for trustworthy review, audit, and certification as a means to build trust within their content creator and user communities. One option these institutions have is to increase and demonstrate their trustworthiness is to apply for the CoreTrustSeal.

Applicants for the CoreTrustSeal are becoming more numerous and diverse, ranging general purpose repositories, preservation infrastructure providers, and domain repositories. This demand for certification and the subjective nature of decisions around levels of CORETrustSeal compliance drives this dissertation.

It is a study of the review process and its veracity and consistency in determining the trustworthiness of applicant repositories. Several assumptions underlie this work. First, audits and reviews must be based on evidence supplied by the repository under scrutiny; second, and not all reviewers will approach a piece of evidence in the same fashion or give it the same weight. Third, the value and veracity of required evidence may be subject to reviewers’ diverse perspectives and diverse repository community norms.

This research used a thematic qualitative analysis approach to identify similarities and differences in CoreTrustSeal reviewers’ responses during semi-structured interviews in order to better understand potential subjective differences among respondents. The participants’ non-probabilistic sample represented a balance in perspectives across three anticipated categories: administrator, archivist, and technologist.

Themes converged around several key concepts. Nearly all participants felt they were performing a peer review process and working to help the repository community and the research enterprise.

Reviewers were questioned about the various CoreTrustSeal application requirements and which ones they felt were the most important. No clear evidence emerged to indicate that variations in perspectives affected the subjective review of application evidence. The same categories of evidence were often selected and identified as being critical across all three categories (i.e., administrator, archivist, and technologist).

Many valuable suggestions from participants were recorded and can be implemented to ensure the consistency and sustainability of this trusted repository review process.

These suggestions and concepts were also very evenly distributed across the three perspectives. The balance in perspectives is potentially due to participants’ experience levels and their years of experience in various positions, holding many responsibilities, within the organizations they represented.

DOI : https://doi.org/10.17615/npck-km73

An analysis of the factors affecting open access to research output in institutional repositories in selected universities in East Africa

Author : Miriam Kakai

INTRODUCTION

Institutional repositories (IRs) present universities with an opportunity to provide global open access (OA) to their scholarship, however, this avenue was underutilised in two of the three universities in this study.

This study aimed at proposing interventions to improve access to research output in IRs in universities in East Africa, and it adds to the depth of knowledge on IRs by pointing out the factors that limit OA in IRs, some of which include lack of government and funder support for OA and mediated content collection workflows that hardly involved seeking author permission to self-archive.

METHODS

A mixed methods approach, following a concurrent strategy was used to investigate the low level of OA in IRs. Data was collected from three purposively selected IRs in universities in East Africa, using self-administered questionnaires from 183 researchers and face-to-face interviews from six librarians.

RESULTS

The findings revealed that content was collected on a voluntary basis, with most of the research output deposited in the IR without the authors’ knowledge. The respondents in this study were, however, supportive of the activities of the IR, and would participate in providing research output in the IR as OA if required to do so.

CONCLUSION

The low level of OA in IRs in universities in East Africa could be increased by improving the IR workflow, collection development, and marketing processes. Self-archiving could be improved by increasing the researchers’ awareness and knowledge of OA and importance of IRs, while addressing their concerns about copyright infringement.

URL : An analysis of the factors affecting open access to research output in institutional repositories in selected universities in East Africa

DOI : http://doi.org/10.7710/2162-3309.2276

Versioning Data Is About More than Revisions: A Conceptual Framework and Proposed Principles

Authors : Jens Klump, Lesley Wyborn, Mingfang Wu, Julia Martin, Robert R. Downs, Ari Asmi

A dataset, small or big, is often changed to correct errors, apply new algorithms, or add new data (e.g., as part of a time series), etc.

In addition, datasets might be bundled into collections, distributed in different encodings or mirrored onto different platforms. All these differences between versions of datasets need to be understood by researchers who want to cite the exact version of the dataset that was used to underpin their research.

Failing to do so reduces the reproducibility of research results. Ambiguous identification of datasets also impacts researchers and data centres who are unable to gain recognition and credit for their contributions to the collection, creation, curation and publication of individual datasets.

Although the means to identify datasets using persistent identifiers have been in place for more than a decade, systematic data versioning practices are currently not available. In this work, we analysed 39 use cases and current practices of data versioning across 33 organisations.

We noticed that the term ‘version’ was used in a very general sense, extending beyond the more common understanding of ‘version’ to refer primarily to revisions and replacements. Using concepts developed in software versioning and the Functional Requirements for Bibliographic Records (FRBR) as a conceptual framework, we developed six foundational principles for versioning of datasets: Revision, Release, Granularity, Manifestation, Provenance and Citation.

These six principles provide a high-level framework for guiding the consistent practice of data versioning and can also serve as guidance for data centres or data providers when setting up their own data revision and version protocols and procedures.

URL : Versioning Data Is About More than Revisions: A Conceptual Framework and Proposed Principles

DOI : http://doi.org/10.5334/dsj-2021-012

Inferring the causal effect of journals on citations

Author : Vincent A Traag

Articles in high-impact journals are, on average, more frequently cited. But are they cited more often because those articles are somehow more “citable”? Or are they cited more often simply because they are published in a high-impact journal? Although some evidence suggests the latter, the causal relationship is not clear.

We here compare citations of preprints to citations of the published version to uncover the causal mechanism. We build on an earlier model of citation dynamics to infer the causal effect of journals on citations. We find that high-impact journals select articles that tend to attract more citations.

At the same time, we find that high-impact journals augment the citation rate of published articles. Our results yield a deeper understanding of the role of journals in the research system.

The use of journal metrics in research evaluation has been increasingly criticized in recent years and article-level citations are sometimes suggested as an alternative. Our results show that removing impact factors from evaluation does not negate the influence of journals. This insight has important implications for changing practices of research evaluation.

DOI : https://doi.org/10.1162/qss_a_00128

Open access analytics with open access repository data: A Multi-level perspective

Author : Ibraheem Mohammed Sultan Al Sadi

Within nearly two decades after the open access movement emerged, its community has drawn attention to understanding its development, coverage, obstacles and motivations. To do so, they depend on data-centric analytics of open access publishing activities, using Web information space as their data sources for these analytical activities.

Open access repositories are one such data source that nurtures open access publishing activities and are a valuable source for analytics. Therefore, the open access community utilises open access repository infrastructure to develop and operate analytics, harnessing the widely adopted Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) interoperability layer to develop value-added services with an analytics agenda.

However,this layer presents its limitations and challenges regarding the support of analytical value-added services. To address these practices, this research has taken the step to consolidate these practices into the ‘open access analytics’ notion of drawing attention to its significance and bridge it with data analytics literature.

As part of this, an explanatory case study demonstrate show the OAI-PMH service provider approach supports open access analytics and also presents its limitations using Registry of Open Access Repositories (ROAR) analytics as a case study.

The case study reflects the limitation of open access registries to enable a single point of discovery due to the quality of their records and complexity of open access repositories taxonomy, the complexity of operationalising the unit of analysis in particular analytics due to the limitations in the OAI-PMH metadata schemes, the complex and resource-intensive harvesting process due to the large volume of data and the low quality of OAI-PMH standards adoptions and the issue of service provider suitability due to a single point of failure.

Also, this doctoral thesis proposes the use of Open Access Analytics using Open Access Repository Data with a Social Machine (OAA-OARD-SM) as a conceptual frame work to deliver open access analytics by using the open access repository infrastructure in acollaborative manner with social machines.

Furthermore, it takes advantage of the web observatory infrastructure as a form of web-based mediated technology to coordinate the open access analytics process. The conceptual framework re-frames the open access analytics process into four layers: the open access repository layer, the open access registry layer, the data analytics layer and open access analytics layer.

It also conceptualises analytics practices carried out within individual repository boundaries as core practices for the realisation of open access analytics and examines how the repository management team can participate in the open access analytics process.

To understand this, expert interviews were carried out to investigate and understand the analytics practices within the repository boundaries and the repository management teams’ interactions with analytics applications that are fed by the open access repository or used by repository management to operate open access analytics.

The interviews provide insight into the variations in the types of analytic practices and highlight the active role played by the repository management team in these practices. Thus, it provides an understanding of the analytics practices within open access repositories by classifying them into two main categories: the distributed analytical applications and locally operated analytics.

The distributed analytics application includes cross repository OAI-based analytics, cross-repository usage data aggregators, solo-repository content-centric analytics and solo-repository centric analytics.

On the other hand, the locally operated analytics take forms of Current Research Information System (CRIS),repository embedded functionalities and in-house developed analytics. It also classifies the repository management interactions with analytics into four roles: data analyst, administrative, data and system management, and system development and support.

Lastly, it raises concerns associated with the application of analytics on open access repositories, including data-related, cost-related and analytical concerns.

URL : http://eprints.soton.ac.uk/id/eprint/447464