Assessment of gender divide in scientific communities

Authors : Antonio De Nicola, Gregorio D’Agostino

Increasing evidence of women’s under-representation in some scientific disciplines is prompting researchers to expand our understanding of this social phenomenon. Moreover, any countermeasures proposed to eliminate this under-representation should be tailored to the actual reasons for this different participation.

Here, we take a multi-dimensional approach to assessing gender differences in science by representing scientific communities as social networks, and using data analytics, complexity science methods, and semantic methods to measure gender differences in the context, the attitude and the success of scientists.

We apply this approach to four scientific communities in the two fields of computer science and information systems using the network of authors at four different conferences. For each discipline, one conference is based in Italy and attracts mostly Italians, while one conference is international in both location and participants.

The present paper provides evidence against common narratives that women’s under-representation is due to women’s limited skills and/or less social centrality.

URL : Assessment of gender divide in scientific communities

DOI : https://doi.org/10.1007/s11192-021-03885-3

Accuracy of PubMed-based author lists of publications and use of author identifiers to address author name ambiguity: a cross-sectional study

Authors : Paul Sebo, Sylvain de Lucia, Nathalie Vernaz

Objective

To assess the accuracy of PubMed-based author lists of publications and use of author identifiers to address author name ambiguity.

Methods

In this Swiss study conducted in 2019, 300 hospital-based senior physicians were asked to generate a list of their publications in PubMed and complete a questionnaire (type of query used, number of errors in their list of publications, knowledge and use of ORCID and ResearcherID).

Results

156 physicians (52%) agreed to participate, 145 of whom published at least one article (mean number of publications: 60 (SD 73)). Only 17% used the advanced search option. On average, there were 5 articles in the lists that were not co-authored by participants (advanced search: 1.0 (SD 2.6) vs. 5.9 (SD 13.9), p value 0.02) and 3 articles co-authored by participants that did not appear in the lists (advanced search: 1.5 (SD 2.0) vs. 3.6 (SD 8.4), p-value 0.05). Although 82% were aware of ORCID, only 16% added all their articles (39% and 6% respectively for ResearcherID).

Conclusions

When used by senior physicians, the advanced search in PubMed is accurate for retrieving authors’ publications. Author identifiers are only used by a minority of physicians and are therefore not recommended in this context, as they would lead to inaccurate results.

URL : Accuracy of PubMed-based author lists of publications and use of author identifiers to address author name ambiguity: a cross-sectional study

DOI : https://doi.org/10.1007/s11192-020-03845-3

The aging effect in evolving scientific citation networks

Authors : Feng Hu, Lin Ma, Xiu-Xiu Zhan, Yinzuo Zhou, Chuang Liu, Haixing Zhao, Zi-Ke Zhang

The study of citation networks is of interest to the scientific community. However, the underlying mechanism driving individual citation behavior remains imperfectly understood, despite the recent proliferation of quantitative research methods.

Traditional network models normally use graph theory to consider articles as nodes and citations as pairwise relationships between them. In this paper, we propose an alternative evolutionary model based on hypergraph theory in which one hyperedge can have an arbitrary number of nodes, combined with an aging effect to reflect the temporal dynamics of scientific citation behavior.

Both theoretical approximate solution and simulation analysis of the model are developed and validated using two benchmark datasets from different disciplines, i.e. publications of the American Physical Society (APS) and the Digital Bibliography & Library Project (DBLP).

Further analysis indicates that the attraction of early publications will decay exponentially. Moreover, the experimental results show that the aging effect indeed has a significant influence on the description of collective citation patterns.

Shedding light on the complex dynamics driving these mechanisms facilitates the understanding of the laws governing scientific evolution and the quantitative evaluation of scientific outputs.

URL : The aging effect in evolving scientific citation networks

DOI : https://doi.org/10.1007/s11192-021-03929-8

What is the benefit from publishing a working paper in a journal in terms of citations? Evidence from economics

Authors : Klaus Wohlraben, Constantin Bürgi

Many papers in economics that are published in peer reviewed journals are initially released in widely circulated working paper series. This raises the question about the benefit of publishing in a peer-reviewed journal in terms of citations.

Specifically, we address the question: to what extent does the stamp of approval obtained by publishing in a peer-reviewed journal lead to more subsequent citations for papers that are already available in working paper series? Our data set comprises about 28,000 working papers from four major working paper series in economics.

Using panel data methods, we show that the publication in a peer reviewed journal results in around twice the number of yearly citations relative to working papers that never get published in a journal. Our results hold in several robustness checks.

URL : What is the benefit from publishing a working paper in a journal in terms of citations? Evidence from economics

DOI : https://doi.org/10.1007/s11192-021-03942-x

Evidence for Trusted Digital Repository Reviews: An Analysis of Perspectives

Author : Jonathan David Crabtree

Building trust in our research infrastructure is important for the future of the academy. Trust in research data repositories is critical as they provide the evidence for past discoveries as well as the input for future discoveries.

Archives and repositories are examining their options for trustworthy review, audit, and certification as a means to build trust within their content creator and user communities. One option these institutions have is to increase and demonstrate their trustworthiness is to apply for the CoreTrustSeal.

Applicants for the CoreTrustSeal are becoming more numerous and diverse, ranging general purpose repositories, preservation infrastructure providers, and domain repositories. This demand for certification and the subjective nature of decisions around levels of CORETrustSeal compliance drives this dissertation.

It is a study of the review process and its veracity and consistency in determining the trustworthiness of applicant repositories. Several assumptions underlie this work. First, audits and reviews must be based on evidence supplied by the repository under scrutiny; second, and not all reviewers will approach a piece of evidence in the same fashion or give it the same weight. Third, the value and veracity of required evidence may be subject to reviewers’ diverse perspectives and diverse repository community norms.

This research used a thematic qualitative analysis approach to identify similarities and differences in CoreTrustSeal reviewers’ responses during semi-structured interviews in order to better understand potential subjective differences among respondents. The participants’ non-probabilistic sample represented a balance in perspectives across three anticipated categories: administrator, archivist, and technologist.

Themes converged around several key concepts. Nearly all participants felt they were performing a peer review process and working to help the repository community and the research enterprise.

Reviewers were questioned about the various CoreTrustSeal application requirements and which ones they felt were the most important. No clear evidence emerged to indicate that variations in perspectives affected the subjective review of application evidence. The same categories of evidence were often selected and identified as being critical across all three categories (i.e., administrator, archivist, and technologist).

Many valuable suggestions from participants were recorded and can be implemented to ensure the consistency and sustainability of this trusted repository review process.

These suggestions and concepts were also very evenly distributed across the three perspectives. The balance in perspectives is potentially due to participants’ experience levels and their years of experience in various positions, holding many responsibilities, within the organizations they represented.

DOI : https://doi.org/10.17615/npck-km73

An analysis of the factors affecting open access to research output in institutional repositories in selected universities in East Africa

Author : Miriam Kakai

INTRODUCTION

Institutional repositories (IRs) present universities with an opportunity to provide global open access (OA) to their scholarship, however, this avenue was underutilised in two of the three universities in this study.

This study aimed at proposing interventions to improve access to research output in IRs in universities in East Africa, and it adds to the depth of knowledge on IRs by pointing out the factors that limit OA in IRs, some of which include lack of government and funder support for OA and mediated content collection workflows that hardly involved seeking author permission to self-archive.

METHODS

A mixed methods approach, following a concurrent strategy was used to investigate the low level of OA in IRs. Data was collected from three purposively selected IRs in universities in East Africa, using self-administered questionnaires from 183 researchers and face-to-face interviews from six librarians.

RESULTS

The findings revealed that content was collected on a voluntary basis, with most of the research output deposited in the IR without the authors’ knowledge. The respondents in this study were, however, supportive of the activities of the IR, and would participate in providing research output in the IR as OA if required to do so.

CONCLUSION

The low level of OA in IRs in universities in East Africa could be increased by improving the IR workflow, collection development, and marketing processes. Self-archiving could be improved by increasing the researchers’ awareness and knowledge of OA and importance of IRs, while addressing their concerns about copyright infringement.

URL : An analysis of the factors affecting open access to research output in institutional repositories in selected universities in East Africa

DOI : http://doi.org/10.7710/2162-3309.2276

Versioning Data Is About More than Revisions: A Conceptual Framework and Proposed Principles

Authors : Jens Klump, Lesley Wyborn, Mingfang Wu, Julia Martin, Robert R. Downs, Ari Asmi

A dataset, small or big, is often changed to correct errors, apply new algorithms, or add new data (e.g., as part of a time series), etc.

In addition, datasets might be bundled into collections, distributed in different encodings or mirrored onto different platforms. All these differences between versions of datasets need to be understood by researchers who want to cite the exact version of the dataset that was used to underpin their research.

Failing to do so reduces the reproducibility of research results. Ambiguous identification of datasets also impacts researchers and data centres who are unable to gain recognition and credit for their contributions to the collection, creation, curation and publication of individual datasets.

Although the means to identify datasets using persistent identifiers have been in place for more than a decade, systematic data versioning practices are currently not available. In this work, we analysed 39 use cases and current practices of data versioning across 33 organisations.

We noticed that the term ‘version’ was used in a very general sense, extending beyond the more common understanding of ‘version’ to refer primarily to revisions and replacements. Using concepts developed in software versioning and the Functional Requirements for Bibliographic Records (FRBR) as a conceptual framework, we developed six foundational principles for versioning of datasets: Revision, Release, Granularity, Manifestation, Provenance and Citation.

These six principles provide a high-level framework for guiding the consistent practice of data versioning and can also serve as guidance for data centres or data providers when setting up their own data revision and version protocols and procedures.

URL : Versioning Data Is About More than Revisions: A Conceptual Framework and Proposed Principles

DOI : http://doi.org/10.5334/dsj-2021-012