Over-optimization of academic publishing metrics: observing Goodhart’s Law in action

Authors : Michael Fire, Carlos Guestrin

Background

The academic publishing world is changing significantly, with ever-growing numbers of publications each year and shifting publishing patterns. However, the metrics used to measure academic success, such as the number of publications, citation number, and impact factor, have not changed for decades.

Moreover, recent studies indicate that these metrics have become targets and follow Goodhart’s Law, according to which, “when a measure becomes a target, it ceases to be a good measure.”

Results

In this study, we analyzed >120 million papers to examine how the academic publishing world has evolved over the last century, with a deeper look into the specific field of biology. Our study shows that the validity of citation-based measures is being compromised and their usefulness is lessening.

In particular, the number of publications has ceased to be a good metric as a result of longer author lists, shorter papers, and surging publication numbers. Citation-based metrics, such citation number and h-index, are likewise affected by the flood of papers, self-citations, and lengthy reference lists.

Measures such as a journal’s impact factor have also ceased to be good metrics due to the soaring numbers of papers that are published in top journals, particularly from the same pool of authors.

Moreover, by analyzing properties of >2,600 research fields, we observed that citation-based metrics are not beneficial for comparing researchers in different fields, or even in the same department.

Conclusions

Academic publishing has changed considerably; now we need to reconsider how we measure success.

URL : Over-optimization of academic publishing metrics: observing Goodhart’s Law in action

DOI : https://doi.org/10.1093/gigascience/giz053

Scientific misconduct and accountability in teams

Authors : Katrin Hussinger, Maikel Pellens

Increasing complexity and multidisciplinarity make collaboration essential for modern science. This, however, raises the question of how to assign accountability for scientific misconduct among larger teams of authors. Biomedical societies and science associations have put forward various sets of guidelines. Some state that all authors are jointly accountable for the integrity of the work.

Others stipulate that authors are only accountable for their own contribution. Alternatively, there are guarantor type models that assign accountability to a single author. We contribute to this debate by analyzing the outcomes of 80 scientific misconduct investigations of biomedical scholars conducted by the U.S. Office of Research Integrity (ORI).

We show that the position of authors on the byline of 184 publications involved in misconduct cases correlates with responsibility for the misconduct. Based on a series of binary regression models, we show that first authors are 38% more likely to be responsible for scientific misconduct than authors listed in the middle of the byline (p<0.01). Corresponding authors are 14% more likely (p<0.05).

These findings suggest that a guarantor-like model where first authors are ex-ante accountable for misconduct is highly likely to not miss catching the author responsible, while not afflicting too many bystanders.

URL : Scientific misconduct and accountability in teams

DOI : https://doi.org/10.1371/journal.pone.0215962

Governance of a global genetic resource commons for non-commercial research: A case-study of the DNA barcode commons

Authors : Janis Geary, Tania Bubela

Life sciences research that uses genetic resources is increasingly collaborative and global, yet collective action remains a significant barrier to the creation and management of shared research resources. These resources include sequence data and associated metadata, and biological samples, and can be understood as a type of knowledge commons.

Collective action by stakeholders to create and use knowledge commons for research has potential benefits for all involved, including minimizing costs and sharing risks, but there are gaps in our understanding of how institutional arrangements may promote such collective action in the context of global genetic resources.

We address this research gap by examining the attributes of an exemplar global knowledge commons: The DNA barcode commons. DNA barcodes are short, standardized gene regions that can be used to inexpensively identify unknown specimens, and proponents have led international efforts to make DNA barcodes a standard species identification tool.

Our research examined if and how attributes of the DNA barcode commons, including governance of DNA barcode resources and management of infrastructure, facilitate global participation in DNA barcoding efforts. Our data sources included key informant interviews, organizational documents, scientific outputs of the DNA barcoding community, and DNA barcode record submissions.

Our research suggested that the goal of creating a globally inclusive DNA barcode commons is partially impeded by the assumption that scientific norms and expectations held by researchers in high income countries are universal. We found scientific norms are informed by a complex history of resource misappropriation and mistrust between stakeholders.

DNA barcode organizations can mitigate the challenges caused by its global membership through creating more inclusive governance structures, developing norms for the community are specific to the context of DNA barcoding, and through increasing awareness and knowledge of pertinent legal frameworks.

URL : Governance of a global genetic resource commons for non-commercial research: A case-study of the DNA barcode commons

Alternative location : https://www.thecommonsjournal.org/articles/10.18352/ijc.859/

Crowdsourcing in medical research: concepts and applications

Authors : Joseph D. Tucker, Suzanne Day, Weiming Tang, Barry Bayus

Crowdsourcing shifts medical research from a closed environment to an open collaboration between the public and researchers. We define crowdsourcing as an approach to problem solving which involves an organization having a large group attempt to solve a problem or part of a problem, then sharing solutions.

Crowdsourcing allows large groups of individuals to participate in medical research through innovation challenges, hackathons, and related activities. The purpose of this literature review is to examine the definition, concepts, and applications of crowdsourcing in medicine.

This multi-disciplinary review defines crowdsourcing for medicine, identifies conceptual antecedents (collective intelligence and open source models), and explores implications of the approach. Several critiques of crowdsourcing are also examined.

Although several crowdsourcing definitions exist, there are two essential elements: (1) having a large group of individuals, including those with skills and those without skills, propose potential solutions; (2) sharing solutions through implementation or open access materials.

The public can be a central force in contributing to formative, pre-clinical, and clinical research. A growing evidence base suggests that crowdsourcing in medicine can result in high-quality outcomes, broad community engagement, and more open science.

URL : Crowdsourcing in medical research: concepts and applications

DOI : https://doi.org/10.7717/peerj.6762

Responsible data sharing in international health research: a systematic review of principles and norms

Authors : Shona Kalkman, Menno Mostert, Christoph Gerlinger, Johannes J. M. van Delden, Ghislaine J. M. W. van Thiel

Background

Large-scale linkage of international clinical datasets could lead to unique insights into disease aetiology and facilitate treatment evaluation and drug development.

Hereto, multi-stakeholder consortia are currently designing several disease-specific translational research platforms to enable international health data sharing.

Despite the recent adoption of the EU General Data Protection Regulation (GDPR), the procedures for how to govern responsible data sharing in such projects are not at all spelled out yet. In search of a first, basic outline of an ethical governance framework, we set out to explore relevant ethical principles and norms.

Methods

We performed a systematic review of literature and ethical guidelines for principles and norms pertaining to data sharing for international health research.

Results

We observed an abundance of principles and norms with considerable convergence at the aggregate level of four overarching themes: societal benefits and value; distribution of risks, benefits and burdens; respect for individuals and groups; and public trust and engagement.

However, at the level of principles and norms we identified substantial variation in the phrasing and level of detail, the number and content of norms considered necessary to protect a principle, and the contextual approaches in which principles and norms are used.

Conclusions

While providing some helpful leads for further work on a coherent governance framework for data sharing, the current collection of principles and norms prompts important questions about how to streamline terminology regarding de-identification and how to harmonise the identified principles and norms into a coherent governance framework that promotes data sharing while securing public trust.

URL : Responsible data sharing in international health research: a systematic review of principles and norms

DOI : https://doi.org/10.1186/s12910-019-0359-9

Assessing Data Management Support Needs of Bioengineering and Biomedical Research Faculty

Authors : Christie A. Wiley, Margaret H. Burnette

Objectives

This study explores data management knowledge, attitudes, and practices of bioengineering and biomedical researchers in the context of the National Institutes of Health-funded research projects. Specifically, this study seeks to answer the following questions:

  1. What is the nature of biomedical and bioengineering research on the Illinois campus and what kinds of data are being generated?
  2. To what degree are biomedical and bioengineering researchers aware of best practices for data management and what are the actual data management behaviors?
  3. What aspects of data management present the greatest challenges and frustrations?
  4. To what degree are biomedical and bioengineering researchers aware of data sharing opportunities and data repositories, and what are their attitudes towards data sharing?
  5. To what degree are researchers aware of campus services and support for data management planning, data sharing, and data deposit, and what is the level of interest in instruction in these areas?

Methods

Librarians on the University of Illinois at Urbana Champaign campus conducted semi-structured interviews with bioengineering and biomedical researchers to explore researchers’ knowledge of data management best practices, awareness of library campus services, data management behavior and challenges managing research data.

The topics covered during the interviews were current research projects, data types, format, description, campus repository usage, data-sharing, awareness of library campus services, data reuse, the anticipated impact of health on public and challenges (interview questions are provided in the Appendix).

Results

This study revealed the majority of researchers explore broad research topics, various file storage solutions, generate numerous amounts of data and adhere to differing discipline-specific practices. Researchers expressed both familiarity and unfamiliarity with DMP Tool.

Roughly half of the researchers interviewed reported having documented protocols for file names, file backup, and file storage. Findings also suggest that there is ambiguity about what it means to share research data and confusion about terminology such as “repository” and “data deposit”. Many researchers equate publication to data sharing.

Conclusions

The interviews reveal significant data literacy gaps that present opportunities for library instruction in the areas of file organization, project workflow and documentation, metadata standards, and data deposit options.

The interviews also provide invaluable insight into biomedical and bioengineering research in general and contribute to the authors’ understanding of the challenges facing the researchers we strive to support.

URL : Assessing Data Management Support Needs of Bioengineering and Biomedical Research Faculty

Alternative location  : https://escholarship.umassmed.edu/jeslib/vol8/iss1/1/

 

Data objects and documenting scientific processes: An analysis of data events in biodiversity data papers

Authors : Kai Li, Jane Greenberg, Jillian Dunic

The data paper, an emerging scholarly genre, describes research datasets and is intended to bridge the gap between the publication of research data and scientific articles. Research examining how data papers report data events, such as data transactions and manipulations, is limited.

The research reported on in this paper addresses this limitation and investigated how data events are inscribed in data papers. A content analysis was conducted examining the full texts of 82 data papers, drawn from the curated list of data papers connected to the Global Biodiversity Information Facility (GBIF).

Data events recorded for each paper were organized into a set of 17 categories. Many of these categories are described together in the same sentence, which indicates the messiness of data events in the laboratory space.

The findings challenge the degrees to which data papers are a distinct genre compared to research papers and they describe data-centric research processes in a through way.

This paper also discusses how our results could inform a better data publication ecosystem in the future.

URL : Data objects and documenting scientific processes: An analysis of data events in biodiversity data papers

Alternative location : https://arxiv.org/abs/1903.06215