The rise of the middle author: Investigating collaboration and division of labor in biomedical research using partial alphabetical authorship

Authors : Philippe Mongeon, Elise Smith, Bruno Joyal, Vincent Larivière

Contemporary biomedical research is performed by increasingly large teams. Consequently, an increasingly large number of individuals are being listed as authors in the bylines, which complicates the proper attribution of credit and responsibility to individual authors.

Typically, more importance is given to the first and last authors, while it is assumed that the others (the middle authors) have made smaller contributions. However, this may not properly reflect the actual division of labor because some authors other than the first and last may have made major contributions.

In practice, research teams may differentiate the main contributors from the rest by using partial alphabetical authorship (i.e., by listing middle authors alphabetically, while maintaining a contribution-based order for more substantial contributions). In this paper, we use partial alphabetical authorship to divide the authors of all biomedical articles in the Web of Science published over the 1980–2015 period in three groups: primary authors, middle authors, and supervisory authors.

We operationalize the concept of middle author as those who are listed in alphabetical order in the middle of an authors’ list. Primary and supervisory authors are those listed before and after the alphabetical sequence, respectively.

We show that alphabetical ordering of middle authors is frequent in biomedical research, and that the prevalence of this practice is positively correlated with the number of authors in the bylines. We also find that, for articles with 7 or more authors, the average proportion of primary, middle and supervisory authors is independent of the team size, more than half of the authors being middle authors.

This suggests that growth in authors lists are not due to an increase in secondary contributions (or middle authors) but, rather, in equivalent increases of all types of roles and contributions (including many primary authors and many supervisory authors).

Nevertheless, we show that the relative contribution of alphabetically ordered middle authors to the overall production of knowledge in the biomedical field has greatly increased over the last 35 years.

URL : The rise of the middle author: Investigating collaboration and division of labor in biomedical research using partial alphabetical authorship




Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data

Authors : Julie A. McMurry, Nick Juty, Niklas Blomberg, Tony Burdett, Tom Conlin, Nathalie Conte, Mélanie Courtot, John Deck, Michel Dumontier, Donal K. Fellows, Alejandra Gonzalez-Beltran, Philipp Gormanns, Jeffrey Grethe, Janna Hastings, Jean-Karim Hériché, Henning Hermjakob, Jon C. Ison, Rafael C. Jimenez, Simon Jupp, John Kunze, Camille Laibe, Nicolas Le Novère, James Malone, Maria Jesus Martin, Johanna R. McEntyre, Chris Morris, Juha Muilu, Wolfgang Müller, Philippe Rocca-Serra, Susanna-Assunta Sansone, Murat Sariyar, Jacky L. Snoep, Stian Soiland-Reyes, Natalie J. Stanford, Neil Swainston, Nicole Washington, Alan R. Williams, Sarala M. Wimalaratne, Lilly M. Winfree, Katherine Wolstencroft, Carole Goble, Christopher J. Mungall, Melissa A. Haendel, Helen Parkinson

In many disciplines, data are highly decentralized across thousands of online databases (repositories, registries, and knowledgebases). Wringing value from such databases depends on the discipline of data science and on the humble bricks and mortar that make integration possible; identifiers are a core component of this integration infrastructure.

Drawing on our experience and on work by other groups, we outline 10 lessons we have learned about the identifier qualities and best practices that facilitate large-scale data integration. Specifically, we propose actions that identifier practitioners (database providers) should take in the design, provision and reuse of identifiers.

We also outline the important considerations for those referencing identifiers in various circumstances, including by authors and data generators. While the importance and relevance of each lesson will vary by context, there is a need for increased awareness about how to avoid and manage common identifier problems, especially those related to persistence and web-accessibility/resolvability.

We focus strongly on web-based identifiers in the life sciences; however, the principles are broadly relevant to other disciplines.

URL : Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data


Medical Theses and Derivative Articles: Dissemination Of Contents and Publication Patterns

Authors : Mercedes Echeverria, David Stuart, Tobias Blanke

Doctoral theses are an important source of publication in universities, although little research has been carried out on the publications resulting from theses, on so-called derivative articles.

This study investigates how derivative articles can be identified through a text analysis based on the full-text of a set of medical theses and the full-text of articles, with which they shared authorship.

The text similarity analysis methodology applied consisted in exploiting the full-text articles according to organization of scientific discourse (IMRaD) using the TurnItIn plagiarism tool.

The study found that the text similarity rate in the Discussion section can be used to discriminate derivative articles from non-derivative articles.

Additional findings were: the first position of the thesis’s author dominated in 85% of derivative articles, the participation of supervisors as coauthors occurred in 100% of derivative articles, the authorship credit retained by the thesis’s author was 42% in derivative articles, the number of coauthors by article was 5 in derivative articles versus 6.4 coauthors, as average, in non-derivative articles and the time differential regarding the year of thesis completion showed that 87.5% of derivative articles were published before or in the same year of thesis completion.


Global Data Quality Assessment and the Situated Nature of “Best” Research Practices in Biology

Author : Sabina Leonelli

This paper reflects on the relation between international debates around data quality assessment and the diversity characterising research practices, goals and environments within the life sciences.

Since the emergence of molecular approaches, many biologists have focused their research, and related methods and instruments for data production, on the study of genes and genomes.

While this trend is now shifting, prominent institutions and companies with stakes in molecular biology continue to set standards for what counts as ‘good science’ worldwide, resulting in the use of specific data production technologies as proxy for assessing data quality.

This is problematic considering (1) the variability in research cultures, goals and the very characteristics of biological systems, which can give rise to countless different approaches to knowledge production; and (2) the existence of research environments that produce high-quality, significant datasets despite not availing themselves of the latest technologies.

Ethnographic research carried out in such environments evidences a widespread fear among researchers that providing extensive information about their experimental set-up will affect the perceived quality of their data, making their findings vulnerable to criticisms by better-resourced peers. T

hese fears can make scientists resistant to sharing data or describing their provenance. To counter this, debates around Open Data need to include critical reflection on how data quality is evaluated, and the extent to which that evaluation requires a localised assessment of the needs, means and goals of each research environment.

URL : Global Data Quality Assessment and the Situated Nature of “Best” Research Practices in Biology


Reproducibility2020: Progress and priorities

Authors : Leonard P. Freedman, Gautham Venugopalan, Rosann Wisman

The preclinical research process is a cycle of idea generation, experimentation, and reporting of results. The biomedical research community relies on the reproducibility of published discoveries to create new lines of research and to translate research findings into therapeutic applications.

Since 2012, when scientists from Amgen reported that they were able to reproduce only 6 of 53 “landmark” preclinical studies, the biomedical research community began discussing the scale of the reproducibility problem and developing initiatives to address critical challenges.

Global Biological Standards Institute (GBSI) released the “Case for Standards” in 2013, one of the first comprehensive reports to address the rising concern of irreproducible biomedical research.

Further attention was drawn to issues that limit scientific self-correction, including reporting and publication bias, underpowered studies, lack of open access to methods and data, and lack of clearly defined standards and guidelines in areas such as reagent validation.

To evaluate the progress made towards reproducibility since 2013, GBSI identified and examined initiatives designed to advance quality and reproducibility. Through this process, we identified key roles for funders, journals, researchers and other stakeholders and recommended actions for future progress. This paper describes our findings and conclusions.

URL : Reproducibility2020: Progress and priorities


A Bibliometric study of Directory of Open Access Journals: Special reference to Microbiology

Author : K S Savita

The present study aim is to determine the number of free e-journal in the field of Microbiology available on DOAJ.

For this study the author has adopted bibliometric method and analyzed on the basis of country-wise distribution, language wise distribution and subject heading wise distribution.

URL : A Bibliometric study of Directory of Open Access Journals: Special reference to Microbiology

Alternative location :

What incentives increase data sharing in health and medical research? A systematic review

Authors : Anisa Rowhani-Farid, Michelle Allen, Adrian G. Barnett


The foundation of health and medical research is data. Data sharing facilitates the progress of research and strengthens science. Data sharing in research is widely discussed in the literature; however, there are seemingly no evidence-based incentives that promote data sharing.


A systematic review (registration: of the health and medical research literature was used to uncover any evidence-based incentives, with pre- and post-empirical data that examined data sharing rates.

We were also interested in quantifying and classifying the number of opinion pieces on the importance of incentives, the number observational studies that analysed data sharing rates and practices, and strategies aimed at increasing data sharing rates.


Only one incentive (using open data badges) has been tested in health and medical research that examined data sharing rates. The number of opinion pieces (n = 85) out-weighed the number of article-testing strategies (n = 76), and the number of observational studies exceeded them both (n = 106).


Given that data is the foundation of evidence-based health and medical research, it is paradoxical that there is only one evidence-based incentive to promote data sharing. More well-designed studies are needed in order to increase the currently low rates of data sharing.

URL : What incentives increase data sharing in health and medical research? A systematic review

Alternative location :