GitHub Repositories with Links to Academic Papers: Open Access, Traceability, and Evolution

Authors : Supatsara Wattanakriengkrai, Bodin Chinthanet, Hideaki Hata, Raula Gaikovina Kula, Christoph Treude, Jin Guo, Kenichi Matsumoto

Traceability between published scientific breakthroughs and their implementation is essential, especially in the case of Open Source Software implements bleeding edge science into its code. However, aligning the link between GitHub repositories and academic papers can prove difficult, and the link impact remains unknown.

This paper investigates the role of academic paper references contained in these repositories. We conducted a large-scale study of 20 thousand GitHub repositories to establish prevalence of references to academic papers. We use a mixed-methods approach to identify Open Access (OA), traceability and evolutionary aspects of the links.

Although referencing a paper is not typical, we find that a vast majority of referenced academic papers are OA. In terms of traceability, our analysis revealed that machine learning is the most prevalent topic of repositories. These repositories tend to be affiliated with academic communities. More than half of the papers do not link back to any repository.

A case study of referenced arXiv paper shows that most of these papers are high-impact and influential and do align with academia, referenced by repositories written in different programming languages. From the evolutionary aspect, we find very few changes of papers being referenced and links to them.

URL : https://arxiv.org/abs/2004.00199

A tale of two ‘opens’: intersections between Free and Open Source Software and Open Scholarship

Authors : Jonathan Tennant, Ritwik Agarwal, Ksenija Baždarić, David Brassard, Tom Crick, Daniel Dunleavy, Thomas Evans, Nicholas Gardner, Monica Gonzalez-Marquez, Daniel Graziotin, Bastian Greshake Tzovaras, Daniel Gunnarsson, Johanna Havemann, Mohammad Hosseini, Daniel Katz, Marcel Knöchelmann, Christopher Madan, Paolo Manghi, Alberto Marocchino, Paola Masuzzo, Peter Murray-Rust, Sanjay Narayanaswamy, Gustav Nilsonne, Josmel Pacheco-Mendoza, Bart Penders, Olivier Pourret, Michael Rera, John Samuel, Tobias Steiner, Jadranka Stojanovski, Alejandro Uribe-Tirado, Rutger Vos, Simon Worthington, Tal Yarkoni

There is no clear-cut boundary between Free and Open Source Software and Open Scholarship, and the histories, practices, and fundamental principles between the two remain complex.

In this study, we critically appraise the intersections and differences between the two movements. Based on our thematic comparison here, we conclude several key things.

First, there is substantial scope for new communities of practice to form within scholarly communities that place sharing and collaboration/open participation at their focus.

Second, Both the principles and practices of FOSS can be more deeply ingrained within scholarship, asserting a balance between pragmatism and social ideology.

Third, at the present, Open Scholarship risks being subverted and compromised by commercial players.

Fourth, the shift and acceleration towards a system of Open Scholarship will be greatly enhanced by a concurrent shift in recognising a broader range of practices and outputs beyond traditional peer review and research articles.

In order to achieve this, we propose the formulation of a new type of institutional mandate. We believe that there is substantial need for research funders to invest in sustainable open scholarly infrastructure, and the communities that support them, to avoid the capture and enclosure of key research services that would prevent optimal researcher behaviours.

Such a shift could ultimately lead to a healthier scientific culture, and a system where competition is replaced by collaboration, resources (including time and people) are shared and acknowledged more efficiently, and the research becomes inherently more rigorous, verified, and reproducible.

URL : A tale of two ‘opens’: intersections between Free and Open Source Software and Open Scholarship

DOI : https://doi.org/10.31235/osf.io/2kxq8

Online division of labour: emergent structures in Open Source Software

Authors : María J. Palazzi, Jordi Cabot, Javier Luis Cánovas Izquierdo, Albert Solé-Ribalta, Javier Borge-Holthoefer

The development Open Source Software fundamentally depends on the participation and commitment of volunteer developers to progress. Several works have presented strategies to increase the on-boarding and engagement of new contributors, but little is known on how these diverse groups of developers self-organise to work together.

To understand this, one must consider that, on one hand, platforms like GitHub provide a virtually unlimited development framework: any number of actors can potentially join to contribute in a decentralised, distributed, remote, and asynchronous manner.

On the other, however, it seems reasonable that some sort of hierarchy and division of labour must be in place to meet human biological and cognitive limits, and also to achieve some level of efficiency.

These latter features (hierarchy and division of labour) should translate into recognisable structural arrangements when projects are represented as developer-file bipartite networks.

In this paper we analyse a set of popular open source projects from GitHub, placing the accent on three key properties: nestedness, modularity and in-block nestedness -which typify the emergence of heterogeneities among contributors, the emergence of subgroups of developers working on specific subgroups of files, and a mixture of the two previous, respectively.

These analyses show that indeed projects evolve into internally organised blocks. Furthermore, the distribution of sizes of such blocks is bounded, connecting our results to the celebrated Dunbar number both in off- and on-line environments.

Our analyses create a link between bio-cognitive constraints, group formation and online working environments, opening up a rich scenario for future research on (online) work team assembly.

URL : https://arxiv.org/abs/1903.03375

Lessons Learned in Partnerships and Practice: Adopting Open Source Institutional Repository Software

Author:  Amy Leigh Allen

INTRODUCTION

After the establishment of the University Archives at the University of Arkansas, Fayetteville, it became apparent that processes needed to be established for collecting, preserving, and providing access to born-digital materials.

The University Archivist established partnerships across multiple departments within the Libraries and with faculty and staff of colleges, schools, and administrative units across campus to test open source repository software and develop collections to fulfill this need.

DESCRIPTION OF PROGRAM

This case study examines three specific projects and workflows providing access to digital undergraduate honors theses, university serials, and music concert recordings. Lessons learned during the project include the success strategies for partnership formation along with the identification of project processes that need improvement, such as promotion and long term preservation.

NEXT STEPS AND CONCLUSIONS 

The campus has transitioned to a proprietary system for the official institutional repository. However, the pilot projects examined in this study filled intermediate needs: providing a group of files and metadata for the official institutional repository and helping the Libraries to evaluate the sustainability of open source platforms.

Staff gained experience and identified areas where improvement was needed. However, the most successful aspect of the project was establishing partnerships that will carry over to the new repository.

URL : Lessons Learned in Partnerships and Practice: Adopting Open Source Institutional Repository Software

DOI : http://doi.org/10.7710/2162-3309.2170

Journal of Open Source Software (JOSS): design and first-year review

Authors : Arfon M Smith, Kyle E Niemeyer, Daniel S Katz, Lorena A Barba, George Githinji, Melissa Gymrek, Kathryn D Huff, Christopher R Madan, Abigail Cabunoc Mayes, Kevin M Moerman, Pjotr Prins, Karthik Ram, Ariel Rokem, Tracy K Teal, Roman Valls Guimera, Jacob T Vanderplas

This article describes the motivation, design, and progress of the Journal of Open Source Software (JOSS). JOSS is a free and open-access journal that publishes articles describing research software. It has the dual goals of improving the quality of the software submitted and providing a mechanism for research software developers to receive credit.

While designed to work within the current merit system of science, JOSS addresses the dearth of rewards for key contributions to science made in the form of software. JOSS publishes articles that encapsulate scholarship contained in the software itself, and its rigorous peer review targets the software components: functionality, documentation, tests, continuous integration, and the license.

A JOSS article contains an abstract describing the purpose and functionality of the software, references, and a link to the software archive. The article is the entry point of a JOSS submission, which encompasses the full set of software artifacts.

Submission and review proceed in the open, on GitHub. Editors, reviewers, and authors work collaboratively and openly. Unlike other journals, JOSS does not reject articles requiring major revision; while not yet accepted, articles remain visible and under review until the authors make adequate changes (or withdraw, if unable to meet requirements).

Once an article is accepted, JOSS gives it a DOI, deposits its metadata in Crossref, and the article can begin collecting citations on indexers like Google Scholar and other services. Authors retain copyright of their JOSS article, releasing it under a Creative Commons Attribution 4.0 International License.

In its first year, starting in May 2016, JOSS published 111 articles, with more than 40 additional articles currently under review. JOSS is a sponsored project of the nonprofit organization NumFOCUS and is an affiliate of the Open Source Initiative.

URL : https://arxiv.org/abs/1707.02264

Le contrôle des communs numériques à des fins commerciales : le cas des logiciels libres

Auteur/Author : Stéphane Couture

Cet article aborde les formes de contrôle des biens communs par des entreprises commerciales en étudiant le cas des logiciels libres. Les logiciels libres sont des logiciels dont le code source est librement accessible, et peut être modifié et partagé.

Cette éthique de partage a permis l’émergence d’un modèle collaboratif souvent présenté comme l’exemple type des « communs numériques ». Cependant, de plus en plus d’entreprises participent aujourd’hui au développement des logiciels libres.

Si plusieurs analystes voient d’un bon œil cette contribution commerciale, d’autres font ressortir les formes de contrôle que ces entreprises mettent en place pour tirer profit des communs en logiciels libres.

En recensant différentes études sur ces questions et en analysant plus précisément les cas de Symfony et de Redhat, deux logiciels libres fortement développés par des entreprises commerciales, le présent article s’attarde sur ces formes de contrôle des communs numériques et en fait ressortir les conséquences éthiques.

URL : https://ethiquepublique.revues.org/2275

The Importance of Free and Open Source Software…

The Importance of Free and Open Source Software and Open Standards in Modern Scientific Publishing :

“In this paper we outline the reasons why we believe a reliance on the use of proprietary computer software and proprietary file formats in scientific publication have negative implications for the conduct and reporting of science. There is increasing awareness and interest in the scientific community about the benefits offered by free and open source software. We discuss the present state of scientific publishing and the merits of advocating for a wider adoption of open standards in science, particularly where it concerns the publishing process.”

URL : http://www.mdpi.com/2304-6775/1/2/49