Data trajectories: tracking reuse of published data for transitive credit attribution

Author : Paolo Missier

The ability to measure the use and impact of published data sets is key to the success of the open data/open science paradigm. A direct measure of impact would require tracking data (re)use in the wild, which is difficult to achieve.

This is therefore commonly replaced by simpler metrics based on data download and citation counts. In this paper we describe a scenario where it is possible to track the trajectory of a dataset after its publication, and show how this enables the design of accurate models for ascribing credit to data originators.

A Data Trajectory (DT) is a graph that encodes knowledge of how, by whom, and in which context data has been re-used, possibly after several generations. We provide a theoretical model of DTs that is grounded in the W3C PROV data model for provenance, and we show how DTs can be used to automatically propagate a fraction of the credit associated with transitively derived datasets, back to original data contributors.

We also show this model of transitive credit in action by means of a Data Reuse Simulator. In the longer term, our ultimate hope is that credit models based on direct measures of data reuse will provide further incentives to data publication.

We conclude by outlining a research agenda to address the hard questions of creating, collecting, and using DTs systematically across a large number of data reuse instances in the wild.

URL : Data trajectories: tracking reuse of published data for transitive credit attribution

URL : http://dx.doi.org/10.2218/ijdc.v11i1.425

Recognizing the Diversity of Contributions: A Case Study for Framing Attribution and Acknowledgement for Scientific Data

Authors : Chung-Yi Hou, Matthew Mayernik

As scientific data volumes, format types, and sources increase rapidly with the invention and improvement of scientific capabilities, the resulting datasets are becoming more complex to manage as well.

One of the significant management challenges is pulling apart the individual contributions of specific people and organizations within large, complex projects.

This is important for two aspects:1) assigning responsibility and accountability for scientific work, and 2) giving professional credit to individuals (e.g. hiring, promotion, and tenure) who work within such large projects.

This paper aims to review the extant practice of data attribution and how it may be improved. Through a case study of creating a detailed attribution record for a climate model dataset, the paper evaluates the strengths and weaknesses of the current data attribution method and proposes an alternative attribution framework accordingly.

The paper concludes by demonstrating that, analogous to acknowledging the different roles and responsibilities shown in movie credits, the methodology developed in the study could be used in general to identify and map out the relationships among the organizations and individuals who had contributed to a dataset.

As a result, the framework could be applied to create data attribution for other dataset types beyond climate model datasets.

URL : Recognizing the Diversity of Contributions: A Case Study for Framing Attribution and Acknowledgement for Scientific Data

DOI : http://dx.doi.org/10.2218/ijdc.v11i1.357

Factors Influencing Research Data Reuse in the Social Sciences : An Exploratory Study

Author : Renata Gonçalves Curty

The development of e-Research infrastructure has enabled data to be shared and accessed more openly. Policy mandates for data sharing have contributed to the increasing availability of research data through data repositories, which create favourable conditions for the re-use of data for purposes not always anticipated by original collectors.

Despite the current efforts to promote transparency and reproducibility in science, datare-use cannot be assumed, nor merely considered a ‘thrifting’ activity where scientists shop around in datarepositories considering only the ease of access to data.

The lack of an integrated view of individual, socialand technological influential factors to intentional and actual data re-use behaviour was the key motivatorfor this study. Interviews with 13 social scientists produced 25 factors that were found to influence theirperceptions and experiences, including both their unsuccessful and successful attempts to re-use data.

These factors were grouped into six theoretical variables: perceived benefits, perceived risks, perceived effort,social influence, facilitating conditions, and perceived re-usability.

These research findings provide an in-depth understanding about the re-use of research data in the context of open science, which can be valuablein terms of theory and practice to help leverage data re-use and make publicly available data moreactionable.

URL : Factors Influencing Research Data Reuse in the Social Sciences : An Exploratory Study

DOI : http://dx.doi.org/10.2218/ijdc.v11i1.401

Impact de l’Open Access sur les citations : une étude de cas

Auteurs/Authors : Frédérique Bordignon, Mathieu Andro

De multiples études, dans la littérature internationale, ont cherché à évaluer l’impact de l’Open Access sur le taux de citation des articles scientifiques. La présente étude, en langue française, reste limitée aux publications 2010 de l’Ecole des Ponts.

Elle offre néanmoins un état de l’art des précédentes études sur le sujet à un lectorat de professionnels francophones et a pour originalité de mesurer le nombre moyen de citations par mois, avant et après “libération” Open Access des articles et d’éviter ainsi la plupart des biais qui peuvent être rencontrés dans ce type de démarche.

En plus de confirmer, comme beaucoup d’autres l’ont fait auparavant, un avantage net de l’Open Access sur le taux de citation en informatique, sciences de la terre et de l’univers, ingénierie, sciences environnementales, mathématiques, physique et astronomie, elle montre aussi qu’une « libération » précoce peut avoir un impact plus favorable qu’une « libération » tardive dans certains champs disciplinaires, comme les mathématiques et physique/astronomie.

URL : Impact de l’Open Access sur les citations : une étude de cas

The Authorship Dilemma: Alphabetical or Contribution?

Authors : Margareta Ackerman, Simina Brânzei

Scientific communities have adopted different conventions for ordering authors on publications.

Are these choices inconsequential, or do they have significant influence on individual authors, the quality of the projects completed, and research communities at large? What are the trade-offs of using one convention over another?

In order to investigate these questions, we formulate a basic two-player game theoretic model, which already illustrates interesting phenomena that can occur in more realistic settings.

We find that alphabetical ordering can improve research quality, while contribution-based ordering leads to a denser collaboration network and a greater number of publications.

Contrary to the assumption that free riding is a weakness of the alphabetical ordering scheme, this phenomenon can occur under any contribution scheme, and the worst case occurs under contribution-based ordering.

Finally, we show how authors working on multiple projects can cooperate to attain optimal research quality and eliminate free riding given either contribution scheme.

URL : https://arxiv.org/abs/1208.3391

Marxism and Open Access in the Humanities: Turning Academic Labor against Itself

Author : David Golumbia

Open Access (OA) is the movement to make academic research available without charge, typically via digital networks. Like many cyberlibertarian causes OA is roundly celebrated by advocates from across the political spectrum.

Yet like many of those causes, OA’s lack of clear grounding in an identifiable political framework means that it may well not only fail to serve the political goals of some of its supporters, and may in fact work against them.

In particular, OA is difficult to reconcile with Marxist accounts of labor, and on its face appears not to advance but to actively mitigate against achievement of Marxist goals for the emancipation of labor. In part this stems from a widespread misunderstanding of Marx’s own attitude toward intellectual work, which to Marx was not categorically different from other forms of labor, though was in danger of becoming so precisely through the denial of the value of the end products of intellectual work.

This dynamic is particularly visible in the humanities, where OA advocacy routinely includes disparagement of academic labor, and of the value produced by that labor.

URL : Marxism and Open Access in the Humanities: Turning Academic Labor against Itself

Alternative location : http://ices.library.ubc.ca/index.php/workplace/article/view/186213

Measuring Scientific Impact Beyond Citation Counts

Authors : Robert M. Patton, Christopher G. Stahl, Jack C. Wells

The measurement of scientific progress remains a significant challenge exasperated by the use of multiple different types of metrics that are often incorrectly used, overused, or even explicitly abused.

Several metrics such as h-index or journal impact factor (JIF) are often used as a means to assess whether an author, article, or journal creates an « impact » on science. Unfortunately, external forces can be used to manipulate these metrics thereby diluting the value of their intended, original purpose.

This work highlights these issues and the need to more clearly define « impact » as well as emphasize the need for better metrics that leverage full content analysis of publications.

URL : http://www.dlib.org/dlib/september16/patton/09patton.html