Analysis on open data as a foundation for data-driven research

Authors : Honami Numajiri, Takayuki Hayashi

Open Data, one of the key elements of Open Science, serves as a foundation for “data-driven research” and has been promoted in many countries. However, the current status of the use of publicly available data consisting of Open Data in new research styles and the impact of such use remains unclear.

Following a comparative analysis in terms of the coverage with the OpenAIRE Graph, we analyzed the Data Citation Index, a comprehensive collection of research datasets and repositories with information of citation from articles. The results reveal that different countries and disciplines tend to show different trends in Open Data.

In recent years, the number of data sets in repositories where researchers publish their data, regardless of the discipline, has increased dramatically, and researchers are publishing more data. Furthermore, there are some disciplines where data citation rates are not high, but the databases used are diverse.

URL : Analysis on open data as a foundation for data-driven research

DOI : https://doi.org/10.1007/s11192-024-04956-x

The Future of Data in Research Publishing: From Nice to Have to Need to Have?

Authors : Christine L. Borgman, Amy Brand

Science policy promotes open access to research data for purposes of transparency and reuse of data in the public interest. We expect demands for open data in scholarly publishing to accelerate, at least partly in response to the opacity of artificial intelligence algorithms.

Open data should be findable, accessible, interoperable, and reusable (FAIR), and also trustworthy and verifiable. The current state of open data in scholarly publishing is in transition from ‘nice to have’ to ‘need to have.’

Research data are valuable, interpretable, and verifiable only in context of their origin, and with sufficient infrastructure to facilitate reuse. Making research data useful is expensive; benefits and costs are distributed unevenly.

Open data also poses risks for provenance, intellectual property, misuse, and misappropriation in an era of trolls and hallucinating AI algorithms. Scholars and scholarly publishers must make evidentiary data more widely available to promote public trust in research.

To make research processes more trustworthy, transparent, and verifiable, stakeholders need to make greater investments in data stewardship and knowledge infrastructures.

DOI : https://doi.org/10.1162/99608f92.b73aae77

Les reconfigurations des vecteurs de la crédibilité scientifique à l’interface entre les mondes sociaux

Auteur.ices/Authors : Fabrizio Li Vigni, Séverine Louvel, Benjamin Raimbault

La crédibilité des scientifiques fait l’objet de débats, qui portent sur les risques de décrédibilisation qui découleraient d’une perte d’autonomie des chercheur·e·s vis-à-vis d’intérêts économiques, de logiques militantes ou d’agendas politiques.

Ces situations de mise à l’épreuve de la crédibilité scientifique vis-à-vis de la société et de la communauté de pair·e·s soulèvent une question plus générale : comment les chercheur·e·s engagé·e·s dans des collectifs positionnés dans plusieurs mondes sociaux construisent-ils·elles leur crédibilité auprès de leurs collègues ?

Leurs activités renforcent-elles, ou affaiblissent-elles, les vecteurs classiques de la crédibilité scientifique ? De manière concomitante, observe-t-on l’émergence de nouveaux vecteurs de crédibilité ?

Les articles de ce dossier thématique interrogent les reconfigurations contemporaines de la crédibilité à partir de quatre axes de transformation des sciences, à savoir : l’ouverture et la bancarisation des données ; les relations sciences–industries ; l’interdisciplinarité ; et les engagements publics des chercheur·e·s.

Dans cet article introductif, nous revenons sur l’histoire de la notion de crédibilité scientifique dans les Science & Technology Studies – telle qu’elle a été proposée par Bruno Latour et Steve Woolgar, puis Steven Shapin et Thomas Gieryn – et sur la manière dont elle a été investie depuis ; puis nous présentons les cinq articles du dossier et en tirons les apports transversaux.

Nous soulignons que, bien davantage que l’avènement de nouveaux vecteurs de la crédibilité scientifique, ces articles donnent à voir des transformations à la marge, situées et contradictoires.

DOI : https://doi.org/10.4000/rac.30365

“We Share All Data with Each Other”: Data-Sharing in Peer-to-Peer Relationships

Author : Eva Barlösius

Although the topic of data-sharing has boomed in the past few years, practices of datasharing have attracted only scant attention within working groups and scientific cooperation (peer-to-peer data-sharing).

To understand these practices, the author draws on Max Weber’s concept of social relationship, conceptualizing data-sharing as social action that takes place within a social relationship. The empirical material consists of interviews with 34 researchers representing five disciplines—linguistics, biology, psychology, computer sciences, and neurosciences.

The analysis identifies three social forms of data-sharing in peer-to-peer relationships: (a) closed communal sharing, which is based on a feeling of belonging together; (b) closed associative sharing, in which the participants act on the basis of an agreement; and (c) open associative sharing, which is oriented to “institutional imperatives” (Merton) and to formal regulations.

The study shows that far more data-sharing is occurring in scientific practice than seems to be apparent from a concept of open data alone. If the main goal of open-data policy programs is to encourage researchers to increase access to their data, it could be instructive to study the three forms of data-sharing to improve the understanding of why and how scientists make their data accessible to other researchers.

URL : “We Share All Data with Each Other”: Data-Sharing in Peer- to-Peer Relationships

DOI : https://doi.org/10.1007/s11024-023-09487-y

Publishers, funders and institutions: who is supporting UKRI-funded researchers to share data?

Authors : Beth Montague-Hellen, Kate Montague-Hellen

Researchers are increasingly being asked by funders, publishers and their institutions to share research data alongside written publications, and to include data availability statements to support their readers in finding this data.

In the UK, UKRI (UK Research and Innovation) is one of the largest funding bodies and has had data-sharing policies for several years. This article investigates the reasons why a researcher may or may not share their data and assesses whether funders, publishers and institutions are supporting data-sharing behaviour through their policies and actions.

A survey with 166 responses gave an indicative assessment of researcher opinions around data sharing, and a corpus of 3,277 journal articles retrieved from four UK institutions was analysed using multivariate logistic regression models to provide empirical evidence as to researcher behaviour around data sharing.

The regression models provide insight into how this is affected by the funder, institution and publisher of the research. This study identifies that those publishers and funders who give clear guidance in their policies as to which data should be shared, and where this data should be shared, are most likely to encourage good practice in researchers.

URL : Publishers, funders and institutions: who is supporting UKRI-funded researchers to share data?

DOI : https://doi.org/10.1629/uksg.602

Introducing a data availability policy for journals at IOP Publishing: Measuring the impact on authors and editorial teams

Authors : Jade Holt, Andrew Walker, Phill Jones

As the open research movement continues to gather pace, a number of publishers, funders, and institutions are mandating the sharing of underlying research data. At the same time, concerns about introducing extra quality control steps around data availability statements (DAS) are driving a discussion about the best way to make data more open without slowing down publication.

This article describes a pilot project to introduce a new Open Data policy to three IOP Publishing (IOPP) journals as part of IOPP’s commitment to increasing transparency and support for open science.

An investigation was undertaken using an automated workflow monitoring tool to understand the impact of this change on authors and the editorial staff. Changes in revised submission processing times and how often manuscripts were returned to the author were measured.

An overall increase in the time editorial staff spent processing manuscripts was found as well as an increase in the number of times manuscripts were returned to authors. Detailed analysis shows that manuscripts in which authors claim in the DAS to have included data within the manuscript were the most strongly affected. Steps to mitigate the effects through improved author communication were found to be effective.

URL : Introducing a data availability policy for journals at IOP Publishing: Measuring the impact on authors and editorial teams

DOI : https://doi.org/10.1002/leap.1386

Open Data Policies among Library and Information Science Journals

Author : Brian Jackson

Journal publishers play an important role in the open research data ecosystem. Through open data policies that include public data archiving mandates and data availability statements, journal publishers help promote transparency in research and wider access to a growing scholarly record.

The library and information science (LIS) discipline has a unique relationship with both open data initiatives and academic publishing and may be well-positioned to adopt rigorous open data policies.

This study examines the information provided on public-facing websites of LIS journals in order to describe the extent, and nature, of open data guidance provided to prospective authors.

Open access journals in the discipline have disproportionately adopted detailed, strict open data policies. Commercial publishers, which account for the largest share of publishing in the discipline, have largely adopted weaker policies. Rigorous policies, adopted by a minority of journals, describe the rationale, application, and expectations for open research data, while most journals that provide guidance on the matter use hesitant and vague language. Recommendations are provided for strengthening journal open data policies.

URL : Open Data Policies among Library and Information Science Journals

DOI : https://doi.org/10.3390/publications9020025