To share or not to share? Image data sharing in the social sciences and humanities

Authors : Elina Late, Mette Skov, Sanna Kumpulainen

Introduction

The paper aims to investigate image data sharing within social science and humanities. While data sharing is encouraged as a part of the open science movement, little is known about the approaches and factors influencing the sharing of image data.

This information is evident as the use of image data in these fields of research is increasing, and data sharing is context dependent.

Method

The study analyses qualitative semi-structured interviews with 14 scholars who incorporate digital images as a core component of their research data.

Analysis

Content analysis is conducted to gather information about scholars’ image data sharing and motivating and impeding factors related to it.

Results

The findings show that image data sharing is not an established research practice, and when it happens it is mostly done via informal means by sharing data through personal contacts. Supporting the scientific community, the open science agenda and fulfilling research funders’ requirements motivate scholars to share their data. Impeding factors relate to the qualities of data, ownership of data, data stewardship, and research integrity.

Conclusion

Advancing image data sharing requires the development of research infrastructures and providing support and guidelines. Better understanding of the scholars’ image data practices is also needed.

URL : To share or not to share? Image data sharing in the social sciences and humanities

DOI : https://doi.org/10.47989/ir292834

Research Data Management in the Croatian Academic Community: A Research Study

Author : Radovan Vrana

This paper presents the results of an empirical research study of Croatian scientists’ use and management of research data. This research study was carried out from 28 June 2023 until 31 August 2023 using an online questionnaire consisting of 28 questions. The answers of 584 respondents working in science were filtered out for further analysis. About three-quarters of the respondents used the research data of other scientists successfully. Research data were mostly acquired from colleagues from the same department or institution.

Roughly half of the respondents did not ask other scientists directly for their research data. Research data are important to the respondents mostly for raising the quality of research. Repeating someone else’s research by using their research data is still a problem. Less than one-third of the respondents provided full access to their research data mostly due to their fear of misuse.

The benefits of research data sharing were recognized but few of the respondents received any reward for it. Archiving research data is a significant problem for the respondents as they dominantly use their own computers prone to failure for that activity and do not think about long-term preservation. Finally, the respondents lacked deeper knowledge of research data management.

URL : Research Data Management in the Croatian Academic Community: A Research Study

DOI : https://doi.org/10.3390/publications12020016

An analysis of the effects of sharing research data, code, and preprints on citations

Authors : Giovanni Colavizza, Lauren Cadwallader, Marcel LaFlamme, Grégory Dozot, Stéphane Lecorney, Daniel Rappo, Iain Hrynaszkiewicz

Calls to make scientific research more open have gained traction with a range of societal stakeholders. Open Science practices include but are not limited to the early sharing of results via preprints and openly sharing outputs such as data and code to make research more reproducible and extensible. Existing evidence shows that adopting Open Science practices has effects in several domains.

In this study, we investigate whether adopting one or more Open Science practices leads to significantly higher citations for an associated publication, which is one form of academic impact. We use a novel dataset known as Open Science Indicators, produced by PLOS and DataSeer, which includes all PLOS publications from 2018 to 2023 as well as a comparison group sampled from the PMC Open Access Subset. In total, we analyze circa 122’000 publications. We calculate publication and author-level citation indicators and use a broad set of control variables to isolate the effect of Open Science Indicators on received citations.

We show that Open Science practices are adopted to different degrees across scientific disciplines. We find that the early release of a publication as a preprint correlates with a significant positive citation advantage of about 20.2% on average. We also find that sharing data in an online repository correlates with a smaller yet still positive citation advantage of 4.3% on average.

However, we do not find a significant citation advantage for sharing code. Further research is needed on additional or alternative measures of impact beyond citations. Our results are likely to be of interest to researchers, as well as publishers, research funders, and policymakers.

Arxiv : https://arxiv.org/abs/2404.16171

From Data Creator to Data Reuser: Distance Matters

Authors : Christine L. Borgman, Paul T. Groth

Sharing research data is complex, labor-intensive, expensive, and requires infrastructure investments by multiple stakeholders. Open science policies focus on data release rather than on data reuse, yet reuse is also difficult, expensive, and may never occur. Investments in data management could be made more wisely by considering who might reuse data, how, why, for what purposes, and when.

Data creators cannot anticipate all possible reuses or reusers; our goal is to identify factors that may aid stakeholders in deciding how to invest in research data, how to identify potential reuses and reusers, and how to improve data exchange processes.

Drawing upon empirical studies of data sharing and reuse, we develop the theoretical construct of distance between data creator and data reuser, identifying six distance dimensions that influence the ability to transfer knowledge effectively: domain, methods, collaboration, curation, purposes, and time and temporality.

These dimensions are primarily social in character, with associated technical aspects that can decrease – or increase – distances between creators and reusers. We identify the order of expected influence on data reuse and ways in which the six dimensions are interdependent.

Our theoretical framing of the distance between data creators and prospective reusers leads to recommendations to four categories of stakeholders on how to make data sharing and reuse more effective: data creators, data reusers, data archivists, and funding agencies.

URL : From Data Creator to Data Reuser: Distance Matters

arXiv : https://arxiv.org/abs/2402.07926

Building a Trustworthy Data Repository: CoreTrustSeal Certification as a Lens for Service Improvements

Authors : Cara Key, Clara Llebot, Michael Boock

Objective

The university library aims to provide university researchers with a trustworthy institutional repository for sharing data. The library sought CoreTrustSeal certification in order to measure the quality of data services in the institutional repository, and to promote researchers’ confidence when depositing their work.

Methods

The authors served on a small team of library staff who collaborated to compose the certification application. They describe the self-assessment process, as they iterated through cycles of compiling information and responding to reviewer feedback.

Results

The application team gained understanding of data repository best practices, shared knowledge about the institutional repository, and identified areas of service improvements necessary to meet certification requirements. Based on the application and feedback, the team took measures to enhance preservation strategies, governance, and public-facing policies and documentation for the repository.

Conclusions

The university library gained a better understanding of top-notch data services and measurably improved these services by pursuing and obtaining CoreTrustSeal certification.

URL : Building a Trustworthy Data Repository: CoreTrustSeal Certification as a Lens for Service Improvements

DOI : https://doi.org/10.7191/jeslib.761

Establishing an early indicator for data sharing and reuse

Authors : Agata Piękniewska, Laurel L. Haak, Darla Henderson, Katherine McNeill, Anita Bandrowski, Yvette Seger

Funders, publishers, scholarly societies, universities, and other stakeholders need to be able to track the impact of programs and policies designed to advance data sharing and reuse. With the launch of the NIH data management and sharing policy in 2023, establishing a pre-policy baseline of sharing and reuse activity is critical for the biological and biomedical community.

Toward this goal, we tested the utility of mentions of research resources, databases, and repositories (RDRs) as a proxy measurement of data sharing and reuse. We captured and processed text from Methods sections of open access biological and biomedical research articles published in 2020 and 2021 and made available in PubMed Central.

We used natural language processing to identify text strings to measure RDR mentions. In this article, we demonstrate our methodology, provide normalized baseline data sharing and reuse activity in this community, and highlight actions authors and publishers can take to encourage data sharing and reuse practices.

URL : Establishing an early indicator for data sharing and reuse

DOI : https://doi.org/10.1002/leap.1586

Making Mathematical Research Data FAIR: A Technology Overview

Authors : Tim Conrad, Eloi Ferrer, Daniel Mietchen, Larissa Pusch, Johannes Stegmuller, Moritz Schubotz

The sharing and citation of research data is becoming increasingly recognized as an essential building block in scientific research across various fields and disciplines. Sharing research data allows other researchers to reproduce results, replicate findings, and build on them. Ultimately, this will foster faster cycles in knowledge generation.

Some disciplines, such as astronomy or bioinformatics, already have a long history of sharing data; many others do not. The current landscape of so-called research data repositories is diverse. This review aims to perform a technology review on existing data repositories/portals with a focus on mathematical research data.

URL : Making Mathematical Research Data FAIR: A Technology Overview

Original location: https://arxiv.org/abs/2309.11829