Authors : Agata Piękniewska, Laurel L. Haak, Darla Henderson, Katherine McNeill, Anita Bandrowski, Yvette Seger
Funders, publishers, scholarly societies, universities, and other stakeholders need to be able to track the impact of programs and policies designed to advance data sharing and reuse. With the launch of the NIH data management and sharing policy in 2023, establishing a pre-policy baseline of sharing and reuse activity is critical for the biological and biomedical community.
Toward this goal, we tested the utility of mentions of research resources, databases, and repositories (RDRs) as a proxy measurement of data sharing and reuse. We captured and processed text from Methods sections of open access biological and biomedical research articles published in 2020 and 2021 and made available in PubMed Central.
We used natural language processing to identify text strings to measure RDR mentions. In this article, we demonstrate our methodology, provide normalized baseline data sharing and reuse activity in this community, and highlight actions authors and publishers can take to encourage data sharing and reuse practices.