An analysis of the effects of sharing research data, code, and preprints on citations

Authors : Giovanni Colavizza, Lauren Cadwallader, Marcel LaFlamme, Grégory Dozot, Stéphane Lecorney, Daniel Rappo, Iain Hrynaszkiewicz

Calls to make scientific research more open have gained traction with a range of societal stakeholders. Open Science practices include but are not limited to the early sharing of results via preprints and openly sharing outputs such as data and code to make research more reproducible and extensible. Existing evidence shows that adopting Open Science practices has effects in several domains.

In this study, we investigate whether adopting one or more Open Science practices leads to significantly higher citations for an associated publication, which is one form of academic impact. We use a novel dataset known as Open Science Indicators, produced by PLOS and DataSeer, which includes all PLOS publications from 2018 to 2023 as well as a comparison group sampled from the PMC Open Access Subset. In total, we analyze circa 122’000 publications. We calculate publication and author-level citation indicators and use a broad set of control variables to isolate the effect of Open Science Indicators on received citations.

We show that Open Science practices are adopted to different degrees across scientific disciplines. We find that the early release of a publication as a preprint correlates with a significant positive citation advantage of about 20.2% on average. We also find that sharing data in an online repository correlates with a smaller yet still positive citation advantage of 4.3% on average.

However, we do not find a significant citation advantage for sharing code. Further research is needed on additional or alternative measures of impact beyond citations. Our results are likely to be of interest to researchers, as well as publishers, research funders, and policymakers.

Arxiv : https://arxiv.org/abs/2404.16171

A survey of how biology researchers assess credibility when serving on grant and hiring committees

Authors : Iain Hrynaszkiewicz, Beruria Novich, James Harney, Veronique Kiermer

Researchers who serve on grant review and hiring committees have to make decisions about the intrinsic value of research in short periods of time, and research impact metrics such Journal Impact Factor (JIF) exert undue influence on these decisions. Initiatives such as the Coalition for Advancing Research Assessment (CoARA) and the Declaration on Research Assessment (DORA) emphasize responsible use of quantitative metrics and avoidance of journal-based impact metrics for research assessment. Further, our previous qualitative research suggested that assessing credibility, or trustworthiness, of research is important to researchers not only when they seek to inform their own research but also in the context of research assessment committees.

To confirm our findings from previous interviews in quantitative terms, we surveyed 485 biology researchers who have served on committees for grant review or hiring and promotion decisions, to understand how they assess the credibility of research outputs in these contexts. We found that concepts like credibility, trustworthiness, quality and impact lack consistent definitions and interpretations by researchers, which had already been observed in our interviews.

We also found that assessment of credibility is very important to most (81%) of researchers serving in these committees but fewer than half of respondents are satisfied with their ability to assess credibility. A substantial proportion of respondents (57% of respondents) report using journal reputation and JIF to assess credibility – proxies that research assessment reformers consider inappropriate to assess credibility because they don’t rely on intrinsic characteristics of the research.

This gap between importance of an assessment and satisfaction in the ability to conduct it was reflected in multiple aspects of credibility we tested and it was greatest for researchers seeking to assess the integrity of research (such as identifying signs of fabrication, falsification, or plagiarism), and the suitability and completeness of research methods. Non-traditional research outputs associated with Open Science practices – research data, code, protocol and preprints sharing – are particularly hard for researchers to assess, despite the potential of Open Science practices to signal trustworthiness.

Our results suggest opportunities to develop better guidance and better signals to support the evaluation of research credibility and trustworthiness – and ultimately support research assessment reform, away from the use of inappropriate proxies for impact and towards assessing the intrinsic characteristics and values researchers see as important.

DOI : https://doi.org/10.31222/osf.io/ht836

A survey of researchers’ needs and priorities for data sharing

Authors : Iain Hrynaszkiewicz, James Harney, Lauren Cadwallader

PLOS has long supported Open Science. One of the ways in which we do so is via our stringent data availability policy established in 2014. Despite this policy, and more data sharing policies being introduced by other organizations, best practices for data sharing are adopted by a minority of researchers in their publications. Problems with effective research data sharing persist and these problems have been quantified by previous research as a lack of time, resources, incentives, and/or skills to share data.

In this study we built on this research by investigating the importance of tasks associated with data sharing, and researchers’ satisfaction with their ability to complete these tasks. By investigating these factors we aimed to better understand opportunities for new or improved solutions for sharing data.

In May-June 2020 we surveyed researchers from Europe and North America to rate tasks associated with data sharing on (i) their importance and (ii) their satisfaction with their ability to complete them. We received 728 completed and 667 partial responses. We calculated mean importance and satisfaction scores to highlight potential opportunities for new solutions to and compare different cohorts.

Tasks relating to research impact, funder compliance, and credit had the highest importance scores. 52% of respondents reuse research data but the average satisfaction score for obtaining data for reuse was relatively low. Tasks associated with sharing data were rated somewhat important and respondents were reasonably well satisfied in their ability to accomplish them. Notably, this included tasks associated with best data sharing practice, such as use of data repositories. However, the most common method for sharing data was in fact via supplemental files with articles, which is not considered to be best practice.

We presume that researchers are unlikely to seek new solutions to a problem or task that they are satisfied in their ability to accomplish, even if many do not attempt this task. This implies there are few opportunities for new solutions or tools to meet these researcher needs. Publishers can likely meet these needs for data sharing by working to seamlessly integrate existing solutions that reduce the effort or behaviour change involved in some tasks, and focusing on advocacy and education around the benefits of sharing data.

There may however be opportunities – unmet researcher needs – in relation to better supporting data reuse, which could be met in part by strengthening data sharing policies of journals and publishers, and improving the discoverability of data associated with published articles.

DOI : https://doi.org/10.31219/osf.io/njr5u

Publishers’ Responsibilities in Promoting Data Quality and Reproducibility

Author : Iain Hrynaszkiewicz

Scholarly publishers can help to increase data quality and reproducible research by promoting transparency and openness.

Increasing transparency can be achieved by publishers in six key areas: (1) understanding researchers’ problems and motivations, by conducting and responding to the findings of surveys; (2) raising awareness of issues and encouraging behavioural and cultural change, by introducing consistent journal policies on sharing research data, code and materials; (3) improving the quality and objectivity of the peer-review process by implementing reporting guidelines and checklists and using technology to identify misconduct; (4) improving scholarly communication infrastructure with journals that publish all scientifically sound research, promoting study registration, partnering with data repositories and providing services that improve data sharing and data curation; (5) increasing incentives for practising open research with data journals and software journals and implementing data citation and badges for transparency; and (6) making research communication more open and accessible, with open-access publishing options, permitting text and data mining and sharing publisher data and metadata and through industry and community collaboration.

This chapter describes practical approaches being taken by publishers, in these six areas, their progress and effectiveness and the implications for researchers publishing their work.

URL : Publishers’ Responsibilities in Promoting Data Quality and Reproducibility

Alternative location : https://link.springer.com/chapter/10.1007%2F164_2019_290

The citation advantage of linking publications to research data

Authors : Giovanni Colavizza, Iain Hrynaszkiewicz, Isla Staden, Kirstie Whitaker, Barbara McGillivray

Efforts to make research results open and reproducible are increasingly reflected by journal policies encouraging or mandating authors to provide data availability statements.

As a consequence of this, there has been a strong uptake of data availability statements in recent literature. Nevertheless, it is still unclear what proportion of these statements actually contain well-formed links to data, for example via a URL or permanent identifier, and if there is an added value in providing them.

We consider 531,889 journal articles published by PLOS and BMC which are part of the PubMed Open Access collection, categorize their data availability statements according to their content and analyze the citation advantage of different statement categories via regression.

We find that, following mandated publisher policies, data availability statements have become common by now, yet statements containing a link to a repository are still just a fraction of the total.

We also find that articles with these statements, in particular, can have up to 25.36% higher citation impact on average: an encouraging result for all publishers and authors who make the effort of sharing their data. All our data and code are made available in order to reproduce and extend our results.

URL : https://arxiv.org/abs/1907.02565

AccessLab: Workshops to broaden access to scientific research

Authors : Amber G. F. Griffiths, Ivvet Modinou, Clio Heslop, Charlotte Brand, Aidan Weatherill, Kate Baker, Anna E. Hughes, Jen Lewis, Lee de Mora, Sara Mynott, Katherine E. Roberts, David J. Griffiths, Iain Hrynaszkiewicz​, Natasha Simons​, Azhar Hussain​,​ Simon Goudie

AccessLabs are workshops with two simultaneous motivations, achieved through direct citizen-scientist pairings: (1) to decentralise research skills so that a broader range of people are able to access/use scientific research, and (2) to expose science researchers to the difficulties of using their research as an outsider, creating new open access advocates.

Five trial AccessLabs have taken place for policy makers, media/journalists, marine sector participants, community groups, and artists. The act of pairing science academics with local community members helps build understanding and trust between groups at a time when this relationship appears to be under increasing threat from different political and economic currents in society.

Here, we outline the workshop motivations, format, and evaluation, with the aim that others can build on the methods developed.

URL : AccessLab: Workshops to broaden access to scientific research

DOI : https://doi.org/10.1371/journal.pbio.3000258

Developing a research data policy framework for all journals and publishers

Authors : Iain Hrynaszkiewicz​, Natasha Simons​, Azhar Hussain​,​ Simon Goudie

More journals and publishers – and funding agencies and institutions – are introducing research data policies. But as the prevalence of policies increases, there is potential to confuse researchers and support staff with numerous or conflicting policy requirements.

We define and describe 14 features of journal research data policies and arrange these into a set of six standard policy types or tiers, which can be adopted by journals and publishers to promote data sharing in a way that encourages good practice and is appropriate for their audience’s perceived needs.

Policy features include coverage of topics such as data citation, data repositories, data availability statements, data standards and formats, and peer review of research data.

These policy features and types have been created by reviewing the policies of multiple scholarly publishers, which collectively publish more than 10,000 journals, and through discussions and consensus building with multiple stakeholders in research data policy via the Data Policy Standardisation and Implementation Interest Group of the Research Data Alliance.

Implementation guidelines for the standard research data policies for journals and publishers are also provided, along with template policy texts which can be implemented by journals in their Information for Authors and publishing workflows.

We conclude with a call for collaboration across the scholarly publishing and wider research community to drive further implementation and adoption of consistent research data policies.

URL : Developing a research data policy framework for all journals and publishers

Alternative location : https://figshare.com/articles/Developing_a_research_data_policy_framework_for_all_journals_and_publishers/8223365/1