FAIR GPT: A virtual consultant for research data management in ChatGPT

Authors : Renat Shigapov, Irene Schumm

FAIR GPT is a first virtual consultant in ChatGPT designed to help researchers and organizations make their data and metadata compliant with the FAIR (Findable, Accessible, Interoperable, Reusable) principles. It provides guidance on metadata improvement, dataset organization, and repository selection.

To ensure accuracy, FAIR GPT uses external APIs to assess dataset FAIRness, retrieve controlled vocabularies, and recommend repositories, minimizing hallucination and improving precision. It also assists in creating documentation (data and software management plans, README files, and codebooks), and selecting proper licenses. This paper describes its features, applications, and limitations.

Arxiv : https://arxiv.org/abs/2410.07108

Reproducible and Attributable Materials Science Curation Practices: A Case Study

Authors : Ye Li, Sarah Laura Wilson, Micah Altman

While small labs produce much of the fundamental experimental research in Material Science and Engineering (MSE), little is known about their data management and sharing practices and the extent to which they promote trust in, and transparency of, the published research.

In this research, we conduct a case study of a leading MSE research lab to characterize the limits of current data management and sharing practices concerning reproducibility and attribution. We systematically reconstruct the workflows, underpinning four research projects by combining interviews, document review, and digital forensics. We then apply information graph analysis and computer-assisted retrospective auditing to identify where critical research information is unavailable or at risk.

We find that while data management and sharing practices in this leading lab protect against computer and disk failure, they are insufficient to ensure reproducibility or correct attribution of work — especially when a group member withdraws before project completion.

We conclude with recommendations for adjustments to MSE data management and sharing practices to promote trustworthiness and transparency by adding lightweight automated file-level auditing and automated data transfer processes.

URL : Reproducible and Attributable Materials Science Curation Practices: A Case Study

DOI : https://doi.org/10.2218/ijdc.v18i1.940

Research Data Management in the Croatian Academic Community: A Research Study

Author : Radovan Vrana

This paper presents the results of an empirical research study of Croatian scientists’ use and management of research data. This research study was carried out from 28 June 2023 until 31 August 2023 using an online questionnaire consisting of 28 questions. The answers of 584 respondents working in science were filtered out for further analysis. About three-quarters of the respondents used the research data of other scientists successfully. Research data were mostly acquired from colleagues from the same department or institution.

Roughly half of the respondents did not ask other scientists directly for their research data. Research data are important to the respondents mostly for raising the quality of research. Repeating someone else’s research by using their research data is still a problem. Less than one-third of the respondents provided full access to their research data mostly due to their fear of misuse.

The benefits of research data sharing were recognized but few of the respondents received any reward for it. Archiving research data is a significant problem for the respondents as they dominantly use their own computers prone to failure for that activity and do not think about long-term preservation. Finally, the respondents lacked deeper knowledge of research data management.

URL : Research Data Management in the Croatian Academic Community: A Research Study

DOI : https://doi.org/10.3390/publications12020016

Research Data Management in the Humanities: Challenges and Opportunities in the Canadian Context

Authors : Stefan Higgins, Lisa Goddard, Shahira Khair

In recent years, research funders across the world have implemented mandates for research data management (RDM) that introduce new obligations for researchers seeking funding. Although data work is not new in the humanities, digital research infrastructures, best practices, and the development of highly qualified personnel to support humanist researchers are all still nascent.

Responding to these changes, this article offers four contributions to how humanists can consider the role of “data” in their research and succeed in its management. First, we define RDM and data management plans (DMP) and raise some exigent questions regarding their development and maintenance.

Second, acknowledging the unsettled status of “data” in the humanities, we offer some conceptual explanations of what data are, and gesture to some ways in which humanists are already (and have always been) engaged in data work.

Third, we argue that data work requires conscious design—attention to how data are produced—and that thinking of data work as involving design (e.g., experimental and interpretive work) can help humanists engage more fruitfully in RDM.

Fourth, we argue that RDM (and data work, generally) is labour that requires compensation in the form of funding, support, and tools, as well as accreditation and recognition that incentivizes researchers to make RDM an integral part of their research.

Finally, we offer a set of concrete recommendations to support humanist RDM in the Canadian context.

URL : Research Data Management in the Humanities: Challenges and Opportunities in the Canadian Context

DOI : https://doi.org/10.16995/dscn.9956

Assessing Quality Variations in Early Career Researchers’ Data Management Plans

Author : Jukka Rantasaari

This paper aims to better understand early career researchers’ (ECRs’) research data management (RDM) competencies by assessing the contents and quality of data management plans (DMPs) developed during a multi-stakeholder RDM course. We also aim to identify differences between DMPs in relation to several background variables (e.g., discipline, course track).

The Basics of Research Data Management (BRDM) course has been held in two multi-faculty, research-intensive universities in Finland since 2020. In this study, 223 ECRs’ DMPs created in the BRDM of 2020 – 2022 were assessed, using the recommendations and criteria of the Finnish DMP Evaluation Guide + General Finnish DMP Guidance (FDEG).

The median quality of DMPs appeared to be satisfactory. The differences in rating according to FDEG’s three-point performance criteria were statistically insignificant between DMPs developed in separate years, course tracks or disciplines. However, using content analysis, differences were found between disciplines or course tracks regarding DMP’s key characteristics such as sharing, storing, and preserving data.

DMPs that contained a data table (DtDMPs) also differed highly significantly from prose DMPs. DtDMPs better acknowledged the data handling needs of different data types and improved the overall quality of a DMP.

The results illustrated that the ECRs had learned the basic RDM competencies and grasped their significance to the integrity, reliability, and reusability of data. However, more focused, further training to reach the advanced competency is needed, especially in areas of handling and sharing personal data, legal issues, long-term preserving, and funders’ data policies.

Equally important to the cultural change when RDM is an organic part of the research practices is to merge research support services, processes, and infrastructure into the research projects’ processes. Additionally, incentives are needed for sharing and reusing data.

URL : Assessing Quality Variations in Early Career Researchers’ Data Management Plans

DOI : https://doi.org/10.2218/ijdc.v18i1.873

Agile Research Data Management with Open Source: LinkAhead

Authors : Daniel Hornung, Florian Spreckelsen, Thomas Weiß

Research data management (RDM) in academic scientific environments increasingly enters the focus as an important part of good scientific practice and as a topic with big potentials for saving time and money. Nevertheless, there is a shortage of appropriate tools, which fulfill the specific requirements in scientific research.

We identified where the requirements in science deviate from other fields and proposed a list of requirements which RDM software should answer to become a viable option. We analyzed a number of currently available technologies and tool categories for matching these requirements and identified areas where no tools can satisfy researchers’ needs.

Finally we assessed the open-source RDMS (research data management system) LinkAhead for compatibility with the proposed features and found that it fulfills the requirements in the area of semantic, flexible data handling in which other tools show weaknesses.

URL : Agile Research Data Management with Open Source: LinkAhead

DOI : https://doi.org/10.48694/inggrid.3866

Emerging roles and responsibilities of libraries in support of reproducible research

Authors : Birgit Schmidt, Andrea Chiarelli, Lucia Loffreda, Jeroen Sondervan

Ensuring the reproducibility of research is a multi-stakeholder effort that comes with challenges and opportunities for individual researchers and research communities, librarians, publishers, funders and service providers. These emerge at various steps of the research process, and, in particular, at the publication stage.

Previous work by Knowledge Exchange highlighted that, while there is growing awareness among researchers, reproducible publication practices have been slow to change. Importantly, research reproducibility has not yet reached institutional agendas: this work seeks to highlight the rationale for libraries to initiate and/or step up their engagement with this topic, which we argue is well aligned with their core values and strategic priorities.

We draw on secondary analysis of data gathered by Knowledge Exchange, focusing on the literature identified as well as interviews held with librarians. We extend this through further investigation of the literature and by integrating the findings of discussions held at the 2022 LIBER conference, to provide an updated picture of how libraries engage with research reproducibility.

Libraries have a significant role in promoting responsible research practices, including transparency and reproducibility, by leveraging their connections to academic communities and collaborating with stakeholders like research funders and publishers. Our recommendations for libraries include: i) partnering with researchers to promote a research culture that values transparency and reproducibility, ii) enhancing existing research infrastructure and support; and iii) investing in raising awareness and developing skills and capacities related to these principles.

URL : Emerging roles and responsibilities of libraries in support of reproducible research

DOI : https://doi.org/10.53377/lq.14947