Trends and changes in academic libraries’ data management functions: A topic modeling analysis of job advertisements

Authors : Ye Yuan , A.M.K. Yanti Idaya , A. Noorhidawati, Guan Wang

In the era of open science, academic libraries have transitioned from traditional resource providers to proactive platforms that drive data integration and knowledge innovation.

This shift has led to the continuous evolution and expansion of their data management functions. This study aims to (i) track trends in academic library data management positions, (ii) identify key themes in job advertisements related to data management, and (iii) examine how these themes have evolved. Using text mining techniques, this study applied Latent Dirichlet Allocation (LDA) and TF-IDF vectorization to systematically analyze 803 job advertisements related to data management posted on the IFLA LIBJOBS platform from 1996 to 2023.

The findings reveal that the development of these positions has undergone three phases: exploration, growth, and adjustment. Four core themes in data management functions emerged: “Cataloging and Metadata Management,” “Data Services and Support,” “Research Data Management,” and “Systems Management and Maintenance.”

Over time, these themes have evolved from distinct roles to a more balanced distribution. Technological advancements, political initiatives, and shifts in the global data environment have influenced these trends. Notably, the rising demand for “Systems Management and Maintenance” highlights its critical role in ensuring data security, while the sustained need for “Cataloging and Metadata Management” underscores its foundational place in data management strategies.

Meanwhile, the steady growth of “Data Services and Support” and “Research Data Management” reflects the adaptability and strategic adjustments of academic libraries in response to the rapidly changing information landscape.

These insights offer valuable empirical evidence for library leaders and policymakers in strategic planning and capacity development, ensuring that libraries can effectively navigate the challenges of a dynamic research environment.

DOI : https://doi.org/10.1016/j.acalib.2025.103017

FAIR GPT: A virtual consultant for research data management in ChatGPT

Authors : Renat Shigapov, Irene Schumm

FAIR GPT is a first virtual consultant in ChatGPT designed to help researchers and organizations make their data and metadata compliant with the FAIR (Findable, Accessible, Interoperable, Reusable) principles. It provides guidance on metadata improvement, dataset organization, and repository selection.

To ensure accuracy, FAIR GPT uses external APIs to assess dataset FAIRness, retrieve controlled vocabularies, and recommend repositories, minimizing hallucination and improving precision. It also assists in creating documentation (data and software management plans, README files, and codebooks), and selecting proper licenses. This paper describes its features, applications, and limitations.

Arxiv : https://arxiv.org/abs/2410.07108

Reproducible and Attributable Materials Science Curation Practices: A Case Study

Authors : Ye Li, Sarah Laura Wilson, Micah Altman

While small labs produce much of the fundamental experimental research in Material Science and Engineering (MSE), little is known about their data management and sharing practices and the extent to which they promote trust in, and transparency of, the published research.

In this research, we conduct a case study of a leading MSE research lab to characterize the limits of current data management and sharing practices concerning reproducibility and attribution. We systematically reconstruct the workflows, underpinning four research projects by combining interviews, document review, and digital forensics. We then apply information graph analysis and computer-assisted retrospective auditing to identify where critical research information is unavailable or at risk.

We find that while data management and sharing practices in this leading lab protect against computer and disk failure, they are insufficient to ensure reproducibility or correct attribution of work — especially when a group member withdraws before project completion.

We conclude with recommendations for adjustments to MSE data management and sharing practices to promote trustworthiness and transparency by adding lightweight automated file-level auditing and automated data transfer processes.

URL : Reproducible and Attributable Materials Science Curation Practices: A Case Study

DOI : https://doi.org/10.2218/ijdc.v18i1.940

Research Data Management in the Croatian Academic Community: A Research Study

Author : Radovan Vrana

This paper presents the results of an empirical research study of Croatian scientists’ use and management of research data. This research study was carried out from 28 June 2023 until 31 August 2023 using an online questionnaire consisting of 28 questions. The answers of 584 respondents working in science were filtered out for further analysis. About three-quarters of the respondents used the research data of other scientists successfully. Research data were mostly acquired from colleagues from the same department or institution.

Roughly half of the respondents did not ask other scientists directly for their research data. Research data are important to the respondents mostly for raising the quality of research. Repeating someone else’s research by using their research data is still a problem. Less than one-third of the respondents provided full access to their research data mostly due to their fear of misuse.

The benefits of research data sharing were recognized but few of the respondents received any reward for it. Archiving research data is a significant problem for the respondents as they dominantly use their own computers prone to failure for that activity and do not think about long-term preservation. Finally, the respondents lacked deeper knowledge of research data management.

URL : Research Data Management in the Croatian Academic Community: A Research Study

DOI : https://doi.org/10.3390/publications12020016

Research Data Management in the Humanities: Challenges and Opportunities in the Canadian Context

Authors : Stefan Higgins, Lisa Goddard, Shahira Khair

In recent years, research funders across the world have implemented mandates for research data management (RDM) that introduce new obligations for researchers seeking funding. Although data work is not new in the humanities, digital research infrastructures, best practices, and the development of highly qualified personnel to support humanist researchers are all still nascent.

Responding to these changes, this article offers four contributions to how humanists can consider the role of “data” in their research and succeed in its management. First, we define RDM and data management plans (DMP) and raise some exigent questions regarding their development and maintenance.

Second, acknowledging the unsettled status of “data” in the humanities, we offer some conceptual explanations of what data are, and gesture to some ways in which humanists are already (and have always been) engaged in data work.

Third, we argue that data work requires conscious design—attention to how data are produced—and that thinking of data work as involving design (e.g., experimental and interpretive work) can help humanists engage more fruitfully in RDM.

Fourth, we argue that RDM (and data work, generally) is labour that requires compensation in the form of funding, support, and tools, as well as accreditation and recognition that incentivizes researchers to make RDM an integral part of their research.

Finally, we offer a set of concrete recommendations to support humanist RDM in the Canadian context.

URL : Research Data Management in the Humanities: Challenges and Opportunities in the Canadian Context

DOI : https://doi.org/10.16995/dscn.9956

Assessing Quality Variations in Early Career Researchers’ Data Management Plans

Author : Jukka Rantasaari

This paper aims to better understand early career researchers’ (ECRs’) research data management (RDM) competencies by assessing the contents and quality of data management plans (DMPs) developed during a multi-stakeholder RDM course. We also aim to identify differences between DMPs in relation to several background variables (e.g., discipline, course track).

The Basics of Research Data Management (BRDM) course has been held in two multi-faculty, research-intensive universities in Finland since 2020. In this study, 223 ECRs’ DMPs created in the BRDM of 2020 – 2022 were assessed, using the recommendations and criteria of the Finnish DMP Evaluation Guide + General Finnish DMP Guidance (FDEG).

The median quality of DMPs appeared to be satisfactory. The differences in rating according to FDEG’s three-point performance criteria were statistically insignificant between DMPs developed in separate years, course tracks or disciplines. However, using content analysis, differences were found between disciplines or course tracks regarding DMP’s key characteristics such as sharing, storing, and preserving data.

DMPs that contained a data table (DtDMPs) also differed highly significantly from prose DMPs. DtDMPs better acknowledged the data handling needs of different data types and improved the overall quality of a DMP.

The results illustrated that the ECRs had learned the basic RDM competencies and grasped their significance to the integrity, reliability, and reusability of data. However, more focused, further training to reach the advanced competency is needed, especially in areas of handling and sharing personal data, legal issues, long-term preserving, and funders’ data policies.

Equally important to the cultural change when RDM is an organic part of the research practices is to merge research support services, processes, and infrastructure into the research projects’ processes. Additionally, incentives are needed for sharing and reusing data.

URL : Assessing Quality Variations in Early Career Researchers’ Data Management Plans

DOI : https://doi.org/10.2218/ijdc.v18i1.873

Agile Research Data Management with Open Source: LinkAhead

Authors : Daniel Hornung, Florian Spreckelsen, Thomas Weiß

Research data management (RDM) in academic scientific environments increasingly enters the focus as an important part of good scientific practice and as a topic with big potentials for saving time and money. Nevertheless, there is a shortage of appropriate tools, which fulfill the specific requirements in scientific research.

We identified where the requirements in science deviate from other fields and proposed a list of requirements which RDM software should answer to become a viable option. We analyzed a number of currently available technologies and tool categories for matching these requirements and identified areas where no tools can satisfy researchers’ needs.

Finally we assessed the open-source RDMS (research data management system) LinkAhead for compatibility with the proposed features and found that it fulfills the requirements in the area of semantic, flexible data handling in which other tools show weaknesses.

URL : Agile Research Data Management with Open Source: LinkAhead

DOI : https://doi.org/10.48694/inggrid.3866