Text data mining and data quality management for research information systems in the context of open data and open science

Authors : Otmane Azeroual, Gunter Saake, Mohammad Abuosba, Joachim Schöpfel

In the implementation and use of research information systems (RIS) in scientific institutions, text data mining and semantic technologies are a key technology for the meaningful use of large amounts of data.

It is not the collection of data that is difficult, but the further processing and integration of the data in RIS. Data is usually not uniformly formatted and structured, such as texts and tables that cannot be linked.

These include various source systems with their different data formats such as project and publication databases, CERIF and RCD data model, etc. Internal and external data sources continue to develop.

On the one hand, they must be constantly synchronized and the results of the data links checked. On the other hand, the texts must be processed in natural language and certain information extracted.

Using text data mining, the quality of the metadata is analyzed and this identifies the entities and general keywords. So that the user is supported in the search for interesting research information.

The information age makes it easier to store huge amounts of data and increase the number of documents on the internet, in institutions’ intranets, in newswires and blogs is overwhelming.

Search engines should help to specifically open up these sources of information and make them usable for administrative and research purposes. Against this backdrop, the aim of this paper is to provide an overview of text data mining techniques and the management of successful data quality for RIS in the context of open data and open science in scientific institutions and libraries, as well as to provide ideas for their application. In particular, solutions for the RIS will be presented.

URL : https://arxiv.org/abs/1812.04298

Exploring Initiatives for Open Educational Practices at an Australian and a Brazilian University

Authors : Carina Bossu, Marineli Meier

This paper explores some key developments in Open Educational Practices (OEP) in higher education in Australia and in Brazil. More specifically, it focuses on the analysis of two individual universities: the University of Tasmania, in Australia; and the Federal University of Paraná, in Brazil.

They are both public and mostly face-to-face universities trying to engage with OEP to enhance their blended learning offerings, and more generally learning and teaching.

However, these institutions are distinctive in terms of their student numbers, their blended learning approaches, their role within their own communities, and their OEP strategies and initiatives.

We will present some of the key policies and strategies adopted by these universities to support OEP, as well as the impact and the opportunities at present.

The discussion in this paper will then attempt to make some recommendations for future directions of OEP adoption not only in these two countries, but also elsewhere.

URL : Exploring Initiatives for Open Educational Practices at an Australian and a Brazilian University

DOI : http://doi.org/10.5334/jime.475

Equipping the Next Generation for Responsible Research and Innovation with Open Educational Resources, Open Courses, Open Communities and Open Schooling: An Impact Case Study in Brazil

Authors : Alexandra Okada, Tony Sherborne

There has been an increasing number of projects and institutions promoting open education at scale through Open Educational Resources (OER) and Massive Open Online Courses (MOOC) to broaden learning opportunities for all. However, there are still many challenges in relation to sustainability, effective implementation and evidence-based impact to support educational policies.

To explore this gap, this paper focuses on an integrated model that combines OER, MOOC, Communities of Practice (CoP) and Open Schooling to promote open education and foster inquiry skills for Responsible Research and Innovation (RRI), a key approach coined by the European Commission.

This study focuses on the ENGAGE Project, with 14 partners in Europe who produced more than 300 OER, 60 MOOC in ten languages and supported 27 CoP with more than 17,000 members in the world including more than 2,000 from Brazil.

Through a novel framework on impact assessment of OER for RRI underpinned by a mixed method approach, this study examines the influence of open education on academic and non-academic groups and the correlation between the outputs developed in the project with the outcomes reported by the Brazilian communities.

Qualitative and quantitative data from the ENGAGE platform, journal articles produced by the Brazilian participants and interviews with authors were analysed.

Findings report the different ways that the community developed open schooling projects, the changes in their practices to foster digital scientific literacy, and outcomes with implications for society.

URL : Equipping the Next Generation for Responsible Research and Innovation with Open Educational Resources, Open Courses, Open Communities and Open Schooling: An Impact Case Study in Brazil

DOI : http://doi.org/10.5334/jime.482

Being a deliberate prey of a predator: Researchers’ thoughts after having published in predatory journal

Authors: Najmeh Shaghaei, Charlotte Wien, Jakob Pavl Holck, Anita L. Thiesen, Ole Ellegaard, Evgenios Vlachos, Thea Marie Drachen

A central question concerning scientific publishing is how researchers select journals to which they submit their work, since the choice of publication channel can make or break researchers.

The gold-digger mentality developed by some publishers created the so-called predatory journals that accept manuscripts for a fee with little peer review. The literature claims that mainly researchers from low-ranked universities in developing countries publish in predatory journals.

We decided to challenge this claim using the University of Southern Denmark as a case. We ran the Beall’s List against our research registration database and identified 31 possibly predatory publications from a set of 6,851 publications within 2015-2016.

A qualitative research interview revealed that experienced researchers from the developed world publish in predatory journals mainly for the same reasons as do researchers from developing countries: lack of awareness, speed and ease of the publication process, and a chance to get elsewhere rejected work published.

However, our findings indicate that the Open Access potential and a larger readership outreach were also motives for publishing in open access journals with quick acceptance rates.

URL : Being a deliberate prey of a predator: Researchers’ thoughts after having published in predatory journal

DOI : http://doi.org/10.18352/lq.10259

Creating Structured Linked Data to Generate Scholarly Profiles: A Pilot Project using Wikidata and Scholia

Authors : Mairelys Lemus-Rojas, Jere D. Odell


Wikidata, a knowledge base for structured linked data, provides an open platform for curating scholarly communication data. Because all elements in a Wikidata entry are linked to defining elements and metadata, other web systems can harvest and display the data in meaningful ways.

Thus, Wikidata has the capacity to serve as the data source for faculty profiles. Scholia is an example of how third-party tools can leverage the power of Wikidata to provisde faculty profiles and bibliographic, data-driven visualizations.


In this article, we share our methods for contributing to Wikidata and displaying the data with Scholia.

We deployed these methods as part of a pilot project in which we contributed data about a small but unique school on the Indiana University-Purdue University Indianapolis (IUPUI) campus, the IU Lilly Family School of Philanthropy.


Following the completion of our pilot project, we aim to find additional methods for contributing large data collections to Wikidata. Specifically, we seek to contribute scholarly communication data that the library already maintains in other systems.

We are also facilitating Wikidata edit-a-thons to increase the library’s familiarity with the knowledge base and our capacity to contribute to the site.

URL : Creating Structured Linked Data to Generate Scholarly Profiles: A Pilot Project using Wikidata and Scholia

DOI : https://doi.org/10.7710/2162-3309.2272

The principles of tomorrow’s university

Authors : Daniel S. Katz, Gabrielle Allen, Lorena A. Barba, Devin R. Berg, Holly Bik, Carl Boettiger, Christine L. Borgman, C. Titus Brown, Stuart Buck, Randy Burd, Anita de Waard, Martin Paul Eve, Brian E. Granger, Josh Greenberg, Adina Howe, Bill Howe, May Khanna, Timothy L. Killeen, Matthew Mayernik, Erin McKiernan, Chris Mentzel, Nirav Merchant, Kyle E. Niemeyer, Laura Noren, Sarah M. Nusser, Daniel A. Reed, Edward Seidel, MacKenzie Smith, Jeffrey R. Spies, Matt Turk, John D. Van Horn, Jay Walsh

In the 21st Century, research is increasingly data- and computation-driven. Researchers, funders, and the larger community today emphasize the traits of openness and reproducibility.

In March 2017, 13 mostly early-career research leaders who are building their careers around these traits came together with ten university leaders (presidents, vice presidents, and vice provosts), representatives from four funding agencies, and eleven organizers and other stakeholders in an NIH- and NSF-funded one-day, invitation-only workshop titled “Imagining Tomorrow’s University.”

Workshop attendees were charged with launching a new dialog around open research – the current status, opportunities for advancement, and challenges that limit sharing.

The workshop examined how the internet-enabled research world has changed, and how universities need to change to adapt commensurately, aiming to understand how universities can and should make themselves competitive and attract the best students, staff, and faculty in this new world.

During the workshop, the participants re-imagined scholarship, education, and institutions for an open, networked era, to uncover new opportunities for universities to create value and serve society.

They expressed the results of these deliberations as a set of 22 principles of tomorrow’s university across six areas: credit and attribution, communities, outreach and engagement, education, preservation and reproducibility, and technologies.

Activities that follow on from workshop results take one of three forms. First, since the workshop, a number of workshop authors have further developed and published their white papers to make their reflections and recommendations more concrete.

These authors are also conducting efforts to implement these ideas, and to make changes in the university system.

Second, we plan to organise a follow-up workshop that focuses on how these principles could be implemented.

Third, we believe that the outcomes of this workshop support and are connected with recent theoretical work on the position and future of open knowledge institutions.

URL : The principles of tomorrow’s university

DOI : https://doi.org/10.12688/f1000research.17425.1

Access to academic libraries: an indicator of openness?

Authors  : Chun-Kai (Karl) Huang, Lucy Montgomery, Cameron Neylon, Katie Wilson


Open access to digital research output is increasing, but academic library policies can place restrictions on public access to libraries. This paper reports on a preliminary study to investigate the correlation between academic library access policies and institutional positions of openness to knowledge.


This primarily qualitative study used document and data analysis to examine the content of library access/use policies of 12 academic institutions in eight countries. The outcomes were statistically correlated with institutional open access publication policies and practices.


We used an automated search tool together with manual searching to retrieve web-based library access policies, then categorised and counted the levels and conditions of public access. We compared scores for institutional library access features, open access features and percentages of open access publications.


Academic library policies may suggest open public access but multi-layered user categories, privileges and fees charged can inhibit access, with disparities in openness emerging between library policies and institutional open access policies.

Conclusion. As open access publishing options and mandates expand, physical entry and access to print and electronic resources in academic libraries is contracting. This conflicts with global library and information commitments to open access to knowledge.

DOI : https://hcommons.org/deposits/item/hc:21881/