Deep Learning in Mining Biological Data

Authors : Mufti Mahmud, M. Shamim Kaiser, T. Martin McGinnity, Amir Hussain

Recent technological advancements in data acquisition tools allowed life scientists to acquire multimodal data from different biological application domains. Categorized in three broad types (i.e. images, signals, and sequences), these data are huge in amount and complex in nature.

Mining such enormous amount of data for pattern recognition is a big challenge and requires sophisticated data-intensive machine learning techniques. Artificial neural network-based learning systems are well known for their pattern recognition capabilities, and lately their deep architectures—known as deep learning (DL)—have been successfully applied to solve many complex pattern recognition problems.

To investigate how DL—especially its different architectures—has contributed and been utilized in the mining of biological data pertaining to those three types, a meta-analysis has been performed and the resulting resources have been critically analysed. Focusing on the use of DL to analyse patterns in data from diverse biological domains, this work investigates different DL architectures’ applications to these data.

This is followed by an exploration of available open access data sources pertaining to the three data types along with popular open-source DL tools applicable to these data. Also, comparative investigations of these tools from qualitative, quantitative, and benchmarking perspectives are provided.

Finally, some open research challenges in using DL to mine biological data are outlined and a number of possible future perspectives are put forward.

URL : Deep Learning in Mining Biological Data

DOI : https://doi.org/10.1007/s12559-020-09773-x

Copyright in the Scientific Community. The Limitations and Exceptions in the European Union and Spanish Legal Frameworks

Author : Itziar Sobrino-García

The increase of visibility and transfer of scholar knowledge through digital environments have been followed by the author’s rights abuses such as plagiarism and fraud. For this reason, copyright is increasingly a topic of major importance since it provides authors with a set of rights to enable them to utilize their work and to be recognized as the creators.

The new research methods linked to technological advances (such as data mining) and the emergence of systems such as Open Access (OA) are currently under debate.

These issues have generated legislative changes at the level of the European Union (EU) and its Member States. For this reason, it is relevant that the researchers know how to protect their work and the proper use of another’s work.

Consequently, this research aims to identify the limitations of copyright in the EU and as a specific case in Spain, within the framework of scientific research. For this, the changes in the European and Spanish copyright regulations are analyzed.

The results confirm new exceptions and limitations for researchers related to technological evolution, such as data mining. Additionally, the article incorporates several guidelines and implications for the scientific community.

URL : Copyright in the Scientific Community. The Limitations and Exceptions in the European Union and Spanish Legal Frameworks

DOI : https://doi.org/10.3390/publications8020027

Copyright and the Progress of Science: Why Text and Data Mining Is Lawful

Author : Michael W. Carroll

This Article argues that U.S. copyright law provides a competitive advantage in the global race for innovation policy because it permits researchers to conduct computational analysis — text and data mining — on any materials to which they have access.

Amendments to copyright law in Japan, and the European Union’s recent addition of limitations on copyright to legalize some TDM research, implicitly acknowledge the competitive benefits provided by the fair use provision of U.S. copyright law.

Focusing only on U.S. law, this Article makes two general contributions to the literature on fair use: (1) in cases involving archiving, the user’s security precautions are relevant under the first fair use factor and should not be treated as an unenumerated factor or as part of the market harm analysis; and (2) good faith should not be a factor in fair use analysis, but even if courts do consider good faith, TDM research conducted on infringing sources, such as Sci-Hub, is still lawful because the research provides transformative benefits without causing harm to the markets that matter.

This Article also revisits the issue of temporary copies to argue that certain steps in TDM research do not make copies that “count” under U.S. law and that it is possible to design cloud-based TDM research that does not implicate U.S. copyright law at all.

This Article addresses the needs of many audiences including policymakers, courts, university counsel, research libraries, and legal scholars who seek a thorough legal analysis to support this argument.

URL : https://lawreview.law.ucdavis.edu/issues/53/2/articles/53-2_carroll.html

Challenges and opportunities in the evolving digital preservation landscape: reflections from Portico

Authors: Kate Wittenberg, Sarah Glasser, Amy Kirchhoff, Sheila Morrissey, Stephanie Orphan

There has been tremendous growth in the amount of digital content created by libraries, publishers, cultural institutions and the general public. While there are great benefits to having content available in digital form, digital objects can be extremely short-lived unless proper attention is paid to preservation.

Reflecting on our experience with the digital preservation service Portico, we provide background on Portico’s history and evolving practice of sustainable preservation of the digital artifacts of scholarly communications.

We also provide an overview of the digital preservation landscape as we see it now, with some thoughts on current requirements for preservation, and thoughts on the opportunities and challenges that lie ahead.

URL : Challenges and opportunities in the evolving digital preservation landscape: reflections from Portico

DOI : http://doi.org/10.1629/uksg.421

Exploring the feasibility of applying data mining for library reference service improvement : a case study of Turku Main Library

Author : Ming Zhan

Data mining, as a heatedly discussed term, has been studied in various fields. Its possibilities in refining the decision-making process, realizing potential patterns and creating valuable knowledge have won attention of scholars and practitioners. However, there are less studies intending to combine data mining and libraries where data generation occurs all the time.

Therefore, this thesis plans to fill such a gap. Meanwhile, potential opportunities created by data mining are explored to enhance one of the most important elements of libraries: reference service. In order to thoroughly demonstrate the feasibility and applicability of data mining, literature is reviewed to establish a critical understanding of data mining in libraries and attain the current status of library reference service.

The result of the literature review indicates that free online data resources other than data generated on social media are rarely considered to be applied in current library data mining mandates. Therefore, the result of the literature review motivates the presented study to utilize online free resources. Furthermore, the natural match between data mining and libraries is established.

The natural match is explained by emphasizing the data richness reality and considering data mining as one kind of knowledge, an easy choice for libraries, and a wise method to overcome reference service challenges. The natural match, especially the aspect that data mining could be helpful for library reference service, lays the main theoretical foundation for the empirical work in this study.

Turku Main Library was selected as the case to answer the research question: whether data mining is feasible and applicable for reference service improvement. In this case, the daily visit from 2009 to 2015 in Turku Main Library is considered as the resource for data mining.

In addition, corresponding weather conditions are collected from Weather Underground, which is totally free online. Before officially being analyzed, the collected dataset is cleansed and preprocessed in order to ensure the quality of data mining.

Multiple regression analysis is employed to mine the final dataset. Hourly visits are the independent variable and weather conditions, Discomfort Index and seven days in a week are dependent variables. In the end, four models in different seasons are established to predict visiting situations in each season.

Patterns are realized in different seasons and implications are created based on the discovered patterns. In addition, library-climate points are generated by a clustering method, which simplifies the process for librarians using weather data to forecast library visiting situation. Then the data mining result is interpreted from the perspective of improving reference service.

After this data mining work, the result of the case study is presented to librarians so as to collect professional opinions regarding the possibility of employing data mining to improve reference services. In the end, positive opinions are collected, which implies that it is feasible to utilizing data mining as a tool to enhance library reference service.

URL : http://www.doria.fi/handle/10024/124215

Learning Analytics and the Academic Library: Professional Ethics Commitments at a Crossroads

Authors : Kyle M.L. Jones, Dorothea Salo

In this paper, the authors address learning analytics and the ways academic libraries are beginning to participate in wider institutional learning analytics initiatives. Since there are moral issues associated with learning analytics, the authors consider how data mining practices run counter to ethical principles in the American Library Association’s “Code of Ethics.”

Specifically, the authors address how learning analytics implicates professional commitments to promote intellectual freedom; protect patron privacy and confidentiality; and balance intellectual property interests between library users, their institution, and content creators and vendors.

The authors recommend that librarians should embed their ethical positions in technological designs, practices, and governance mechanisms.

URL : Learning Analytics and the Academic Library: Professional Ethics Commitments at a Crossroads

Alternative location : http://crl.acrl.org/index.php/crl/article/view/16603

The legal and policy framework for scientific data sharing, mining and reuse

Author : Mélanie Dulong de Rosnay

Text and Data Mining, the automatic processing of large amounts of scientific articles and datasets, is an essential practice for contemporary researchers. Some publishers are challenging it as a lawful activity and the topic is being discussed during European copyright law reform process.

In order to better understand the underlying debate and contribute to the policy discussion, this article first examines the legal status of data access and reuse and licensing policies. It then presents available options supporting the exercise of Text and Data Mining: publication under open licenses, open access legislations and a recognition of the legitimacy of the activity.

For that purpose, the paper analyses the scientific rational for sharing and its legal and technical challenges and opportunities. In particular, it surveys existing open access and open data legislations and discusses implementation in European and Latin America jurisdictions.

Framing Text and Data mining as an exception to copyright could be problematic as it de facto denies that this activity is part of a positive right to read and should not require additional permission nor licensing.

It is crucial in licenses and legislations to provide a correct definition of what is Open Access, and to address the question of pre-existing copyright agreements. Also, providing implementation means and technical support is key. Otherwise, legislations could remain declarations of good principles if repositories are acting as empty shells.

URL ; https://books.openedition.org/editionsmsh/9082