Deep Learning in Mining Biological Data

Authors : Mufti Mahmud, M. Shamim Kaiser, T. Martin McGinnity, Amir Hussain

Recent technological advancements in data acquisition tools allowed life scientists to acquire multimodal data from different biological application domains. Categorized in three broad types (i.e. images, signals, and sequences), these data are huge in amount and complex in nature.

Mining such enormous amount of data for pattern recognition is a big challenge and requires sophisticated data-intensive machine learning techniques. Artificial neural network-based learning systems are well known for their pattern recognition capabilities, and lately their deep architectures—known as deep learning (DL)—have been successfully applied to solve many complex pattern recognition problems.

To investigate how DL—especially its different architectures—has contributed and been utilized in the mining of biological data pertaining to those three types, a meta-analysis has been performed and the resulting resources have been critically analysed. Focusing on the use of DL to analyse patterns in data from diverse biological domains, this work investigates different DL architectures’ applications to these data.

This is followed by an exploration of available open access data sources pertaining to the three data types along with popular open-source DL tools applicable to these data. Also, comparative investigations of these tools from qualitative, quantitative, and benchmarking perspectives are provided.

Finally, some open research challenges in using DL to mine biological data are outlined and a number of possible future perspectives are put forward.

URL : Deep Learning in Mining Biological Data

DOI : https://doi.org/10.1007/s12559-020-09773-x

Biomedical Data Sharing Among Researchers: A Study from Jordan

Authors : Lina Al-Ebbini, Omar F Khabour, Karem H Alzoubi, Almuthanna K Alkaraki

Background

Data sharing is an encouraged practice to support research in all fields. For that purpose, it is important to examine perceptions and concerns of researchers about biomedical data sharing, which was investigated in the current study.

Methods

This is a cross-sectional survey study that was distributed among biomedical researchers in Jordan, as an example of developing countries. The study survey consisted of questions about demographics and about respondent’s attitudes toward sharing of biomedical data.

Results

Among study participants, 46.9% (n=82) were positive regarding making their research data available to the public, whereas 53.1% refused the idea. The reasons for refusing to publicly share their data included “lack of regulations” (33.5%), “access to research data should be limited to the research team” (29.5%), “no place to deposit the data” (6.5%), and “lack of funding for data deposition” (6.0%).

Agreement with the idea of making data available was associated with academic rank (P=0.003). Moreover, gender (P-value=0.043) and number of publications (P-value=0.005) were associated with a time frame for data sharing (ie, agreeing to share data before vs after publication).

Conclusion

About half of the respondents reported a positive attitude toward biomedical data sharing. Proper regulations and facilitation data deposition can enhance data sharing in Jordan.

URL : Biomedical Data Sharing Among Researchers: A Study from Jordan

DOI : https://doi.org/10.2147/JMDH.S284294

Implementing the RDA Research Data Policy Framework in Slovenian Scientific Journals

Authors: Janez Štebe, Maja Dolinar, Sonja Bezjak, Ana Inkret

The paper aims to present the implementation of the RDA research data policy framework in Slovenian scientific journals within the project RDA Node Slovenia. The activity aimed to implement the practice of data sharing and data citation in Slovenian scientific journals and was based on internationally renowned practices and policies, particularly the Research Data Policy Framework of the RDA Data Policy Standardization and Implementation Interest Group.

Following this, the RDA Node Slovenia coordination prepared a guidance document that allowed the four pilot participating journals (from fields of archaeology, history, linguistics and social sciences) to adjust their journal policies regarding data sharing, data citation, adapted the definitions of research data and suggested appropriate data repositories that suit their disciplinary specifics.

The comparison of results underlines how discipline-specific the aspects of data-sharing are. The pilot proved that a grass-root approach in advancing open science can be successful and well-received in the research community, however, it also pointed out several issues in scientific publishing that would benefit from a planned action on a national level.

The context of an underdeveloped data sharing culture, slow implementation of open data strategy by the national research funder and sparse national data service infrastructure creates a unique environment for this study, the result of which can be used in similar contexts worldwide.

URL : Implementing the RDA Research Data Policy Framework in Slovenian Scientific Journals

DOI : http://doi.org/10.5334/dsj-2020-049

Investigation and Development of the Workflow to Clarify Conditions of Use for Research Data Publishing in Japan

Authors : Yasuyuki Minamiyama, Ui Ikeuchi, Kunihiko Ueshima, Nobuya Okayama, Hideaki Takeda

With the recent Open Science movement and the rise of data-intensive science, many efforts are in progress to publish research data on the web. To reuse published research data in different fields, they must be made more generalized, interoperable, and machine-readable.

Among the various issues related to data publishing, the conditions of use are directly related to their reuse potential. We show herein the types of external constraints and conditions of use in research data publishing in a Japanese context through the analysis of the interview and questionnaire for practitioners.

Although the conditions of research data use have been discussed only in terms of their legal constraints, we organize the inclusion of the non-legal constraints and data holders’ actual requirements.

Furthermore, we develop practical guideline for examining effective data publishing flow with licensing scenarios. This effort can be positioned to develop an infrastructure for data-intensive science, which will contribute to the realization of Open Science.

URL : Investigation and Development of the Workflow to Clarify Conditions of Use for Research Data Publishing in Japan

DOI : http://doi.org/10.5334/dsj-2020-053

fiddle: a tool to combat publication bias by getting research out of the file drawer and into the scientific community

Authors : René Bernard, Tracey L. Weissgerber, Evgeny Bobrov, Stacey J. Winham, Ulrich Dirnag, Nico Riedel

Statistically significant findings are more likely to be published than non-significant or null findings, leaving scientists and healthcare personnel to make decisions based on distorted scientific evidence.

Continuously expanding ´file drawers’ of unpublished data from well-designed experiments waste resources creates problems for researchers, the scientific community and the public. There is limited awareness of the negative impact that publication bias and selective reporting have on the scientific literature.

Alternative publication formats have recently been introduced that make it easier to publish research that is difficult to publish in traditional peer reviewed journals. These include micropublications, data repositories, data journals, preprints, publishing platforms, and journals focusing on null or neutral results. While these alternative formats have the potential to reduce publication bias, many scientists are unaware that these formats exist and don’t know how to use them.

Our open source file drawer data liberation effort (fiddle) tool (RRID:SCR_017327 available at: http://s-quest.bihealth.org/fiddle/) is a match-making Shiny app designed to help biomedical researchers to identify the most appropriate publication format for their data. Users can search for a publication format that meets their needs, compare and contrast different publication formats, and find links to publishing platforms.

This tool will assist scientists in getting otherwise inaccessible, hidden data out of the file drawer into the scientific community and literature. We briefly highlight essential details that should be included to ensure reporting quality, which will allow others to use and benefit from research published in these new formats.

URL : fiddle: a tool to combat publication bias by getting research out of the file drawer and into the scientific community

DOI : https://doi.org/10.1042/CS20201125

Enforcing public data archiving policies in academic publishing: A study of ecology journals

Authors : Dan Sholler, Karthik Ram, Carl Boettiger, Daniel S Katz

To improve the quality and efficiency of research, groups within the scientific community seek to exploit the value of data sharing. Funders, institutions, and specialist organizations are developing and implementing strategies to encourage or mandate data sharing within and across disciplines, with varying degrees of success.

Academic journals in ecology and evolution have adopted several types of public data archiving policies requiring authors to make data underlying scholarly manuscripts freely available. The effort to increase data sharing in the sciences is one part of a broader “data revolution” that has prompted discussion about a paradigm shift in scientific research.

Yet anecdotes from the community and studies evaluating data availability suggest that these policies have not obtained the desired effects, both in terms of quantity and quality of available datasets.

We conducted a qualitative, interview-based study with journal editorial staff and other stakeholders in the academic publishing process to examine how journals enforce data archiving policies.

We specifically sought to establish who editors and other stakeholders perceive as responsible for ensuring data completeness and quality in the peer review process. Our analysis revealed little consensus with regard to how data archiving policies should be enforced and who should hold authors accountable for dataset submissions.

Themes in interviewee responses included hopefulness that reviewers would take the initiative to review datasets and trust in authors to ensure the completeness and quality of their datasets.

We highlight problematic aspects of these thematic responses and offer potential starting points for improvement of the public data archiving process.

URL : Enforcing public data archiving policies in academic publishing: A study of ecology journals

DOI : https://doi.org/10.1177/2053951719836258

Research Data Sharing in Spain: Exploring Determinants, Practices, and Perceptions

Authors : Rafael Aleixandre-Benavent, Antonio Vidal-Infer, Adolfo Alonso-Arroyo, Fernanda Peset, Antonia Ferrer Sapena

This work provides an overview of a Spanish survey on research data, which was carried out within the framework of the project Datasea at the beginning of 2015. It is covered by the objectives of sustainable development (goal 9) to support the research.

The purpose of the study was to identify the habits and current experiences of Spanish researchers in the health sciences in relation to the management and sharing of raw research data. Method: An electronic questionnaire composed of 40 questions divided into three blocks was designed.

The three Section s contained questions on the following aspects: (A) personal information; (B) creation and reuse of data; and (C) preservation of data. The questionnaire was sent by email to a list of universities in Spain to be distributed among their researchers and professors. A total of 1063 researchers completed the questionnaire.

More than half of the respondents (54.9%) lacked a data management plan; nearly a quarter had storage systems for the research group; 81.5% used personal computers to store data; “Contact with colleagues” was the most frequent means used to locate and access other researchers’ data; and nearly 60% of researchers stated their data were available to the research group and collaborating colleagues.

The main fears about sharing were legal questions (47.9%), misuse or interpretation of data (42.7%), and loss of authorship (28.7%).

The results allow us to understand the state of data sharing among Spanish researchers and can serve as a basis to identify the needs of researchers to share data, optimize existing infrastructure, and promote data sharing among those who do not practice it yet.

URL : Research Data Sharing in Spain: Exploring Determinants, Practices, and Perceptions

DOI : https://doi.org/10.3390/data5020029