Openness in Big Data and Data Repositories. The Application of an Ethics Framework for Big Data in Healthand Research

Authors : Vicki Xafis, Markus K. Labude

There is a growing expectation, or even requirement, for researchers to deposit a variety of research data in data repositories as a condition of funding or publication. This expectation recognizes the enormous benefits of data collected and created for research purposes being made available for secondary uses, as open science gains increasing support.

This is particularly so in the context of big data, especially where health data is involved. There are, however, also challenges relating to the collection, storage, and re-use of research data.

This paper gives a brief overview of the landscape of data sharing via data repositories and discusses some of the key ethical issues raised by the sharing of health-related research data, including expectations of privacy and confidentiality, the transparency of repository governance structures, access restrictions, as well as data ownership and the fair attribution of credit.

To consider these issues and the values that are pertinent, the paper applies the deliberative balancing approach articulated in the Ethics Framework for Big Data in Health and Research (Xafis et al. 2019) to the domain of Openness in Big Data and Data Repositories.

Please refer to that article for more information on how this framework is to be used, including a full explanation of the key values involved and the balancing approach used in the case study at the end.

URL : Openness in Big Data and Data Repositories. The Application of an Ethics Framework for Big Data in Healthand Research


How ethics combine with big data: a bibliometric analysis

Authors : Marta Kuc-Czarnecka, Magdalena Olczyk

The term Big Data is becoming increasingly widespread throughout the world, and its use is no longer limited to the IT industry, quantitative scientific research, and entrepreneurship, but entered as well everyday media and conversations. The prevalence of Big Data is simply a result of its usefulness in searching, downloading, collecting and processing massive datasets.

It is therefore not surprising that the number of scientific articles devoted to this issue is increasing. However, the vast majority of research papers deal with purely technical matters. Yet, large datasets coupled with complex analytical algorithms pose the risk of non-transparency, unfairness, e.g., racial or class bias, cherry-picking of data, or even intentional misleading of public opinion, including policymakers, for example by tampering with the electoral process in the context of ‘cyberwars’.

Thus, this work implements a bibliometric analysis to investigate the development of ethical concerns in the field of Big Data. The investigation covers articles obtained from the Web of Science Core Collection Database (WoS) published between 1900 and July 2020.

A sample size of 892 research papers was evaluated using HistCite and VOSviewer software. The results of this investigation shed light on the evolution of the junction of two concepts: ethics and Big Data.

In particular, the study revealed the following array of findings: the topic is relatively poorly represented in the scientific literature with the relatively slow growth of interest. In addition, ethical issues in Big Data are discussed mainly in the field of health and technology.

URL : How ethics combine with big data: a bibliometric analysis



Models of Research and the Dissemination of Research Results: the Influences of E-Science, Open Access and Social Networking

Authors : Rae A. Earnshaw, Mohan de Silva, Peter S. Excell

In contrast with practice in recent times past, computational and data intensive processes are increasingly driving collaborative research in science and technology.

Large amounts of data are being generated in experiments or simulations and these require real-time, or near real-time, analysis and visualisation. The results of these evaluations need to be validated and then published quickly and openly in order to facilitate the overall progress of research on a national and international basis.

Research is increasingly undertaken in large teams and is also increasingly interdisciplinary as many of the major research challenges lie at the boundaries between existing disciplines.

The move to open access for peer reviewed publications is rapidly becoming a required option in the sector. At the same time, communication and dissemination procedures are also utilising non-traditional forms facilitated by burgeoning developments in social networking.

It is proposed that these elements, when combined, constitute a paradigm shift in the model of research and the dissemination of research results.

URL : Models of Research and the Dissemination of Research Results: the Influences of E-Science, Open Access and Social Networking

Alternative location :

What Can a Knowledge Complexity Approach Reveal About Big Data and Archival Practice?

Author : Nicola Horsley

As one of the major technological concepts driving ICT development today, big data has been touted as offering new forms of analysis of research data. Its application has reached out across disciplines but some research sources and archival practices do not sit comfortably within the computational turn and this has sparked concerns that cultural heritage collections that cannot be structured, represented, or, indeed, digitised accordingly may be excluded and marginalised by this new paradigm.

This work-in-progress paper reports on the contribution of the KPLEX project’s knowledge complexity approach to understanding the relationship between big data and archival practice.


Understanding Big Data for Industrial Innovation and Design: The Missing Information Systems Perspective

Author : Miguel Baptista Nunes

This paper identifies a need to complement the current rich technical and mathematical research agenda on big data with a more information systems and information science strand, which focuses on the business value of big data.

An agenda of research for information systems would explore motives for using big data in real organizational contexts, and consider proposed benefits, such as increased effectiveness and efficiency, production of high-quality products/services, creation of added business value, and stimulation of innovation and design.

Impacts of such research on the academic community, the industrial and business world, and policy-makers are discussed.

URL : Understanding Big Data for Industrial Innovation and Design: The Missing Information Systems Perspective



Big Data and Data Science: Opportunities and Challenges of iSchools

Authors : Il-Yeol Song, Yongjun Zhu

Due to the recent explosion of big data, our society has been rapidly going through digital transformation and entering a new world with numerous eye-opening developments. These new trends impact the society and future jobs, and thus student careers.

At the heart of this digital transformation is data science, the discipline that makes sense of big data. With many rapidly emerging digital challenges ahead of us, this article discusses perspectives on iSchools’ opportunities and suggestions in data science education.

We argue that iSchools should empower their students with “information computing” disciplines, which we define as the ability to solve problems and create values, information, and knowledge using tools in application domains.

As specific approaches to enforcing information computing disciplines in data science education, we suggest the three foci of user-based, tool-based, and application-based. These three foci will serve to differentiate the data science education of iSchools from that of computer science or business schools.

We present a layered Data Science Education Framework (DSEF) with building blocks that include the three pillars of data science (people, technology, and data), computational thinking, data-driven paradigms, and data science lifecycles.

Data science courses built on the top of this framework should thus be executed with user-based, tool-based, and application-based approaches.

This framework will help our students think about data science problems from the big picture perspective and foster appropriate problem-solving skills in conjunction with broad perspectives of data science lifecycles. We hope the DSEF discussed in this article will help fellow iSchools in their design of new data science curricula.

URL : Big Data and Data Science: Opportunities and Challenges of iSchools


Digitising Cultural Complexity: Representing Rich Cultural Data in a Big Data environment

Authors : Jennifer Edmond, Georgina Nugent Folan

One of the major terminological forces driving ICT integration in research today is that of “big data.” While the phrase sounds inclusive and integrative, “big data” approaches are highly selective, excluding input that cannot be effectively structured, represented, or digitised.

Data of this complex sort is precisely the kind that human activity produces, but the technological imperative to enhance signal through the reduction of noise does not accommodate this richness.

Data and the computational approaches that facilitate “big data” have acquired a perceived objectivity that belies their curated, malleable, reactive, and performative nature. In an input environment where anything can “be data” once it is entered into the system as “data,” data cleaning and processing, together with the metadata and information architectures that structure and facilitate our cultural archives acquire a capacity to delimit what data are.

This engenders a process of simplification that has major implications for the potential for future innovation within research environments that depend on rich material yet are increasingly mediated by digital technologies.

This paper presents the preliminary findings of the European-funded KPLEX (Knowledge Complexity) project which investigates the delimiting effect digital mediation and datafication has on rich, complex cultural data.

The paper presents a systematic review of existing implicit definitions of data, elaborating on the implications of these definitions and highlighting the ways in which metadata and computational technologies can restrict the interpretative potential of data.

It sheds light on the gap between analogue or augmented digital practices and fully computational ones, and the strategies researchers have developed to deal with this gap.

The paper proposes a reconceptualisation of data as it is functionally employed within digitally-mediated research so as to incorporate and acknowledge the richness and complexity of our source materials.