New Horizons for a Data-Driven Economy : A Roadmap for Usage and Exploitation of Big Data in Europe

Editors : José María Cavanillas, Edward Curry, Wolfgang Wahlster

In this book readers will find technological discussions on the existing and emerging technologies across the different stages of the big data value chain. They will learn about legal aspects of big data, the social impact, and about education needs and requirements.

And they will discover the business perspective and how big data technology can be exploited to deliver value within different sectors of the economy.

URL : New Horizons for a Data-Driven Economy : A Roadmap for Usage and Exploitation of Big Data in Europe

Alternative location :


Research Data in Current Research Information Systems

Authors : Joachim Schöpfel, Hélène Prost,Violaine Rebouillat

The paper provides an overview of recent research and publications on the integration of research data in Current Research Information Systems (CRIS) and addresses three related issues, i.e. the object of evaluation, identifier schemes and conservation.

Our focus is on social sciences and humanities. As research data gradually become a crucial topic of scientific communication and evaluation, current research information systems must be able to consider and manage the great variety and granularity levels of data as sources and results of scientific research.

More empirical and moreover conceptual work is needed to increase our understanding of the reality of research data and the way they can and should be used for the needs and objectives of research evaluation.

The paper contributes to the debate on the evaluation of research data, especially in the environment of open science and open data, and will be helpful in implementing CRIS and research data policies.


Using the Semantic Web for Rapid Integration of WikiPathways with Other Biological Online Data Resources

Authors : Andra Waagmeester,  Martina Kutmon, Anders Riutta, Ryan Miller,  Egon L. Willighagen, Chris T.  Evelo , Alexander R. Pico

The diversity of online resources storing biological data in different formats provides a challenge for bioinformaticians to integrate and analyse their biological data.

The semantic web provides a standard to facilitate knowledge integration using statements built as triples describing a relation between two objects. WikiPathways, an online collaborative pathway resource, is now available in the semantic web through a SPARQL endpoint at

Having biological pathways in the semantic web allows rapid integration with data from other resources that contain information about elements present in pathways using SPARQL queries.

In order to convert WikiPathways content into meaningful triples we developed two new vocabularies that capture the graphical representation and the pathway logic, respectively. Each gene, protein, and metabolite in a given pathway is defined with a standard set of identifiers to support linking to several other biological resources in the semantic web.

WikiPathways triples were loaded into the Open PHACTS discovery platform and are available through its Web API ( to be used in various tools for drug development.

We combined various semantic web resources with the newly converted WikiPathways content using a variety of SPARQL query types and third-party resources, such as the Open PHACTS API. The ability to use pathway information to form new links across diverse biological data highlights the utility of integrating WikiPathways in the semantic web.

URL : Using the Semantic Web for Rapid Integration of WikiPathways with Other Biological Online Data Resources


Cloud-Based Big Data Management and Analytics for Scholarly Resources: Current Trends, Challenges and Scope for Future Research

Authors : Samiya Khan, Kashish A. Shakil, Mansaf Alam

With the shifting focus of organizations and governments towards digitization of academic and technical documents, there has been an increasing need to use this reserve of scholarly documents for developing applications that can facilitate and aid in better management of research.

In addition to this, the evolving nature of research problems has made them essentially interdisciplinary. As a result, there is a growing need for scholarly applications like collaborator discovery, expert finding and research recommendation systems.

This research paper reviews the current trends and identifies the challenges existing in the architecture, services and applications of big scholarly data platform with a specific focus on directions for future research.


Towards a paradigm for open and free sharing of scientific data on global change science in China

Authors : Changhui Peng, Xinzhang Song, Hong Jiang, Qiuan Zhu, Huai Chen, Jing M. Chen, Peng Gong, Chang Jie, Wenhua Xiang, Guirui Yu, Xiaolu Zhou

Despite great progress in data sharing that has been made in China in recent decades, cultural, policy, and technological challenges have prevented Chinese researchers from maximizing the availability of their data to the global change science community.

To achieve full and open exchange and sharing of scientific data, Chinese research funding agencies need to recognize that preservation of, and access to, digital data are central to their mission, and must support these tasks accordingly.

The Chinese government also needs to develop better mechanisms, incentives, and rewards, while scientists need to change their behavior and culture to recognize the need to maximize the usefulness of their data to society as well as to other researchers.

The Chinese research community and individual researchers should think globally and act personally to promote a paradigm of open, free, and timely data sharing, and to increase the effectiveness of knowledge development.

URL : Towards a paradigm for open and free sharing of scientific data on global change science in China


Revisiting the Data Lifecycle with Big Data Curation

Author : Line Pouchard

As science becomes more data-intensive and collaborative, researchers increasingly use larger and more complex data to answer research questions.

The capacity of storage infrastructure, the increased sophistication and deployment of sensors, the ubiquitous availability of computer clusters, the development of new analysis techniques, and larger collaborations allow researchers to address grand societal challenges in a way that is unprecedented.

In parallel, research data repositories have been built to host research data in response to the requirements of sponsors that research data be publicly available. Libraries are re-inventing themselves to respond to a growing demand to manage, store, curate and preserve the data produced in the course of publicly funded research.

As librarians and data managers are developing the tools and knowledge they need to meet these new expectations, they inevitably encounter conversations around Big Data. This paper explores definitions of Big Data that have coalesced in the last decade around four commonly mentioned characteristics: volume, variety, velocity, and veracity.

We highlight the issues associated with each characteristic, particularly their impact on data management and curation. We use the methodological framework of the data life cycle model, assessing two models developed in the context of Big Data projects and find them lacking.

We propose a Big Data life cycle model that includes activities focused on Big Data and more closely integrates curation with the research life cycle. These activities include planning, acquiring, preparing, analyzing, preserving, and discovering, with describing the data and assuring quality being an integral part of each activity.

We discuss the relationship between institutional data curation repositories and new long-term data resources associated with high performance computing centers, and reproducibility in computational science.

We apply this model by mapping the four characteristics of Big Data outlined above to each of the activities in the model. This mapping produces a set of questions that practitioners should be asking in a Big Data project

URL : Revisiting the Data Lifecycle with Big Data Curation

Alternative location :

Research Data Sharing and Reuse Practices of Academic Faculty Researchers: A Study of the Virginia Tech Data Landscape

Author : Yi Shen

This paper presents the results of a research data assessment and landscape study in the institutional context of Virginia Tech to determine the data sharing and reuse practices of academic faculty researchers.

Through mapping the level of user engagement in “openness of data,” “openness of methodologies and workflows,” and “reuse of existing data,” this study contributes to the current knowledge in data sharing and open access, and supports the strategic development of institutional data stewardship.

Asking faculty researchers to self-reflect sharing and reuse from both data producers’ and data users’ perspectives, the study reveals a significant gap between the rather limited sharing activities and the highly perceived reuse or repurpose values regarding data, indicating that potential values of data for future research are lost right after the original work is done.

The localized and sporadic data management and documentation practices of researchers also contribute to the obstacles they themselves often encounter when reusing existing data.

URL : Research Data Sharing and Reuse Practices of Academic Faculty Researchers: A Study of the Virginia Tech Data Landscape

Alternative location :