Open science-based framework to reveal open data publishing: an experience from using Common Crawl

Authors : Andreiwid Correa, Israel Fernandes

The publishing of open data is considered a key element for civic participation paving the way to the ‘public value’, a term which underpins the social contribution. A result of that can be seen through the popularity of data portals published all around the world by governments, public and private organizations.

However, the diffusion of data portals raises concerns about discoverability and validity of these data sources, especially to what extent they contribute to open data and open science.

The purpose of this work is to develop a framework to reveal open data publishing with the use of a freely available open science project called Common Crawl. The idea is to identify open data-related initiatives and to gather information about their availability, having in the framework’s essence an iterative and differential process.

The main outcome is shown through a proposed model for the historical data repository which involves both use and creation of open science to branch new sort of research possibilities based on publishing of derived data.

URL : https://hal.archives-ouvertes.fr/hal-02544245