D’abord les données, ensuite la méthode ? Big data et déterminisme en sciences sociales

Auteurs/Authors : Jean-Christophe Plantin, Federica Russo

Si les chercheurs en sciences sociales ont depuis longtemps recours à de larges quantités de données, par exemple avec les enquêtes par questionnaire, le recours à des données numériques massives et hétérogènes, ou « big data », est de plus en plus fréquent.

À travers un abandon de la théorie pour la recherche de corrélations, cette multitude de données suscite-t-elle une nouvelle forme de déterminisme ?

L’histoire des sciences sociales indique au contraire que l’accroissement des données disponibles a entraîné un rejet progressif d’une hypothèse déterministe héritée des sciences de la nature, au profit d’une autonomisation méthodologique fondée sur la modélisation statistique.

Dans ce contexte, cet article montre que l’accent mis sur la taille des big data ne signifie pas tant un retour au déterminisme, mais est davantage révélateur du désajustement actuel entre les caractéristiques de ces données massives et les méthodes et infrastructures en sciences sociales.

URL : https://socio.revues.org/2328

Afraid of Scooping – Case Study on Researcher Strategies against Fear of Scooping in the Context of Open Science

Author : Heidi Laine

The risk of scooping is often used as a counter argument for open science, especially open data. In this case study I have examined openness strategies, practices and attitudes in two open collaboration research projects created by Finnish researchers, in order to understand what made them resistant to the fear of scooping.

The radically open approach of the projects includes open by default funding proposals, co-authorship and community membership. Primary sources used are interviews of the projects’ founding members.

The analysis indicates that openness requires trust in close peers, but not necessarily in research community or society at large. Based on the case study evidence, focusing on intrinsic goals, like new knowledge and bringing about ethical reform, instead of external goals such as publications, supports openness.

Understanding fundaments of science, philosophy of science and research ethics, can also have a beneficial effect on willingness to share. Whether there are aspects in open sharing that makes it seem riskier from the point of view of certain demographical groups within research community, such as women, could be worth closer inspection.

URL : Afraid of Scooping – Case Study on Researcher Strategies against Fear of Scooping in the Context of Open Science

DOI : http://doi.org/10.5334/dsj-2017-029

Towards an Ethical Framework for Publishing Twitter Data in Social Research: Taking into Account Users’ Views, Online Context and Algorithmic Estimation

Authors : Matthew L Williams, Pete Burnap, Luke Sloan

New and emerging forms of data, including posts harvested from social media sites such as Twitter, have become part of the sociologist’s data diet. In particular, some researchers see an advantage in the perceived ‘public’ nature of Twitter posts, representing them in publications without seeking informed consent.

While such practice may not be at odds with Twitter’s terms of service, we argue there is a need to interpret these through the lens of social science research methods that imply a more reflexive ethical approach than provided in ‘legal’ accounts of the permissible use of these data in research publications.

To challenge some existing practice in Twitter-based research, this article brings to the fore: (1) views of Twitter users through analysis of online survey data; (2) the effect of context collapse and online disinhibition on the behaviours of users; and (3) the publication of identifiable sensitive classifications derived from algorithms.

URL : Towards an Ethical Framework for Publishing Twitter Data in Social Research: Taking into Account Users’ Views, Online Context and Algorithmic Estimation

DOI : http://dx.doi.org/10.1177%2F0038038517708140

Advancing research data publishing practices for the social sciences: from archive activity to empowering researchers

Authors : Veerle Van den Eynden, Louise Corti

Sharing and publishing social science research data have a long history in the UK, through long-standing agreements with government agencies for sharing survey data and the data policy, infrastructure, and data services supported by the Economic and Social Research Council.

The UK Data Service and its predecessors developed data management, documentation, and publishing procedures and protocols that stand today as robust templates for data publishing.

As the ESRC research data policy requires grant holders to submit their research data to the UK Data Service after a grant ends, setting standards and promoting them has been essential in raising the quality of the resulting research data being published. In the past, received data were all processed, documented, and published for reuse in-house.

Recent investments have focused on guiding and training researchers in good data management practices and skills for creating shareable data, as well as a self-publishing repository system, ReShare. ReShare also receives data sets described in published data papers and achieves scientific quality assurance through peer review of submitted data sets before publication.

Social science data are reused for research, to inform policy, in teaching and for methods learning. Over a 10 years period, responsive developments in system workflows, access control options, persistent identifiers, templates, and checks, together with targeted guidance for researchers, have helped raise the standard of self-publishing social science data.

Lessons learned and developments in shifting publishing social science data from an archivist responsibility to a researcher process are showcased, as inspiration for institutions setting up a data repository.

URL : Advancing research data publishing practices for the social sciences: from archive activity to empowering researchers

DOI : doi:10.1007/s00799-016-0177-3

Opening Scholarly Communication in Social Sciences by Connecting Collaborative Authoring to Peer Review

Authors : Afshin Sadeghi, Johannes Wilm, Philipp Mayr, Christoph Lange

The objective of the OSCOSS research project on “Opening Scholarly Communication in the Social Sciences” is to build a coherent collaboration environment that facilitates scholarly communication workflows of social scientists in the roles of authors, reviewers, editors and readers. This paper presents the implementation of the core of this environment: the integration of the Fidus Writer academic word processor with the Open Journal Systems (OJS) submission and review management system.

URL : https://arxiv.org/abs/1703.04428

Patent citation data in social science research: Overview and best practices

Authors : Adam B. Jaffe, Gaétan de Rassenfosse

The last 2 decades have witnessed a dramatic increase in the use of patent citation data in social science research. Facilitated by digitization of the patent data and increasing computing power, a community of practice has grown up that has developed methods for using these data to: measure attributes of innovations such as impact and originality; to trace flows of knowledge across individuals, institutions and regions; and to map innovation networks.

The objective of this article is threefold. First, it takes stock of these main uses. Second, it discusses 4 pitfalls associated with patent citation data, related to office, time and technology, examiner, and strategic effects. Third, it highlights gaps in our understanding and offers directions for future research.

URL : Patent citation data in social science research: Overview and best practices

Alternative location : http://onlinelibrary.wiley.com/doi/10.1002/asi.23731/full

Big data challenges for the social sciences: from society and opinion to replications

Author : Dominique Boullier

Big Data dealing with the social produce predictive correlations for the benefit of brands and web platforms. Beyond “society” and “opinion” for which the text lays out a genealogy, appear the “traces” that must be theorized as “replications” by the social sciences in order to reap the benefits of the uncertain status of entities’ widespread traceability.

High frequency replications as a collective phenomenon did exist before the digital networks emergence but now they leave traces that can be computed. The third generation of Social Sciences currently emerging must assume the specific nature of the world of data created by digital networks, without reducing them to the categories of the sciences of “society” or “opinion”.

Examples from recent works on Twitter and other digital corpora show how the search for structural effects or market-style trade-offs are prevalent even though insights about propagation, virality and memetics could help build a new theoretical framework.

URL : http://arxiv.org/abs/1607.05034