Causation, Correlation, and Big Data in Social Science Research

Cowls, Josh and Schroeder, Ralph (2015) Causation, Correlation, and Big Data in Social Science Research. Policy & Internet 7 (4), 447-472.

The emergence of big data offers not only a potential boon for social scientific inquiry, but also raises distinct epistemological issues for this new area of research. Drawing on interviews conducted with researchers at the forefront of big data research, we offer insight into questions of causal versus correlational research, the use of inductive methods, and the utility of theory in the big data age. While our interviewees acknowledge challenges posed by the emergence of big data approaches, they reassert the importance of fundamental tenets of social science research such as establishing causality and drawing on existing theory. They also discussed more pragmatic issues, such as collaboration between researchers from different fields, and the utility of mixed methods. We conclude by putting the themes emerging from our interviews into the broader context of the role of data in social scientific inquiry, and draw lessons about the future role of big data in research.

Big Data – What’s New(s)?

The following is a slightly edited version of a talk I gave at the Data Power conference in Sheffield this week, presenting work by myself and Ralph Schroeder.

The question of what drives news coverage far pre-dates the Internet and the rise of social media, and over the decades – or indeed the centuries – of mass media, myriad explanations have been offered in answer. Continue reading “Big Data – What’s New(s)?”

The MPs whose Wikipedia pages have been edited from inside Parliament

Grant Shapps is in the headlines after being accused of self-serving edits made to his own entry on Wikipedia, as well as unflattering changes made to rivals’ pages. But he may not be the only politician giving himself a virtual facelift. Analysis of the Twitter account @parliamentedits, which tracks edits to Wikipedia made from inside the Houses of Parliament, shows other attempts to edit the online encyclopedia, many of them controversial.
Continue reading “The MPs whose Wikipedia pages have been edited from inside Parliament”

Ad-hoc encounters with big data: Engaging citizens in conversations around tabletops

Fjeld, Morten, Woźniak, Paweł, Cowls, Josh and Nardi, Bonnie (2015). Ad-hoc encounters with big data: Engaging citizens in conversations around tabletops. First Monday 20 (2).

The increasing abundance of data creates new opportunities for communities of interest and communities of practice. We believe that interactive tabletops will allow users to explore data in familiar places such as living rooms, cafés, and public spaces. We propose informal, mobile possibilities for future generations of flexible and portable tabletops. In this paper, we build upon current advances in sensing and in organic user interfaces to propose how tabletops in the future could encourage collaboration and engage users in socially relevant data-oriented activities. Our work focuses on the socio-technical challenges of future democratic deliberation. As part of our vision, we suggest switching from fixed to mobile tabletops and provide two examples of hypothetical interface types: TableTiles and Moldable Displays. We consider how tabletops could foster future civic communities, expanding modes of participation originating in the Greek Agora and in European notions of cafés as locales of political deliberation.

Big Data: the New Water or the New Oil?

In definitional terms, big data is, as we are repeatedly told, a matter of volume, velocity, variety and sometimes veracity. But perhaps as a result of a fifth v, the vagueness of this definition, those discussing the present and future impact of big data on society routinely describe big data more figuratively and evocatively. Often, this metaphorical definition takes the form of a liquid. Streams of big data flow and cascade between – and sometimes leak from – organisations. Continue reading “Big Data: the New Water or the New Oil?”

Big Data in the Humanities: lessons from papyrus and Instagram

I’m currently in Washington DC to attend the IEEE International Conference on Big Data. The first day is set aside for workshops, and I’ve just attended a really insightful one on ‘Big Humanities Data’. The diversity of work presented was immense, covering a huge sweep of history: from fragments of ancient Greek text to Instagram photos taken during the Ukraine revolution, via the Irish Rebellion of 1641 and the Spanish flu outbreak of a century ago. Nonetheless, certain patterns stuck out from many of most of the talks given. Continue reading “Big Data in the Humanities: lessons from papyrus and Instagram”

The Crowd in the Cloud? Three challenges for gauging public opinion online

Cowls, Josh (2014) The Crowd in the Cloud? Three challenges for gauging public opinion online. IPP2014: Crowdsourcing for Politics and Policy, September 2014, Oxford, UK.

Much excitement surrounds the use of social sources of big data – harvested from popular networking platforms like Twitter and Facebook, as well as other forms of socially generated data including Wikipedia edits and Google searches – in the pursuit of social scientific discovery. In this paper I assess the extent to which these newly available sources of socially-generated big data can tell us about public opinion in a society at large. I draw on data from a series of interviews conducted with researchers at the forefront of big data approaches to social science, in order to outline the opportunities and issues around this area of research. In my analysis I identify three challenges to the validity of online public opinion measurement – the reliability of the data collected, the representativeness of the ‘sample’ being analysed, and the replicability of this form of public opinion research – and suggest various ways in which these challenges can be met.

Social media and public opinion: what’s new?

I’m currently writing up a paper for submission to the Internet, Politics and Policy 2014 conference to be held by the OII in September. My paper – which draws substantially on interviews conducted as part of the Sloan Foundation-funded project of which I’m part – asks whether and to what extent the measurement of public opinion has been transformed by the new availability of socially-generated sources of big data, such as social media postings and search queries, and the tools which allow us to analyse them. Continue reading “Social media and public opinion: what’s new?”

Streisandfreude: how the right to be forgotten may become an excuse to be remembered

Copyright (C) 2002 Kenneth & Gabrielle Adelman, California Coastal Records Project, www.californiacoastline.org.
Barbara Streisand’s house in the hills, an image which survived legal efforts at suppression to give us ‘the Streisand effect’. Copyright (C) 2002 Kenneth & Gabrielle Adelman, California Coastal Records Project, http://www.californiacoastline.org.

The past fortnight saw the first ripples of reaction to the European Court of Justice’s assertion of a citizen’s ‘right to be forgotten’ online. Following the court’s ruling, Google began the implementation of a process whereby individuals can petition for the removal of links in search results to pages deemed objectionable.

Continue reading “Streisandfreude: how the right to be forgotten may become an excuse to be remembered”

Big Data in Bellagio: who counts, what counts, and how do we count?

photo-2

One of the early discussions emerging at our ‘Big Data for Social Change’ at the Rockefeller Center in Bellagio surrounds how the act of capturing of big data impinges on our understanding of it. There are three strands in particular which have been flagged up. Firstly, who does the counting? As Marc Ventresca has showed, the shift from ecclesiastical to secular authority in the collection of data affected perceptions of society, for example shifting the focus to the individual from the collective. The national census is not an impassive, aloof process but rather a culturally and politically significant object, reflecting and reinforcing societal debate and conflict. This significance is reflected in the 1918 observation that, “the science of statistics is the chief instrumentality through which the progress of civilization is now measured, and by which its development hereafter will be largely controlled”. Continue reading “Big Data in Bellagio: who counts, what counts, and how do we count?”