Over the past few years debates about data have frequently made headline news. To help us to better understand data, including its uses and ethical implications, various analogies have been used. Although analogies can help us get a better grasp of this complex issue, we should be wary of the limitations of these comparisons.
Beyond data ownership:
One popular analogy has been the comparison between oil and data. In 2014, the Chief Executive of Intel argued: “Data, I look at it as the new oil. It’s going to change most industries across the board.” Others have suggested that, like oil in the 18th century, data is an increasingly untapped asset held by companies – with organisations and individuals now jockeying to understand how best to make the most of it.
Data as a public good:
However, in her keynote speech on data stewardship at a recent Ada Lovelace Institute seminar Professor Diane Coyle argued that data is not the new oil. Coyle says that, instead, data is a public good – an asset to be provided for and accessible to all members of society. One feature of data is that it can be used in a fashion that is non rivalrous; if I’m using up oil, no one else can simultaneously do so. But data, on the other hand is something that can be used by others to serve different purposes and functions– like air or free to broadcast television.
To draw out this point, consider the term ‘I give you my data’, and compare it to the term, ‘I give you some oil. What are these two statements doing? Quite different things – they describe a relationship between you and me, but in one instance I have a thing which I transfer and lose, and in the other instance, I am sharing knowledge which you may or may not find valuable and useful, at no additional cost to me. Not only is the thing we are talking about different, but the way we (and others) relate to it also differs.
Another feature of a public good, as pointed out by Peter Mills from the Nuffield Council on Bioethics, is that data as a public good would be non excludable. That is to say, it is impossible to exclude individuals from using or deploying most pure public goods. Data is not the new oil not simply because oil is a finite resource and data is not, but also because we attach certain normative associations to data that mean we conceive of at least some of it as open or free at the point of use.
However, if data is a public good, are we suggesting that privacy and data protection are not important when it comes to personal data?
Coyle made the case that we need to have a clearer understanding that different types of data exist, that data is used in different ways and that the creation of a ‘data taxonomy’ can add value to data. The OECD suggest considering whether data is public or private, where the data comes from (e.g., user or machine generated) and whether the data is open or closed access.
An unusual feature of data is that it relies upon other data to become valuable. Here too, the analogy with oil (and even air) breaks down. And, unlike many other public goods, it is not the quantity of it that matters, quite so much as the quality and the use to which it is put. Coyle pointed out that there is a stark lack of consensus amongst businesses themselves about how best to measure the value of their data. Is data like wine (improving over time), like fish (deteriorating over time), or rather more uncertain than either of those analogies (like Schrodinger’s Cat?). The value of data depends on a range of variables – who is looking at it, its purpose, its context, who else has access to it and the time at which one has access to it – and there is no clear answer about how best to measure its economic value.
The notion of stewardship may prove helpful here. The most compelling feature of stewardship is that it describes a good or an asset that is held on behalf of, or on trust for others. Those who are stewards of homes, for instance, never really own the home in the private ownership sense. Rather, the notion of stewardship has embedded within it a set of responsibilities and obligations to others that implicitly recognises that it is being used and looked after to pursue the best possible outcome for others – both at an individual and societal level. We have historically used the analogy of stewardship to describe our relationship to the environment (held on trust by us to others – citizens in the here and now, as well as citizens in the future) – in much the same way, we are beginning to use the analogy of stewardship for data.
Data citizenship and data rights:
This analogy also has its limitations. Data is more than simply an asset subject to governance but that it is also shaping the very fundamental nature of what it means to be a citizen (through, for instance, our individuality, our individual visibility to each other and to other public bodies through our interactions online). Ivana Bartoletti, Head of Privacy and Data Protection at Gemserv, has argued that data is an essential part of citizen identity in the modern world; data isn’t just about us – it is us.
In this sense, data is creating new ‘public and private’ spaces, whilst also shaping the fundamental rights that we hold as citizens and members of a democracy. Nowhere is this more apparent than in the debate about who is included or who holds data citizenship, as well as who is excluded. So, for instance, Emma Prest, Executive Director at DataKind UK, highlighted that, in reality, people do not care very much about data, despite the impact data has on people’s lives. As digital citizens, we have a moral and political obligation to others in exchange for the rights we have over our own data.
Towards a plurality of conceptual frames:
When we engage in intellectual debate about data we are working (and indeed struggling) with several conceptual frames that are in isolation incomplete. But that does not mean they are not useful; data as a public good; data stewardship; data citizenship, and data rights are all concepts which enable us to have a valuable conversation about the role, relevance and purpose of data and its different features and dimensions. This is a point well made by cognitive linguists George Lakoff and Mark Johnson (2003) who stated: “Our ordinary conceptual system, in terms of which we both think and act, is fundamentally metaphorical in nature.’ Data is no exception to this rule. Metaphors do not simply shape the way we see data, but they also shape the way we act on data and think about its role in the world.
So perhaps, having abandoned the notion of ‘data ownership’, we should not rush to adopt yet another analogy that would most perfectly explain the use of data at the exclusion of all others. We might just find that isn’t possible.
We are grateful to all those who participated in discussions at our seminar hosted by the Nuffield Foundation and the Ada Lovelace Institute on data stewardship, as well as the question of how best data could be used to serve the common good.
Report with recommendations and findings of a public deliberation on biometrics technology, policy and governance
Examining how the commitment to responsible data in the UK's National Data Strategy could be realised and what it misses
Exploring the datafication of health: what it is, how it occurs, and its impacts on individual and social wellbeing
A research partnership with NHS AI Lab exploring the potential for algorithmic impact assessments in an AI imaging case study