Realising data’s potential is complex. Data can help solve some of the world’s most pressing challenges, but it can also be used in ways which negatively impact people and society. Organisations and governments are right to turn to data to tackle major societal problems, like combatting the COVID-19 pandemic or supporting cities to become smarter and more innovative. But this must be offset by efforts to develop methods for the responsible management of data on behalf of data subjects.
Achieving both goals simultaneously – using data for good causes and protecting data subjects’ rights – is not straightforward.
Defining ‘good’ causes is a subjective, moving target that can only be achieved through delicate deliberation, and protecting data rights requires constant navigation of regulatory and ethical obligations. And even if both these hurdles are overcome, there are often technical trials involved in getting data out of siloes and opening it up for research and innovation.
These concepts are welcome and well-intentioned, but because such models for data governance are so novel, there are as many variations on them as there are people conceptualising them. This is promising; innovation requires experimentation and broad thinking. But among these many approaches lies a risk that some proposed models for data governance become a panacea for the thorny questions we should be asking about data governance.
Among this landscape, some researchers are exploring ideas of ‘data stewardship’ and drawing on a set of principles put forward by Nobel Prize-winning economist, Elinor Ostrom.
Ostrom’s work was devoted to governing the commons, where she outlined eight principles for the appropriate stewardship of common resources. Though often focused on shared natural resources like pastures, forests or fisheries, applying Ostrom’s principles to data can help to think through the thorny challenges of doing good with data.
In particular, Ostrom’s framework can help us think about two issues that go to the heart of the complexities of data governance: consent and privacy.
Consent is not control
Some novel models for data governance aim to offer data subjects more control by letting them choose how data about them is used, often in return for money.
For example, two platforms, CitizenMe and Geens, offer data subjects the ability to choose what data they share with who and for what purposes. This process is facilitated by the platform, which acts in a similar manner to some described ‘data trusts’ by managing access to data according to users’ choices and allocating reimbursement. Initiatives like this have considerable merit – they aim to offer more power to citizens and can better serve data subjects’ interests – though there are substantial critiques around receiving monetary compensation for personal data, as we recently explored in a data ownership panel at RightsCon 2020.
However, this type of model does not fully empower people. Instead, it only provides a more fine-grained version of informed consent. In effect, rather than consenting once to all uses of data, users can iteratively consent to different uses at different times. This suggests better privacy-by-design, as guided by the GDPR, but it is far from enabling data subjects to determine things like the aim or purpose of the data use, the nature of the data access agreement or the sanctions if those accessing the data break the rules.
Fine-grained consent and privacy-by-design are commendable, but they do not offer individuals genuine control. Greater authority to consent does not convey actual decision-making power or agency.
To give data subjects this agency, two of Ostrom’s principles can guide better data stewardship. Principle 2 states that when managing a common resource there should be a balance between what stakeholders must contribute and what they can use, and Principle 3 states that stakeholders must be involved in making decisions about and setting rules for the management and use of shared resources.
Principles like these, when applied to a dataset or data institution, would give all stakeholders (whether data subjects or data processors) an equal say in the rules which govern datasets, enable all stakeholders to determine the model for data access and give them each a say in the purposes for which the data is used. It would also create a balance between how much data processors can access and what they must offer in return.
If implemented, such principles could guide models of data stewardship that go beyond offering fine-grained consent and move towards giving data subjects greater control over how data about them is used. We see this in initiatives like Genomics England, which includes patients on the data access review panel, going a long way towards placing decision-making power in the hands of data subjects.
Privacy vs innovation
Another key goal of novel data governance models is to open up data for innovation or research while ensuring that the privacy and rights of data subjects are protected.
However, some efforts to promote the use of data for good while protecting data rights and privacy create an unwitting tension: they place social good and the individual right to privacy at odds with one another.
This happens in some medical research settings, where maintaining the anonymity of the data subject contends with the needs of research. A patient’s address, age, gender, ethnicity or unique medical history might be crucial to understanding the spread of disease. But from this data, their identity could be pieced together and used to reveal sensitive information, even if their name and other personal data were excluded.
In this situation, it’s easy to assume that using data to discover new medical treatments (or for any similar research and innovation) is in tension with protecting user’s data privacy. But this begs the (wrong) question: ‘Which is more important – improving health or data rights?’
In research to engage members of the public around the fair use of health data, we’ve found this to be a false dichotomy. Many people are comfortable with, and even supportive of, the idea that data about them can be used for ‘good’ causes, whether medical, environmental or social. But they are concerned that the data won’t just benefit data subjects or wider society, and instead, it will unjustly benefit other (already powerful) actors at data subjects’ and society’s expense.
This is because people don’t just place trust in a technology, but across the whole system in which it’s deployed. That trust is affected by the actions and intentions of the big tech companies that regularly make controversial headlines. It’s also affected by the perception of pharmaceutical giants, insurance companies, government representatives, law enforcement institutions, the relatively unknown technology firms that sit within data pipelines and more.
People fear that some of those actors may infringe on individuals’ privacy, use data for malign means or make unfair profit instead of striving to achieve goals like reaching carbon zero or developing new medical treatments. Part of this fear stems from the fact that what data subjects might consider to be ‘good’ is different to how data processors may define it, especially if individuals have no say in that definition. And though there is no single definition of ‘good’ (and perhaps nor should there be), data subjects deserve to express what world they want to see and how they think data should help build it.
To resolve this challenge, we should remember that many people’s concerns lie in the purpose or outcome of data use. If data is used in a way that is genuinely responsible, privacy-preserving, trustworthy and aimed towards public good, most people are supportive as long as those that still choose to opt out can do so with no fine-print and no negative consequences. Several of Ostrom’s principles can apply here: Principle 1 calls for clear boundaries around the common resource; Principle 4 calls for effective monitoring; and Principle 5 states there must be sanctions and accountability.
Applied to data stewardship these principles would ensure a clear, stated purpose for the use of data and require mechanisms to hold those accessing data to account. If rules are broken, sanctions would be enforced – like fines or rescinded access to data (as well as data deletion, of course). In addition, Ostrom’s Principle 8 calls for external oversight where necessary, such as overarching legislation about data use (e.g. the GDPR).
Principles like this would give data subjects the confidence to know that the data about them is being genuinely used for a vision of social good that they agree with, as well as offering securities against malign secondary uses or risks of breaches.
When data subjects know data is being used for purposes they support and there are robust mechanisms to ensure that and allow opt out, the tension between data for good and data privacy is resolved. And we know this works in practice: initiatives like UK Biobank or the Consumer Research Data Centre enact principles along these lines and consequently ensure the trust which underpins their use of data for good.
Understanding data stewardship
As part of Ada’s Rethinking Data programme, we aim to contribute to the landscape of research exploring innovative models of data governance.
In the coming weeks, we’ll be publishing a set of case studies that maps Ostrom’s principles against real-world examples where data is stewarded for social causes or on behalf of data subjects (we’ve drawn on some of these cases here). We’ll then follow this with a report that expands the case study analysis and engages critically with the notion of data stewardship.
We’re careful not to pose these principles for data stewardship as yet-another panacea for data governance. Ostrom’s principles do not fit neatly to data and must not be applied uncritically. That’s why we’ve chosen to draw from case studies and explore practices from empirical cases, rather than prescribe practices from the theory. In this, we follow an ethos that has been described as Ostrom’s Law: ‘A resource arrangement that works in practice can work in theory.’ Here, Ostrom’s work is a guide to orientate ourselves around and place solidarity at the core of data governance.
We are not the only organisation working on the topic of novel data governance. As well as those referenced already, GovLab is developing concrete ideas for how to embed and embody stewardship of data, Aapti Institute has put data into the context of different models of governance, and Understanding Patient Data has explored a model for learning health data governance.
All these perspectives – and more – are needed to develop models for the governance of data that work for people and society.
This article is the first in a series about data stewardship. Across the series researchers and practitioners working in different organiations and contexts, who each have a unique perspective on data stewardship, will share practical experience and research ideas.
It’s not possible or desirable for one person or organisation to decide what a ‘good’ use of data is. That’s why we hope this series and our research will help push forward thinking on how to govern data for good and ensure diverse voices contribute to defining it.
In this long read, we highlight three issues that arise out of the European Commission’s data strategy.
Myths and themes to emerge from our panel discussion on data ownership at RightsCon 2020
To help us to better understand data, including its uses and ethical implications, various analogies are used.
The societal impacts of introducing a public health identity system: legal, social and ethical issues
The second in our series of events addressing the nascent ‘public health identity’ systems developing around the world.