Skip to content

Building the evidence base for data and AI ethics

What impact is the debate on data and AI ethics having on independent public research?

Olivia Varley-Winter

23 November 2018

Reading time: 5 minutes

This autumn, Facebook was handed a fine of £500,000 for allowing the company Cambridge Analytica to access its users’ data, after an ICO investigation. This public and legal case has brought issues around ethical data use to the fore and added impetus to debate on this complex and little understood issue.

In scoping and developing the Ada Lovelace Institute, we seek to remedy problems running through the debate on data and artificial intelligence. Coverage of ethics tends to focus heavily on anecdotes drawn from either the promise of cutting-edge technology, or on cases where the adoption of new technologies has compromised values such as privacy and dignity. It is crucial to build the evidence base to more strongly articulate how data and AI works for people and society, and its ethical dimensions.

Over the summer, Ada hosted a roundtable on the subject of building the evidence base. Organised in partnership with The Royal Statistical Society, the roundtable brought together experts from computer science, law and human rights, political science, philosophy, industry, data science and statistics.

The discussion was wide ranging, and sparked in particular two main challenges: (1) data is held by large corporations, posing challenges for independent public research, and (2) the public interest in any given case can be difficult to establish, requiring frameworks for deliberation and scrutiny.

Access to personal data for research

At the roundtable, the UK Information Commissioner, Elizabeth Denham, discussed a recent case for her office, the use of 1.6 million patient files by DeepMind Health as part of a pilot study for the Royal Free Hospital in 2017. The ICO’s intervention formed a “teaching moment” for the organisations involved: a data driven tool developed by DeepMind Health is proving useful for clinicians at the hospital, but the ICO have made it clear that patient data protection rights are not “a price to pay”. The ICO’s undertaking included a requirement to undertake a data protection impact assessment, and to communicate much more fully with patients about use of their data for research.

Although regulation can promote ethical use of data, regulation is also limited by its specific remit. For example, regarding the pilot study agreed by the Royal Free Hospital, independent researchers raised concerns not only for data protection, but for power asymmetries in the way in which data was shared. It is important to ask who maintains the public stake in insights driven by the personal data collected by private firms, and this was of interest to others at the event. Twitter, a privately owned social media platform, was discussed as an example, as it is one of the social media platforms of interest for research. It was said that even though the Library of Congress in Washington was able to arrange some time ago to receive every Tweet to a large storage system, no researcher has been able to access that quantity of data. The library cited technical constraints and Twitter was, in the meanwhile, seeking private commercial benefit from the data. So even this ambitious approach did not enhance public data. In the UK, an enhanced public research base is one objective of ‘Data Trusts’, which were recommended by Dame Wendy Hall with Jérôme Pesenti in last year’s independent report on AI strategy. Pilots of data trusts are underway, and wider thinking on models for these is continuing to develop.

The benefit of a level playing field across the public and private sectors was espoused by Hetan Shah, Executive Director of the Royal Statistical Society. Addressing the concern for public benefit from data and AI, Shah cited provisions for data access for statistics in the Digital Economy Act 2017 as an example of how access to the data held by companies can, for statistical purposes, be facilitated. Shah has also proposed elsewhere that the rights of companies over data might need to be further defined and limited. Could more use be made of legal powers to address the pressing needs of researchers and of regulators?

Public scrutiny

In addition to being able to monitor the impact of data and AI in society through access to information, there is also a need for research to establish the basis for public scrutiny. At the roundtable, Professor Karen Yeung, Interdisciplinary Professorial Fellow in Law, Ethics and Informatics at the Birmingham Law School, spoke of how much interpretations differ. In the present day, even privacy, which has long been a recognised human right, might mean different things to different people. Changes in regulations, and high profile media reports such as the Cambridge Analytica story, led to an uptick in public consciousness of data protection and privacy issues. It is surprising however that facial recognition technologies have not had a similar level of attention. Limits placed on technologies will, additionally, be a result of political commitments. For example, communities and countries that are committed to democracy, human rights and fundamental freedoms would not support the same extension of state power online as happens in China. Through China’s ‘social credit’ system, companies provide loans on preferential terms, and other benefits. However a unique difference is the state’s power to blacklist people without their consent. Acting with online systems, the state can bar purchases and freedoms such as travel, for reasons such as unpaid debts or unruly behaviour.

In their final remarks, participants noted that the UK is beginning to better articulate, communicate and build upon its strengths. The General Data Protection Regulation is a basis for industry to address data privacy, and although some cases of data sharing have sparked public backlash, some others are being implemented at pace, whilst respecting data protection law. Open data is data which everybody has access to, and in the UK open data is substantially supported to develop along ethical lines. People contributing actively to data work in government, NGOs and in corporations are also pursuing guidance and frameworks toward ethical practice. However, important elements of the debate are being missed, such as the research community’s interaction with new data sources, their access to and understanding of AI, and establishment of the basis for public scrutiny.

Continuing dialogue

Ada’s agenda for the evidence base is ambitious: to foster rigorous research and debate on how data and AI affect society as a whole, and different groups within it. Are public fears for data and AI based on misconceptions? How do safeguards operate in practice and how should this inform principles and guidance? Ada hopes to facilitate wide ranging input from across relevant disciplines and sectors, and that this affords the necessary access to information that is otherwise missing.