Skip to content

As part of this work, the Ada Lovelace Institute, the University of Exeter’s Institute for Data Science and Artificial Intelligence, and the Alan Turing Institute developed six mock AI and data science research proposals that represent hypothetical submissions to a Research Ethics Committee. An expert workshop found that case studies are useful training resources for understanding common AI and data science ethical challenges. Their purpose is to prompt reflection on common research ethics issues and the societal implications of different AI and data science research projects. These case studies are for use by students, researchers, members of research ethics committees, funders and other actors in the research ecosystem to further develop their ability to spot and evaluate common ethical issues in AI and data science research.


 

Executive summary

Research in the fields of artificial intelligence (AI) and data science is often quickly turned into products and services that affect the lives of people around the world. Research in these fields is used in the provision of public services like social care, determining which information is amplified on social media, what jobs or insurance people are offered, and even who is deemed a risk to the public by police and security services.  There has been a significant increase in the volume of AI and data science research in the last ten years, with these methods now being applied to other scientific domains like history, economics, health sciences and physics.

Figure 1: Number of AI publications in the world 2010-21[1]

Globally, the volume of AI research is increasing year-on-year and currently accounts for more than 4% of all published research.

Since products and services built with AI and data science research can have substantial effects on people’s lives, it is essential that this research is conducted safely and responsibly, and with due consideration for the broader societal impacts it may have. However, the traditional research governance mechanisms that are responsible for identifying and mitigating ethical and societal risks often do not address the challenges presented by AI and data science research.

As several prominent researchers have highlighted,[2] inadequately reviewed AI and data science research can create risks that are carried downstream into subsequent products,[3] services and research.[4] Studies have shown these risks can disproportionately impact people from marginalised and minoritised communities, exacerbating racial and societal inequalities.[5] If left unaddressed, unexamined assumptions and unintended consequences (paid forward into deployment as ‘ethical debt’[6]) can lead to significant harms to individuals and society. These harms can be challenging to address or mitigate after the fact.

Ethical debt also poses a risk to the longevity of the field of AI: if researchers fail to demonstrate due consideration for the broader societal implications of their work, it may reduce public trust in the field. This could lead to it becoming a domain that future researchers find undesirable to work in – a challenge that has plagued research into nuclear power and the health effects of tobacco.[7]

To address these problems, there have been increasing calls from within the AI and data science research communities for more mechanisms, processes and incentives for researchers to consider the broader societal impacts of their research.[8]

In many corporate and academic research institutions, one of the primary mechanisms for assessing and mitigating ethical risks is the use of Research Ethics Committees (RECs), also known in some regions as Institutional Review Boards (IRBs) or Ethics Review Committees (ERCs). Since the 1960s, these committees have been empowered to review research before it is undertaken and can reject proposals unless changes are made in the proposed research design.

RECs generally consist of members of a specific academic department or corporate institution, who are tasked with evaluating research proposals before the research begins. Their evaluations are based on a combination of normative and legal principles that have developed over time, originally in relation to biomedical human subjects research. A REC’s role is to help ensure that researchers justify their decisions for how research is conducted, thereby mitigating the potential harms they may pose.

However, the current role, scope and function of most academic and corporate RECs are insufficient for the myriad of ethical challenges that AI and data science research can pose. For example, the scope of REC reviews is traditionally only on research involving human subjects. This means that the many AI and data science projects that are not considered a form of direct intervention in the body or life of an individual human subject are exempt from many research ethics review processes.[9] In addition, a significant amount of AI and data science research involves the use of publicly available and repurposed datasets, which are considered exempt from ethics review under many current research ethics guidelines.[10]

If AI and data science research is to be done safely and responsibly, RECs must be equipped to examine the full spectrum of risks, harms and impacts that can arise in these fields.

In this report, we explore the role that academic and corporate RECs play in evaluating AI and data science research for ethical issues, and also investigate the kinds of common challenges these bodies face.

The report draws on two main sources of evidence: a review of existing literature on RECs and research ethics challenges, and a series of workshops and interviews with members of RECs and researchers who work on AI and data science ethics.

Challenges faced by RECs

Our evaluation of this evidence uncovered six challenges that RECs face when addressing AI and data science research:

Challenge 1: Many RECs lack the resources, expertise and training to appropriately address the risks that AI and data science pose.  

Many RECs in academic and corporate environments struggle with inadequate resources and training on the variety of issues that AI and data science can raise. The work of RECs is often voluntary and unpaid, meaning that members of RECs may not have the requisite time or training to appropriately review an application in its entirety. Studies suggest that RECs are often viewed by researchers as compliance bodies rather than mechanisms for improving the safety and impact of their research.

Challenge 2: Traditional research ethics principles are not well suited for AI research.

RECs review research using a set of normative and legal principles that are rooted in biomedical, human-subject research practices, which operate under a researcher-subject relationship rather than a researcher-data subject relationship. This distinction has challenged traditional principles of consent, privacy and autonomy in AI research, and created confusion and challenges for RECs trying to apply these principles to novel forms of research.

Challenge 3: Specific principles for AI and data science research are still emerging and are not consistently adopted by RECs.

The last few years have seen an emerging series of AI ethics principles aimed at the development and deployment of AI systems. However, these principles have not been well adapted for AI and data science research practices, signalling a need for institutions to translate these principles into actionable questions and processes for ethics reviews.

Challenge 4: Multi-site or public-private partnerships can exacerbate existing challenges of governance and consistency of decision-making.

An increasing amount of AI research involves multi-site studies and public-private partnerships. This can lead to multiple REC reviews of the same research, which can highlight different standards in ethical review of different institutions and present a barrier to completing timely research.

Challenge 5: RECs struggle to review potential harms and impacts that arise throughout AI and data science research.

REC reviews of AI and data science research are ex ante assessments, done before research takes place. However, many of the harms and risks in AI research may only become evident at later stages of the research. Furthermore, many of the types of harms that can arise – such as issues of bias, or wider misuses of AI or data – are challenging for a single committee to predict. This is particularly true with the broader societal impacts of AI research, which require a kind of evaluation and review that RECs currently do not undertake.

Challenge 6: Corporate RECs lack transparency in relation to their processes.

Motivated by a concern to protect their intellectual property and trade secrets, many private-sector RECs for AI research do not make their processes or decisions publicly accessible and use strict non-disclosure agreements to control the involvement of external experts in their decision-making. In some extreme cases, this lack of transparency has raised suspicion of corporate REC processes from external research partners, which can pose a risk to the efficacy of public-private research partnerships.

Recommendations

To address these challenges, we make the following recommendations:

For academic and corporate RECs

Recommendation 1: Incorporate broader societal impact statements from researchers.

A key issue this report identifies is the need for RECs to incentivise researchers to engage more reflexively with the broader societal impacts of their research, such as the potential environmental impacts of their research, or how their research could be used to exacerbate racial or societal inequalities.

There have been growing calls within the AI and data science research communities for researchers to incorporate these considerations in various stages of their research. Some researchers have called for changes to the peer review process to require statements of potential broader societal impacts,[11] and some AI/machine learning (ML) conferences have experimented with similar requirements in their conference submission process.[12]

RECs can support these efforts by incentivising researchers to engage in reflexive exercises to consider and document the broader societal impacts of their research. Other actors in the research ecosystem (funders, conference organisers, etc.) can also incentivise researchers to engage in these kinds of reflexive exercises.

Recommendation 2: RECs should adopt multi-stage ethics review processes of high-risk AI and data science research.

Many of the challenges that AI and data science raise will arise in different stages of research. RECs should experiment with requiring multiple stages of evaluations of research that raises particular ethical concern, such as evaluations at the point of data collection and a separate evaluation at the point of publication.

Recommendation 3: Include interdisciplinary and experiential expertise in REC membership.

Many of the risks that AI and data science research pose cannot be understood without engagement with different forms of experiential and subject-matter expertise. RECs must be interdisciplinary bodies if they are to address the myriad of issues that AI and data science can pose in different domains, and should incorporate the perspectives of individuals who will ultimately be impacted by the research.

For academic/corporate research institutions

Recommendation 4: Create internal training and knowledge-sharing hubs for researchers and REC members, and enable more cross-institutional knowledge sharing.

These hubs can provide opportunities for cross-institutional knowledge-sharing and ensure institutions do not develop standards of practice in silos. They should collect and share information on the kinds of ethical issues and challenges AI and data science research might raise, including case studies of research that raises challenging ethical issues. In addition to our report, we have developed a resource consisting of six case studies that we believe highlight some of the common ethical challenges that RECs might face.[13]

Recommendation 5: Corporate labs must be more transparent about their decision-making and do more to engage with external partners.

Corporate labs face specific challenges when it comes to AI and data science reviews. While many are better resourced and have experimented with broader societal impact thinking, some of these labs have faced criticism for being opaque about their decision-making processes. Many of these labs make consequential decisions about their research without engaging with local, technical or experiential expertise that resides outside their organisation.

For funders, conference organisers and other actors in the research ecosystem

Recommendation 6: Develop standardised principles and guidance for AI and data science research principles.

RECs currently lack standardised principles for evaluating AI and data science research. National research governance bodies like UKRI should work to create a new set of ‘Belmont 2.0’ principles[14] that offer some standardised approaches, guidance and methods for evaluating AI and data science research. Developing these principles should draw on a wide set of perspectives from different disciplines and communities who are impacted by AI and data science research, including multinational perspectives –  particularly from regions that have been historically underrepresented in the development of past research ethics principles.

Recommendation 7: Incentivise a responsible research culture.

AI and data science researchers lack incentives to reflect on and document the societal impacts their research. Different actors in the research ecosystem can encourage ethical behaviour – funders, for example, can create requirements that researchers conduct a broader societal impact statement of their research in order to receive a grant, and conference organisers and journal editors can encourage researchers to include a broader societal impact statement when submitting research. By creating incentives throughout the research ecosystem, ethical reflection can become more desirable and rewarded.

Recommendation 8: Increase funding and resources for ethical reviews of AI and data science research.

There is an urgent need for institutions and funders to support RECs, including paying for the time of staff and funding external experts to engage in questions of research ethics.

Introduction

The academic fields of AI and data science research have witnessed an explosive growth in the last two decades. According to the Stanford AI Index, between 2015 and 2020, the number of AI publications on open-access publication database arXiv grew from 5,487 to over 34,376 (see also Figure 1). As of 2019, AI publications represented 3.8% of all peer-reviewed scientific publications, an increase from 1.3% in 2011.[15] The vast majority of research appearing in major AI conferences comes from academic and industry institutions based in the European Union, China and the United States of America.[16] AI and data science techniques are also being applied across a range of other academic disciplines such as history,[17] economics,[18] genomics[19] and biology.[20]

Compared to many other disciplines, AI and data science have a relatively fast research-to-product pipeline and relatively low barriers for use, making these techniques easily adaptable (though not necessarily well suited) to a range of different applications.[21] While these qualities have led AI and data science to be described as ‘more important than fire and electricity’ by some industry leaders,[22] there have been increased calls from members of the AI research community to require researchers to consider and address ‘failures of imagination’[23] of the potential broader societal impacts and risks of their research.

Figure 2: The research-to-product timeline

This timeline shows how short the research-to-product pipeline for AI can be. It took less than a year from the release of initial research in 2020 and 2021, exploring how to generate images from text inputs, to the first commercial products selling these services.

The sudden growth of AI and data science research has exacerbated challenges for traditional research ethics review processes, and highlighted that they are poorly set up to address questions of broader societal impact of research. Several high-profile instances of controversial AI research passing institutional ethics review include image recognition applications that claim to identify homosexuality,[24] criminality,[25] physiognomy[26] and phrenology.[27] Corporate labs have also experienced high-profile examples of unethical research being approved, including a Microsoft chatbot capable of spreading disinformation,[28] and a Google research paper that contributed to the surveillance of China’s Uighur population.[29]

In research institutions, the role of assessing for research ethics issues tends to fall on Research Ethics Committees (RECs), also known in some regions as Institutional Review Boards (IRBs) or Ethics Review Committees (ERCs). Since the 1960s, these committees have been empowered to reject research from being undertaken unless changes are made in the proposed research design.

These committees generally consist of members of a specific academic department or corporate institution, who are responsible for evaluating research proposals before the research begins. Their evaluations combine normative and legal principles, originally linked to biomedical human subjects research, that have developed over time.

Traditionally, RECs only consider research involving human subjects and only consider questions concerning how the research will be conducted. While they are not the only ‘line of defence’ against unethical practices in research, they are the primary actor responsible for mitigating potential harms to research subjects in many forms of research.

The increasing prominence of AI and data science research poses an important question: are RECs well placed and adequately set up to address the challenges that AI and data science research pose? This report explores these challenges that public and private-sector RECs face in evaluations of research ethics and broader societal impact issues in AI and data science research.[30] In doing so, it aims to help institutions that are developing AI research review processes take a holistic and robust approach for identifying and mitigating these risks. It also seeks to provide research institutions and other actors in the research ecosystem – funders, journal editors and conference organisers – with specific recommendations for how they can address these challenges.

This report seeks to address four research questions:

  1. How are RECs in academia and industry currently structured? What role do they play in the wider research ecosystem?
  2. What resources (e.g. moral principles, legal guidance, etc.) are RECs using to guide their reviews of research ethics? What is the scope of these reviews?
  3. What are the most pressing or common challenges and concerns that RECs are facing in evaluations of AI and data science research?
  4. What changes can be made so that RECs and the wider AI and data science research community can better address these challenges?

To address these questions, this report relied on a review of the literature on RECs, research ethics and broader societal impact questions in AI. The report also draws on a series of workshops with 42 members of public and private AI and data science research institutions in May 2021, along with eight interviews with experts in research ethics and AI issues. More information on our methodology can be found in ‘Methodology and limitations’.

This report begins with an introduction to the history of RECs, how they are commonly structured, and how they commonly operate in corporate and academic environments for AI and data science research. The report then discusses six challenges that RECs face – some of which are longstanding issues, others of which are exacerbated by the rise of AI and data science research. We conclude the paper with a discussion of these findings and eight recommendations for actions that RECs and other actors in the research ecosystem can take to better address the ethical risks of AI and data science research.

Context for Research Ethics Committees and AI research

This section provides a brief history of modern research ethics and Research Ethics Committees (RECs), discusses their scope and function, and highlights some differences between how they operate in corporate and academic environments. It places RECs in the context of other actors in the ‘AI research ecosystem’, such as organisers of AI and data science conferences, or editors of AI journal publications who set norms of behaviour and incentives within the research community. Three key points to take away from this chapter are:

  1. Modern research ethics questions are mostly focused on ethical challenges that arise in research methodology, and exclude consideration of the broader societal impacts of research.
  2. Current RECs and research ethics principles stem from biomedical research, which analyses questions of research ethics through a lens of patient-clinician relationships and is not well suited for the more distanced relationship in AI and data science between a researcher and data subject.
  3. Academic and corporate RECs in AI research share common aims, but with some important differences. Corporate AI labs tend to have more resources, but may also be less transparent about their processes.

What is a REC, and what is its scope and function?

Every day, RECs review applications to undertake research for potential ethical issues that may arise. Broadly defined, RECs are institutional bodies made up of members of an institution (and, in some instances, independent members outside that institution) who are charged with evaluating applications to undertake research before it begins. They make judgements about the suitability of research, and have the power to approve researchers to go ahead with a project or request that changes are made before research is undertaken. Many academic journals and conferences will not publish or accept research that fails to meet a review by a Research Ethics Committee (though as we will discuss below, not all research requires review).

RECs operate with two purposes in mind:

  1. To protect the welfare and interests of prospective and current research participants and minimise risk of harm to them.
  2. To promote ethical and societally valuable research.

In meeting these aims, RECs traditionally conduct an ex ante evaluation only once, before a research project begins. In understanding what kinds of ethical questions RECs evaluate for, it is also helpful to disentangle three distinct categories of ethical risks in research: [31]

  1. Mitigating research process harms (often confusingly called ‘research ethics’).
  2. Research integrity.
  3. Broader societal impacts of research (also referred to as Responsible Research and Innovation, or RRI).

The scope of REC evaluations is entirely on questions of mitigating the ethical risks from research methodology, such as how the researcher intends to protect the privacy of a participant, anonymise their data or ensure they have received informed consent.[32] In their evaluations, RECs may look at whether the research poses a serious risk to interests and safety of research subjects, or if the researchers are operating in accordance with local laws governing data protection and intellectual property ownership of any research findings.

REC evaluations may also probe on whether the researchers have assessed and minimised potential harm to research participants, and seek to balance this against the benefits of the research for society at large.[33] However, there are limitations to the aim of promoting ethical and societally valuable research. There are few frameworks for how RECs can consider the benefit of research for society at large. Additionally, this concept of mitigating methodological risks does not extend to considerations of whether the research poses risks to society at large, or to individuals beyond the subjects of that research.

 

Three different kinds of ethical risks in research

1.    Mitigating research process (also known as ‘research ethics’): The term research ethics refers to the principles and processes governing how to mitigate the risks to research subjects. Research ethics principles are mostly concerned with the protection, safety and welfare of individual research participants, such as gaining their informed consent to participate in research or anonymising their data to protect their privacy.

 

2.    Research integrity: These are principles governing the credibility and integrity of the research, including which whether it is intellectually honest, transparent, robust, and replicable.[34] In most fields, research integrity is evaluated via the peer review process after research is completed.

 

3.    Broader societal impacts of research: This refers to the potential positive and negative societal and environmental implications of research, including unintended uses (such as misuse) of research. A similar concept is Responsible Research and Innovation (RRI) which refers to steps that researchers can undertake to anticipate and address the potential downstream risks and implications of their research.[35]

RECs, however, often do not evaluate for questions of research integrity, which is concerned with whether research is intellectually honest, transparent, robust and replicable.[36] These can include questions relating to whether data has been fabricated or misrepresented, whether research is reproducible, stating the limitations and assumptions of the research, and disclosing conflicts of interests.[37] The intellectual integrity of researchers is important for ensuring public trust in science, which can be eroded in cases of misconduct.[38]

Some RECs may consider complaints about research integrity issues that arise after research has been published, but these issues are often not considered as part of their ethics reviews. RECs may, however, assess a research applicant’s bona fides to determine if they are someone who appears to have integrity (such as if they have any conflicts of interest with the subject of their study). Usually, questions of research integrity are left to other actors in the research ecosystem, such as peer reviewers and whistleblowers who may notify a research institution or the REC of questionable research findings or dishonest behaviour. Other governance mechanisms for addressing research integrity issues include publishing the code or data of the research so that others may attempt to reproduce findings.

Another area of ethical risks that contemporary RECs do not evaluate for (but which we argue they should) is the responsibility of researchers to consider the broader societal effects of their research on society.[39] This is referred to as Responsible Research and Innovation (RRI), which moves beyond concerns of research integrity and is: an approach that anticipates and assesses potential implications and societal expectations with regard to research and innovation, with the aim to foster the design of inclusive and sustainable research and innovation’.[40]

RRI is concerned with the integration of mechanisms of reflection, anticipation and inclusive deliberation around research and innovation, and relies on individual researchers to incorporate these practices in their research. This includes analysing potential economic, societal or environmental impacts that arise from research and innovation. RRI is a more recent development that emerged separately to RECs, stemming in part from the Ethical Legal and Societal Implications Research (ELSI) programme in the 1990s, which was established to research the broader societal implications of genomics research.[41]

Traditionally, RECs are usually not well equipped to deal with assessing subsequent uses of research, or their impacts on society. RECs often lack the capacity or remit to monitor the downstream uses of research, or to act as an ‘observatory’ for identifying trends in the use or misuse of research they reviewed at inception. This is compounded by the decentralised and fragmentary nature of RECs, which operate independently of each other and often do not evaluate each other’s work.

What principles do RECs rely on to make judgements about research ethics?

In their evaluations, RECs rely on a variety of tools, including laws like the General Data Protection Regulation (GDPR), which cover data protection issues and some discipline-specific norms. At the core of all Research Ethics Committee evaluations, there are a series of moral principles that have evolved over time. These principles largely stem from the biomedical sciences, and have been codified, debated and edited by international bodies like the World Medical Association and World Health Organisation. The biomedical model of research ethics is the foundation for how concepts like autonomy and consent were encoded in law,[42] which often motivate modern discussions about privacy.

Some early modern research ethics codes, like the Nuremberg Principles and the Belmont Report, were developed in response to specific atrocities and scandals involving biomedical research on human subjects. Other codes, like the Declaration of Helsinki, developed out of a field-wide concern to self-regulate before governments stepped in to regulate.[43]

Each code and declaration seeks to address specific ethical issues from a particular regional and historical context. Nonetheless, they are united by two aspects. Firstly, they frame research ethics questions in a way that assumes a clear researcher-subject relationship. Secondly, they all seek to standardise norms of evaluating and mitigating the potential risks caused by research processes, to support REC decisions becoming more consistent between different institutions.

 

Historical principles governing research ethics

 

Nuremberg Code: The Nuremberg trials occurred in 1947 and revealed horrific and inhumane medical experimentation by Nazi scientists on human subjects, primarily concentration camp prisoners. Out of concern that these atrocities might further damage public trust in medical professionals and research,[44] the judges in this trial included a set of universal principles for ‘permissible medical experiments’ in their verdict, which would later become known as the Nuremberg Code.[45] The Code lists ten principles that seek to ensure individual participant rights are protected and outweigh any societal benefit of the research.

 

Declaration of Helsinki: Established by World Medical Association (WMA), the Helsinki Declaration seeks to articulate universal principles for human subjects research and clinical research practice. The WMA is an international organisation representing physicians from across the globe. The Helsinki Declaration has been updated repeatedly since its first iteration in 1964, with major updates occurring in 1975, 2000 and 2008. It specifies five basic principles for all human subjects research, as well as further principles specific to clinical research.

 

Belmont Report: This report was written in response to several troubling incidents in the USA, in which patients participating in clinical trials were not adequately informed about the risks involved. These include a 40-year-long experiment by the US Public Health Service and the Tuskegee Institute that sought to study untreated syphilis in Black men. Despite having over 600 participants (399 with syphilis, 201 without), the participants were deceived about the risks and nature of experiment and were not provided with a cure for the disease after it had been developed in the 1940s.[46] These developments led to the United States’ National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research to publish the Belmont Report in 1979, which listed several principles for research to follow: justice, beneficence and respect for persons.[47]

 

Council for International Organizations of Medical Sciences Guidelines (CIOMS): CIOMS was formed in 1949 by the World Health Organisations and the United Nations Educational, Scientific and Cultural Organisation (UNESCO), and is made up of a range of biomedical member organisations from across the world. In 2016, it published the International Ethical Guidelines for Health-Related Research Involving Humans,[48] which includes specific requirements for research involving vulnerable persons and groups, compensation for research participants, and requirements for researchers and health authorities to engage potential participants and communities in a ‘meaningful participatory process’ in various stages of research.[49]

 

Biomedical research ethics principles touch on a wide variety of issues, including autonomy and consent. The Nuremberg Code specified that, for research to proceed, a researcher must have consent given (i) voluntarily by a (ii) competent and (iii) informed subject (iv) with adequate comprehension. At the time, consent was understood as only applicable to healthy, non-patient participants, and thus excluded patients in clinical trials, access to patient information like medical registers and participants (like children or people with a cognitive impairment) who are unable to give consent.

Subsequent research ethics principles have adapted to these scenarios with methods such as legal guardianship, group or community consent, and broad or blanket consent.[50] Under the Helsinki Declaration, consent must be given in writing and states that research subjects can give consent only if they have been fully informed of the study’s purpose, the methods, risks and benefits involved, and their right to withdraw.[51] In all these conceptions of consent, there is a clearly identifiable research subject, who is in some kind of direct relationship with a researcher.

Another area that biomedical research principles touch on is the risk and benefit of research for research subjects. While the Nuremberg Code was unambiguous about the protection of research subjects, the Helsinki Declaration introduced the concept of benefit from research in proportion to risk.[52] The 1975 document and other subsequent revisions reaffirmed that, ‘while the primary purpose of medical research is to generate new knowledge, this goal can never take precedence over the rights and interests of individual research subjects.’[53]

However, Article 21 recommends that research can be conducted if the importance of its objective outweighs the risks to participants, and Article 18 states that a careful assessment of predictable risks to participants must be undertaken in comparison to potential benefits for individuals and communities.[54] The Helsinki Declaration lacks clarity on what constitutes an acceptable, or indeed ‘predictable’ risk and how the benefits would be assessed, and therefore leaves the challenge of resolving these questions to individual institutions.[55] The CIOMS guidance also suggests RECs should consider the ‘social value’ of health research in considering a cost/benefit analysis.

The Belmont Report also addressed the trade-off between societal benefit and individual risk, offering specific ethics principles to guide scientific research that include ‘respects for persons’, ‘beneficence’ and ‘justice’.[56] The principle of ‘respect for persons’ is broken down into respect for the autonomy of human research subjects and requirements for informed consent. The principle of ‘beneficence’ requires the use of the best possible research design to maximise benefits and minimise harms, and prohibits any research that is not backed by a favourable risk-benefit ratio (to be determined by a REC). Finally, the principle of ‘justice’ stipulates that the risks and benefits of research are distributed fairly, research subjects are selected through fair procedures, and to avoid any exploitation of vulnerable populations.

The Nuremberg Code created standardised requirements to identify who bears responsibility for identifying and addressing potential ethical risks of research. For example, the Code stipulates that the research participants have the right to withdraw (Article 9), but places responsibility on the researchers to evaluate and justify any risks in relation to human participation (Article 6), to minimise harm (Articles 4 and 7) and to stop the research if it is likely to cause injury or death to participants (Articles 5 and 10).[57] Similar requirements exist in other biomedical ethical principles like the Helsinki Declaration, which extends responsibility for assessing and mitigating ethical risks to both researchers and RECs.

A brief history of RECs in the USA and the UK

RECs are a relatively modern phenomenon in the history of academic research, and their origins stem from early biomedical research initiatives of the 1970s. The 1975 Declaration of Helsinki, an initiative by the World Medical Association (WMA) to articulate universal principles for human subjects research and clinical research practice, declared the ultimate arbiter for making assessments of ethical risk and benefit were specifically appointed, independent research ethics committees who are given the responsibility to assess the risk of harm to research subjects and the management of those risks.

 

In the USA, the National Research Act of 1974 requires Institutional Review Board (IRB) approval for all human subjects research projects funded by the US Department of Health, Education, and Welfare (DHEW).[58] This was extended in 1991 under the ‘Common Rule’ so that any research involving human subjects that is funded by the federal government must undergo an ethics review by an IRB. There are certain exceptions for what kinds of research will go before an IRB, including research that involves the analysis of data that is publicly available, privately funded research, and research that involves secondary analysis of existing data (such as the use of existing ‘benchmark’ datasets that are commonly used in AI research).[59]

 

In the UK, the first RECs began operating informally around 1966, in the context of clinical research in the National Health Service (NHS), but it was not until 1991 that RECs were formally codified. In the 1980s, the UK expanded the requirement for REC review beyond clinical health research into other disciplines. Academic RECs in the UK began to spring up around this same time, with the majority coming into force after the year 2000.

 

UK RECs in the healthcare and clinical context are coordinated and regulated by the Health Research Authority, which has passed guidance for how medical healthcare RECs should be structured and operate, including the procedure of submitting an ethics application and the process of ethics review.[60] This guidance allows for greater harmony across different health RECs and better governance for multi-site research projects, but this guidance does not extend to RECs in other academic fields. Some funders such as the UK’s Economic and Social Research Council have also released research ethics guidelines for non-health projects to undergo certain ethics review requirements if the project involves human subjects research (though the definition of human subjects research is contested).[61]

RECs in academia

While RECs broadly seek to protect the welfare and interests of research participants and promote ethical and societally valuable research, there are some important distinctions to draw between the function and role of a REC in academic institutions compared to private-sector AI labs.

Where are RECs located in universities and research institutes?

Academic RECs bear a significant amount of the responsibility for assessing research involving human participants, including the scrutiny of ethics applications from staff and students. Broadly, there are two models of RECs used in academic research institutions:

  1. Centralised: A single, central REC is responsible for all research ethics applications, including the development of ethics policies and guidance.
  2. Decentralised: Schools, faculties or departments have their own RECs for reviewing applications, while a central REC maintains and develops ethics policies and guidance.[62]

RECs can be based at the institutional level (such as at universities), or at the regional and federal level. Some RECs may also be run by non-academic institutions, who are charged with reviewing academic research proposals. For example, academic health research in the UK may undergo review by RECs run by the National Health Service (NHS), sometimes in addition to review by the academic body’s own REC. In practice, this means that publicly funded health research proposals may seek ethics approval from one of the 85 RECs run by the NHS, in addition to non-NHS RECs run by various academic departments.[63]

A single, large academic institution, such as the University of Cambridge, may have multiple committees running within it, each with a different composition and potentially assessing different kinds of fields of research. Depending on the level of risk and required expertise, a research project may be reviewed by a local REC, school-level REC or may also be reviewed by a REC at the university level.[64]

For example, Exeter University has a central REC and 11 devolved RECs at college or discipline level. The devolved RECS report to the central REC, which is accountable to the University Council (governing body). Exeter University also implements a ‘dual assurance’ scheme, with an independent member of the university’s governing body providing oversight of the implementation of their ethics policy. The University of Oxford also relies on a cascading system of RECs, which can escalate concerns up the chain if needed, and which may include department and domain-specific guidance for certain research ethics issues.

Figure 3: The cascade of RECs at the University of Oxford[65]

This figure shows how one academic institution’s RECs are structured, with a central REC and more specialised committees.

What is the scope and role of academic RECs?

According to a 2004 survey of UK academic REC members, they play four principal roles:[66]

  1. Responsibility for ethical issues relating to research involving human participants, including maintaining standards and provision of advice to researchers.
  2. Responsibility for ensuring production and maintenance of codes of practice and guidance for how research should be conducted.
  3. Ethical scrutiny of research applications from staff and, in most cases, students.
  4. Reporting and monitoring of instances of unethical behaviour to other institutions or academic departments.

Academic RECs often include a function for intaking and assessing reports of unethical research behaviour, which may lead to disciplinary action against staff or students.

When do ethics reviews take place?

RECs form a gateway through which researchers apply to obtain ethics approval as a prerequisite for further research. At most institutions, researchers will submit their work for ethics approval before conducting the study – typically at the early stages in the research lifecycle, such as at the planning stage or when applying for research grants. This means RECs only consider an anticipatory assessment of ethical risks that the proposed method may raise.

This assessment relies on both ‘testimony’ from research applicants who document what they believe are the material risks, and a review by REC members themselves who assess the validity of that ‘testimony’, provide an opinion of what they envision the material risks of the research method might be, and how those risks can be mitigated. There is limited opportunity for revising these assessments once the research is underway, and that usually only occurs if a REC review identifies a risk or threat and asks for additional information. One example of an organisation that takes a different approach is the Alan Turing Institute, which developed a continuous integration approach with reviews taking place at various stages throughout the research life cycle.[67]

The extent of a REC’s review will vary depending on whether the project has any clearly identifiable risks to participants, and many RECs apply a triaging process to identify research that may pose particularly significant risks. RECs may use a checklist that asks a researcher whether their project involves particularly sensitive forms of data collection or risk, such as research with vulnerable population groups like children, or research that may involve deceiving research participants (such as creating a fake account to study online right-wing communities). If an application raises one of these issues, it must undergo a full research ethics review. In cases where a research application does not involve any of these initial risks, it may undergo an expedited process that involves a review of only some factors of the application such as its data governance practices.[68]

Figure 4: Example of the triaging application intake process for a UK University REC

If projects meet certain risk criteria, they may be subject to a more extensive review by the full committee. Lower-risk projects may be approved by only one or two members of the committee.

During the review, RECs may offer researchers advice to mitigate potential ethical risks. Once approval is granted, no further checks by RECs are required. This means that there is no mechanism for ongoing assessment of emerging risks to participants, communities or society as the research progresses. As the focus is on protecting individual research participants, there is no assessment of potential long-term downstream harms of research.

Composition of academic RECs

The composition of RECs varies between and even within various institutions. In the USA, RECs are required under the ‘common rule’ to have a minimum of five members with a variety of professional backgrounds, to be made up of people from different ethnic and cultural backgrounds, and to have at least one member who is independent from the institution. In the UK, the Health Research Authority recommends RECs have 18 members, while the Economic and Social Research Council (ESRC) recommends at least seven.[69] RECs operate on a voluntary basis, and there is currently no financial compensation for REC members, nor any other rewards or recognition.

Some RECs are comprised of an interdisciplinary board of people who bring different kinds of expertise to ethical reviews. In theory, this is to provide a more holistic review of research that ensures perspectives from different disciplines and life experiences are factored into a decision. RECs in the clinical context in the UK, for example, must involve both expert members with expertise in the subject area and ‘lay members’, which refers to people ‘who are not registered healthcare professionals and whose primary professional interest is not in clinical research’.[70] Additional expertise can be sourced on an ad hoc basis.[71] The ESRC also further emphasises that RECs should be multi-disciplinary and include ethnic and gender diversity.[72] According to our expert workshop participants, however, many RECs that are located within a specific department of faculty are often not multi-disciplinary and do not include lay members, although specific expertise might be requested when needed.

The Secure Anonymised Information Linkage databank (SAIL)[73] offers one example of a body that does integrate lay members in their ethics review process. Their review criteria include data governance issues and risks of disclosure, but also whether the project contributes to new knowledge, and whether it serves the public good by improving health, wellbeing and public services.

RECs within the technology industry

In the technology industry, several companies with AI and data science research divisions have launched internal ethics review processes and accompanying RECs, with notable examples being Microsoft Research, Meta Research and Google Brain. In our workshop and interviews with participants, members of corporate RECs we spoke with noted some key facets of their research review processes. It is important, however, to acknowledge that little publicly available information exists on corporate REC practices, including their processes and criteria for research ethics review. This section reflects statements made by workshop and interview participants, and some public reports of research ethics practices in private-sector labs.

Scope

According to our participants, corporate AI research RECs tend to take a broader scope of review than traditional academic RECs. Their reviews may extend beyond research ethics issues and into questions of broader societal impact. Interviews with developers of AI ethics review practices in industry suggested a view that traditional REC models can be too cumbersome and slow for the quick pace of the product development life cycle.

At the same time, ex ante review does not provide good oversight on risks that emerge during or after a project. To address this issue, some industry RECs have sought to develop processes that focus beyond protecting individual research subjects and include considerations for the broader downstream effects for population groups or society, as well as recurring review throughout the research/product lifecycle.[74]

Several companies we spoke with have specific RECs that review research involving human subjects. However, as one participant from a corporate REC noted, ‘a lot of AI research does not involve human subjects’ or their data, and may focus instead on environmental data or other types of non-personal information. This company relied on separate ethics review process for such cases that considers (i) the potential broader impact of the research and (ii) whether the research aligns with public commitments or ethical principles the company has made.

According to a law review article on their research ethics review process, Meta (previously known as Facebook) claims to consider the public contribution of knowledge of research and whether it may generate positive externalities and implications for society.[75] A workshop participant from another corporate REC noted that ‘the purpose of [their] research is to have societal impact, so ethical implications of their research are fundamental to them.’ These companies also tend to have more resources to undertake ethical reviews than academic labs, and can dedicate more full-time staff positions to training, broader impact mapping and research into the ethical implications of AI.

The use of AI-specific ethics principles and red lines

Many corporate companies like Meta, Google and Microsoft have published AI ethics principles that articulate particular considerations for their AI and data science research to consider, as well as ‘red line’ research areas they will not undertake. For example, in response to employee protests against a US Department of Defense contract, Google stated it will not pursue AI ‘weapons or other technologies whose principal purpose or implementation is to cause or directly facilitate injury to people’.[76] Similarly, DeepMind and Element AI have signed a pledge against AI research for lethal autonomous weapons alongside over 50 other companies; a pledge that only a handful of academic institutions have made.[77]

According to some participants, articulating these principles can make more salient the specific ethical concerns that researchers at corporate labs should consider with AI and data science research. However, other participants we spoke with noted that, in practice, there is a lack of internal and external transparency around how these principles are applied.

Many participants from academic institutions we spoke with noted they do not use ‘red line’ areas of research out of concern that these red lines may infringe on existing principles of academic openness.

Extent of reviews

Traditional REC reviews tend to focus on a single one-off assessment of research risk at the early stages of a project. In contrast, one corporate REC we spoke with described their review as being a continuous process in which a team may engage with the REC at different stages, such as when a team is collecting data prior to publication, and post-publication reviews into whether the outcomes and impacts they were concerned with came to fruition. This kind of continuous review enables a REC to capture risks as they emerge.

We note that it was unclear whether this practice was common among industry labs or reflected one lab’s particular practices. We also note that some academic labs, like the Alan Turing Institute, are implementing similar initiatives to engage researchers at various stages of the research lifecycle.

A related point flagged by some workshop participants was that industry ethics review boards may vary in terms of their power to affect product design or launch decisions. Some may make non-binding recommendations, and others can green light or halt projects, or return a project to a previous development stage with specific recommendations.[78]

Composition of board and external engagement

The corporate REC members we spoke with all described the composition of their boards as being interdisciplinary and reflecting a broad range of teams at the company. One REC, for example, noted that members of engineering, research, legal and operations teams sit on their ethical review committee to provide advice not only on specific projects, but also for entire research programmes. Another researcher we spoke with described how their organisation’s ethics review process provides resources for researchers, including a list of ‘banned’ publicly accessible datasets that have questionable consent and privacy issues but are commonly used by researchers in academia and other parts of industry.

However, none of the corporate RECs we spoke with had lay members or external experts on their boards. This raises a serious concern that perspectives of people impacted by these technologies are not reflected in ethical reviews of their research, and that what constitutes a risk or is considered a high-priority risk is left solely to the discretion of employees of the company. The lack of engagement with external experts or people affected by this research may mean that critical or non-obvious information about what constitutes a risk to some members of society may be missed. Some participants we spoke with also mentioned that corporate labs experience challenges engaging with external stakeholders and experts to consult on critical issues. Many large companies seek to hire this expertise in-house, bringing in interdisciplinary researchers with social science, economics and other backgrounds. However, engaging external experts can be challenging, given concerns around trade secrets, sharing sensitive data and tipping off rival companies about their work.

Many companies resort to asking participants to sign non-disclosure agreements (NDAs), which are legally binding contracts with severe financial sanctions and legal risks if confidential information is disclosed. These can last in perpetuity, and for many external stakeholders (particularly those from civil society or marginalised groups), signing these agreements can be a daunting risk. However, we did hear from other corporate REC members that they had successfully engaged with external experts in some instances to understand the holistic set of concerns around a research project. In one biomedical-based research project, a corporate REC claimed to have engaged over 25 experts in a range of backgrounds to determine potential risks their work might raise and what mitigations were at their disposal.

Ongoing training

Many corporate RECs we spoke with also place an emphasis on continued skills and training, including providing basic ‘ethical training’ for staff of all levels. One corporate REC member we spoke with noted several lessons learned from their experience running ethical reviews of AI and data science research:

  1. Executive buy-in and sponsorship: It is essential to have senior leaders in the organisation backing and supporting this work. Having a senior spokesperson also helped in communicating the importance of ethical consideration throughout the organisation.
  2. Culture: It can be challenging to create a culture where researchers feel incentivised to talk and think about the ethical implications of their work, particularly in the earliest stages. Having a collaborative company culture in which research is shared openly within the company, and a transparent process where researchers understand what an ethics review will involve, who is reviewing their work, and what will be expected of them can help address this concern. Training programmes for new and existing staff on the importance of ethical reviews and how to think reflexively helped staff level-set with what is expected of them.
  3. Diverse perspectives: Engaging diverse perspectives can result in more robust decision-making. This means engaging with external experts who represent interdisciplinary backgrounds, and may include hiring that expertise internally. This can also include experiential diversity, which incorporates perspectives of different lived experiences. It also involves considering one’s own positionality and biases, and being reflexive as to how one’s own biases and lived experiences can influence consideration for ethical issues.
  4. Early and regular engagement leads to more successful outcomes: Ethical issues can emerge at different stages of a research project’s lifecycle, particularly given quick-paced and shifting political and social dynamics outside the lab. Engaging in ethical reviews at the point of publication can be too late, and the earlier this work is engaged with the better. Regular engagement throughout the project lifecycle is the goal, along with post-mortem reviews of the impacts of research.
  5. Continuous learning: REC processes need to be continuously updated and improved, and it is essential to seek feedback on what is and isn’t working.

Other actors in the research ethics ecosystem

While academic and corporate RECs and researchers share the primary burden for assessing research ethics issues, there are other actors who share this responsibility to varying degrees, including funders, publishers and conference organisers.[79] Along with RECs, these other actors help establish research culture, which refers to ‘the behaviours, values, expectations, attitudes and norms of research communities’.[80] Research culture influences how research is done, who conducts research and who is rewarded for it.

Creating a healthy research culture is a responsibility shared by research institutions, conference organisers, journal editors, professional associations and other actors in the research ecosystem. This can include creating rewards and incentives for researchers to conduct their work according to a high ethical standard, and to reflect carefully on the broader societal impacts of their work. In this section, we examine in detail only three actors in this complex ecosystem.

Figure 5: Different actors in the research ecosystem

This figure shows some of the different actors that comprise the AI and data science research ecosystem. These actors interact and set incentives for each other. For example, funders can set incentives for institutions and researchers to follow (such as meeting certain criteria as part of a research application). Similarly, publishers and conferences can set incentives for researchers to follow in order to be published.

Organisers of research conferences can set particular incentives for a healthy research culture. Research conferences are venues where research is rewarded and celebrated, enabling career advancement and growth opportunities. They are also forums where junior and senior researchers from the public and private sectors create professional networks and discuss field-wide benchmarks, milestones and norms of behaviour. As Ada’s recent paper with CIFAR on AI and machine learning (ML) conference organisers explores, there are a wide variety of steps that conferences can take to incentivise consideration for research ethics and broader societal impacts.[81]

For example, in 2020, the Conference on Neural Information Processing (NeurIPS) introduced a requirement that submitted papers include a broader societal impact statement of the benefits, limitations and risks of the research.[82] These impact statements were designed to encourage researchers submitting work to the conference to consider the risks their research might raise, and to conduct more interdisciplinary consultation with experts from other domains and engagement with people who may be affected by their research.[83] The introduction of this requirement was hotly contested by some researchers, who were concerned it was an overly burdensome ‘tick box’ exercise that would become pro-forma over time.[84]  In 2021, NeurIPs shifted to adding ethical considerations into a checklist of requirements for submitted papers, rather than requiring a standalone statement for all papers to complete.

Editors of academic journals can set incentives for researchers to assess for and mitigate the ethical implications of their work. Having work published in an academic journal is primary goal for most academics, and a pathway for career advancement. Journals often put in place certain requirements for submissions to be accepted. For example, the Committee on Publication Ethics (COPE) has released guidelines on research integrity practices in scholarly publishing, which stipulate that journals should include policies on data sharing, reproducibility and ethical oversight.[85] This includes requirements that studies involving human subjects research must provide self-disclosure that a REC has approved the study.

Some organisations have suggested journal editors could go further towards encouraging researchers to consider questions of broader societal impacts. The Partnership on AI (PAI) published a range of recommendations for responsible publication practice in AI and ML research, which include calls for a change in research culture that normalises the discussion of downstream consequences of AI and ML research.[86]

Specifically for conferences and journals, PAI recommends expanding peer review criteria to include potential downstream consequences by asking submitting researchers to include a broader societal impact statement. Furthermore, PAI recommends establishing a separate review process to evaluate papers based on risk and downstream consequences, a process that may require a unique set of multidisciplinary experts to go beyond the scope of current journal review practices.[87]

Public and private funders (such as research councils) can establish incentives for researchers to engage with questions of research ethics, integrity and broader societal impacts. Funders play a critical role in determining which research proposals will move forward, and what areas of research will be prioritised over others. This presents an opportunity for funders to encourage certain practices, such as requiring that any research that receives funding meets expectations around research integrity, Responsible Research and Innovation and research ethics. For example, Gardner recommends that grant funding and public tendering of AI systems should require a ‘Trustworthy AI Statement’ from researchers that includes an ex ante assessment of how the research will comply with the European HLEG’s Trustworthy AI standards.[88]

Challenges in AI research

In this chapter, we highlight six major challenges that Research Ethics Committees (RECs) face when evaluating AI and data science research, as uncovered during workshops conducted with members of RECs and researchers in May 2021.

Challenge 1:  Many RECs lack the resources, expertise and training to appropriately address the risks that AI and data science pose

Inadequate review requirements

Some workshop participants highlighted that many projects that raise severe privacy and consent issues are not required to undergo research ethics review. For example, some RECs encourage researchers to adopt data minimisation and anonymisation practices and do not require a project to undergo ethics reviews if the data is anonymised after collection. However, research has shown that anonymised data can still be triangulated with other datasets to enable reidentification,[89] raising a privacy risk to data subjects and implications for the consideration of broader impacts.[90] Expert participants noted that it is hard to determine if data collected for a project is anonymous, and that RECs must have the right expertise to fully interrogate whether a research project has adequately addressed these challenges.

As Metcalf and Crawford have noted, data science is usually not considered a form of direct intervention in the body or life of individual human subjects and is, therefore, exempt from many research ethics review processes.[91] Similar challenges arise with AI research projects that rely on data collected from public sources, such as surveillance cameras or scraped from the public web, which are assumed to pose minimal risk to human subjects. Under most current research ethics guidelines, research projects using publicly available or pre-existing datasets collected and shared by other researchers are also not required to undergo research ethics review.[92]

Some of our workshop participants noted that researchers can view RECs as risk averse and overly concerned with procedural questions and reputation management. This reflects some findings from the literature. Samuel et al found that, while researchers perceive research ethics as procedural and centred on operational governance frameworks, societal ethics are perceived as less formal and more ‘fuzzy’, noting the absence of standards and regulations governing AI in relation to societal impact.[93]

Expertise and training

Another institutional challenge our workshop participants identified related to the training, composition and expertise of RECs. These concerns are not unique to reviews of AI and data science and reflect long-running concerns with how effectively RECs operate. In the USA, a 2011 study found that university research ethics review processes are perceived by researchers as inefficient, with review outcomes being viewed as inconsistent and often resulting in delays in the research process, particularly for multi-site trials.[94]

Other studies have found that researchers view RECs as overly bureaucratic and risk-averse bodies, and that REC practices and decisions can vary substantially across institutions.[95] These studies have found that that RECs have differing approaches to determining which projects require a full rather than expedited review, and often do not provide a justification or explanation for their assessments of the risk of certain research practices.[96] In some documented cases, researchers have gone so far as to abandon projects due to delays and inefficiencies of research ethics review processes.[97]

There is some evidence these issues are exacerbated in reviews of AI and data science research. Dove et al found systemic inefficiencies and substantive weaknesses in research ethics review processes, including:

  • a lack of expertise in understanding the novel challenges emerging from data-intensive research
  • a lack of consistency and reasoned decision-making of RECs
  • a focus on ‘tick-box exercises’
  • duplication of ethics reviews
  • a lack of communication between RECs in multiple jurisdictions.[98]

One reason for variation in ethics review process outcomes is disagreement among REC members. This can be the case even when working with shared guidelines. For example, in the context of data acquired through social media for research purposes, REC members differ substantially in their assessment of whether consent is required, as well as the risks to research participants. In part, this difference of opinion can be linked to their level of experience in dealing with these issues.[99] Some researchers suggest that reviewers may benefit from more training and support resources on emerging research ethics issues, to ensure a more consistent approach to decision-making.[100]

A significant challenge arises from the lack of training – and, therefore, lack of expertise – of REC members.[101] While this has already been identified as a persistent issue with RECs generally,[102] AI and data science research can be applied to many disciplines. This means that REC members evaluating AI and data science research must have expertise across many fields. However, many RECs in this space frequently lack expertise across both (i) technical methods of AI and data science, and (ii) domain expertise from other relevant disciplines.[103]

Samuel et al found that some RECs that review AI and data science research are concerned with data governance issues, such as data privacy, which is perceived as not requiring AI-specific technical skills.[104] While RECs regularly draw on specialist advice through cross-departmental collaboration, workshop participants questioned whether resources to support examination of ethical issues relating to AI and data science research are made available for RECs.[105] RECs may need to consider which appropriate expertise is required for these reviews and how it will be sourced, for instance, via specialist ad-hoc advice, or the institution of sub-committees.[106]

The need for reviewers with expertise across disciplines, ethical expertise and cross-departmental collaboration is clear. Participants in our workshops questioned whether interdisciplinary expertise is sufficient to review AI and data science research projects, and whether experiential expertise (expertise on the subject matter gained through first-person involvement) is also necessary to provide a more holistic assessment of potential research risks. This could take the form of changing a REC’s composition to involve a broader range of stakeholders, such as community representatives or external organisations.

Resources

A final challenge that RECs face relates to their resourcing and the value given to their work. According to our workshop participants, RECs are generally under-resourced in terms of budget, staffing and rewarding of members. Many RECs rely on voluntary ‘pro bono’ labour of professors and other staff, with members managing competing commitments and an expanding volume of applications for ethics review.[107] Inadequate resources can result in further delays and have a negative impact on the quality of the reviews. Chadwick shows that RECs rely on the dedication of their members, who prioritise the research subjects, researchers, REC members and the institution ahead of personal gain.[108]

Several of our workshop participants noted reviewers do not have enough time to do a proper ethics review that evaluates the full range of potential ethical issues, or the right range of skills. According to several participants, sitting on a REC is often a ‘thankless’ task, which can make finding people willing to serve difficult. Those who are willing and have the required expertise risk being overloaded. Reviewing is ‘free labour’ with little or no recognition, and the question arises how to incentivise REC members. It was discussed that research ethics review should be budgeted appropriately to engage with stakeholders throughout the project lifecycle.

Challenge 2: Traditional research ethics principles are not well suited for AI research

In their evaluations of AI and data science research, RECs have traditionally relied on a set of legally mandated and self-regulatory ethics principles that largely stem from the biomedical sciences. These principles have shaped the way that modern research ethics is understood at research institutions, how RECs are constructed and the traditional scope of their remit.

Contemporary RECs draw on a long list of additional resources for AI and data science research in their reviews, including data science-specific guidelines like the Association of Internet Researchers ethical guidelines,[109] provisions of the EU General Data Protection Regulation (GDPR) to govern data protection issues, and increasingly the emerging field of ‘AI ethics’ principles. However, the application of these principles raises significant challenges for RECs.

Several of our expert participants noted these guidelines and principles are often not implemented consistently across different countries, scientific disciplines, or across different departments or teams within the same institution.[110] As prominent research guidelines were originally developed in the context of biomedical research, questions have been raised about their applicability to other disciplines, such as the social sciences, data science and computer science.[111] For example, some in the research community have questioned the extension of the Belmont principles to research in non-experimental settings due to differences in methodologies, the relationships between researchers and research subjects, different models and expectations of consent and different considerations for what constitutes potential harm and to whom.[112]

We draw attention to four main challenges in the application of traditional bioethics principles to ethics reviews of AI and data science research:

Autonomy, privacy and consent

One example of how biomedical principles can be poorly applied to AI and data science research relates to how they address questions of autonomy and consent. Many of these principles emphasise that ‘voluntary consent of the human subject is absolutely essential’ and should outweigh considerations for the potential societal benefit of the research.

Workshop participants highlighted consent and privacy issues as one of the most significant challenges RECs are currently facing in reviews of AI and data science research. This included questions about how to implement ‘ongoing consent’, whereby consent is given at various stages of the research process; whether informed consent may be considered forced consent when research subjects do not really understand the implications of the future use of their data; and whether it is practical to require consent be given more than once when working with large-scale data repositories. A primary concern flagged by workshop participants was whether RECs put too much weight on questions of consent and autonomy at the expense of wider ethical concerns.

Issues of consent largely stem from the ways these fields collect and use personal data,[113] which differs substantially from the traditional clinical experiment format. Part of the issue is the relatively distanced relationship between data scientist and research subject. Here, researchers can rely on data scraped from the web – such as social media posts; or collected via consumer devices – such as fitness trackers or smart speakers.[114] Once collected, many of these datasets can be made publicly accessible as ‘benchmark datasets’ for other researchers to test and train their models. The Flickr Faces HQ dataset, for example, contains 70,000 images of faces collected from a photo-sharing website and made publicly accessible with a Creative Commons license for other researchers to use.[115]

These collection and sharing practices pose novel risks to the privacy and identifiability of research subjects, and challenge traditional notions of informed consent from participants.[116] Once collected and shared, datasets may be re-used or re-shared for different purposes than those understood during the original consent process. It is often not feasible for researchers re-using the data to obtain informed consent in relation to the original research. In many cases, informed consent may not have been given in the first place.[117]

Not being able to obtain informed consent does not give the researcher a blank slate, and datasets that are continuously used as a benchmark for technology development risk normalising the avoidance of consent-seeking practices. Some benchmark datasets, such as the longitudinal Pima Indian Diabetes Dataset (PIDD), are tied to a colonial past of oppression and exploitation of indigenous peoples, and its use as a benchmark dataset perpetuates these politics in new forms.[118] The challenges to informed consent can cause significant damage to public trust in institutions and science. One notable example involved a Facebook (now Meta) study in 2014, in which researchers were able to monitor users’ emotional states and manipulated their news feed without their consent, showing more negative content to some users.[119] The study led to significant public concern, and raised questions about how Facebook users could give informed consent in instances where they lack control, let alone awareness of the study.

In some instances, AI and data science research may also pose novel privacy risks relating to the kinds of inferences that can be drawn from data. To take one example, researchers at Facebook (now Meta) developed an AI system to identify suicidal intent in user-generated content, which could be shared with law enforcement agencies to conduct wellness checks on identified users.[120] This kind of ‘emergent’ health data produced through interactions with software platforms or products is not subject to the same requirements or regulatory oversight as data from a mental health professional.[121] This highlights how an AI system can infer sensitive health information about an individual based on non-health related data in the public domain, which could pose severe risks for the privacy of vulnerable and marginalised communities.

Questions of consent and privacy point to another tension between principles of research integrity and the ethical obligations towards protecting research participants from harm. In the spirit of making research reproducible, there is a growing acceptance among the AI and data science research community that scientific data should be openly shared, and that open access policies for data and code should be fostered so that other researchers can easily re-use research outputs. At the same time, it is not possible to make data accessible to everyone, as this can lead to harmful misuses of the data by other parties, or uses of that data that are for a purpose the data subject would not be comfortable with. Participants largely agreed, however, that RECs struggle to assess these types of research projects because the existing ex ante model of RECs addresses potential risks up front and may not be fit to address the potential emerging risks for data subjects.[122]

Risks to research subjects vs societal benefit

A related topic to consent is the challenge of weighing the societal benefit of research against the risks to the research subjects it poses.

Workshop participants acknowledged how AI and data science research create a different researcher-subject relationship from traditional biomedical research. For example, participants noted that research in a clinical context involves a person who is present and with whom researchers have close and personal interaction. A researcher in these contexts is identifiable to their subject, and vice versa. This relationship often does not exist in AI and data science research, where the ‘subject’ of research may not be readily identifiable or may be someone affected by research rather than someone participating in the research. Some research argues that AI and data science research marks a shift from ‘human subjects’ research to ‘data subjects’ research, in which care and concern for the welfare of participants should be given to those whose data is used.[123]

In many cases, data science and AI research projects rely on data sourced from the web through scraping, a process that challenges traditional notions of informed consent and raises questions about whether researchers are in a position to assess the risk of research to participants.[124] Researchers may not be able to identify the people whose data they are collecting, meaning they often lack a relational dynamic that is essential for understanding the needs, interests and risks of their research subjects.  In other cases, AI researchers may use publicly available datasets made available on online repositories like GitHub, and which may be repurposed for reasons that differ from their originally intended basis for collection. Finally, major differences arise with how data is analysed and assessed. Many kinds of AI and data science research rely on the curation of massive volumes of data, a process that many researchers outsource to third-party contract services such as Amazon’s MTurk. These processes create further separation between researchers and research subjects, outsourcing important value-laden decisions about the data to third-party workers who are not identifiable, accountable or known to research subjects.

Responsibility for assessing risks and benefit

Another challenge research ethics principles have sought to address is determining who is responsible for assessing and communicating the risk of research to participants.

One criticism has been that biomedical research ethics frameworks do not reflect the ‘emergent, dynamic and interactional nature’[125] of fields like the social sciences and humanities.[126] For example, ethnographic or anthropological research methods are open-ended, emergent and need to be responsive to the concerns of research participants throughout the research process. Meanwhile, traditional REC reviews have been solely concerned with an up-front risk assessment. In our expert workshops, several participants noted a similar concern within AI and data science research, where risks or benefits cannot be comprehensively assessed in the early stages of research.

Universality of principles

Some biomedical research ethics initiatives have sought to formulate universal principles for research ethics in different jurisdictions, which would help ensure a common standard of review in international research partnerships or multi-site research studies. However, many of these initiatives were created by institutions from predominantly Western countries to respond to Western biomedical research practices, and critics have pointed out that they therefore reflect a deeply Western set of ethics.[127] Other efforts have been undertaken to develop universal principles, including the Emanuel, Wendler and Grady framework, which uses eight principles with associated ‘benchmark’ questions to help RECs from different regions evaluate potential ethical issues relating to exploitation.[128] While there is some evidence that this model has worked well in REC evaluations for biomedical research in African institutions,[129] it has not yet been widely adopted by RECs in other regions.

Challenge 3: Specific principles for AI and data science research are still emerging and are not consistently adopted by RECs

A more recent phenomenon relevant to the consideration of ethical issues relating to AI and data science has been the proliferation of ethical principles, standards and frameworks for the development and use of AI systems.[130], [131], [132], [133] The development of standards for ethical AI systems has been taken up by bodies such as the Institute of Electrical and Electronics Engineers (IEEE) and the International Organization for Standardization (ISO).[134] Some of these efforts have occurred at the international level, such as the OECD or United Nations. A number of principles can be found across this spectrum, including transparency, fairness, privacy and accountability. However, these common principles have variations in how they are defined, understood and scoped, meaning there is no single codified approach to how they should be interpreted.[135]

In developing such frameworks, some have departed from widely adopted guidelines. For example, Floridi and Cowls propose a framework of five overarching principles for AI. This includes the traditional bioethics principles of beneficence, non-maleficence, autonomy and justice, drawn from the Belmont principles, but adds the principle of explicability, which combines questions of intelligibility (how something works) with accountability (who is responsible for the way it works).[136] Others have argued that international human rights frameworks offer a promising basis to develop coherent and universally recognised standards for AI ethics.[137]

Several of our workshop participants mentioned that it is challenging to judge the relevance of existing principles in the context of AI and data science research. During the workshops, a variety of additional principles were mentioned, for example, ‘equality’, ‘human-centricity’, ‘transparency’ and ‘environmental sustainability’. This indicates that there is not yet clear consensus around which principles should guide AI and data science research practices, and that the question of how those principles should be developed (and by which body) is not yet answered. We address this challenge in our recommendations.

The wide range of available frameworks, principles and guidelines demonstrate the difficulty for researchers and practitioners to select suitable frameworks or principles due to the current inconsistencies and a lack of a commonly accepted framework or principles guiding ethical AI and data science research. As many of our expert participants noted, this has led to confusion among RECs about whether these frameworks or principles should supplement biomedical principles, and how they should apply them to reviews of data science and AI research projects.

Complicating this challenge is the question of whether ethical principles guiding AI and data science research would be useful in practice. In a paper comparing the fields of medical ethics with AI ethics, Mittelstadt argues that AI research and development lacks several essential features for developing coherent research ethics principles and practices. These include the lack of common aims and fiduciary duties, a history of professional norms and bodies to translate principles into practice, and robust legal and professional accountability mechanisms.[138] While medical ethics draws on its practitioners being part of a ‘moral community’ characterised by common aims, values and training, AI cannot refer to such established norms and practices, given the wide range of disciplines and commercial fields it can be applied to.

The blurring of commercial and societal motives for AI research can cause AI developers to be driven by values such as innovation and novelty, performance or efficiency, rather than ethical aims rooted in biomedicine around concern for their ‘patient’ or for societal benefit. In some regions, like Canada, professional codes of practice and law around medicine have established fiduciary-like duties between doctors and their patients, which do not exist in the fields of AI and data science.[139] AI does not have a history and professional culture around ethics comparable to the medical field, which has a strong regulating influence on practitioners. Some research has also questioned the aims of AI research, and what kinds of practices are incentivised and encouraged within the research community. A study involving interviews with 53 AI practitioners in India, East and West African countries, and the USA showed that, despite the importance of high-quality data in addressing potential harms and a proliferation of data ethics principles, practitioners find the implementation of these practices to be one of the most undervalued and ‘de-glamorised’ aspects of developing AI systems.[140]

Identifying clear principles for AI research ethics is a major challenge. This is particularly the case because so few of the emerging AI ethics principles specifically focus on AI or data science research ethics. Rather, they centre on the ethics of AI system development and use. In 2019, the IEEE published a report entitled Ethically aligned design: Prioritizing human wellbeing with autonomous and intelligent systems, which contains a chapter on ‘Methods to Guide Ethical Research and Design’.[141] This chapter includes a range of recommendations for academic and corporate research institutions, including that: labs should identify stages in their processes in which ethical considerations, or ‘ethics filters’, are in place before products are further developed and deployed; and that interdisciplinary ethics training should be a core subject for everyone working in the STEM field, and should be incentivised by funders, conferences and other actors. However, this report stops short of offering clear guidance for RECs and institutions on how they should turn AI ethics principles into clear practical guidelines for conducting and assessing AI research.

Several of our expert participants observed that many AI researchers and RECs currently draw on legal guidance and norms relating to privacy and data protection, which can risk conflating questions of AI ethics into narrower issues of data governance. The rollout of the European General Data Protection Regulation (GDPR) in 2018 created a strong incentive for European institutions and institutions working with personal data of Europeans to reinforce existing ethics requirements on how research data is collected, stored and used by researchers. Expert participants noted that data protection questions are common on most REC reviews. As Samuel notes, there is some evidence that AI researchers tend to perceive research ethics as data governance questions, a mindset of thinking that is reinforced by institutional RECs in some of the questions they ask.[142]

There have been some grassroots efforts to standardise research ethics principles and guidance for some forms of data science research, including social media research. The Association of Internet Researchers, for example, has published its third edition of ethical guidelines,[143] which includes suggestions for how to deal with privacy and consent issues posed by scraping online data, how to outline and address questions across different stages of the ethics lifecycle (such as considering issues of bias and in the data analysis stage), and considering issues of potential downstream harms with the use of that data. However, these guidelines are voluntary and are narrowly focused on social media research. It remains unclear whether RECs are consistently enforcing them. As Samuel notes, the lack of established norms and criteria in social media research has caused many researchers to rely on bottom-up, personal ‘ethical barometers’ that create discrepancies in how ethical research should be conducted.[144]

In summary, there are a wide range of broad AI ethics principles that seek to guide how AI technologies are developed and deployed. The iterative nature of AI research, in which a published model or dataset can be used by downstream developers to create a commercial product with unforeseen consequences, raises a significant challenge for RECs seeking to apply AI and data science research ethics principles. As many of our expert participants noted, AI ethics research principles must touch on both how research is conducted (including what methodological choices are made), and also involve consideration for the wider societal impact of that research and how it will be used by downstream developers.

Challenge 4: Multi-site or public-private partnerships can exacerbate existing challenges of governance and consistency of decision-making

RECs face governance and fragmentation challenges in their decision-making. In contrast to clinical research, which is coordinated in the UK by the Health Research Authority (HRA), RECs evaluating AI and data science research are generally not guided by an overarching governing body, and do not have structures to coordinate similar issues between different RECs. Consequently, their processes, decision-making and outcomes can vary substantially.[145]

Expert participants noted this lack of consistent guidance between RECs is exacerbated by research partnerships with international institutions and public-private research partnerships. The specific processes RECs follow can vary between committees, even within the same institution. This can result in different RECs reaching different conclusions on similar types of research. A 2011 survey of research into Institutional Review Board (IRB) decisions found numerous instances where similar research projects received significantly different decisions, with some RECs approving with no restrictions, others requiring substantial restrictions and others rejecting research outright.[146]

This lack of an overarching coordinating body for RECs is especially problematic for international projects that involve researchers working in teams across multiple jurisdictions, often with large datasets that have multiple sources across multiple sites.[147] Most biomedical research ethics guidelines recommend that multi-site research should be evaluated by RECs located in all respective jurisdictions,[148] on the basis that each institution will reflect the local regulatory requirements for REC review, which they are best prepared to respond to.

Historically, most research in the life sciences was conducted with a few participants at a local research institution.[149] In some regions, requirements for local involvement have developed to provide some accountability for research subjects. Canada, for example, requires social science research involving indigenous populations to meet specific research ethics requirements, including around community engagement and involvement with members of indigenous communities, and around requirements for indigenous communities to own any data.[150]

However, this arrangement does not fit the large-scale, international, data-intensive research of AI and data science, which often relies on the generation, scraping and repurposing of large datasets, often without any awareness of who exactly the data may be from or under what purpose it was collected. The fragmented landscape of different RECs and regulatory environments leads to multiple research ethics applications to different RECs with inconsistent outcomes, which can be highly resource intensive.[151] Workshop participants highlighted how ethics committees face uncertainties in dealing with data sourced and/or processed in heterogeneous jurisdictions, where legal requirements and ethical norms can be very different.

Figure 6: Public-private partnerships in AI research[152]

The graphs above show an increasing trend in public-private partnerships in AI research, and in multinational collaborations on AI research. With increasing public-private partnerships and multi-site research, this can increase the challenges for these kinds of research.

Public-private partnerships

Public-private partnerships (PPPs) are common in biomedical research, where partners from the public and private sector share, analyse and use data.[153] The type of collaborations can vary, from project-specific collaborations to long-term strategic alliances between different groups, or large multi-consortia. The data ecosystem is fragmented and complex, as health data is increasingly being shared, linked, re-used or re-purposed in novel ways.[154] Some regulations, such as the General Data Protection Regulation (GDPR) may apply to all research; however, standards, drivers or reputational concerns may differ between actors in the public and private sector. This means that PPPs navigate an equally complex and fragmented landscape of standards, norms and regulations.[155]

As our expert participants noted, public-private partnerships can raise concerns about who derives benefit from the research, who controls the intellectual property of findings, and how data is shared in a responsible and rights-respecting way. The issue of data sharing is particularly problematic when research is used for the purpose of commercial product or service development. For example, wearable devices or apps that track health and fitness data can produce enormous amounts of biomedical ‘big data’ when combined with other biomedical datasets.[156] While the data generated by these consumer devices can be beneficial for society, through opportunities to advance clinical research in, for instance, chronic illness, consumers of these services may not be aware of these subsequent uses, and their expectations of personal and informational privacy may be violated.[157]

These kinds of violations can have devastating consequences. One can take the recent example of the General Practice Data for Planning and Research (GPDPR), a proposal by England’s National Health Service to create a centralised database of pseudonymised patient data that could be made accessible for researchers and commercial partners.[158] The plan was criticised for failing to alert patients about the use of this data, leading to millions of patients in England opting out of their patient data being accessible for research purposes. As of this publication date, the UK Government has postponed the plan.

Expert participants highlighted that data sharing must be conducted responsibly, aligning with the values and expectations of affected communities, a similar view held by bodies like the UK’s Centre for Data Ethics and Innovation.[159] However, what these values and expectations are, and how to avoid making unwarranted assumptions, is less clear. Recent research suggests that participatory approaches to data stewardship may increase legitimacy of and confidence in the use of data that works for people and society.[160]

Challenge 5: RECs struggle to review potential harms and impacts that arise throughout AI and data science research

REC reviews of AI and data science research are ex ante assessments done before research takes place. However, many of the harms and risks in AI research may only become evident at later stages of the research. Furthermore, many of the types of harms that can arise – such as issues of bias, or wider misuses of AI or data – are challenging for a single committee to predict. This is particularly true with the broader societal impacts of AI research, which require a kind of evaluation and review that RECs currently do not undertake.

Bias and discrimination

Identifying or predicting potential biases, and consequent discrimination, that can arise in datasets and AI models at various stages of development constitute a significant challenge for the evaluation of AI and data science research. Numerous kinds of bias can arise during data collection, model development and deployment, leading to potentially harmful downstream effects.[161] For example, Buolamwini and Gebru demonstrate that many popular facial recognition systems have much poorer performance on darker skin and non-male identities due to sampling biases in the population dataset used to train the model.[162] Similarly, numerous studies have shown predictive algorithms for policing and law enforcement can reproduce societal biases due to choices in their model architecture, design and deployment.[163],[164],[165] In supervised machine learning, manually annotated datasets can harbour bias through problematic application of gender or race categories.[166],[167],[168] In unsupervised machine learning, datasets commonly represent different types of historical biases (because data reflect existing sociotechnical bias in the world), which lead to a lack of demographic diversity, aggregation or population.[169] Crawford argues that datasets used for model training purposes are asked to capture a very complex world through taxonomies consisting of discrete classifications, an act that requires non-trivial political, cultural and social choices.[170]

Figure 7: How bias can arise in different ways in the AI development lifecycle[171]

This figure uses the example of an AI-based healthcare application, to show how bias can arise from patterns in the real world, in the data, in the design of the system, and in its use.

Understanding the ways in which biases can arise in different stages of an AI research project creates a challenge for RECs, which may not have the capacity, time or resources to determine what kinds of biases might arise in a particular project or how they should be evaluated and mitigated. Under current REC guidelines, it may be easier for RECs to challenge researchers on how they can address questions concerning data collection and sampling bias issues, but questions concerning whether research may be used to create biased or discriminatory outcomes at the point of application are outside the scope of most REC reviews.

Data provenance

Workshop participants identified data provenance – how data is originally collected sourced by researchers – as another major challenge for RECs. The issue becomes especially salient when it comes to international and collaborative projects, which draw on complex networks of datasets. Some datasets may constitute ‘primary’ data – that is, data collected by researchers. Meanwhile, other data may be ‘secondary’, which includes data that is shared, disseminated or made public by others. With secondary data, the underlying purpose for its collection, its accuracy and biases embedded at the stage of collection may be unclear.

There is a need for RECs to consider not just where data is sourced from but to also probe into what its intended purposes are, how it has been tested for potential biases that may be baked into a project, and other questions about the ethics of its collection. Some participants said that it is not enough to ask whether a dataset received ethical clearance when collected. One practical tool that might address this would be standardisation of dataset documentation practices by research institutions. For example, there is the option to use datasheets, which list critical information about how a dataset was collected, who to contact with questions and what potential ethical issues it may raise.

Labour practices around data labelling

Another issue flagged by our workshop participants related to considerations for the labour conditions and mental and physical wellbeing of data annotators. Data labellers form part of the backbone of AI and data science research, and include people who review, tag and label data to form a dataset, or evaluate the success of a model. These workers are often recruited from services like MTurk. Research and data labeller activism has shown that many face exploitative working conditions and underpayment.[172]

According to some workshop participants, it remains unclear whether data labellers are considered ‘human subjects’ in their reviews. Their wellbeing is not routinely considered by RECs. While some institutions maintain MTurk policies, these are often not written from the perspective of workers themselves and may not fully consider the variety of risks that workers face. These can include non-payment of services, or asking workers to undertake too much work in too short of a time.[173] Initiatives like the Partnership on AI’s Responsible Sourcing of Data Enrichment Services and the Northwestern Institutional Review Board’s Guidelines for Academic Requesters offer models for how corporate and academic RECs might develop policies.[174]

Societal and downstream impacts

Several experts noted standard RECs practices can fail to assess the broader societal impacts of AI and data science research, leading to traditionally marginalised population groups being disproportionately affected by AI and data science research. Historically, RECs have an anticipatory role, with potential risks assessed and addressed at the initial planning stage of the research. The focus on protecting individual research subjects means that RECs generally do not consider potential broader societal impacts, such as long-term harms to communities.[175]

For example, a study using facial recognition technology to determine sexual orientation of people,[176] or the recognition of Uighur minorities in China,[177] poses serious questions for societal benefit and the impacts on marginalised communities – yet the RECs who reviewed these projects did not consider these kinds of questions. Since the datasets used in these projects consisted of images scraped from the internet and curated, the research did not constitute human subjects research, and therefore passed ethics review.

Environmental impacts

The environmental footprint of AI and data science is a further significant impact that our workshop participants highlighted as an area most RECs do not currently review for. Some forms of AI research, such as deep learning and multi-agent learning, can be compute-intensive, raising questions about whether their benefits offset the environmental cost.[178] Similar questions have been raised about large language models (LLMs), such as OpenAI’s GPT-3, which rely on intensive computational methods without articulating a clearly defined benefit to society.[179] Our workshop participants noted that RECs could play a role in assessing whether a project’s aims justify computationally intensive methods, or whether a researcher is using the most computationally efficient method of training their model (avoiding unnecessary computational spend). However, there is no existing framework for RECs to use to help make these kinds of determinations, and it is unclear whether many REC members would have the right competencies to evaluate such questions.

Considerations of ‘legitimate research’

Workshop participants discussed whether RECs are well suited to determine what constitutes ‘legitimate research’. For example, some participants raised questions about the intellectual proximity of AI research to discredited forms of pseudoscience like phrenology, citing AI research that is based on flawed assumptions about race and gender – a point raised in empirical research evaluating the use of AI benchmark datasets.[180] AI and data science research regularly involves the categorisation of data subjects into particular groups, which may involve crude assumptions that, nonetheless, can lead to severe population-level consequences. These ‘hidden decisions’ are often baked into a dataset and, once shared, can remain unchallenged for long periods of time. To give one example, portions of the MIT Tiny Images dataset, first created in 2006, were removed in 2018 after it was discovered to include racist and sexist categorisations of images of minoritised people and women.[181] This dataset has been used to train a range of subsequent models and may still be in use today, given the ability to download and repost datasets without subsequent documentation explaining their limitations. Several participants noted that RECs are not set up to identify, let alone assess, for these kinds of issues, and may consider defining ‘good science’ out of their remit.

A lack of incentives for researchers to consider broader societal impacts

Another point of discussion in the workshops was how to incentivise researchers to consider broader societal impact questions. Researchers are usually incentivised and rewarded by producing novel and innovative work, evidenced by publications in relevant scientific journals or conferences. Often, this involves researchers making broad statements about how AI or data science research can have positive implications for society, yet there is little incentive for researchers to consider potentially harmful impacts of their work.

Some of the expert participants pointed out that other actors in the research ecosystem, such as funders, could help to incentivise researchers to reflexively consider and document the potential broader societal impacts of their work. Stanford University’s Ethics and Society Review, for example, requires researchers seeking funding from the Stanford Institute for Human-Centered Artificial Intelligence to write an impact statement reflecting on how their proposal might create negative societal impacts for society, how they can mitigate those impacts, and to work with an interdisciplinary faculty panel to ensure those concerns are addressed before funding is received. Participants in this programme overwhelmingly described it as a positive for their research and training experience.[182]

A more ambitious proposal from some workshop participants was to go beyond a risk-mitigation plan and incentivise research that benefits society. However, conceptualisations of social, societal or public good are contested, at best – there is no universally agreed on theory of what these are.[183] There are also questions about who is included in ‘society,’ and whether some benefits for those in a position of power would actively harm other members of society who are disadvantaged.

AI and data science research communities have not yet developed a rigorous method for deeply considering what constitutes public benefit, or a rigorous methodology for assessing the long-term impact of AI and data science interventions. Determining what constitutes the ‘public good’ or ‘public benefit’ would, at the very least, require some form of public consultation; even then, it may not be sufficient.[184]

One participant noted it is difficult in some AI and data science research projects to consider these impacts, particularly projects aimed at theory-level problems or small step-change advances in efficiency (for example, research that produces a more efficient and less computationally intensive method for training an image detection model). This dovetails with concerns raised by some in the AI and data science research community that there is too great a focus on creating novel methods for AI research instead of applying research to address applied, real-world problems.[185]

Workshop participants raised a similar concern about AI and data science research that is conducted without any clear rationale for addressing societal problems. Participants used the metaphor of a ‘fishing expedition’ to describe some types of AI and data science research projects that have no clear aim or objective but sought to explore large datasets to see what they found. As one workshop participant put it, researchers should always be aware that, just because data can be collected, or is already available, it does not mean that it should be collected or used for any purpose.

Challenge 6: Corporate RECs lack transparency in relation to their processes

Some participants noted that, while corporate lab reviews may be more extensive, they can also be more opaque, and are at risk of being driven by interests beyond research ethics, including whether research poses a reputational risk to the company if published. Moss and Metcalf note how ethics practices in Silicon Valley technology companies are often chiefly concerned with questions of corporate values and legal risk and compliance, and do not systematically address broader issues such as questions around moral, social and racial justice.[186] While corporate ethics reviewers draw on a variety of guidelines and frameworks, they may not address ongoing harms, evaluate these harms outside of the corporate context, or evaluate organisational behaviours and internal incentive structures.[187] It is worth noting that academic RECs have faced a similar criticism. Recent research has documented how academic REC decisions can be driven by a reputational interest to avoid ‘embarrassment’ of the institution.[188]

Several of our participants highlighted the relative lack of external transparency of corporate REC processes versus academic ones. This lack of transparency can make it challenging for other members of the research community to trust that corporate research review practices are sufficient.

Google, for example, launched a ‘sensitive topics’ review process in 2020 that asks researchers to run their work through legal, policy and public relations teams if it relates to certain topics like face and sentiment analysis or categorisations of race, gender or political affiliation.[189] According to the policy, ‘advances in technology and the growing complexity of our external environment are increasingly leading to situations where seemingly inoffensive projects raise ethical, reputational, regulatory or legal issues.’ In at least three reported instances, researchers were told to ‘strike a more positive tone’ and to remove references to Google products, raising concerns about the credibility of findings. In one notable example that became public in 2021, a Google ethical AI researcher was fired from their role after being told that a research paper they had written, which was critical of the use of large language models (a core component in Google’s search engine), could not be published under this policy.[190]

Recommendations

We conclude this paper with a set of eight recommendations, organised into sections aimed primarily at three stakeholders in the research ethics ecosystem:

  1. Academic and corporate Research Ethics Committees (RECs) evaluating AI and data science research.
  2. Academic and corporate AI and data science research institutions.
  3. Funders, conference organisers, journal editors, and other actors in the wider AI and data science research ecosystems.

For academic and corporate RECs

Recommendation 1: Incorporate broader societal impact statements from researchers

The problem

Broader societal impacts of AI and data science research are not currently considered by RECs. These might include ‘dual-use’ research (meaning it can be used for both civilian and military purposes), possible harms to society or the environment, and the potential for discrimination against marginalised populations.  Instead, RECs focus their reviews on questions of research methodology. Several workshop participants noted that there are few incentives for researchers to reflexively consider questions of societal impact. Workshop participants also noted that institutions do not offer any framework for RECs to follow, or training or guidance for researchers. Broader societal impact statements can ensure researchers reflect on, and document, the full list of potential harms, risks and benefits their work may pose.

Recommendations

Researchers should be required to undertake an evaluation of broader societal impact as part of their ethics evaluation.
This would be an impact statement that included a summary of the positive and negative impacts on society they anticipate from their research. They should include any known limitations or risks for misuse that may arise, such as whether their research findings are premised on assumptions that are particular to a geographic region, or if there is a possibility of using the findings to exacerbate certain forms of societal injustices.

Training should be designed and implemented for researchers to adequately conduct stakeholder and impact assessment evaluations, as a precondition to receive funding or ethics approval.[191]
These exercises should encourage researchers to consider the intended uses of their innovations and reflect on what kinds of unintended uses might arise. The result of these assessments can be included in research ethics documentation that reports on the researchers’ reflections on both discursive questions that invite open-ended opinion (such as what the intended use of the research may be) and categorical information that lists objective statistics and data about the project (such as the datasets that will be used, or the methods that will be applied). Some academic institutions are experimenting with this approach for research ethics applications.

Examples of good practice
Recent research from Microsoft provides a structured exercise for how researchers can consider, document and communicate potential broader societal impacts, including who the affected stakeholders are in their work, and what limitations and potential benefits it may have.[192]

Methods for impact assessment of algorithmic systems have emerged from the domains of human rights, environmental studies and data protection law. These methods are not necessarily standardised or consistent, but they seek to encourage researchers to reflect on the impacts of their work. Some examples include the use of algorithmic impact assessments in healthcare settings,[193] and in public sector uses of algorithmic systems in the Netherlands and Canada.[194]

In 2021, Stanford University tested an Ethics and Society Review board (ESR), which sought to supplement the role of its Institutional Review Board. The ESR requires researchers seeking funding from the Stanford Institute for Human-Centered Artificial Intelligence to consider negative or societal risks from their proposal, develop mitigative measures to assess those risks, and to collaborate with an interdisciplinary faculty panel to ensure concerns are addressed before funds are disbursed.[195] A pilot study of 41 submissions to this panel found that ‘58% of submitters felt that it had influenced the design of their research project, 100% are willing to continue submitting future projects to the ESR,’ and that submitting researchers sought additional training and scaffolding about societal risks and impacts.[196]

Figure 8: Stanford University Ethics and Society Review (ESR) process[197]

Understanding the potential impacts of AI and data science research can ensure researchers produce technologies that are fit for purpose and well-suited for the task at hand. The successful development and integration of an AI-powered sepsis diagnostic tool in a hospital in the USA offers an example of how researchers worked with key stakeholders to develop and design a life-changing product. Researchers on this project relied on continuous engagement with stakeholders in the hospital, including nurses, doctors and other staff members, to determine how the system could meet their needs.[198] By understanding these needs, the research team were able to tailor the final product so that it fitted smoothly within the existing practices and procedures of this hospital.

Open questions

There are several open questions on the use of broader societal impact statements. One relates to whether these statements should be a basis for a REC rejecting a research proposal. This was a major point of disagreement among our workshop participants. Some participants pushed back on the idea, out of concern that research institutions should not be in the position to determine what research is appropriate or inappropriate based on potential societal impacts, and that this may cause researchers to view RECs as a policing body for issues that have not occurred. Instead, these participants suggested a softer approach, whereby RECs require researchers to draft a broader societal impact statement but there is not a requirement for RECs to evaluate the substance of those assessments. Other participants noted that these impact assessments would be likely to highlight clear cases where the societal risks are too great, and that RECs should incorporate these considerations into their final decisions.

Another consideration related to whether a broader societal impacts evaluation should involve some aspect of ex post reviews of research, in which research institutions monitor the actual impacts of published research. This process would require significant resourcing. While there is no standard method for conducting these kinds of reviews yet, some researchers in the health field have called for this kind of ex post review conducted by an interdisciplinary committee of academics and stakeholders.[199]

Lastly, some workshop participants questioned whether a more holistic ethics review process could be broken up into parts handled by different sub-committees. For example, could questions of data ethics – how data should be handled, processed and stored, and which datasets are appropriate for researchers to use – have their own dedicated process or sub-committee? This sub-committee would need to adopt clear principles and set expectations with researchers for specific data ethics practices, and could also address the evolving dynamic between researcher and participants.

There was a suggestion that more input from data subjects could help, with a focus on how they can, and whether they should, benefit from the research, and whether this would therefore constitute a different type or segment of ethical review. Participants mentioned the need for researchers to think relationally and understand who the data subject is, the power dynamics at play and to work out the best way of involving research participants in the analysis and dissemination of findings.

Recommendation 2: RECs should adopt multi-stage ethics review processes for AI and data science research

The problem

Ethical and societal risks of AI and data science research can manifest at different stages of research[200] – from early ideation to data collection, to pre-publication. Assessing the ethical and broader societal impacts of AI research can be difficult as the results of data-driven research cannot be known in advance of accessing and processing data or building machine learning (ML) models. Typically, RECs only review research applications once before research beings, and with a narrow focus solely looking at ethical issues pertaining to methodology. This can mean that ethical review processes fail to catch risks that arise in later stages, such as potential environmental or privacy considerations if research is published, particularly for research that is ‘high risk’ and pertains to protected characteristics or has high potential for societal impact.

Recommendations

RECs should set up multi-stage and continuous ethics reviews, particularly for ‘high-risk’ AI research

RECs should experiment with requiring multiple stages of evaluations of research that raises particular ethical concern, such as evaluations at the point of data collection and a separate evaluation at the point of publication. Ethics review processes should engage with considerations raised at all stages of the research lifecycle. RECs must move away from being the ‘owners’ of ethical thinking into being stewards who guide researchers through the review process.

This means challenging the notion of an ethical review being a one-off exercise conducted at the start of a project, and instead shifts the approach of a REC and the ethics review process towards one that embeds ethical reflection throughout a project. This will benefit from more iterative ethics review processes, as well as additional interdisciplinary training for AI and data science researchers.

Several workshop participants suggested that multi-stage ethics review could consist of a combination of formal and informal review processes. Formal review processes could exist at the early and late stages, such as funding or publication, while at other points, the research team could be asked to engage in more informal peer-reviews or discussions with experts or reviewers. In the early stages of the project, milestones could be identified which are defined by the research teams, and in collaboration with RECs. For example, a milestone could be a grant submission, or when changing roles or adding new research partners to the project. Milestones could be used to trigger an interim review. Rather than following a standardised approach, this model allows for flexibility, as the milestones would be different for each project. This could also involve a tiered assessment, which is a standardised assessment based on identified risks a research project poses, which then determines the milestones.

Building on Burr & Leslie,[201] we can speak of four broad stages in an AI or data science research project: design, develop, pre-publication and post-deployment.

At the stage of designing a research project, policies and resources should be in place to:

  • Ensure new funders and potential partnerships adhere to an ethical framework. Beyond legal due diligence, this is about establishing partnerships on the basis of their values and a project’s goals.
  • Implement scoping policies that establish whether a particular research project must undertake REC processes. Two ways are suggested in the literature for such policies, and examining each organisation’s research and capability will help decide which is most suitable:
    • Sandler et al suggest a consultation process whereby RECs produce either ‘an Ethical Issues Profile report or a judgment that there are not substantive ethical issues raised’.[202]
    • The UK Statistics Authority employs an ethics self-assessment tool that determines a project’s level of risk.[203]
  • Additionally, scoping processes can result in establishing whether a project must undertake data, stakeholder, human rights or other impact assessments that focus on the broader societal impacts of their work (see Recommendation 1). Stanford’s Ethical and Societal Review Board offers one model for how institutions can set more ‘carrots and sticks’ for researchers to reflexively engage in the potential broader impacts of their research by tying the completion of a societal impact statement to their funding proposal.

At the development stage of a project, a typical REC evaluation should be undertaken to consider any ethical risks. RECs should provide a point of contact to ensure changes in the project’s aims and methods that raise new challenges are subjected to due reflection. This ensures an iterative process that aligns with the practicalities of research. RECs may also experiment with creating specialised sub-committees that address different issues, such as a separate data ethics review board that includes expertise in data ethics and domain-specific expertise, or a health data or social media data review board. It could help evaluate potential impact for people and society; depending on composition, it could also be adept at reviewing the technical aspects of a research project.[204] This idea builds from a hybrid review mechanism that Ferretti et al propose, which merges aspects of the traditional model of RECs with specialised research committees that assess particular parts of a research project.[205]

One question that RECs must turn into practice is to establish which projects must undertake particular REC processes, as it may be too burdensome for all projects to undergo this scrutiny. In some cases, it may be that a REC determines a project should undergo stricter scrutiny if an analysis of its potential impacts on various stakeholders highlights serious ethical issues. Whether or not a project is ‘in scope’ for a more substantial REC review process might depend on:

  • the level of risk it raises
  • the training or any certifications its researchers hold
  • whether it is reviewed by a relevant partner’s REC.

Determining what quantifies a risk is challenging, as not all risks may be evident or within the imagination of a REC. More top-level guidance on risks (see Recommendation 4) and interdisciplinary/experiential membership on RECs (see Recommendation 3) can help ensure that a wider scope of AI risks are identified.

At the stage of pre-publication of a research project, RECs should encourage researchers to revisit the ethical and broader societal impact considerations that may have arisen earlier. In light of the research findings, have these changed at all? Have new risks arisen? At this stage, REC members can act as stewards to help researchers navigate publication requirements, which may include filling in the broader societal impact statements that some AI and ML conferences are beginning to implement. They might also connect researchers with subject-matter experts in particular domains, who can help them understand potential ethical risks with their research. Finally, RECs may be able to provide guidance on how to release research responsibly, including whether to release publicly a dataset or code that may be used to cause harm.

Lastly, RECs and research institutions should experiment with post-publication evaluations of the impacts of research. RECs could, for example, take a pool of research submissions that involved significant ethical review and conduct an analysis of how that work was received 2–3 years down the line. Criteria this assessment could look at may include how that work was received by the media or press, who has cited that work subsequently, and whether negative or positive impacts came to fruition.

Figure 9: Example of multi-stage ethics review process

This figure shows what a multi-stage ethics review process could look like. It involves an initial self-assessment for broader impacts issues at the design stage, a REC review (and potential review by a specialised data ethics board at the production stage, another review of high-risk research at pre-publication stage, and a potential post-publication review of the research 2–3 years after it is published.

Examples of good practice

As explored above, there is not yet consensus on how to operationalise a continuous, multi-stage ethics review process, but there is an emerging body of work addressing ethics consideration at different stages in a projects’ lifecycle. Building on academic research,[206] the UK’s Centre for Data Ethics and Innovation has proposed an ‘AI assurance’ framework for continuously testing the potential risks of AI systems. This framework involves the use of different mechanisms like audits, testing and evaluation at different stages of an AI product’s lifecycle.[207] However, this framework is focused on AI products rather than research, and further work would be needed to adapt this framework for research.

D’Aquin et al propose an ethics-by-design methodology for AI and data science research that takes a broader view of data ethics.[208] Assessment usually happens at the research design/planning stage, and there are no incentives for the researcher to consider ethical issues as they eventually emerge with the progress of research. Instead, considerations for emerging ethical risks should be ongoing.[209] A few academic and corporate research institutions, such as the Alan Turing Institute, have already introduced or are in the process of implementing continuous ethics review processes (see Appendix 2). Further research is required to study how these work in practice.

Open questions

A multi-stage research review process should capture more of the ethical issues that arise in AI research, and enable RECs to evaluate ex post impacts of their research. However, continuous, multi-stage reviews require a substantial increase in resources and so are an option only for institutions who are ready to make an investment in ethics practices. These proposals could require multiples of the current time commitments of REC members and officers, and therefore require greater compensation for REC members.

The prospect of implementing a multi-stage review process raises further questions of scope, remit and role of ethics reviews. Informal reviews spread over time could see REC members take more of an advisory role than in the compliance-oriented models of the status quo, allowing researchers to informally check in with ethics experts, to discuss emerging issues and the best way to approach them. Dove argues that the role of RECs is to operate as regulatory stewards, who guide researchers through the review process.[210] To do this, RECs should establish communication channels for researchers to get in touch and interact. However, Ferretti et al warn there is a risk that ethics oversight might become inefficient if different committees overlap, or if procedures become confusing and duplicated. It would also be challenging to bring together different ethical values and priorities across a range of stakeholders, so this change needs sustaining over the long term.[211]

Recommendation 3: Include interdisciplinary expertise in REC membership

The problem

The make-up and scope of a REC review came up repeatedly in our workshops and literature reviews, with considerable concern raised about how RECs can accurately capture the wide array of ethical challenges posed by different kinds of AI and data science research. There was wide agreement within our workshop of the importance of ensuring that different fields of expertise have their voices heard in the REC process, and that the make-up of RECs should reflect a diversity of backgrounds.

Recommendations

RECs must include more interdisciplinary expertise in their membership

In recruiting new members, RECs should draw on members from different research and professional fields that go beyond just computer science, such as the social sciences, humanities, STEM sciences and other fields. By having these different disciplines present, they can each bring a different ethical lens to the challenges that a project may raise. RECs might also consider including members who work in the legal, communications or marketing teams to ensure that the concerns raised speak to a wider audience and respond to broader institutional contexts. Interdisciplinarity involves the development of a common language, a reflective stance towards research, and a critical perspective towards science.[212] If this expertise is not present at an institution, RECs could make greater use of external experts for specific questions that arise from data science research.[213]

RECs must include individuals with different experiential expertise

RECs must also seek to include members who represent different forms of experiential expertise, which includes individuals from historically marginalised groups with perspectives that are often not represented in these settings. This both includes more diverse experiences in discussions about data science and AI research outputs, and ensures that these meet the values of a culturally rich and heterogeneous society.

Crucially, the mere representation of a diversity of viewpoints is not enough to ensure the successful integration of those views into REC decisions. Members must feel empowered to share their concerns and be heard, and careful attention must be paid to the power dynamics that underlie how decisions are made within a REC. Mechanisms for ensuring more transparent and ethical decision-making practices are an area of future research worth pursuing.

In terms of the composition of RECs, Ferretti et al suggest that these should become more diverse and include members of the public and research subjects or communities affected by the research.[214] Besides the public, members from inside an institution should also be selected to achieve a multi-disciplinary composition of the board.

Examples of good practice

One notable example is the SAIL (Secure Anonymised Information Linkage) Databank, a Wales-wide research databank with approximately 30 billion records of individual level population datasets. Requests to access the databank are reviewed by an Information Governance Review Panel which includes representatives from public health agencies, clinicians, and members of the public who may be affected by the uses of this data. More information on SAIL can be found in Appendix 2.

Open questions

Increasing experiential and subject-matter expertise in AI and data science research reviews will hopefully lead to more holistic evaluations of the kinds of risks that may arise, particularly given the wide range of societal applications of AI and data science research. However, expertise from members of the public and external experts must be fairly compensated, and the impact of more diverse representation on these boards should be the subject of future study and evaluation.

Figure 10: The potential make-up of an AI/data science ethics committee[215]

For academic/corporate research institutions

Recommendation 4: Create internal training and knowledge-sharing hubs for researchers and REC members, and encourage more cross-institutional learning

The problem

A recurring concern raised by members of our workshops was a lack of shared resources to help RECs address common ethical issues in their research. This was coupled with a lack of transparency and openness of decision-making in many modern RECs, particularly for some corporate institutions where publication review processes can feel opaque to researchers. When REC processes and decisions are enacted behind closed doors, it becomes challenging to disseminate lessons learned to other institutions and researchers. It also raises a challenge for researchers who may come to view a REC as a ‘compliance’ body, rather than a resource for seeking advice and guidance. Several workshop participants noted that shared resources and trainings could help REC members, staff and students to better address these issues.

Recommendations

Research institutions should create institutional training and knowledge-sharing hubs

These hubs can serve five core functions:

1. Pooling shared resources on common AI and data science ethics challenges for students, staff and REC members to use.

The repository can compile resources, news articles and literature on ethical risks and impacts of AI systems, tagged and searchable by research type, risk or topic. These can prompt reflection on research ethics by providing students and staff with current, real-world examples of these risks in practice.

The hub could also provide a list of ‘banned’ or problematic datasets that staff or students should not use. This could help address concerns around datasets that are collected without underlying consent from research subjects, and which are commonly used as ‘benchmark’ datasets. The DukeMTC dataset of recorded videos on campus, for example, continues to be used by computer vision researchers in papers, despite being removed by Duke due to ethical concerns. Similar efforts to create a list of problematic datasets are underway at some major AI and ML conferences, and some of our workshop participants suggested that some research institutions already maintain lists like this.

2. Providing hypothetical or actual case studies of previous REC submissions and decisions to give a sense of the kinds of issues others are facing.

Training hubs could include repositories of previous applications that have been scrutinised and approved by the pertinent REC, which form a body of case studies that can inform both REC policies and individual researchers. Given the fast pace of AI and data science research, RECs can often encounter novel ethical questions. By logging past approved projects and making them available to all REC members, RECs can ensure consistency in their decisions about new projects.

We suggest that logged applications also be made available to the institution’s researchers for their own preparation when undertaking the REC process. Making applications available must be done with the permission of the relevant project manager or principal investigator, where necessary. To support the creation of these repositories, we have developed a resource consisting of six hypothetical AI and data science REC submissions that can be used for training purposes.[216]

3. Listing the institutional policies and guidance developed by the REC, such as policies outlining the research review process, self-assessment tools and societal impact assessments (see Recommendation 1).

By including a full list of its policies, hubs can foster dialogue between different processes within research institutions. Documentation from across the organisation can be shared and framed in its importance for pursuing thoughtful and responsible research.

In addition to institutional guidelines, we suggest training hubs include national, international or professional society guidelines that may govern specific kinds of research. For example, researchers seeking to advance healthcare technologies in the UK should ensure compliance with relevant Department of Health and Social Care guidelines, such as their guidelines for good practice for digital and data-driven health technologies.[217]

4. Providing a repository of external experts in subject-matter domains who researchers and REC members can consult with.

This would include a curated list of subject-matter experts in specific domains that students, staff and REC members can consult with. This might include contact details for experts in subjects like data protection law or algorithmic bias within or outside of the institution, but may extend to include lived experience experts and civil society organisations who can reflect societal concerns and potential impacts of a technology.

5. Signposting to other pertinent institutional policies (such as compliance, data privacy, diversity and inclusion).

By listing policies and resources on data management, sharing, access and privacy, training hubs could ensure researchers have more resources and training on how to properly manage and steward the data they use. Numerous frameworks are readily available online, such as the FAIR Principles,[218] promoting findability, accessibility, interoperability and reuse of digital assets; and DCC’s compilation of metadata standards for different research fields.[219]

Hubs could also include the institution’s policies on data labeller practices (if such policies exist). Several academic institutions have developed policies regarding MTurk workers that cover issues regarding fair pay, communication and acknowledgment.[220] [221] Some resources have even been co-written with input directly from MTurk workers. These resources vary from institution to institution, and there is a need for UK Research and Innovation (UKRI) and other national research institutions to codify these requirements into practical guidance for research institutions. One resource we suggest RECs tap into is the know-how and policies of human resources departments. Most large institutions and companies will already have pay and reward schemes in place. Data labellers and annotators must have access to the same protections as other legally defined positions.

The hub can also host or link to forums or similar communication channels that encourage informal peer-to-peer discussions. All staff should be welcomed into such spaces.

Examples of good practice

There are some existing examples of shared databases of AI ethics issues, including the Partnership on AI’s AI Incident Database and Charlie Pownall’s AI, Algorithmic, and Automation Incidents and Controversies Database. These databases compile news reports of instances of AI risks and ethics issues and make them searchable by type and function.[222] [223]

The Turing Institute’s Turing Way offers an excellent example of a research institution’s creation of shared resources for training and research ethics issues. For more information on the Turing Way, see Appendix 2.

Open questions

One pertinent question is whether these hubs should exist at the institutional or national level. Training hubs could start at the institutional level in the UK, and over time could connect to a shared resource managed by a centralised body like UKRI. It may be easier to start at the institutional level with repositories of relevant documentation, and spaces that foster dialogue among an institution’s workforce. An international hub could help RECs coordinate with one another and external stakeholders through international and cross-institutional platforms, and explore the opportunity of inter-institutional review standards and/or ethics review processing. We suggest that training hubs be made publicly accessible and open to other institutions, and that they are regularly reviewed and updated as appropriate.

Recommendation 5: Corporate labs must be more transparent about their decision-making and do more to engage with external partners

The problem

Several of our workshop participants noted that corporate RECs face particular opportunities and challenges in reviews of AI and data science research. Members of corporate RECs and research institutions shared that they are likely to have more resources to undertake ethical reviews than public labs, and several noted that these reviews often come at various stages of a project’s lifecycle, including near publication.

However, there are serious concerns around a lack of internal and external transparency in how some corporate RECs make their decisions. Some researchers within these institutions have cited they are unable to assess what kind of work is acceptable or unacceptable, and there are reports of some companies changing research findings for reputational reasons. Some participants claimed that corporate labs can be more risk averse when it comes to seeking external stakeholder feedback, due to privacy and trade secret concerns. Finally, members of corporate RECs are made up of members of that institution, and do not reflect experiential or disciplinary expertise outside of the company. Several interview and workshop participants noted that corporate RECs often do not consult with external experts on research ethics or broader societal impact issues, choosing instead to keep such deliberations in house.

Recommendations

Corporate labs must publicly release their ethical review criteria and process

To address concerns around transparency, corporate RECs should publicly release details on their REC review processes, including what criteria they evaluate for and how decisions are made. This is crucial for public-private research collaborations, which risk the findings of public institutions being censored for private reputational concerns, and for internal researchers to know what ethical considerations they should factor into their research. Corporate RECs should also commit to releasing transparency reports citing how many research studies they have rejected, amended and approved, on what grounds, and some example case studies (even if hypothetical) exploring the reasons why.

Corporate labs should consult with external experts on their research ethics reviews, and ideally include external and experiential experts on members of their ethics review boards

Given their research may have significant impacts on people and society, corporate labs must ensure their research ethics review boards include individuals who sit outside the company and reflect a range of experiential and disciplinary expertise. Not including this expertise will mean that corporate labs lack meaningful evaluations of the risks their research can pose. To complement their board membership, corporate labs should also consult more regularly on ethics issues with external experts to understand the impact of their research on different communities, disciplines and sectors.

Examples of good practice

In a blog post from 2022, the AI research company DeepMind explained how their ethical principles applied to their evaluation of a specific research project relating to the use of AI for protein folding.[224] In this post, DeepMind stated they had engaged with more than 30 experts outside of the organisation to understand what kinds of challenges their research might pose, and how they might release their research responsibly. This offers a model of how private research labs might consult with external expertise, and could be replicated as a standard for DeepMind and other companies’ activities.

In our research, we did not identify any corporate AI or data science research lab that has released their policies and criteria for ethical review. We also did not identify any examples of corporate labs that have experiential experts or external experts on their research ethics review boards.

Open questions

Some participants noted that it can be difficult for corporate RECs to be more transparent due to concerns around trade secrets and competition – if a company releases details on its research agenda, competitors may use this information for their own gain. One option suggested by our workshop participants is to engage in questions around research practices and broader societal impacts with external stakeholders at a higher level of abstraction that avoids getting into confidential internal details. Initiatives like the Partnership on AI seek to create a forum where corporate labs can more openly discuss common challenges and seek feedback in semi-private ways. However, corporate labs must engage in these conversations with some level of accountability. Reporting what actions they are taking as a result of those stakeholder engagements is one way to demonstrate how these engagements are leading to meaningful change.

For funders, conference organisers and other actors in the research ecosystem

Recommendation 6: Develop standardised principles and guidance for AI and data science research principles

The problem

A major challenge observed by our workshop participants is that RECs often produce inconsistent decisions, due to a lack of widely accepted frameworks or principles that deal specifically with AI and data science research ethics issues. Institutions who are ready to update their processes and standards are left to take their own risks choosing how to draft new rules. In the literature, a plethora of principles, frameworks and guidance around AI ethics has started to converge around principles like  transparency, justice, fairness, non-maleficence, responsibility and privacy.[225] However, there has yet to be a global effort to translate these principles into AI research ethics practices, or to determine how ethical principles should be interpreted or operationalised by research institutions.[226] This requires researchers to consider diverse ethics interpretations and understanding in regions, other than Western societies, which so far have not adequately featured in this debate.

Recommendations

UK policymakers should engage in a multi-stakeholder international effort to develop a ‘Belmont 2.0’ that translates AI ethics principles into specific guidelines for AI and data science research.

There is a significant need for a centralised body, such as the OECD, Global Partnership on AI or other international body to lead a multinational and inclusive effort to develop more consistent ethical guidance for RECs to use with AI and data science research. The UK must take a lead on this and use its position in these bodies to call for the development of a ‘Belmont 2.0’ for AI and data science.[227] This effort must involve representatives from all nations and avoid the pitfalls of previous research ethics principle developments that have overly favoured Western conceptions of ethics and principles. This effort should seek to define a minimum global standard of research ethics assessment that is flexible, responsive to and considerate of local circumstances.

By engaging in a multinational effort, UK national research ethics bodies like the UK Research Integrity Office (UKRIO) can develop more consistent guidance for UK academic RECs to address common challenges. This could include standardised trainings on broader societal impact issues, bias and consent challenges, privacy and identifiability issues, and other questions relating to research integrity, research ethics and broader societal impact considerations.

We believe that UKRIO can also help in the effort for standardising RECs by developing common guidance for public-private AI research partnerships, and consistent guidance for academic RECs. A substantial amount of AI research involves public-private partnerships. Common guidance could include specific support for core language around intellectual property concerns and data privacy issues.

Examples of good practice

There are some existing cross-national associations of RECs that jointly draft guidance documents or conduct training programmes. The European Network of Research Ethics Committee (EUREC) is one such example, though others could be created for other regions, or specifically for RECs who evaluate AI and data science research.[228]

In respect to laws and regulations, experts observe a gap in the regulation of AI and data science research. For example, the General Data Protection Regulation (GDPR) does provide some guidance for how European research institutions should collect, handle and use data for research purposes, though our participants noted this guidance has been interpreted by different institutions and researchers in widely different ways, leading to legal uncertainty.[229] While the UK Information Commissioner’s Office (ICO) published guidance on AI and data protection,[230] it does not offer specific guidance for AI and data science researchers.

Open questions

It is important to note that standardised principles for AI research are not a silver bullet. Significant challenges will remain in the implementation of these principles. Furthermore, as the history of biomedical research ethics principle development has shown, it will be essential for a body or network of bodies with global legitimacy and authority to steer the development of these principles, and to ensure that they accurately reflect the needs of regions and communities that are traditionally underrepresented in AI and data science research.

Recommendation 7: Incentivise a responsible research culture

The problem

RECs form one part of the research ethics ecosystem, a complex matrix of responsibility shared and supported by other actors including funding bodies, conference organisers, journal editors and researchers themselves.[231] In our workshops, one of the many challenges that our participants highlighted was a lack of strong incentives in this research ecosystem to consider ethical issues. In some cases, considering ethical risks may not be rewarded or valued by journals, funders or conference organisers. Considering the ethical issues that AI and data science research can raise, it is essential for these different actors to align their incentives and encourage AI and data science researchers to reflect on and document the societal impacts their research.

Recommendations

Conference organisers, funders, journal editors and other actors in the research ecosystem must incentivise and reward ethical reflection

Different actors in the research ecosystem can encourage a culture of ethical behaviour. Funders, for example, can create requirements that researchers conduct a broader societal impact statement of their research in order to receive a grant, and conference organisers and journal editors can encourage researchers to include a broader societal impact statement when submitting research. Conference organisers and journal editors can put in place similar requirements, and reward papers that exemplify strong ethical consideration. Publishers, for example, could potentially be assigned to evaluate broader societal impact questions in addition to research integrity issues.[232] By creating incentives for ethical reflection throughout the research ecosystem, ethical reflection can become more desirable and rewarded.

Examples of good practice

Some AI and data science conference organisers are putting in place measures to incentivise researchers to consider the broader societal impacts of their research. The 2020 NeurIPS conference, one of the largest AI and machine learning conferences in the world, required submissions to include a statement reflecting on broader societal impact, and created guidance for researchers to complete this.[233] The conference had a set of reviewers who specifically evaluated these impact statements. The use of these impact statements led to some controversy, with some researchers suggesting they could led to a chilling effect on particular types of research, and others suggesting difficulties in creating these kinds of impact assessments for more theoretical forms of AI research.[234] As of 2022, the NeurIPs conference has included these statements as part of its checklist of expectations for submission.[235] In a 2022 report, the Ada Lovelace Institute, CIFAR, and the Partnership on AI identified several measures that AI conference organisers could take to incentivise a culture of ethical reflection.[236]

There are also proposals underway for funders to include these considerations. Gardner and colleagues recommend that grant funding and public tendering of AI systems requires a ‘Trustworthy AI Statement’.[237]

Open questions

Enabling a stronger culture of ethical reflection and consideration in the AI and data science research ecosystem will require funding and resources. Reviewers of AI and data science research papers for conferences and journals already face a tough task; this work is voluntary and unpaid, and these reviewers often lack clear standards or principles to review against. We believe more training and support will be needed to ensure this recommendation can be successfully implemented.

Recommendation 8: Increase funding and resources for ethical reviews of AI and data science research

The problem

RECs face significant operational challenges around compensating their members for their time, providing timely feedback, and maintaining the necessary forms of expertise on their boards. A major challenge is the lack of resources that RECs face, and their reliance on voluntary and unpaid labour from institutional staff.

Recommendations

As part of their R&D strategy, UK policymakers must earmark additional funding for research institutions to provide greater resource, training and support to RECs.

In articulating national research priorities, UK policymakers should mandate an amount of funding towards initiatives that focus on interdisciplinary ethics training and support for research ethics committees. Funding must be made available for continuous, multi-stage research ethics review processes, and rewarding behaviour from organisations including UK Research and Innovation (UKRI) and UK research councils. Future iterations of the UK’s National AI Strategy should earmark funding for ethics training and for the work of RECs to expand their scope and remit.

Increasing funding and resources for institutional RECs will enable these essential bodies to undertake their critical work fully and holistically. Increased funding and support will also enable RECs to expand their remit and scope to capture risks and impacts of AI and data science research, which are essential for ensuring AI and data science are viewed as trustworthy disciplines and for mitigating the risks this research can pose. The traditional approach to RECs has treated their labour as voluntary and unpaid. RECs must be properly supported and resourced to meet the challenges that AI and data science pose.

Acknowledgements

This report was authored by:

  • Mylene Petermann, Ada Lovelace Institute
  • Niccolo Tempini, Senior Lecturer in Data Studies at the University of Exeter’s Institute for Data Science and Artificial Intelligence (IDSAI)
  • Ismael Kherroubi Garcia, Kairoi
  • Kirstie Whitaker, Alan Turing Institute
  • Andrew Strait, Ada Lovelace Institute

This project was made possible by the Arts and Humanities Research Council who provided a £100k grant for this work. We are grateful for our reviewers – Will Hawkins, Edward Dove and Gabrielle Samuel. We are also grateful for our workshop participants and interview subjects, who include the following and several others who wished to remain anonymous:

  • Alan Blackwell
  • Barbara Prainsack
  • Brent Mittelstadt
  • Cami Rincón
  • Claire Salinas
  • Conor Houghton
  • David Berry
  • Dawn Bloxwich
  • Deb Raji
  • Deborah Kroll
  • Edward Dove
  • Effy Vayena
  • Ellie Power
  • Elizabeth Buchanan
  • Elvira Perez
  • Frances Downey
  • Gail Seymour
  • Heba Youssef
  • Iason Gabriel
  • Jade Ouimet
  • Josh Cowls
  • Katharine Wright
  • Kerina Jones
  • Kiruthika Jayaramakrishnan
  • Lauri Kanerva
  • Liesbeth Venema
  • Mark Chevilet
  • Nicola Stingelin
  • Ranjit Singh
  • Rebecca Veitch
  • Richard Everson
  • Rosie Campbell
  • Sara Jordan
  • Shannon Vallor
  • Sophia Batchelor
  • Thomas King
  • Tristan Henderson
  • Will Hawkins

Appendix 1: Methodology and limitations

This report uses the term data science to mean the extraction of actionable insights and knowledge from data, which involves preparing data for analysis, performing data analysis using statistical methods leading to the identification of patterns in the data.[238]

This report uses the term AI research in its broadest sense, to cover research into software and systems that display intelligent behaviour, which includes subdisciplines like machine learning, reinforcement learning, deep learning and others.[239]

This report relied on a review of the literature on RECs, research ethics and broader societal impact questions in AI, most of which covers challenges in academic RECs. This report also draws on a series of workshops with 42 members of public and private AI and data science research institutions in May 2021, along with eight interviews with experts in research ethics and AI issues. These workshops and interviews provided some additional insight into the ways corporate RECs operate, though we acknowledge that much of this information is challenging to verify given the relative lack of transparency of many corporate institutions in sharing their internal research review processes (one of our recommendations is explicitly aimed at this challenge). We are grateful to our workshop participants and research subjects for their support in this project.

This report contains two key limitations:

  1. While we sought to review the literature of ethics review processes in both commercial and academic research institutions, the literature on RECs in industry is scarce and largely reliant on statements and articles published by companies themselves. Their claims are therefore not easily verifiable, and sections relating to industry practice should be read with this in mind.
  2. The report exclusively focuses on research ethics review processes at institutions in the UK, Europe and the USA, and our findings are therefore not representative of a broader international context. We encourage future work to focus on how research ethics and broader societal impact reviews are conducted in other regions.

Appendix 2: Examples of ethics review processes

In our workshops, we invited presentations from four UK organisations to share how they currently construct their ethics review processes. We include short descriptions of three of these institutions below:

The Alan Turing Institute

The Alan Turing Institute was established in 2015 as the UK National Institute for Data Science. In 2017, artificial intelligence was added to its remit, on Government recommendation. The Turing Institute was created by five founding universities and the UK Engineering and Physical Sciences Research Council.[240] The Turing Institute has since published The Turing Way, a handbook for reproducible, ethical and collaborative data science. The handbook is open source and community-driven.[241]

In 2020, The Turing Way expanded to a series of guides that covered reproducible research,[242] project design,[243] communication,[244] collaboration[245] and ethical research.[246] For example, the Guide for Ethical Research advises to consider consent in cases where the data is already available, and to understand the terms and conditions under which the data has been made available. The guide also advises to consider further societal consequences. This involves an assessment of the societal, environmental and personal risks involved in research, and measures in place to mitigate these risks.

As of writing, the Turing Institute is working on changes to its ethics review processes towards a continuous integration approach based on the model of ‘DevOps’. This is a term used in software development that involves a process of continuous integration and feedback loops across the stages of planning, building and coding, deployment and operations. To ensure ethical standards are upheld in a project, this model involves frequent communication and ongoing, real-time collaboration between researchers and research ethics committees. Currently an application to RECs for ethics review is usually submitted after a project is defined, and a funding application has been made. However, the continuous integration approach covers all stages in the research lifecycle, from project design to publication, communication and maintenance. For researchers, this means considering research ethics from the beginning of a research project and fostering a continuous conversation with RECs, for example when defining the project, or so that RECs could offer support when submitting an application for funding. The project documentation would be updated continuously as the project progresses through various stages.

The project would go through several rounds of reviews by RECs, for example, when accessing open data, during data analysis or at the publication stage. This is a rapid, collaborative process where researchers incorporate the comments from the expert reviewers. This model ensures that researchers address ethical issues as they arise throughout the research lifecycle. For example, the ethical considerations of publishing synthetic data cannot be known in advance, therefore, an ongoing ethics review is required.

This model of research ethics review requires a pool of practising researchers as reviewers. There would also need to be decision-makers who are empowered by the institution to reject an ethics application, even if funding is in place. Furthermore, this model requires permanent specialised expert staff who would be able to hold these conversations with researchers, which also requires additional resources.

SAIL Databank

The Secure Anonymised Information Linkage (SAIL) Databank[247] is a platform for robust secure storage and use of anonymised person-based data for research to improve health, wellbeing and services in Wales. The data held in this repository can be linked together to address research questions, subject to safeguards and approvals. The databank contains over 30 billion records from individual-level population datasets from about 400 data providers, used by approximately 1,200 data users. The data is primarily sourced in Wales, but also England.

The data is securely stored, and access is tightly controlled through a robust and proportionate ‘privacy by design’ methodology, which is regulated by a team of specialists and overseen by an independent Information Governance Review Panel (IGRP). The core datasets come from Welsh organisations, and include hospital inpatient and outpatient data. With the Core Restricted Datasets, the provider reserves the right to review every proposed use of the data, while approval for the Core Datasets is devolved to the IGRP.

The data provider divides the data into two parts. The demographic data goes to a trusted third party (an NHS organisation), which matches the data against a register of the population of Wales and assigns each person represented a unique anonymous code. The content data is sent directly to SAIL. The two parts can be brought together to create de-identified copies of the data, which are then subjected to further controls and presented to researchers in anonymised form.

The ‘privacy by design’ methodology is enacted in practice by a suite of physical, technical and procedural controls. This is guided by the ‘five safes’ model, for example, ‘safe projects’, ‘safe people’ (through research accreditation) or ‘safe data’ (through encryption, anonymisation or control before information can be accessed).

In practice, if a researcher wishes to work with some of the data, they submit a proposal and SAIL reviews feasibility and scoping. The researcher is assigned to an analyst who has extensive knowledge of the available datasets and who advises on which datasets they need to request data from, and which variables will help the researcher answer the questions. After this process, the researcher makes an application to SAIL, which goes to the IGRP. The application can be approved, rejected or recommendations for amendments made. The IGRP is comprised of representatives from organisations including Public Health Wales, Welsh government, Digital Health and Care Wales and the British Medical Association (BMA), and members of the public.

The criteria for review include, for example, an assessment of whether the research contributes to new knowledge, whether it improves health, wellbeing and public services, whether there is a risk that the output may be disclosive of individuals or small groups, and whether measures are in place to mitigate the risks of disclosure. In addition, public engagement and involvement ensures that a public voice is present in terms of considering potential societal impact, and who also provide a public perspective on research.

Researchers must complete a recognised safe researcher training programme and abide by the data access agreement. The data is then provided through a virtual environment, which allows the researchers to carry out the data analysis and request results. However, researchers cannot transfer data out of the environment. Instead, researchers must propose to SAIL which results they would like to transfer for publication or presentation, and these are then checked by someone at SAIL to ensure that they do not contain any disclosive elements.

Previously, the main data types were health data, but more recently, SAIL deals increasingly with administrative data, e.g. the UK Census, and with emerging data types, which may require multiple approval processes, and which can be a problem in terms of coordination. For example, data access that falls under the Digital Economy Act must have approval from the Research Accreditation Panel, and there is an expectation that each project will have undergone formal research ethical review, in addition to the IGRP.

University of Exeter

The University of Exeter has a central University Ethics Committee (UEC) and 11 devolved RECs at college or discipline level. The devolved RECS report to the UEC, which is accountable to the University Council (the governing body).[248] Exeter University also has a dual assurance scheme, with an independent member of the governing body also providing oversight.

The work of RECs is based on a single research ethics framework[249] which was first developed in 2013. This sets common standards and requirements, which also allows for flexibility to adapt to local circumstances. The framework underwent further substantial revision in 2019/20, which was a collaborative process with researchers from all disciplines with the aim to make it as reflective as possible of all discipline requirements while meeting common standards. Exeter also provides guidance and training on research ethics and as well as taught content for undergraduate and postgraduate students.

The REC operating principles[250] include:

  • independence (mitigating conflicts of interest and ensuring sufficient impartial scrutiny; enhancing lay membership of committees)
  • competence (ensuring that membership of committees/selection of reviewers is informed by relevant expertise and that decision-making is consistent, coherent, and well-informed; cross-referral of projects)
  • facilitation (recognising the role of RECs in facilitating good research and support for researchers; ethical review processes recognised as valuable by researchers)
  • transparency and accountability (REC decisions and advice to be open to scrutiny with responsibilities discharged consistently).

Some of the challenges include the lack of specialist knowledge, especially on emerging issues, such as AI and data science, new methods, or interdisciplinary research. Another challenge is information governance, e.g. ensuring that researchers have access to research data, as well as appropriate options for research data management and secure storage. Furthermore, ensuring transparency and clarity for research participants is important, e.g. active, or ongoing consent, where relevant. Secondary data use reviews include a risk-adapted or proportionate approach.

In terms of data sharing, researchers must have the appropriate permissions in place and understand the requirements of those. There are concerns about the potential misuse of data and research outputs, and researchers are encouraged to reflect on the potential implications or uses of their research, and to consider the principles of Responsible Research and Innovation (RRI) with the support of RECs. The potential risks with data sharing and international collaborations means that it is important to ensure that there is informed decision-making around these issues.

Due to the potentially significant risks of AI and data science research, Exeter University currently focuses on the Trusted Research Guidance issued by the Centre for Protection of National Infrastructure. Export Control compliance plays a role as well, but there is a greater need for awareness and training.

The University of Exeter has scope in the existing research ethics framework for setting up a specialist data science and AI ethics reference group (advisory group), which requires further work, e.g. how to balance the conflict between having a very specialist group of researchers reviewing the research, while also maintaining a certain level of independence. This would require more specialist training for RECs and researchers.

Furthermore, the University is currently evaluating how to review international and multi-site research, and how to streamline the process of ethics review as much as possible to avoid potential duplication in research ethics applications. This also requires capacity building with research partners.

Finally, improving the ability for reporting, auditing and monitoring plays a significant role, especially as the University recently implemented a new single, online research ethics application and review system.


Footnotes

[1] Source: Zhang, D. et al. (2022). ‘The AI Index 2022 Annual Report’. arXiv. Available at:
https://doi.org/10.48550/arXiv.2205.03468

[2] Bender, E.M. (2019). ‘Is there research that shouldn’t be done? Is there research that shouldn’t be encouraged?’. Medium. Available at: https://medium.com/@emilymenonbender/is-there-research-that-shouldnt-be-done-is-there-research-that-shouldn-t-be-encouraged-b1bf7d321bb6

[3]Truong, K. (2020). ‘This Image of a White Barack Obama Is AI’s Racial Bias Problem In a Nutshell’. Vice. Available at: https://www.vice.com/en/article/7kpxyy/this-image-of-a-white-barack-obama-is-ais-racial-bias-problem-in-a-nutshell

[4] Small, Z. ‘600,000 Images Removed from AI Database After Art Project Exposes Racist Bias’. Hyperallergic. Available at: https://hyperallergic.com/518822/600000-images-removed-from-ai-database-after-art-project-exposes-racist-bias/

[5] Richardson, R. (2021). ‘Racial Segregation and the Data-Driven Society: How Our Failure to Reckon with Root Causes Perpetuates Separate and Unequal Realities’. Berkeley Technology Law Journal, 36(3). Available at: https://papers.ssrn.com/abstract=3850317; [5] Buolamwini, J. and Gebru, T. (2018). ‘Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification’. Proceedings of the 1st Conference on Fairness, Accountability and Transparency. Conference on Fairness, Accountability and Transparency, PMLR, pp. 77–91. Available at: https://proceedings.mlr.press/v81/buolamwini18a.html

[6] Petrozzino, C. (2021). ‘Who pays for ethical debt in AI?’. AI and Ethics, 1(3), pp. 205–208. Available at: https://doi.org/10.1007/s43681-020-00030-3

[7] Abdalla, M. and Abdalla, M. (2021). ‘The Grey Hoodie Project: Big Tobacco, Big Tech, and the Threat on Academic Integrity’. AIES ’21: Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society. Available at: https://doi.org/10.1145/3461702.3462563

[8] For example, a recent paper from researchers at Microsoft includes guidance for a structured exercise to identify potential limitations in AI research. See: Smith, J. J. et al. (2022). ‘REAL ML: Recognizing, Exploring, and Articulating Limitations of Machine Learning Research’. 2022 ACM Conference on Fairness, Accountability, and Transparency, pp. 587–597. Available at: https://doi.org/10.1145/3531146.3533122

[9] Metcalf, J. and Crawford, K. (2016). ‘Where are human subjects in big data research? The emerging ethics divide.’ Big Data & Society, 3(1). Available at: https://doi.org/10.1177/205395171665021

[10] Metcalf, J. and Crawford, K. (2016).

[11] Hecht, B. et al. (2021). ‘It’s Time to Do Something: Mitigating the Negative Impacts of Computing Through a Change to the Peer Review Process’. arXiv. Available at:
https://doi.org/10.48550/arXiv.2112.09544

[12] Ashurst, C. et al. (2021). ‘AI Ethics Statements — Analysis and lessons learnt from NeurIPS Broader Impact Statements’. arXiv. Available at:
https://doi.org/10.48550/arXiv.2111.01705

[13] See: Ada Lovelace Institute. (2022). Looking before we leap: Case studies. Available at: https://www.adalovelaceinstitute.org/resource/research-ethics-case-studies/

[14] Raymond, N. (2019). ‘Safeguards for human studies can’t cope with big data’. Nature, 568(7752), pp. 277–277. Available at: https://doi.org/10.1038/d41586-019-01164-z

[15] The number of AI journal publications grew by 34.5% from 2019 to 2020, compared to a growth of 19.6% between 2018 and 2019. See: Stanford University. (2021). Artificial Intelligence Index 2021, chapter 1. Available at: https://aiindex.stanford.edu/wp-content/uploads/2021/03/2021-AI-Index-Report-_Chapter-1.pdf

[16] Chuvpilo, G. (2020). ‘AI Research Rankings 2019: Insights from NeurIPS and ICML, Leading AI Conferences’. Medium. Available at: https://medium.com/@chuvpilo/ai-research-rankings-2019-insights-from-neurips-and-icml-leading-ai-conferences-ee6953152c1a

[17] Minsky, C. (2020). ‘How AI helps historians solve ancient puzzles’. Financial Times. Available at: https://www.ft.com/content/2b72ed2c-907b-11ea-bc44-dbf6756c871a

[18] Zheng, S., Trott, A., Srinivasa, S. et al. (2020). ‘The AI Economist: Improving Equality and Productivity with AI-Driven Tax Policies’. Salesforce Research. Available at: https://blog.einstein.ai/the-ai-economist/

[19] Eraslan, G., Avsec, Ž., Gagneur, J. and Theis, F. J. (2019). ‘Deep learning: new computational modelling techniques for genomics’. Nature Reviews Genetics. Available at: https://doi.org/10.1038/s41576-019-0122-6

[20] DeepMind. (2020). ‘AlphaFold: a solution to a 50-year-old grand challenge in biology’. DeepMind Blog. Available at: https://deepmind.com/blog/article/alphafold-a-solution-to-a-50-year-old-grand-challenge-in-biology

[21] Boyarskaya, M., Olteanu, A. and Crawford, K. (2020). ‘Overcoming Failures of Imagination in AI Infused System Development and Deployment’. arXiv. Available at: https://doi.org/10.48550/arXiv.2011.13416

[22] Clifford, C. (2018). ‘Google CEO: A.I. is more important than fire or electricity’. CNBC. Available at: https://www.cnbc.com/2018/02/01/google-ceo-sundar-pichai-ai-is-more-important-than-fire-electricity.html

[23] Boyarskaya, M., Olteanu, A. and Crawford, K. (2020). ‘Overcoming Failures of Imagination in AI Infused System Development and Deployment’. arXiv. Available at: https://doi.org/10.48550/arXiv.2011.13416

[24] Metcalf, J. (2017). ‘“The study has been approved by the IRB”: Gayface AI, research hype and the pervasive data ethics…’ Medium. Available at: https://medium.com/pervade-team/the-study-has-been-approved-by-the-irb-gayface-ai-research-hype-and-the-pervasive-data-ethics-ed76171b882c

[25] Coalition for Critical Technology. (2020). ‘Abolish the #TechToPrisonPipeline’. Medium. Available at: https://medium.com/@CoalitionForCriticalTechnology/abolish-the-techtoprisonpipeline-9b5b14366b16.

[26] Ongweso Jr, E. (2020). ‘An AI Paper Published in a Major Journal Dabbles in Phrenology’. Vice. Available at: https://www.vice.com/en/article/g5pawq/an-ai-paper-published-in-a-major-journal-dabbles-in-phrenology

[27]Colaner, S. (2020). ‘AI Weekly: AI phrenology is racist nonsense, so of course it doesn’t work’. VentureBeat. Available at: https://venturebeat.com/2020/06/12/ai-weekly-ai-phrenology-is-racist-nonsense-so-of-course-it-doesnt-work/.

[28] Hsu, J. (2019). ‘Microsoft’s AI Research Draws Controversy Over Possible Disinformation Use’.  IEEE Spectrum. Available at: https://spectrum.ieee.org/tech-talk/artificial-intelligence/machine-learning/microsofts-ai-research-draws-controversy-over-possible-disinformation-use

[29] Harlow, M., Murgia, M. and Shepherd, C. (2019). ‘Western AI researchers partnered with Chinese surveillance firms’. Financial Times. Available at: https://www.ft.com/content/41be9878-61d9-11e9-b285-3acd5d43599e

[30] This report does not focus on considerations relating to research integrity, though we acknowledge this is an important and related topic.

[31] For a deeper discussion on these issues, see: Ashurst, C. et al. (2022). ‘Disentangling the Components of Ethical Research in Machine Learning’. FAccT ’22: 2022 ACM Conference on Fairness, Accountability, and Transparency, pp. 2057–2068. Available at: https://doi.org/10.1145/3531146.3533781

[32] Dove, E. S., Townend, D., Meslin, E. M. et al. (2016). ‘Ethics review for international data-intensive research’. Science, 351(6280), pp. 1399–1400.

[33] Dove, E. S., Townend, D., Meslin, E. M. et al. (2016).

[34] UKRI. ‘Research integrity’. Available at: https://www.ukri.org/what-we-offer/supporting-healthy-research-and-innovation-culture/research-integrity/

[35] Engineering and Physical Sciences Research Council. ‘Responsible research and innovation’. UKRI. Available at: https://www.ukri.org/councils/epsrc/guidance-for-applicants/what-to-include-in-your-proposal/health-technologies-impact-and-translation-toolkit/research-integrity-in-healthcare-technologies/responsible-research-and-innovation/

[36] UKRI. ‘Research integrity’. Available at: https://www.ukri.org/what-we-offer/supporting-healthy-research-and-innovation-culture/research-integrity/

[37] Partnership on AI. (2021). Managing the Risks of AI Research. Available at: http://partnershiponai.org/wp-content/uploads/2021/08/PAI-Managing-the-Risks-of-AI-Resesarch-Responsible-Publication.pdf

[38] Korenman, S. G., Berk, R., Wenger, N. S. and Lew, V. (1998). ‘Evaluation of the research norms of scientists and administrators responsible for academic research integrity’. Jama, 279(1), pp. 41–47.

[39] Douglas, H. (2014). ‘The moral terrain of science’. Erkenntnis, 79(5), pp. 961–979.

[40] European Commission. (2018). Responsible Research and Innovation, Science and Technology. Available at: https://data.europa.eu/doi/10.2777/45726

[41] National Human Genome Research Institute. ‘Ethical, Legal and Social Implications Research Program’. Available at: https://www.genome.gov/Funded-Programs-Projects/ELSI-Research-Program-ethical-legal-social-implications

[42] Bazzano, L. A. et al. (2021). ‘A Modern History of Informed Consent and the Role of Key Information’. Ochsner Journal, 21(1), pp. 81–85. Available at: https://doi.org/10.31486/toj.19.0105

[43] Hedgecoe, A. (2017). ‘Scandals, Ethics, and Regulatory Change in Biomedical Research’. Science, Technology, & Human Values, 42(4), pp. 577–599.  Available at: https://journals.sagepub.com/doi/abs/10.1177/0162243916677834

[44] Israel, M. (2015). Research Ethics and Integrity for Social Scientists, second edition. SAGE Publishing. Available at: https://uk.sagepub.com/en-gb/eur/research-ethics-and-integrity-for-social-scientists/book236950

[45] The Nuremberg Code was in part based on pre-war medical research guidelines from the German Medical Association, which included elements of patient consent to a procedure. These guidelines were disused during the rise of the Nazi Regime in favour of guidelines that contributed to the ‘healing of the nation’, as defendants at the Nuremberg trial put it. See: Ernst, E. and Weindling, P. J. (1998). ‘The Nuremberg Medical Trial: have we learned the lessons?’ Journal of Laboratory and Clinical Medicine, 131(2), pp. 130–135; and British Medical Journal. (1996). ‘Nuremberg’. British Medical Journal, 313(7070). Available at: https://www.bmj.com/content/313/7070

[46] Center for Disease Control and Prevention. (2021). The U.S. Public Health Service Syphilis Study at Tuskegee. Available at: https://www.cdc.gov/tuskegee/timeline.htm

[47] The National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research. (1979). The Belmont Report.

[48] Council for International Organizations of Medical Sciences (CIOMS). (2016). International Ethical Guidelines for Health-related Research Involving Humans, Fourth Edition. Available at: https://cioms.ch/wp-content/uploads/2017/01/WEB-CIOMS-EthicalGuidelines.pdf

[49] A more extensive study of the history of research ethics is provided by: Garcia, K. et al. (2022). ‘Introducing An Incomplete History of Research Ethics’. Open Life Sciences. Available at: https://openlifesci.org/posts/2022/08/08/An-Incomplete-History-Of-Research-Ethics/

[50] Hoeyer, K. and Hogle, L. F. (2014). ‘Informed consent: The politics of intent and practice in medical research ethics’. Annual Review of Anthropology, 43, pp. 347–362.

 

Legal guardianship: The Helsinki Declaration specifies that underrepresented groups should have adequate access to research and to the results of research. However, vulnerable population groups are often excluded from research if they are not able to give informed consent. A legal guardian is usually appointed by a court and can give consent on the participants’ behalf, see: Brune C,, Stentzel U., Hoffmann W. and van den Berg, N. (2021). ‘Attitudes of legal guardians and legally supervised persons with and without previous research experience towards participation in research projects: A quantitative cross-sectional study’. PLoS ONE, 16(9).

 

Group or community consent refers to research that can generate risks and benefits as part of the wider implications beyond the individual research participant. This means that consent processes may need to be supplemented by community engagement activities, see: Molyneux, S. and Bull, S. (2013). ‘Consent and Community Engagement in Diverse Research Contexts: Reviewing and Developing Research and Practice: Participants in the Community Engagement and Consent Workshop, Kilifi, Kenya, March 2011’. Journal of Empirical Research on Human Research Ethics (JERHRE), 8(4), pp. 1–18. Available at: https://doi.org/10.1525/jer.2013.8.4.1

 

Blanket consent refers to a process by which individuals donate their samples without any restrictions. Broad (or ‘general’) consent refers to a process by which individuals donate their samples for a broad range of future studies, subject to specified restrictions, see: Wendler, D. (2013). ‘Broad versus blanket consent for research with human biological samples’. The Hastings Center report, 43(5), pp. 3–4. Available at: https://doi.org/10.1002/hast.200

[51] World Medical Association. (2008). WMA Declaration of Helsinki – ethical principles for medical research involving human subjects. Available at: https://www.wma.net/policies-post/wma-declaration-of-helsinki-ethical-principles-for-medical-research-involving-human-subjects/

[52] Ashcroft, R. ‘The Declaration of Helsinki’ in: Emanuel, E. J., Grady, C. C., Crouch, R. A., Lie, R. K., Miller, F. G. and Wendler, D. D. (eds.). (2008). The Oxford textbook of clinical research ethics. Oxford University Press.

[53] World Medical Association. (2008). WMA Declaration of Helsinki – ethical principles for medical research involving human subjects. Available at: https://www.wma.net/policies-post/wma-declaration-of-helsinki-ethical-principles-for-medical-research-involving-human-subjects/

[54] World Medical Association. (2008).

[55] Millum, J., Wendler, D. and Emanuel, E. J. (2013). ‘The 50th anniversary of the Declaration of Helsinki: progress but many remaining challenges’. Jama, 310(20), pp. 2143–2144.

[56] The Belmont Report was published by the National Commission for the Protection of Human Subjects in Biomedical and Behavioral Research, which was created for the U.S. Department of Health, Education, and Welfare (DHEW) based on authorisation by the U.S. Congress in 1974. The National Commission had been tasked by the U.S. Congress with the identification of guiding research ethics principles in response to public outrage over the Tuskegee Syphilis Study and other ethically questionable projects that emerged during this time.

[57] The Nuremberg Code failed to deal with several related issues, including how international research trial should be run, questions of care for research subjects after the trial has ended or how to assess the benefit of the research to a host community. See: Annas, G. and Grodin, M. (2008). The Nazi Doctors and the Nuremberg Code: Human Rights in Human Experimentation. Oxford University Press

[58] In 1991, the regulations of the DHEW became a ‘common rule’ that covered 16 federal agencies.

[59] Office for Human Research Protections. (2009). Code of Federal Regulations, Part 46: Protection of Human Subjects. Available at: https://www.hhs.gov/ohrp/regulations-and-policy/regulations/45-cfr-46/index.html

[60] In 2000, the Central Office for Research Ethics was formed, followed by the establishment of the National Research Ethics Service and later the Health Research Authority (HRA). See: NHS Health Research Authority. (2021). Research Ethics Committees – Standard Operating Procedures. Available at: https://www.hra.nhs.uk/about-us/committees-and-services/res-and-recs/research-ethics-committee-standard-operating-procedures/

[61] There is some guidance for non-health RECs in the UK – the Economic and Social Science Research Council released research ethics guidelines for any project funded by ESRC to undergo certain ethics review requirements if the project involves human subjects research. See: Economic and Social Research Council. (2015). ESRC Framework for Research Ethics. UKRI. Available at: https://www.ukri.org/councils/esrc/guidance-for-applicants/research-ethics-guidance/framework-for-research-ethics/

[62] Tinker, A. and Coomber, V. (2005). ‘University research ethics committees—A summary of research into their role, remit and conduct’. Research Ethics, 1(1), pp. 5–11.

[63] European Network of Research Ethics Committees. ‘Short description of the UK REC system’. Available at: http://www.eurecnet.org/information/uk.html

[64] University of Cambridge. ‘Ethical Review’. Available at: https://www.research-integrity.admin.cam.ac.uk/ethical-review

[65] University of Oxford. ‘Committee information: Structure, membership and operation of University research ethics committees’. Available at: https://researchsupport.admin.ox.ac.uk/governance/ethics/committees

[66] Tinker, A. and Coomber, V. (2005). ‘University Research Ethics Committees — A Summary of Research into Their Role, Remit and Conduct’. SAGE Journals. Available at: https://doi.org/10.1177/174701610500100103

[67] The Turing Way Community et al. Guide for Ethical Research – Introduction to Research Ethics. Available at: https://the-turing-way.netlify.app/ethical-research/ethics-intro.html

[68] For an example of a full list of risks and the different processes, see: University of Exeter. (2021). Research Ethics Policy and Framework: Appendix C – Risk and Proportionate Review checklist. Available at: https://www.exeter.ac.uk/media/universityofexeter/governanceandcompliance/researchethicsandgovernance/Appendix_C_Risk_and_Proportionate_Review_v1.1_07052021.pdf; and University of Exeter. (2021). Research Ethics Policy and Framework. Available at: https://www.exeter.ac.uk/media/universityofexeter/governanceandcompliance/researchethicsandgovernance/Revised_UoE_Research_Ethics_Framework_v1.1_07052021.pdf.

[69] NHS Health Research Authority. (2021). Governance arrangements for Research Ethics Committees. Available at: https://www.hra.nhs.uk/planning-and-improving-research/policies-standards-legislation/governance-arrangement-research-ethics-committees/; and Economic and Social Research Council. (2015). ESRC Framework for Research Ethics. UKRI. Available at: https://www.ukri.org/councils/esrc/guidance-for-applicants/research-ethics-guidance/framework-for-research-ethics/

[70] NHS Health Research Authority. (2021). Research Ethics Committee – Standard Operating Procedures. Available at: https://www.hra.nhs.uk/about-us/committees-and-services/res-and-recs/research-ethics-committee-standard-operating-procedures/

[71] NHS Health Research Authority. (2021).

[72] Economic and Social Research Council. (2015). ESRC Framework for Research Ethics. UKRI. Available at: https://www.ukri.org/councils/esrc/guidance-for-applicants/research-ethics-guidance/framework-for-research-ethics/

[73] See: saildatabank.com

[74] Moss, E. and Metcalf, J. (2020). Ethics Owners. A New Model of Organizational Responsibility in Data-Driven Technology Companies. Data & Society. Available at: https://datasociety.net/library/ethics-owners/

[75] We note this article reflects Facebook’s process in 2016, and that this process may have undergone significant changes since that period. See: Jackman, M. and Kanerva, L. (2016). ‘Evolving the IRB: building robust review for industry research’. Washington and Lee Law Review Online, 72(3), p. 442.

[76] See: Google AI. ‘Artificial Intelligence at Google: Our Principles’. Available at: https://ai.google/principles/.

[77] Future of Life Institute. (2018). Lethal autonomous weapons pledge. Available at: https://futureoflife.org/2018/06/05/lethal-autonomous-weapons-pledge/

[78] Moss, E. and Metcalf, J. (2020). Ethics Owners. A New Model of Organizational Responsibility in Data-Driven Technology Companies. Data & Society. Available at: https://datasociety.net/library/ethics-owners/

[79] Samuel, G., Derrick, G. E., and Van Leeuwen, T. (2019). ‘The ethics ecosystem: Personal ethics, network governance and regulating actors governing the use of social media research data.’ Minerva, 57(3), pp. 317–343. Available at: https://link.springer.com/article/10.1007/s11024-019-09368-3

[80] The Royal Society. ‘Research Culture’. Available at: https://royalsociety.org/topics-policy/projects/research-culture/

 

[81] Canadian Institute for Advanced Research, Partnership on AI and Ada Lovelace Institute. (2022). A culture of ethical AI: report. Available at: https://www.adalovelaceinstitute.org/event/culture-ethical-ai-cifar-pai/

[82] Prunkl, C. E. et al. (2021). ‘Institutionalizing ethics in AI through broader impact requirements’. Nature Machine Intelligence, 3(2), pp. 104–110. Available at: https://www.nature.com/articles/s42256-021-00298-y

[83] Prunkl et al state that potential negative effects to impact statements are that these could be uninformative, biased, misleading or overly speculative, and therefore lack quality. The statements could lead to trivialising of ethics and governance and the complexity involved in assessing ethical and societal implications. Researchers could develop a negative attitude towards submitting an impact statement, and may find it a burden, confusing or irrelevant. The statements may also create a false sense of security, in cases where positive impacts are overstated or negative impacts understated, which may polarise the research community along political or institutional lines. See: Prunkl, C. E. et al. (2021).

[84] Some authors felt that the requirement of an impact statement is important, but there was uncertainty over who should complete them and how. Other authors also did not feel qualified to address the broader impact of their work. See: Abuhamad, G. and Rheault, C. (2020). ‘Like a Researcher Stating Broader Impact For the Very First Time’. arXiv. Available at: https://arxiv.org/abs/2011.13032

[85] Committee on Publication Ethics. (2018). Principles of Transparency and Best Practices in Scholarly Publishing. Available at: https://publicationethics.org/files/Principles_of_Transparency_and_Best_Practice_in_Scholarly_Publishingv3_0.pdf

[86] Partnership on AI. (2021). Managing the Risks of AI Research: Six Recommendations for Responsible Publication. Available at: https://partnershiponai.org/workstream/publication-norms-for-responsible-ai/

[87] Partnership on AI. (2021).

[88] Gardner, A., Smith, A. L., Steventon, A. et al. (2021). ‘Ethical funding for trustworthy AI: proposals to address the responsibilities of funders to ensure that projects adhere to trustworthy AI practice’. AI and Ethics. pp.1–15. Available at: https://link.springer.com/article/10.1007/s43681-021-00069-w

[89] Vayena, E., Brownsword, R., Edwards, S. J. et al. (2016). ‘Research led by participants: a new social contract for a new kind of research’. Journal of Medical Ethics, 42(4), pp. 216–219.

[90] There are three types of disclosure risks and possible reidentification of an individual despite masking or de-identification of data: identity disclosure, attribute disclosure, e.g., when a person is identified to belong to a particular group, or inferential disclosure, e.g., when information about a person can be inferred with released data.  See: Xafis, V., Schaefer, G. O., Labude, M. K. et al. (2019). ‘An ethics framework for big data in health and research’. Asian Bioethics Review, 11(3). Available at: https://doi.org/10.1007/s41649-019-00099-x

[91] Metcalf, J. and Crawford, K. (2016). ‘Where are human subjects in big data research? The emerging ethics divide’. Big Data & Society, 3(1). Available at: https://journals.sagepub.com/doi/full/10.1177/2053951716650211

[92] Metcalf, J. and Crawford, K. (2016).

[93] Samuel, G., Chubb, J. and Derrick, G. (2021). ‘Boundaries Between Research Ethics and Ethical Research Use in Artificial Intelligence Health Research’. Journal of Empirical Research on Human Research Ethics. Available at: https://journals.sagepub.com/doi/full/10.1177/15562646211002744

[94] Abbott, L. and Grady, C. (2011). ‘A systematic review of the empirical literature evaluating IRBs: What we know and what we still need to learn’. Journal of Empirical Research on Human Research Ethics, 6(1). Available at: https://doi.org/10.1525/jer.2011.6.1.3

[95] Zywicki, T. J. (2007). ‘Institutional review boards as academic bureaucracies: An economic and experiential analysis’. Northwestern University Law Review, 101(2), p.861. Available at: https://heinonline.org/HOL/LandingPage?handle=hein.journals/illlr101&div=36&id=&page=

[96] Abbott, L. and Grady, C. (2011). ‘A systematic review of the empirical literature evaluating IRBs: What we know and what we still need to learn’. Journal of Empirical Research on Human Research Ethics, 6(1). Available at: https://doi.org/10.1525/jer.2011.6.1.3

[97] Abbott, L. and Grady, C. (2011).

[98] Dove, E. S. and Garattini, C. (2018). ‘Expert perspectives on ethics review of international data-intensive research: Working towards mutual recognition’. Research Ethics, 14(1), pp. 1–25. Available at: https://journals.sagepub.com/doi/full/10.1177/1747016117711972

[99] Hibbin, R. A., Samuel, G. and Derrick, G. E. (2018). ‘From “a fair game” to “a form of covert research”: Research ethics committee members’ differing notions of consent and potential risk to participants within social media research’. Journal of Empirical Research on Human Research Ethics, 13(2). Available at: https://journals.sagepub.com/doi/full/10.1177/1556264617751510

[100] Guillemin, M., Gillam, L., Rosenthal, D. and Bolitho, A. (2012). ‘Human research ethics committees: examining their roles and practices’. Journal of Empirical Research on Human Research Ethics, 7(3). Available at: https://journals.sagepub.com/doi/abs/10.1525/jer.2012.7.3.38

[101] Ferretti, A., Ienca, M., Sheehan, M. et al. (2021). ‘Ethics review of big data research: What should stay and what should be reformed?’. BMC Medical Ethics, 22(1), pp. 1–13. Available at: https://bmcmedethics.biomedcentral.com/articles/10.1186/s12910-021-00616-4

[102] Guillemin, M., Gillam, L., Rosenthal, D. and Bolitho, A. (2012). ‘Human research ethics committees: examining their roles and practices’. Journal of Empirical Research on Human Research Ethics, 7(3). Available at: https://journals.sagepub.com/doi/abs/10.1525/jer.2012.7.3.38

[103] Yuan, H., Vanea, C., Lucivero, F. and Hallowell, N. (2020). ‘Training Ethically Responsible AI Researchers: a Case Study’. arXiv. Available at: https://arxiv.org/abs/2011.11393

[104] Samuel, G., Chubb, J. and Derrick, G. (2021). ‘Boundaries Between Research Ethics and Ethical Research Use in Artificial Intelligence Health Research’. Journal of Empirical Research on Human Research Ethics. Available at: Available at: https://journals.sagepub.com/doi/full/10.1177/15562646211002744

[105] Rawbone, R. (2010). ‘Inequality amongst RECs’. Research Ethics Review, 6(1), pp. 1–2. Available at: https://journals.sagepub.com/doi/pdf/10.1177/174701611000600101

[106] Hine, C. (2021). ‘Evaluating the prospects for university-based ethical governance in artificial intelligence and data-driven innovation’. Research Ethics. Available at: https://journals.sagepub.com/doi/full/10.1177/17470161211022790

[107] Page, S. A. and Nyeboer, J. (2017). ‘Improving the process of research ethics review’. Research integrity and peer review, 2(1), pp. 1–7. Available at: https://researchintegrityjournal.biomedcentral.com/articles/10.1186/s41073-017-0038-7

[108] Chadwick, G. L. and Dunn, C. M. (2000). ‘Institutional review boards: changing with the times?’. Journal of public health management and practice, 6(6), pp. 19–27. Available at: https://europepmc.org/article/med/18019957

[109] Association of Internet Researchers. (2020). Internet Research: Ethical Guidelines 3.0. Available at: https://aoir.org/reports/ethics3.pdf

[110] Emanuel, E. J., Grady, C. C., Crouch, R. A., Lie, R. K., Miller, F. G. and Wendler, D. D. (eds.). (2008). The Oxford textbook of clinical research ethics. Oxford University Press.

[111] Oakes, J. M. (2002). ‘Risks and wrongs in social science research: An evaluator’s guide to the IRB’. Evaluation Review, 26(5), pp. 443–479. Available at: https://journals.sagepub.com/doi/10.1177/019384102236520?url_ver=Z39.88-2003&rfr_id=ori:rid:crossref.org&rfr_dat=cr_pub%20%200pubmed; and Dyer, S. and Demeritt, D. (2009). ‘Un-ethical review? Why it is wrong to apply the medical model of research governance to human geography’. Progress in Human Geography, 33(1), pp. 46–64. Available at: https://journals.sagepub.com/doi/10.1177/0309132508090475

[112] Cannella, G. S. and Lincoln, Y. S. (2011). ‘Ethics, research regulations, and critical social science’. The Sage handbook of qualitative research, 4, pp. 81–90; and Israel, M. (2014). Research ethics and integrity for social scientists: Beyond regulatory compliance. SAGE Publishing.

[113] The ICO defines personal data as ‘information relating to natural persons who

can be identified or who are identifiable, directly from the information in question; or

who can be indirectly identified from that information in combination with other information.’ See: Information Commissioners Office. Guide to the UK General Data Protection Regulation (UK GDPR) – What is Personal Data? Available at: https://ico.org.uk/for-organisations/guide-to-data-protection/guide-to-the-general-data-protection-regulation-gdpr/key-definitions/what-is-personal-data/

[114] Friesen, P., DouglasJones, R., Marks, M. et al. (2021). ‘Governing AIDriven Health Research: Are IRBs Up to the Task?’ Ethics & Human Research, 43(2), pp. 35–42. Available at: https://onlinelibrary.wiley.com/doi/abs/10.1002/eahr.500085

[115] Karras, T., Laine, S. and Aila, T. (2019). ‘A style-based generator architecture for generative adversarial networks’. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4401–4410.

[116] Ferretti, A., Ienca, M., Sheehan, M. et al. (2021). ‘Ethics review of big data research: What should stay and what should be reformed?’. BMC Medical Ethics, 22(1), pp. 1–13. Available at: https://bmcmedethics.biomedcentral.com/articles/10.1186/s12910-021-00616-4

[117] Ferretti, A., Ienca, M., Sheehan, M. et al. (2021).

[118] Radin, J. (2017). ‘“Digital Natives”: How Medical and Indigenous Histories Matter for Big Data’. Osiris, 32, pp. 43–64. Available at: https://doi.org/10.1086/693853

[119] Kramer, A. D., Guillory, J. E. and Hancock, J. T. (2014). ‘Experimental evidence of massive-scale emotional contagion through social networks’. Proceedings of the National Academy of Sciences, 111(24), pp. 8788–8790. Available at: https://www.pnas.org/doi/abs/10.1073/pnas.1320040111; and Selinger, E. and Hartzog, W. (2016). ‘Facebook’s emotional contagion study and the ethical problem of co-opted identity in mediated environments where users lack control’. Research Ethics, 12(1), pp. 35–43.

[120] Marks, M. (2020). ‘Emergent medical data: Health Information inferred by artificial intelligence’. UC Irvine Law Review, 995. Available at: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3554118

[121] Marks, M. (2020).

[122] Ferretti, A., Ienca, M., Sheehan, M. et al. (2021). ‘Ethics review of big data research: What should stay and what should be reformed?’. BMC Medical Ethics, 22(1), pp. 1–13. Available at: https://bmcmedethics.biomedcentral.com/articles/10.1186/s12910-021-00616-4

[123] Samuel, G., Ahmed, W., Kara, H. et al. (2018). ‘Is It Time to Re-Evaluate the Ethics Governance of Social Media Research?’. Journal of Empirical Research on Human Research Ethics, 13(4), pp. 452–454. Available at: https://www.jstor.org/stable/26973881

[124] Taylor, J. and Pagliari, C. (2018). ‘Mining Social Media Data: How are Research Sponsors and Researchers Addressing the Ethical Challenges?’. Research Ethics, 14(2). Available at: https://journals.sagepub.com/doi/10.1177/1747016117738559

[125] Iphofen, R. and Tolich, M. (2018). ‘Foundational issues in qualitative research ethics’. The Sage handbook of qualitative research ethics, pp. 1–18. Available at: https://methods.sagepub.com/book/the-sage-handbook-of-qualitative-research-ethics-srm/i211.xml

[126] Schrag, Z. M. (2011). ‘The case against ethics review in the social sciences’. Research Ethics, 7(4), pp. 120–131.

[127] Goodyear, M. et al. (2007). ‘The Declaration of Helsinki. Mosaic tablet, dynamic document or dinosaur?’. British Medical Journal, 335; and Ashcroft, R. E. (2008). ‘The declaration of Helsinki’. The Oxford textbook of clinical research ethics, pp. 141–148.

[128] Emanuel, E.J., Wendler, D. and Grady, C. (2008) ‘An Ethical Framework for Biomedical Research’. The Oxford Textbook of Clinical Research Ethics, pp. 123–135.

[129] Tsoka-Gwegweni, J. M. and Wassenaar, D.R. (2014). ‘Using the Emanuel et al. Framework to Assess Ethical Issues Raised by a Biomedical Research Ethics Committee in South Africa’. Journal of Empirical Research on Human Research Ethics, 9(5), pp. 36–45. Available at: https://journals.sagepub.com/doi/10.1177/1556264614553172?url_ver=Z39.88-2003&rfr_id=ori:rid:crossref.org&rfr_dat=cr_pub%20%200pubmed

[130] Hagendorff, T. (2020). ‘The ethics of AI ethics: An evaluation of guidelines’. Minds and Machines, 30(1), pp. 99–120. Available at: https://link.springer.com/article/10.1007/s11023-020-09517-8;

[131] Fjeld, J., Achten, N., Hilligoss, H., Nagy, A. and Srikumar, M. (2020). ‘Principled artificial intelligence: Mapping consensus in ethical and rights-based approaches to principles for AI’. Berkman Klein Center Research Publication No. 2020–1. Available at: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3518482

[132] Gardner, A., Smith, A. L., Steventon, A. et al. (2021). ‘Ethical funding for trustworthy AI: proposals to address the responsibilities of funders to ensure that projects adhere to trustworthy AI practice’. AI and Ethics, pp. 1–15. Available at: https://link.springer.com/article/10.1007/s43681-021-00069-w

[133] Floridi, L. and Cowls, J. (2019). ‘A unified framework of five principles for AI in society’. Social Science Research Network. Available at: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3831321

[134] These include standards initiatives like the IEEE’s P7000 series on ethical design of AI systems, which include P7001 – Standard for Transparency of Autonomous Systems (2021), P7003 – Algorithmic Bias Considerations (2018) and P7010 – Wellbeing Metrics Standard for Ethical Artificial Intelligence and Autonomous Systems (2020). ISO/IEC JTC 1/SC 42 – Artificial Intelligence takes on a series of related standards around data management, trustworthiness of AI systems, and transparency.

[135] Jobin, A., Ienca, M. and Vayena, E. (2019). ‘The global landscape of AI ethics guidelines’. Nature Machine Intelligence, 1, pp. 389–399. Available at: https://doi.org/10.1038/s42256-019-0088-2

[136] Floridi, L. and Cowls, J. (2019). ‘A unified framework of five principles for AI in society’. Social Science Research Network. Available at: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3831321

[137] Yeung, K., Howes, A. and Pogrebna, G. (2019). ‘AI governance by human rights-centred design, deliberation and oversight: An end to ethics washing’. The Oxford Handbook of AI Ethics. Oxford University Press.

[138] Mittelstadt, B. (2019). ‘Principles alone cannot guarantee ethical AI’. Nature Machine Intelligence, 1(11), pp. 501–507. Available at: https://www.nature.com/articles/s42256-019-0114-4

[139] Mittelstadt, B. (2019).

[140]Sambasivan, N., Kapania, S., Highfill, H. et al. (2021). ‘“Everyone wants to do the model work, not the data work”: Data Cascades in High-Stakes AI’. Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, pp. 1–15. Available at: https://research.google/pubs/pub49953/

[141] IEEE Standards Association. (2019). Ethically Aligned Design, First Edition. Available at: https://ethicsinaction.ieee.org/#ead1e 

[142] Samuel, G., Diedericks, H. and Derrick, G. (2021). Population health AI researchers’ perceptions of the public portrayal of AI: A pilot study’. Public Understanding of Science, 30(2),  pp. 196–211. Available at: https://journals.sagepub.com/doi/full/10.1177/0963662520965490

[143] Association of Internet Researchers. (2020). Internet Research: Ethical Guidelines 3.0. Available at: https://aoir.org/reports/ethics3.pdf

[144] Samuel, G., Derrick, G. E. and Van Leeuwen, T. (2019). ‘The ethics ecosystem: Personal ethics, network governance and regulating actors governing the use of social media research data’. Minerva, 57(3), pp. 317–343. Available at: https://link.springer.com/article/10.1007/s11024-019-09368-3

[145] Vadeboncoeur, C., Townsend, N., Foster, C. ,and Sheehan, M. (2016). ‘Variation in university research ethics review: Reflections following an inter-university study in England’. Research Ethics, 12(4), pp. 217–233. Available at: https://journals.sagepub.com/doi/full/10.1177/1747016116652650; and Abbott, L. and Grady, C. (2011). ‘A systematic review of the empirical literature evaluating IRBs: What we know and what we still need to learn’. Journal of Empirical Research on Human Research Ethics, 6(1), pp.3-19. Available at: https://journals.sagepub.com/doi/abs/10.1525/jer.2011.6.1.3

[146] Silberman, G. and Kahn, K. L. (2011). ‘Burdens on research imposed by institutional review boards: the state of the evidence and its implications for regulatory reform’. The Milbank quarterly, 89(4), pp. 599–627. Available at: https://doi.org/10.1111/j.1468-0009.2011.00644

[147] Dove, E. S. and Garattini, C. (2018). ‘Expert perspectives on ethics review of international data-intensive research: Working towards mutual recognition’. Research Ethics, 14(1), pp. 1–25. Available at: https://journals.sagepub.com/doi/10.1177/1747016117711972

[148] Coleman, C. H., Ardiot, C., Blesson, S.et al . (2015). ‘Improving the Quality of Host Country Ethical Oversight of International Research: The Use of a Collaborative ‘PreReview’Mechanism for a Study of Fexinidazole for Human African Trypanosomiasis’. Developing World Bioethics, 15(3), pp. 241–247. Available at: https://onlinelibrary.wiley.com/doi/full/10.1111/dewb.12068

[149] Dove, E. S. and Garattini, C. (2018). ‘Expert perspectives on ethics review of international data-intensive research: Working towards mutual recognition’. Research Ethics, 14(1), pp. 1–25. Available at: https://journals.sagepub.com/doi/10.1177/1747016117711972

[150] Government of Canada. (2018). Tri-Council Policy Statement Ethical Conduct for Research Involving Humans, Chapter 9: Research Involving the First Nations, Inuit and Métis Peoples of Canada. Available at: https://ethics.gc.ca/eng/policy-politique_tcps2-eptc2_2018.html

[151] Dove, E. S. and Garattini, C. (2018). ‘Expert perspectives on ethics review of international data-intensive research: Working towards mutual recognition’. Research Ethics, 14(1), pp. 1–25. Available at: https://journals.sagepub.com/doi/10.1177/1747016117711972

[152] Source: Zhang, D. et al. (2022) ‘The AI Index 2022 Annual Report’. arXiv. Available at:
https://doi.org/10.48550/arXiv.2205.03468

[153] Ballantyne, A. and Stewart, C. (2019). ‘Big data and public-private partnerships in healthcare and research.’ Asian Bioethics Review, 11(3), pp. 315–326. Available at: https://link.springer.com/article/10.1007/s41649-019-00100-7

[154] Ballantyne, A. and Stewart, C. (2019).

[155] Ballantyne, A. and Stewart, C. (2019). ‘Big data and public-private partnerships in healthcare and research.’ Asian Bioethics Review, 11(3), pp. 315–326. Available at: https://link.springer.com/article/10.1007/s41649-019-00100-7

[156] Mittelstadt, B. and Floridi, L. (2016). ‘The ethics of big data: Current and foreseeable issues in biomedical contexts’. Science and Engineering Ethics, 22(2), pp. 303–341. Available at: https://link.springer.com/article/10.1007/s11948-015-9652-2

[157] Mittelstadt, B. (2017). ‘Ethics of the health-related internet of things: a narrative review’. Ethics and Information Technology, 19, pp. 157–175. Available at: https://doi.org/10.1007/s10676-017-9426-4

[158] Machirori, M. and Patel. R. (2021). ‘Turning distrust in data sharing into “engage, deliberate, decide”’. Ada Lovelace Institute. Available at: https://www.adalovelaceinstitute.org/blog/distrust-data-sharing-engage-deliberate-decide/

[159] Centre for Data Ethics and Innovation. (2020). Addressing trust in public sector data use. UK Government. Available at: https://www.gov.uk/government/publications/cdei-publishes-its-first-report-on-public-sector-data-sharing/addressing-trust-in-public-sector-data-use

[160] Ada Lovelace Institute. (2021). Participatory data stewardship: A framework for involving people in the use of data. Available at: https://www.adalovelaceinstitute.org/report/participatory-data-stewardship/

[161] Suresh, H. and Guttag, J. (2021). ‘Understanding Potential Sources of Harm throughout the Machine Learning Life Cycle’. MIT Schwarzman College of Computing. Available at: https://mit-serc.pubpub.org/pub/potential-sources-of-harm-throughout-the-machine-learning-life-cycle/release/1

[162] Buolamwini, J. and Gebru, T. (2018). ‘Gender shades: Intersectional Accuracy Disparities in Commercial Gender Classification.’ Proceedings of the 1st Conference on Fairness, Accountability and Transparency. Conference on Fairness, Accountability and Transparency, PMLR, pp. 77–91. Available at: https://proceedings.mlr.press/v81/buolamwini18a.html

[163] Asaro, P.M. (2019). AI Ethics in Predictive Policing: From Models of Threat to an Ethics of Care. Available at: https://peterasaro.org/writing/AsaroPredicitvePolicingAIEthicsofCare.pdf

[164] O’Neil, C. (2016). Weapons of math destruction: How big data increases inequality and threatens democracy. Crown.

[165] Angwin, J., Larson, J., Mattu, S. and Kirchner, L. (2016). ‘Machine Bias’. ProPublica. Available at: https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing

[166] Keyes, O. (2018). ‘The misgendering machines: Trans/HCI implications of automatic gender recognition’. Proceedings of the ACM on human-computer interaction, 2(CSCW), pp. 1–22.

[167] Hamidi, F., Scheuerman, M. K. and Branham, S. M. (2018). ‘Gender recognition or gender reductionism? The social implications of embedded gender recognition systems’. CHI ’18. Proceedings of the 2018 CHI conference on human factors in computing systems, pp. 1–13. Available at: https://dl.acm.org/doi/abs/10.1145/3173574.3173582

[168] Scheuerman, M. K., Wade, K., Lustig, C. and Brubaker, J. R. (2020). ‘How We’ve Taught Algorithms to See Identity: Constructing Race and Gender in Image Databases for Facial Analysis’. Proceedings of the ACM on Human-Computer Interaction, 4(CSCW1), pp. 1–35. Available at: https://dl.acm.org/doi/abs/10.1145/3392866

[169] Mehrabi, N., Morstatter, F., Saxena, N. et al. (2021). ‘A survey on bias and fairness in machine learning’. ACM Computing Surveys (CSUR), 54(6), pp. 1–35.

[170] Crawford, K. (2021). The Atlas of AI. Yale University Press.

[171] Source: Leslie, D. et al. (2021). ‘Does “AI” stand for augmenting inequality in the era of COVID-19 healthcare?’. BMJ, 372. Available at: https://www.bmj.com/content/372/bmj.n304

[172] Irani, L. C. and Silberman, M. S. (2013). ‘Amazon Mechanical Turk: Gold Mine or Coal Mine?’ CHI ’13: Proceedings of the SIGCHI conference on human factors in computing systems, pp. 611–620); Available at: https://dl.acm.org/doi/abs/10.1145/2470654.2470742

[173] Massachusetts Institute of Technology – Committee on the Use of Humans as Experimental Subjects. COUHES Policy for Using Amazon’s Mechanical Turk. Available at: https://couhes.mit.edu/guidelines/couhes-policy-using-amazons-mechanical-turk

[174] Jindal, S. (2021). ‘Responsible Sourcing of Data Enrichment Services’. Partnership on AI. Available at: https://partnershiponai.org/responsible-sourcing-considerations/; and Northwestern University. Guidelines for Academic Requesters. Available at: https://irb.northwestern.edu/docs/guidelinesforacademicrequesters-1.pdf

[175] Friesen, P., DouglasJones, R., Marks, M. et al. (2021). ‘Governing AIDriven Health Research: Are IRBs Up to the Task?’ Ethics & Human Research, 43(2), pp. 35–42. Available at: https://onlinelibrary.wiley.com/doi/abs/10.1002/eahr.500085

[176] Wang, Y. and Kosinski, M. (2018). ‘Deep neural networks are more accurate than humans at detecting sexual orientation from facial images’. Journal of Personality and Social Psychology, 114(2), p. 246. Available at: https://psycnet.apa.org/doiLanding?doi=10.1037%2Fpspa0000098

[177] Wang, C., Zhang, Q., Duan, X. and Gan, J. (2018). ‘Multi-ethnical Chinese facial characterization and analysis’. Multimedia Tools and Applications, 77(23), pp. 30311–30329.

[178] Strubell, E., Ganesh, A. and McCallum, A. (2019). ‘Energy and policy considerations for deep learning in NLP’. arXiv. Available at: https://arxiv.org/abs/1906.02243

[179] Bender, E.M., Gebru, T., McMillan-Major, A. and Shmitchell, S. (2021). ‘On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?’ Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (FAccT ‘21), pp. 610–623. Available at: https://doi.org/10.1145/3442188.3445922

[180] Denton, E., Hanna, A., Amironesei, R. et al. (2020). ‘Bringing the people back in: Contesting benchmark machine learning datasets’. arXiv. Available at: https://doi.org/10.48550/arXiv.2007.07399

[181] Birhane, A. and Prabhu, V. U. (2021). ‘Large image datasets: A pyrrhic win for computer vision?’. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pp. 1537–1547. Available at: https://doi.org/10.48550/arXiv.2006.16923

[182] Jensen, B. (2021). ‘A New Approach to Mitigating AI’s Negative Impact’. Institute for Human-Centered Artificial Intelligence. Available at: https://hai.stanford.edu/news/new-approach-mitigating-ais-negative-impact

[183] Green, B. (2019). ‘“Good” isn’t good enough’. Proceedings of the AI for Social Good workshop at NeurIPS. Available at: http://ai.ethicsworkshop.org/Library/LibContentAcademic/GoodNotGoodEnough.pdf

[184] For example, the UK National Data Guardian published the results of a public consultation on how health and care data should be used to benefit the public, which may prove a model for the AI and data science research communities to follow. See: National Data Guardian. (2021). Putting Good Into Practice. A public dialogue on making public benefit assessments when using health and care data. UK Government. Available at: https://www.gov.uk/government/publications/putting-good-into-practice-a-public-dialogue-on-making-public-benefit-assessments-when-using-health-and-care-data

[185] Kerner, H. (2020). ‘Too many AI researchers think real-world problems are not relevant’. MIT Technology Review. Available at: https://www.technologyreview.com/2020/08/18/1007196/ai-research-machine-learning-applications-problems-opinion/

[186] Moss, E. and Metcalf, J. (2020). Ethics Owners. A New Model of Organizational Responsibility in Data-Driven Technology Companies. Data & Society. Available at: https://datasociety.net/library/ethics-owners/

[187] Moss, E. and Metcalf, J. (2020).

[188] Hedgecoe, A. (2015). ‘Reputational Risk, Academic Freedom and Research Ethics Review’. British Sociological Association, 50(3), pp.486–501. Available at: https://journals.sagepub.com/doi/full/10.1177/0038038515590756

[189] Dave, P. and Dastin, J. (2020) ‘Google told its scientists to “strike a positive tone” in AI research – documents’. Reuters. Available at: https://www.reuters.com/article/us-alphabet-google-research-focus-idUSKBN28X1CB

[190] Simonite, T. (2021). ‘What Really Happened When Google Ousted Timnit Gebru’. Wired. Available at: https://www.wired.com/story/google-timnit-gebru-ai-what-really-happened/

[191] Ferretti, A., Ienca, M., Sheehan, M. et al. (2021). ‘Ethics review of big data research: What should stay and what should be reformed?’. BMC Medical Ethics, 22(1), pp. 1–13. Available at: https://bmcmedethics.biomedcentral.com/articles/10.1186/s12910-021-00616-4

[192] Smith, J. J., Amershi, S., Barocas, S. et al. (2022). ‘REAL ML: Recognizing, Exploring, and Articulating Limitations of Machine Learning Research’. 2022 ACM Conference on Fairness, Accountability, and Transparency (FaccT ’22). Available at: https://facctconference.org/static/pdfs_2022/facct22-47.pdf

[193] Ada Lovelace Institute. (2021). Algorithmic impact assessment: a case study in healthcare. Available at: https://www.adalovelaceinstitute.org/project/algorithmic-impact-assessment-healthcare/

[194] Zaken, M. van A. (2022). Impact Assessment Fundamental Rights and Algorithms. The Ministry of the Interior and Kingdom Relations. Available at: https://www.government.nl/documents/reports/2022/03/31/impact-assessment-fundamental-rights-and-algorithms; Government of Canada. (2021). Algorithmic Impact Assessment Tool. Available at: https://www.canada.ca/en/government/system/digital-government/digital-government-innovations/responsible-use-ai/algorithmic-impact-assessment.html

[195] Jensen, B. (2021). ‘A New Approach To Mitigating AI’s Negative Impact’. Institute for Human-Centered Artificial Intelligence. Available at: https://hai.stanford.edu/news/new-approach-mitigating-ais-negative-impact

[196] Bernstein, M. S., Levi, M., Magnus, D. et al. (2021). ‘ESR: Ethics and Society Review of Artificial Intelligence Research’. arXiv. Available at: https://arxiv.org/abs/2106.11521

[197] Center for Advanced Study in the Behavioral Sciences at Stanford University. ‘Ethics & Society Review – Stanford University’. Available at: https://casbs.stanford.edu/ethics-society-review-stanford-university

[198] Sendak, M., Elish, M.C., Gao, M. et al. (2020). ‘“The Human Body Is a Black Box”: Supporting Clinical Decision-Making with Deep Learning.’ FAT* ‘20: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pp. 99–109. Available at: https://doi.org/10.1145/3351095.3372827

[199] Samuel, G. and Derrick, D. (2020). ‘Defining ethical standards for the application of digital tools to population health research’. Bulletin of the World Health Organization Supplement, 98(4), pp. 239–244. Available at: https://pubmed.ncbi.nlm.nih.gov/32284646/

[200] Kawas, S., Yuan, Y., DeWitt, A. et al (2020). ‘Another decade of IDC research: Examining and reflecting on values and ethics’. IDC ’20: Proceedings of the Interaction Design and Children Conference, pp. 205–215. Available at: https://dl.acm.org/doi/abs/10.1145/3392063.3394436

[201] Burr, C. and Leslie, D. (2021). ‘Ethical Assurance: A Practical Approach to the Responsible Design, Development, and Deployment of Data-Driven Technologies’. Social Science Research Network. Available at: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3937983

[202] Sandler, R. and Basel, J. (2019). Building Data and AI Ethics Committees, p. 19. Accenture. Available at: https://www.accenture.com/_acnmedia/pdf-107/accenture-ai-and-data-ethics-committee-report-11.pdf

 

[203] UK Statistics Authority. Ethics Self-Assessment Tool. Available at: https://uksa.statisticsauthority.gov.uk/the-authority-board/committees/national-statisticians-advisory-committees-and-panels/national-statisticians-data-ethics-advisory-committee/ethics-self-assessment-tool/

[204] Ferretti, A., Ienca, M., Sheehan, M. et al. (2021). ‘Ethics review of big data research: What should stay and what should be reformed?’. BMC Medical Ethics, 22(1), pp. 1–13. Available at: https://bmcmedethics.biomedcentral.com/articles/10.1186/s12910-021-00616-4

[205] Ferretti, A., Ienca, M., Sheehan, M. et al. (2021).

[206] The concept of ‘ethical assurance’ is a process-based form of project governance that supports inclusive and participatory ethical deliberation while also remaining grounded in social and technical realities. See: Burr, C. and Leslie, D. (2021). ‘Ethical Assurance: A Practical Approach to the Responsible Design, Development, and Deployment of Data-Driven Technologies’. Social Science Research Network. Available at: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3937983

[207] Centre for Data Ethics and Innovation (2022). The roadmap to an effective AI assurance ecosystem. UK Government. Available at: https://www.gov.uk/government/publications/the-roadmap-to-an-effective-ai-assurance-ecosystem

[208] d’Aquin, M., Troullinou, P., O’Connor, N. E. et al. (2018). ‘Towards an “Ethics by Design” Methodology for AI research projects’. AIES ’18: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pp. 54–59. Available at: https ://dl.acm.org/doi/abs/10.1145/3278721.3278765

[209] d’Aquin, M., Troullinou, P., O’Connor, N. E. et al. (2018).

[210] Dove, E. (2020). Regulatory Stewardship of Health Research: Navigating Participant Protection and Research Promotion. Edward Elgar.

[211] Ferretti, A., Ienca, M., Sheehan, M. et al. (2021). ‘Ethics review of big data research: What should stay and what should be reformed?’. BMC Medical Ethics, 22(1), pp. 1–13. Available at: https://bmcmedethics.biomedcentral.com/articles/10.1186/s12910-021-00616-4

[212] d’Aquin, M., Troullinou, P., O’Connor, N. E. et al. (2018). ‘Towards an “Ethics by Design” Methodology for AI research projects’. AIES ’18: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pp. 54–59. Available at: https ://dl.acm.org/doi/abs/10.1145/3278721.3278765

[213] Ferretti, A., Ienca, M., Sheehan, M. et al. (2021). ‘Ethics review of big data research: What should stay and what should be reformed?’. BMC Medical Ethics, 22(1), pp. 1–13. Available at: https://bmcmedethics.biomedcentral.com/articles/10.1186/s12910-021-00616-4

[214] Ferretti, A., Ienca, M., Sheehan, M. et al (2021).

[215] Source: Sandler, R. and Basl, J. (2019). Building Data and AI Ethics Committees, p. 19. Accenture. Available at: https://www.accenture.com/_acnmedia/pdf-107/accenture-ai-and-data-ethics-committee-report-11.pdf

[216] See: Ada Lovelace Institute. (2022). Looking before we leap: Case studies. Available at: https://www.adalovelaceinstitute.org/resource/research-ethics-case-studies/

[217] Department of Health and Social Care. (2021). A guide to good practice for digital and data-driven health technologies. UK Government. Available at: https://www.gov.uk/government/publications/code-of-conduct-for-data-driven-health-and-care-technology/initial-code-of-conduct-for-data-driven-health-and-care-technology

[218] Go Fair. Fair principles. Available at: https://www.go-fair.org/fair-principles/

[219] Digital Curation Centre (DCC). ‘List of metadata standards’. Available at: https://www.dcc.ac.uk/guidance/standards/metadata/list

[220] Partnership on AI. (2021). Responsible Sourcing of Data Enrichment Services. Available at: https://partnershiponai.org/paper/responsible-sourcing-considerations/

[221] Northwestern University. (2014). Guidelines for Academic Requesters. Available at: https://irb.northwestern.edu/docs/guidelinesforacademicrequesters-1.pdf

[222]Partnership on AI. AI Incidents Database. Available at: https://partnershiponai.org/workstream/ai-incidents-database/

[223] AIAAIC. AIAAIC Repository. Available at: https://www.aiaaic.org/aiaaic-repository

 

[224] DeepMind. (2022). ‘How our principles helped define Alphafolds release’. Available at: https://www.deepmind.com/blog/how-our-principles-helped-define-alphafolds-release

[225] Jobin, A., Ienca, M. and Vayena, E. (2019). ‘The global landscape of AI ethics guidelines’. Nature, 1, pp. 389–399. Available at : https://doi.org/10.1038/s42256-019-0088-2

[226] Jobin, A., Ienca, M. and Vayena, E. (2019).

[227] Ferretti, A., Ienca, M., Sheehan, M. et al. (2021). ‘Ethics review of big data research: What should stay and what should be reformed?’. BMC Medical Ethics, 22(1), pp. 1–13. Available at: https://bmcmedethics.biomedcentral.com/articles/10.1186/s12910-021-00616-4

[228] Dove, E. S. and Garattini, C. (2018). ‘Expert perspectives on ethics review of international data-intensive research: Working towards mutual recognition’. Research Ethics, 14(1), pp. 1–25.

[229] Mitrou, L. (2018). ‘Data Protection, Artificial Intelligence and Cognitive Services: Is the General Data Protection Regulation (GDPR) “Artificial Intelligence-Proof”?’. Social Science Research Network. Available at: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3386914

[230] Information Commissioner’s Office (ICO). Guidance on AI and data protection. Available at: https://ico.org.uk/for-organisations/guide-to-data-protection/key-dp-themes/guidance-on-ai-and-data-protection/

[231] Samuel, G., Derrick, G. E. and Van Leeuwen, T. (2019). ‘The ethics ecosystem: Personal ethics, network governance and regulating actors governing the use of social media research data’. Minerva, 57(3), pp. 317–343. Available at: https://link.springer.com/article/10.1007/s11024-019-09368-3

[232] Ferretti, A., Ienca, M., Sheehan, M. et al. (2021). ‘Ethics review of big data research: What should stay and what should be reformed?’. BMC Medical Ethics, 22(1), pp. 1–13. Available at: https://bmcmedethics.biomedcentral.com/articles/10.1186/s12910-021-00616-4

[233]  Ashurst, C., Anderljung, M., Prunkl, C. et al. (2020). ‘A Guide to Writing the NeurIPS Impact Statement’. Centre for the Governance of AI. Available at: https://medium.com/@GovAI/a-guide-to-writing-the-neurips-impact-statement-4293b723f832

[234] Castelvecchi, D. (2020). ‘Prestigious AI meeting takes steps to improve ethics of research’. Nature, 589(7840), pp. 12–13. Available at: https://doi.org/10.1038/d41586-020-03611-8

[235] NeurIPS. (2021). NeurIPS 2021 Paper Checklist Guidelines. Available at: https://neurips.cc/Conferences/2021/PaperInformation/PaperChecklist

[236] Canadian Institute for Advanced Research, Partnership on AI and Ada Lovelace Institute. (2022). A culture of ethical AI: report. Available at: https://www.adalovelaceinstitute.org/event/culture-ethical-ai-cifar-pai/

[237] Gardner, A., Smith, A. L., Steventon, A. et al. (2021). ‘Ethical funding for trustworthy AI: proposals to address the responsibilities of funders to ensure that projects adhere to trustworthy AI practice’. AI and Ethics, 2. pp.1–15. Available at: https://link.springer.com/article/10.1007/s43681-021-00069-w

[238] Provost, F. and  Fawcett T. (2013). ‘Data science and its relationship to big data and data-driven decision making’. Big Data, 1(1), pp. 51–59.

[239] We borrow from the definition used by the European Commission’s High Level Expert Group on AI. See: European Commission. (2019). Ethics guidelines for trustworthy AI. Available at: https://digital-strategy.ec.europa.eu/en/library/ethics-guidelines-trustworthy-ai

[240] The Alan Turing Institute. ‘About us’. Available at: https://www.turing.ac.uk/about-us

[241] The Turing Way Community et al. (2019). The Turing Way: A Handbook for Reproducible Data Science. Available at: https://the-turing-way.netlify.app/welcome

[242] The Turing Way Community et al. (2020). Guide for Reproducible Research. Available at: https://the-turing-way.netlify.app/reproducible-research/reproducible-research.html

[243] The Turing Way Community et al. (2020). Guide for Project Design. Available at: https://the-turing-way.netlify.app/project-design/project-design.html

[244] The Turing Way Community et al. (2020). Guide for Communication. Available at: https://the-turing-way.netlify.app/communication/communication.html

[245] The Turing Way Community et al. (2020). Guide for Collaboration. Available at: https://the-turing-way.netlify.app/collaboration/collaboration.html

[246]The Turing Way Community et al. (2020). Guide for Ethical Research. Available at: https://the-turing-way.netlify.app/ethical-research/ethical-research.html

[247] See: https://saildatabank.com/

[248] University of Exeter. (2021). Ethics Policy. Available at: https://www.exeter.ac.uk/media/universityofexeter/governanceandcompliance/researchethicsandgovernance/Ethics_Policy_Revised_November_2020.pdf

 

[249] University of Exeter. (2021). Research Ethics Policy and Framework. Available at: https://www.exeter.ac.uk/media/universityofexeter/governanceandcompliance/researchethicsandgovernance/Revised_UoE_Research_Ethics_Framework_v1.1_07052021.pdf

[250] University of Exeter (2021).

1–12 of 15

Skip to content

This report sets out the first-known detailed proposal for the use of an algorithmic impact assessment for data access in a healthcare context – the UK National Health Service (NHS)’s proposed National Medical Imaging Platform (NMIP).

It proposes a process for AIAs, which aims to ensure that algorithmic uses of public-sector data are evaluated and governed to produce benefits for society, governments, public bodies and technology developers, as well as the people represented in the data and affected by the technologies and their outcomes

This includes actionable steps for the AIA process, alongside more general considerations for the use of AIAs in other public and private-sector contexts.

This report is supported by a sample user guide and template for the process.

  1. User guide (HTML, PDF)
  2. AIA template (HTML)

Glossary of abbreviated terms

 

AIA: Algorithmic impact assessment

DAC: Data Access Committee

NMIP: National Medical Imaging Platform

Executive summary

Governments, public bodies and developers of artificial intelligence (AI) systems are becoming interested in algorithmic impact assessments (referred to throughout this report as ‘AIAs’) as a means to create better understanding of and accountability for potential benefits and harms from AI systems. At the same time – as a rapidly growing area of AI research and application – healthcare is recognised as a domain where AI has the potential to bring significant benefits, albeit with wide-ranging implications for people and society.

This report offers the first-known detailed proposal for the use of an algorithmic impact assessment for data access in a healthcare context – the UK National Health Service (NHS)’s proposed National Medical Imaging Platform (NMIP). It includes actionable steps for the AIA process, alongside more general considerations for the use of AIAs in other public
and private-sector contexts.

There are a range of algorithmic accountability mechanisms being used in the public sector, designed to hold the people and institutions that design and deploy AI systems accountable to those affected by them.[footnote]Ada Lovelace Institute, AI Now Institute, Open Government Partnership. (2021). Algorithmic accountability for the public sector. Open Government Partnership. Available at: https://www.opengovpartnership.org/documents/algorithmic-accountability-public-sector/[/footnote] AIAs are an emerging mechanism, proposed as a method for building algorithmic accountability, as they have the potential to help build public
trust, mitigate potential harm and maximise potential benefit of AI systems.

Carrying out an AIA involves assessing possible societal impacts of an AI system before implementation (with ongoing monitoring often advised).[footnote]Ada Lovelace Institute and DataKind UK. (2020). Examining the black box: tools for assessing AI systems. Ada Lovelace Institute. Available at: https://www.adalovelaceinstitute.org/report/examining-the-black-box-tools-for-assessing-algorithmic-systems/[/footnote]

AIAs are not a complete solution for accountability on their own: they are best complemented by other algorithmic accountability initiatives, such as audits or transparency registers.

AIAs are currently largely untested in public-sector contexts. This project synthesises existing literature with new research to propose both a use case for AIA methods and a detailed process for a robust algorithmic impact assessment. This research has been conducted in the context of a specific example of an AIA in a healthcare setting, to explore the potential for this accountability mechanism to help data-driven innovations to fulfil their potential to support new practices in healthcare.

In the UK, the national Department for Health and Social Care and the English National Health Service (NHS) are supporting public and private sector AI research and development, by enabling access for developers and researchers to high-quality medical imaging datasets to train and validate AI systems. However, data-driven healthcare innovations also have the potential to produce harmful outcomes and exacerbate existing health and social inequalities, by undermining patient consent to data use and public trust in AI systems. These impacts can result in serious harm to both individuals and groups who are often ‘left behind’ in provision of health and social care.[footnote]Ada Lovelace Institute. (2021). The data divide. Available at: https://www.adalovelaceinstitute.org/wp-content/uploads/2021/03/Thedata-divide_25March_final-1.pdf[/footnote]

Because of the risk and scale of harm, it is vital that developers of AI-based healthcare systems go through a process of assessing the potential impacts of their system throughout its lifecycle. This can help mitigate possible risks to patients and the public, reduce legal liabilities for healthcare providers who use their system, and build understanding of how the system can be successfully integrated and used by clinicians.

This report offers a proposal for the use of an algorithmic impact assessment for data access in a healthcare context – the proposed National Medical Imaging Platform (NMIP) from the NHS AI Lab. Uniquely, the focus of this research is a context where the public and private sector use of AIAs intersect – a public health body that has created a database of medical imaging records and, as part of the process for granting access, has requested private sector and academic researchers and developers complete an AIA.

This report proposes a seven stage process for algorithmic impact assessments

Building on Ada’s existing work on assessing AI systems,[footnote]Ada Lovelace Institute. (2021). Technical methods for regulatory inspection of algorithmic systems. Available at: https://www. adalovelaceinstitute.org/report/technical-methods-regulatory-inspection/[/footnote] the project evaluates the literature on AIA methods and identifies a model for their use in a particular context. Through interviews with NHS stakeholders, experts in impact assessments and potential ‘users’ of the NMIP, this report explores how an AIA process can be implemented in practice, addressing three questions:

  1. As an emerging methodology, what does an AIA process involve, and what can it achieve?
  2. What is the current state of thinking around AIAs and their potential to produce accountability, minimise harmful impacts, and serve as a tool for the more equitable design of AI systems?
  3. How could AIAs be conducted in a way that is practical, effective, inclusive and trustworthy?

The report proposes a process for AIAs, which aims to ensure that algorithmic uses of public-sector data are evaluated and governed to produce benefits for society, governments, public bodies and technology developers, as well as the people represented in the data and affected by the technologies and their outcomes.

The report findings include actionable steps to help the NHS AI Lab establish this process, alongside more general considerations for the use of AIAs in other public and private-sector contexts. The proposed process this report recommends the NHS AI Lab adopts
includes seven steps:

  1. AIA reflexive exercise: an impact-identification exercise is completed by the applicant team(s) and submitted to the NMIP Data Access Committee (DAC) as part of the NMIP filtering. This templated exercise prompts teams to detail the purpose, scope and intended use of the proposed system, model or research, and who will be affected. It also provokes reflexive thinking about common ethical concerns, consideration of intended and unintended consequences and possible measures to help mitigate any harms.
  2. Application filtering: an initial process of application filtering is completed by the NMIP DAC to determine which applicants proceed to the next stage of the AIA.
  3. AIA participatory workshop: an interactive workshop is held, which equips participants with a means to pose questions and pass judgement on the harm and benefit scenarios identified in the previous exercise (and possibly uncovering some further impacts), broadening participation in the AIA process.
  4. AIA synthesis: the applicant team integrates the workshop findings into the template.
  5. Data-access decision: the NMIP DAC makes a decision about whether to grant data access. This decision is based on criteria  relating to the potential risks posed by this system and whether the product team has offered satisfactory mitigations to potentially harmful outcomes.
  6. AIA publication: the completed AIAs are published externally in a central, easily accessible location, probably the NMIP website.
  7. AIA iteration: the AIA is revised on an ongoing basis by project teams, and at certain trigger points, such as a process of significant model redevelopment.

Alongside the AIA process detail, this report outlines seven ‘operational questions’ for policymakers, developers and researchers to consider before beginning to develop or implement an AIA:

  1. How to navigate the immaturity of the wider assessment ecosystem?
  2. What groundwork is required prior to the AIA?
  3. Who can conduct the assessment?
  4. How to ensure meaningful participation in defining and identifying impacts?
  5. What is the artefact of the AIA and where can it be published?
  6. Who will act as a decisionmaker about the suitability of the AIA and the acceptability of the impacts it documents?
  7. How will trials be resourced, evaluated and iterated?

The report offers a clear roadmap towards the implementation of an AIA

In conclusion, the report offers a clear roadmap towards the implementation of an AIA. It will be of value to policymakers, public institutions and technology developers interested in algorithmic accountability mechanisms who need a high-level understanding of the process and its specific uses, alongside generalisable findings. It will also be useful for people interested in participatory methods for data governance (following on from our Participatory data stewardship report).[footnote]Ada Lovelace Institute. (2021). Participatory data stewardship. Available at: https://www.adalovelaceinstitute.org/report/participatorydata-stewardship/[/footnote]

In addition, for technology developers with an AI system that needs to go through an AIA process or data controllers requiring external applicants to complete an AIA as part of data-access process, the report offers a detailed understanding of the process through supporting
documentation.

This documentation includes a step-by-step guide to completing the AIA for applicants to the NMIP, and a sample AIA output template, modelled on the document NMIP applicant teams would submit with a data-access application.

Introduction

This project explores the potential for the use of AIAs in a real-world case study: AI in medical imaging

Rapid innovation in the use of analytics and data-driven technology (including AI) is shaping almost every aspect of our daily lives. The healthcare sector has seen significant growth in applications of data and AI, from automated diagnostics and personalised medicine to the analysis of medical imaging for screening, diagnosis and triage. The healthcare sector has seen a substantial surge in attempts to utilise AI and data-driven techniques to make existing tasks like diagnostic prediction more efficient and reimagine new ways of delivering more personalised forms of healthcare.[footnote]Bohr, A. and Memarzadeh. K. (2020). ‘The rise of artificial intelligence in healthcare applications’. Artificial Intelligence in Healthcare, pp.25-60. Available at: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7325854/[/footnote]

However, while data-driven innovation holds the potential to revolutionise healthcare, it also has the potential to exacerbate health inequalities and increase demand on an already overstretched health and social care system. The risks of deploying AI and data-driven technologies in the health system include, but are not limited to:

  • The perpetuation of ‘algorithmic bias’,[footnote]Angwin, J., Larson, J., Mattu, S. and Kirchnir, L. (2016). ‘Machine bias’. ProPublica. Available at: https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing[/footnote] exacerbating health inequalities by replicating entrenched social biases and racism in existing systems.[footnote]Barocas, S. and Selbst, A. D. (2016). ‘Big data’s disparate impact’. California Law Review, 104, pp. 671- 732. [online] Available at: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2477899[/footnote]  [footnote]Buolamwini, J. and Gebru, T. (2018). ‘Gender shades: intersectional accuracy disparities in commercial gender classification’. Conference on Fairness, Accountability and Transparency, pp.1-15.[online] Available at: https://proceedings.mlr.press/v81/ buolamwini18a/buolamwini18a.pdf[/footnote]  [footnote]Miller, C. (2015). ‘When algorithms discriminate’. The New York Times. Available at: https://www.nytimes.com/2015/07/10/upshot/ when-algorithms-discriminate.html[/footnote]
  • Inaccessible language or lack of transparent explanations can make it hard for clinicians, patients and the public to understand the technologies and their uses, undermining public scrutiny and accountability.
  • The collection of personal data, tracking and the normalisation of surveillance, creating risks to individual privacy.

By exploring the applicability of AIAs toward a healthcare case study of medical imaging, we hope to gain a richer understanding of how AIAs should be adopted in practice

This project explores the potential for use of one approach to algorithmic accountability, algorithmic impact assessments or ‘AIAs’ (see: ‘What is an algorithmic impact assessment?’), in a real-world case study: AI in medical imaging. AIAs are an emerging approach for holding the people and institutions that design and deploy AI systems accountable to those who are affected by them, and a way to pre-emptively identify potential impacts arising from the design, development and deployment of algorithms on people and society.

The site of research is unique among existing uses of AIAs, being located in the domain of healthcare, which is significantly regulated with a strong tradition of ethical awareness and the importance of public participation. It is also likely to produce ‘high-risk’ applications.

While many AIA proposals have focused on public-sector uses of AI[footnote]Reisman, D., Schultz, J., Crawford, K. and Whittaker, M. (2018). Algorithmic impact assessments: a practical framework for public agency accountability. AI Now Institute. Available at: https://ainowinstitute.org/aiareport2018.pdf[/footnote]  [footnote]Government of Canada. (2020). Directive on Automated Decision-Making. Available at: https://www.tbs-sct.gc.ca/pol/doc-eng. aspx?id=32592[/footnote]  [footnote]Ada Lovelace Institute, AI Now Institute, Open Government Partnership.(2021). Algorithmic accountability for the public sector. Open Government Partnership[/footnote] (AIAs have not yet been adopted in the private sector), and there may be a health-related AIA completed under the Canadian AIA framework, this study looks at applications at the intersection of a public and private-sector data-access process. Applications in this context are developed on data originating in the public sector, by a range of mainly private actors, but with some oversight from a public-sector department (the NHS).

This new AIA is proposed as part of a data-access process for a public-sector dataset – the National Medical Imaging Platform (NMIP). This is, to our knowledge, unique in AIAs so far. Where other proposals for AIAs have used legislation or independent assessors, this model uses a Data Access Committee (DAC) as a forum for holding developers accountable – to require the completion of the AIA, to evaluate the AIA and to prevent a project proceeding (or at least, proceeding with NHS data) if the findings are not satisfactory.

These properties provide a unique context, and also have implications for the design of this AIA, which should be considered by anyone looking to apply parts of this process in another domain or context. It is expected that elements of this process, such as the AIA template and exercise formats, to prove transferrable.

Some aspects, including using a DAC as the core accountability mechanism, and the centralisation of publication and resourcing for the participatory workshops, will not be directly transferable to all other cases but should form a sound structural basis for thinking about alternative solutions.

The generalisable findings to emerge from this research should be valuable to the regulators, policymakers and healthcare providers like the NHS, who will need to use a variety of tools and approaches to assess the potential and actual impacts of AI systems operating in the healthcare environment. In Examining the Black Box, we surveyed the state of the field in data-driven technologies and identified four notable methodologies under development, including AIAs,[footnote]Ada Lovelace Institute and DataKind UK. (2020). Examining the Black Box: tools for assessing algorithmic systems. Available at: https://www.adalovelaceinstitute.org/report/examining-the-black-box-tools-for-assessing-algorithmic-systems/[/footnote] and our study of algorithmic accountability mechanisms for the public sector identifies AIAs as forming part of the typology of other policies currently in use globally, including transparency mechanisms, audits and regulatory inspection, and independent oversight bodies.[footnote]Ada Lovelace Institute, AI Now Institute and Open Government Partnership. (2021). Algorithmic accountability for the public sector. Open Government Partnership. Available at: https://www.opengovpartnership.org/documents/algorithmic-accountabilitypublic- sector/[/footnote]

These tools and approaches are still very much in their infancy, with little consensus on how and when to apply them and what their stated aims should be, and few examples of these tools in practice. Most evidence for the usefulness of AIAs at present has come from examples of impact assessments in other sectors, rather than practical implementation. Accordingly, AIAs cannot be assumed to be ready to roll out.

By exploring the applicability of AIAs toward a healthcare case study of medical imaging – namely, the use of AIAs as part of the data release strategy of the forthcoming National Medical Imaging Platform (NMIP) from the NHS AI Lab, we hope to gain a richer understanding of how AIAs should be adopted in practice, and how such tools can be translated into meaningful algorithmic accountability and, ultimately, better outcomes for people and society.

AI in medical imaging has the potential to optimise existing processes in clinical pathways, support clinicians with decision-making and allow for better use of clinical data, but some have urged developers to adhere to regulation and governance frameworks to assure safety, quality and security and prioritise patient benefit and clinician support.[footnote]Royal College of Radiologists. Policy priorities: Artificial Intelligence. Available at: https://www.rcr.ac.uk/press-and-policy/policypriorities/artificial-intelligence[/footnote]

Understanding algorithmic impact assessments

What is an algorithmic impact assessment?

Algorithmic impact assessments (referred to throughout this report as ‘AIAs’) are a tool for assessing possible societal impacts of an AI system before the system is in use (with ongoing monitoring often advised).[footnote]Ada Lovelace Institute and DataKindUK. (2020). Examining the Black Box: tools for assessing algorithmic systems. Available at: https://www.adalovelaceinstitute.org/report/examining-the-black-box-tools-for-assessing-algorithmic-systems/[/footnote]

They have been proposed by researchers, policymakers and developers as one algorithmic accountability approach – a way to create greater accountability for the design and deployment of AI systems.[footnote]Knowles, B. and Richards, J. (2021). ‘The sanction of authority: promoting public trust in AI’. Computers and Society. Available at: https://arxiv.org/abs/2102.04221[/footnote] The intention of these approaches is to build public trust in the use of these systems, mitigate their potential to cause harm to people and groups,[footnote]Raji, D., Smart, A., White, R. N., Mitchell, M., Gebru, T., Hutchinson, B., Smith-Loud, J., Theron, D. and Barnes, P. (2020). ‘Closing the AI accountability gap: defining an end-to-end framework for internal algorithmic auditing’. Conference on Fairness, Accountability, and Transparency, pp.33–44. Barcelona: ACM. Available at: https://doi.org/10.1145/3351095.3372873[/footnote] and maximise their potential for benefit.[footnote]Leslie, D. (2019). Understanding artificial intelligence ethics and safety: A guide for the responsible design and implementation of AI systems in the public sector. The Alan Turing Institute. Available at: https://www.turing.ac.uk/sites/default/files/2019-06/ understanding_artificial_intelligence_ethics_and_safety.pdf[/footnote]

AIAs build on the broader methodology of impact assessments, a type of policy assessment with a long history of use in other domains, such as finance, cybersecurity and environmental studies.[footnote]Reisman, D., Schultz, J., Crawford, K. and Whittaker, M. (2018). Algorithmic impact assessments: a practical framework for public agency accountability. AI Now Institute. Available at: https://ainowinstitute.org/aiareport2018.pdf[/footnote] Other closely related types of impact assessments include data protection impact assessments (DPIAs), which evaluate the impact of a technology or policy on individual data privacy rights, and human rights impact assessments (HRIAs), originating in the development sector but increasingly used to assess the human rights impacts of business practices and technologies.[footnote]Recent examples include Facebook’s ex post HRIA of their platform’s effects on the genocide in Myanmar, and Microsoft’s HRIA of its use of AI. See: Latonero, M. and Agarwal, A. (2021). Human rights impact assessments for AI: learning from Facebook’s failure in Myanmar. CARR Center for Human Rights Policy Harvard Kennedy School. Available at: https://carrcenter.hks.harvard.edu/files/ cchr/files/210318-facebook-failure-in-myanmar.pdf; Article One. Challenge: From 2017 to 2018, Microsoft partnered with Article One to conduct the first-ever Human Rights Impact Assessment (HRIA) of the human rights risks and opportunities related to artificial intelligence (AI). Available at: https://www.articleoneadvisors.com/case-studies-microsoft[/footnote]

AIAs encourage developers of AI systems to consider the potential impacts of the development and implementation of their system

Conducting an impact assessment provides actors with a way to assess and evaluate the potential economic, social and environmental impacts of a proposed policy or intervention.[footnote]Adelle, C. and Weiland, S. (2012). ‘Policy assessment: the state of the art’. Impact Assessment and Project Appraisal 30.1, pp. 25- 33 Available at: https://www.tandfonline.com/doi/full/10.1080/14615517.2012.663256[/footnote] Some impact assessments are conducted prior to launching a policy or project as a way to foresee potential risks, known as ex ante assessments, while others are launched once the policy or project is already in place, to evaluate how the project went – known as ex post.

Unlike other impact assessments, AIAs specifically encourage developers of AI systems to consider the potential impacts of the development and implementation of their system. Will this system affect certain individuals disproportionately more than others? What kinds of socio-environmental factors – such as stable internet connectivity or a reliance on existing hospital infrastructure – will determine its success or failure? AIAs provide an ex ante assessment of these kinds of impacts and potential mitigations at the earliest stages of an AI system’s development.

Current AIA practice in the public and private sectors

AIAs are currently not widely used in either public or private sector contexts and there is no single accepted standard, or ‘one size fits all’, methodology for their use.

AIAs were first proposed by the AI Now Institute as a detailed framework for underpinning accountability in public sector agencies that engages communities impacted by the use of public sector algorithmic decision-making,[footnote]Reisman, D., Schultz, J., Crawford, K. and Whittaker, M. (2018). Algorithmic impact assessments: a practical framework for public agency accountability. AI Now Institute. Available at: https://ainowinstitute.org/aiareport2018.pdf[/footnote] building from earlier scholarship that
proposed the use of ‘algorithmic impact statements’ as a way to manage predictive policing technologies.[footnote]Selbst, A.D. (2017). ‘Disparate impact in big data policing’. 52 Georgia Law Review 109, pp.109-195. Available at: https://papers.ssrn. com/sol3/papers.cfm?abstract_id=2819182[/footnote]

Though consensus is growing over the importance of principles for the development and use of AI systems like accountability, transparency and fairness, individual priorities and organisational interpretation of these terms differ. The lack of consistency with these concepts means not all AIAs are designed to achieve the same ends, and the process for conducting AIAs will depend on the specific context in which they are implemented.[footnote]Metcalf, J., Moss, E., Watkins, E.A., Ranjit, S. and Elish, M.C. (2021). ‘Algorithmic impact assessments and accountability: the coconstruction of impacts’. Conference on Fairness Accountability, and Transparency [online] Available at: https://dl.acm.org/doi/ pdf/10.1145/3442188.3445935[/footnote]

Recent scholarship from Data & Society identifies 10 ‘constitutive components’ as common to different types of impact assessment, and that are necessary for inclusion in any AIA. These include a ‘source of legitimacy’, the idea that an impact assessment must be legally mandated and enforced through another institutional structure such as a government agency, and a relational dynamic between stakeholders, the accountable actor and an accountability forum that describe how accountability relationships are formed.

In an ‘actor – forum’ relationship, an actor should be able to explain and justify conduct to an external forum, who are able to pass judgement.[footnote]Wieringa, M. (2020). ‘What to account for when accounting for algorithms: a systematic literature review on algorithmic accountability’. Conference on Fairness, Accountability, and Transparency, pp.1-18 [online] Barcelona: ACM. Available at: https://dl.acm.org/ doi/10.1145/3351095.3372833[/footnote]
Other components include ‘public consultation’, involving gathering feedback from external perspectives for evaluative purposes, and ‘public access’, which gives members of the public access to crucial material about the AIA, such as its procedural elements, in order to further build accountability.[footnote]Moss, E., Watkins, E.A., Singh, R., Elish, M.C. and Metcalf, J. (2021). Assembling accountability: algorithmic impact assessment for the public interest. Data & Society. Available at: https://datasociety.net/library/assembling-accountability-algorithmic-impactassessment- for-the-public-interest/[/footnote]

While varied approaches to AIAs have been proposed in theory, only one current model of AIA exists in practice, authorised by the Treasury Board of Canada Secretariat’s Directive on Automated Decision-Making,[footnote]Government of Canada. (2020). Directive on Automated Decision-Making. Available at: https://www.tbs-sct.gc.ca/pol/doc-eng. aspx?id=32592[/footnote] aimed at Canadian civil servants and used to manage public-sector AI delivery and procurement standards. The lack of more practical examples of AIAs is a known deficiency in the literature.

The lack of real-world examples and practical difficulty for institutions implementing AIAs remains a concern for those advocating for their widespread adoption, particularly as part of policy interventions.

An additional consideration is the inclusion of a diverse range of perspectives in the process of its development. Most AIA processes are controlled and determined by decision-makers in the algorithmic process, with less emphasis on the consultation of outside perspectives, including the experiences of those most impacted by the algorithmic deployment. As a result, AIAs are at risk of adopting an incomplete or incoherent view of potential impacts, divorced from these lived experiences.[footnote]Katell, M., Young, M., Dailey, D., Herman, B., Guetler, V., Tam, A., Bintz, C., Raz, D. and Krafft, P. M. (2020). ‘Toward situated interventions for algorithmic equity: lessons from the field’. Conference on Fairness, Accountability, and Transparency pp.44-45 [online] ACM: Barcelona. Available at: https://dl.acm.org/doi/abs/10.1145/3351095.3372874[/footnote] To practically seek and integrate those perspectives into the final AIA output has proven to be a difficult and ill-defined undertaking, with the required guidance being largely unavailable.

Canadian algorithmic impact assessment model

 

At the time of writing, the Canadian AIA is the only known and recorded AIA process implemented in practice. The Canadian AIA is a procurement management tool adopted under the Directive on Automated Decision-Making, aiming to guide policymakers into best practice use and procurement of AI systems that might be used to help govern service delivery at the federal level.

 

The Directive draws from administrative law principles of procedural fairness, accountability, impartiality and rationality,[footnote]Scassa, T. (2020). Administrative law and the governance of automated decision-making: a critical look at Canada’s Directive on Automated Decision-Making. Forthcoming, University of British Columbia Law Review. Available at: https://papers.ssrn.com/sol3/ papers.cfm?abstract_id=3722192[/footnote] and is aimed at all AI systems that are used to make a decision about an individual.[footnote]Government of Canada. (2020). Directive on Automated Decision-Making. Available at: https://www.tbs-sct.gc.ca/pol/doc-eng. aspx?id=32592[/footnote] One of the architects of the AIA, Noel Corriveau, considers a merit of impact assessments is to facilitate compliance with legal and regulatory requirements.[footnote]Karlin, M. and Corriveau, N. (2018). ‘The Government of Canada’s Algorithmic Impact Assessment: Take Two’. Supergovernance. Available at: https://medium.com/@supergovernance/the-government-of-canadas-algorithmic-impact-assessment-take-two- 8a22a87acf6f[/footnote] 

 

The AIA itself consists of an online questionnaire of eight sections containing 60 questions related to technical attributes of the AI system, the data underpinning it and how the system designates decision-making, and frames ‘impacts’ as the ‘broad range of factors’ that may arise because of a decision made by, or supported by, an AI system. Four categories of ‘impacts’ are utilised in this AIA: the rights of individuals, health and wellbeing of individuals, economic interests of individuals and impacts on the ongoing sustainability of an environmental ecosystem.

 

Identified impacts are ranked according to a sliding scale, from little to no impact to very high impact, and weighted to produce a final impact score. Once complete, the AIA is exported to PDF format and published on the Open Canada website. At the time of writing, there are four completed Canadian AIAs, providing useful starting evidence for how AIAs might be documented and published.

Many scholars and practitioners consider AIAs to hold great promise in assessing the possible impacts of the use of AI systems within the public sector, including applications that range from law enforcement to welfare delivery.[footnote]Margetts, H. and Dorobantu, C. (2019). ‘Rethink government with AI’. Nature. Available at: https://www.nature.com/articles/d41586- 019-01099-5[/footnote] For instance, the AI Now Institute’s proposed AIA sets out a process intended to build public agency accountability and public trust.[footnote]Reisman, D., Schultz, J., Crawford, K. and Whittaker, M. (2018). Algorithmic impact assessments: a practical framework for public agency accountability. AI Now Institute. Available at: https://ainowinstitute.org/aiareport2018.pdf[/footnote] 

As we explored in Algorithmic accountability for the public sector, AIAs can be considered part of a wider toolkit of algorithmic accountability policies and approaches adopted globally, including algorithm auditing,[footnote]Raji, D., Smart, A., White, R. N., Mitchell, M., Gebru, T., Hutchinson, B., Smith-Loud, J., Theron, D. and Barnes, P. (2020). ‘Closing the AI accountability gap: defining an end-to-end framework for internal algorithmic auditing’. Conference on Fairness, Accountability, and Transparency, pp.33–44. Barcelona: ACM. Available at: https://doi.org/10.1145/3351095.3372873[/footnote] and algorithm transparency registers.[footnote]Transparency registers document and make public the contexts where algorithms and AI systems are in use in local or federal Government, and have been adopted in cities including Helsinki, see: City of Helsinki AI register. What is the AI register? Available at: https://ai.hel.fi/ and Amsterdam, see: Amsterdam Algorithm Register Beta. What is the algorithm register? Available at: https:// algoritmeregister.amsterdam.nl/[/footnote] 

Other initiatives have been devised as ‘soft’ self-assessment frameworks, to be used alongside an organisation or institution’s existing ethics and norms guidelines, or in deference to global standards like the IEEE’s AI Standards or the UN Guiding Principles on Business and Human Rights. These kinds of initiatives often relay some flexibility on recommendations to suit specific use cases, as seen in the European Commission’s High-level Expert Group on AI’s assessment list for trustworthy AI.[footnote]European Commission. (2020). Assessment List for Trustworthy Artificial Intelligence (ALTAI) for self-assessment. Available at: https://digital-strategy.ec.europa.eu/en/library/assessment-list-trustworthy-artificial-intelligence-altai-self-assessment[/footnote] 

While many proponents of AIAs from civil society and academia see them as a method for improving public accountability,[footnote]Binns, R. (2018). ‘Algorithmic accountability and public reason’. Philosophy & Technology, 31, pp.543-556. [online] Available at: https:// link.springer.com/article/10.1007/s13347-017-0263-5[/footnote] AIAs also have scope for adoption within private-sector institutions, under the condition of regulators and public institutions incentivising their adoption and compelling their use in certain private sector contexts. Conversely, AIAs also help provide a lens for regulators to view, understand and pass judgement on institutional cultures and practices.[footnote]Selbst, A.D. (2021). ‘An institutional view of algorithmic impact assessments’, Harvard Journal of Law & Technology (forthcoming). Available at: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3867634[/footnote] The proposed US Algorithm Accountability Act sets out requirements for large private companies to undertake impact assessments in 2019,[footnote]Congress.Gov. (2019). H.R.2231 – Algorithmic Accountability Act of 2019. Available at: https://www.congress.gov/bill/116th-congress/house-bill/2231#:~:text=Introduced%20in%20House%20(04%2F10%2F2019)&text=This%20bill%20requires%20specified%20 commercial,artificial%20intelligence%20or%20machine%20learning[/footnote] with progress on the Act beginning to regain momentum.[footnote]Johnson, K. (2021). ‘The movement to hold AI accountable gains more steam’. Ars Technica. Available at: https://arstechnica.com/ tech-policy/2021/12/the-movement-to-hold-ai-accountable-gains-more-steam/3/[/footnote]

The focus of this case study is on a context where the public and private sector use of AIAs intersect – a public health body has created a database of medical imaging records and, as part of the process for granting access, has requested private-sector and academic researchers and developers complete an AIA. This is a novel context that presents its own unique challenges and learnings (see: ‘Annex 1: Proposed process in detail’), but has also yielded important considerations that we believe are pertinent and timely for other actors interested in AIAs (see: ‘Seven operational questions for AIAs’).

Goals of the NHS AI Lab NMIP AIA process

This report aims to outline a practical design of the AIA process for the NHS AI Lab’s NMIP project. To do this, we reviewed the literature to uncover both areas of consensus and uncertainty among AIA scholars and practitioners, in order to build on and extend existing research. We also interviewed key NHS AI Lab and NMIP stakeholders, employees at research labs and healthtech start-ups who would seek access to the NMIP and experts in algorithmic accountability issues in order to guide the development of our process (see: ‘Methodology’).

As discussed above, AIAs are context-specific and differ in their objectives and assumptions, and their construction and implementation. It is therefore vital that the NMIP AIA has clearly defined and explained goals in order to both communicate the purpose of an AIA for the NMIP
context, and ensure the process works, enabling a thorough, critical and meaningful ex ante assessment of impacts.

This information is important for developers who undertake the AIA process to understand the assumptions behind its method, as well as policymakers interested in algorithmic accountability mechanisms, in order to usefully communicate the value of this AIA and distinguish it from other proposals.

In this context, this AIA process is designed to achieve the following goals:

  1. accountability
  2. reflection/reflexivity
  3. standardisation
  4. independent scrutiny
  5. transparency.

These goals emerged both from literature review and interviews, enabling us to identify areas where the AIA would add value, complement existing governance initiatives and contribute to minimising harmful impacts.

  1. Accountability
    It’s important to have a clear understanding of what accountability means in the context of the AIA process. The definition that is most helpful here understands accountability as a depiction of the social relationship between an ‘actor’ and a ‘forum’, where being accountable describes an obligation of the actor to explain and justify conduct to a forum.[footnote]Bovens, M. (2006). Analysing and assessing public accountability. A conceptual framework. European Governance Papers (EUROGOV) No. C-06-01. Available at: https://www.ihs.ac.at/publications/lib/ep7.pdf[/footnote] An actor in this context might be a key decision-maker within an applicant team, such as a technology developer and project principal investigator. The forum might comprise the arrangement of external stakeholders, such as clinicians who might use the system, members of the Data Access Committee (DAC) and members of the public. The forum must have the capacity to deliberate on the actor’s actions, ask questions, pass judgement and enforce sanctions if necessary.[footnote]Metcalf, J., Moss, E., Watkins, E.A., Ranjit, S. and Elish, M.C. (2021). ‘Algorithmic impact assessments and accountability: the coconstruction of impacts’. Conference on Fairness Accountability, and Transparency [online] Available at: https://dl.acm.org/doi/ pdf/10.1145/3442188.3445935[/footnote]
  2. Reflection/reflexivity
    An AIA process should prompt reflection from developers and critical dialogue with individuals who would be affected by this process about how the design and development of a system might result in certain harms and benefits – to clinicians, patients, and society. Behaving reflexively means examining or responding to one’s – or that of a teams’ – own practices, motives and beliefs during a research process. Reflexivity is an essential principle for completing a thorough, meaningful and critical AIA, closely related to the concept of positionality, which has been developed through work on AI ethics and safety in the public sector.[footnote]Leslie, D. (2019). Understanding artificial intelligence ethics and safety. Available at: https://www.turing.ac.uk/sites/default/ files/2019-08/understanding_artificial_intelligence_ethics_and_safety.pdf[/footnote] Our reflexive exercise enables this practice among developers by providing an actionable framework for discussing ethical considerations arising from the deployment of AI systems, and a forum for exploration of individual biases and ways of viewing and understanding the world.

    The broad participation of a range of perspectives is therefore a critical element of increased awareness in a reflection that includes some level of awareness to positionality. The AIA exercises were built with continual reflexivity in mind, which provide a means for technology developers to examine ethical principles thoroughly during design and development phases.
  3. Standardisation
    Our literature review revealed that while many scholars have proposed possible approaches and methods for an AIA, these tend to be higher-level recommendations for an overall approach. There is little discussion around how individual activities of the AIA should be structured, captured and recorded. A notable exception is the Canadian AIA, which makes use of a questionnaire to capture the impact assessment process, providing a format for the AIA ‘users’ to follow in order to complete the AIA, and for external stakeholders to view once the AIA is published. Some existing data/AI governance processes were confusing for product and development teams. One stakeholder interviewee commented: ‘Not something I’m an expert in – lots of the forms written in language I don’t understand, so was grateful that our information governance chaps took over and made sure I answered the right things within that.’ This underscored the need for a clear and coherent, standardised AIA process to ensure that applicant teams were able to engage fully with the task and that completed AIAs are of a consistent standard. To ensure NMIP applicants find the AIA as effective and practical as possible, and to build consistency between applications, it is important they undergo a clearly defined process that leads to an output that can be easily compared and evaluated. To this end, our AIA process provides a standard template document, both to aid the process and keep relative uniformity between different NMIP applications. Over time, once this AIA has been trialled and tested, we envisage that standardised and consistent applications will also help the DAC and members of the public to begin to develop paradigms of the kinds of harms and benefits that new applicants should consider.

  4. Independent scrutiny
    The goal of independent scrutiny is to provide external stakeholders with the powers to scrutinise, assess and evaluate AIAs and identify any potential issues with process. Many proposed AIAs argue for multistakeholder collaboration,[footnote]Reisman, D., Schultz, J., Crawford, K. and Whittaker, M. (2018). Algorithmic impact assessments: a practical framework for public agency accountability. AI Now Institute. Available at: https://ainowinstitute.org/aiareport2018.pdf[/footnote] but there is a notable gap in procedure for how participation would be structured in an AIA, and how external
    perspectives would be included in the process.
    We sought to address these gaps by building a participatory initiative as part of the NMIP AIA (for more information on the participatory workshop, see: ‘Annex 1: Proposed process in detail’). Independent scrutiny helps to build robust accountability, as it helps to formalise the actor-forum relationship, providing further opportunity for judgement and deliberation among the wider forum.[footnote]Wieringa, M. (2020). ’What to account for when accounting for algorithms: a systematic literature review on algorithmic accountability’. Conference on Fairness, Accountability and Transparency, p.1-18. ACM: Barcelona. Available at: https://dl.acm.org/doi/ abs/10.1145/3351095.3372833[/footnote] AIAs should be routinely scrutinised to ensure they are used and adopted effectively, that teams are confident and critical in their approach to examining impacts, and that AIAs provide continual value.
  5. Transparency
    In this context, we consider AIA transparency as building in critical oversight of the AIA process itself, focusing on making the AIA, as a mechanism of governance, transparent. This differs to making transparent details about the AI system and its logic – what has been referred to as ‘first-order transparency’.[footnote]Kaminski, M. (2020). ‘Understanding transparency in algorithmic accountability’. Cambridge Handbook of the Law of Algorithms, e.d. Woodrow Barfield. Cambridge: Cambridge University Press [online] Available at: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3622657[/footnote] This AIA aims to improve transparency via both internal and external visibility, by prompting applicant teams to document the AIA process and findings, which are then published centrally for members of the public to view. Making this information publicly available provides more information for regulators, civil society organisations and members of the public about what kinds of systems are being developed in the UK healthcare context, and how their societal impacts are understood by those who develop or research them.

In order to achieve these goals, the AIA process and output make use of two principal approaches: documentation and participation.

  1. Documentation
    Thorough recordkeeping is critical to this AIA process and can produce significant benefits for developers and external stakeholders.

    Teams who have access to documentation stating ethical direction are more likely to address ethical concerns with a project at the outset.[footnote]Boyd, K.L (2021). ’Datasheets for datasets help ML engineers notice and understand ethical issues in training data’. Proceedings of the ACM on Human-Computer Interaction, 5, 438, pp.1-27. [online] Available at: https://dl.acm.org/doi/abs/10.1145/3479582[/footnote] Documentation can change internal process and practice, as it necessitates reflexivity, which creates opportunities to better identify, understand and question assumptions and behaviours.

    This shift in internal process may also begin to influence external practice: it has been argued that good AIA documentation process may create what sociologists call ‘institutional isomorphism’, where industry practice begins to homogenise owing to social and normative pressures.[footnote]Selbst, A. (2021). ’An institutional view of algorithmic impact assessments’. 35 Harvard Journal of Law & Technology (forthcoming), pp.1-79. [online] Available at: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3867634[/footnote] Through consistent documentation, teams gain a richer context for present and future analysis and evaluation of the project.
  2. Participation
    Participation is the mechanism for bringing a wider range of perspectives to the AIA process. It can take various forms – from soliciting written feedback through to deliberative workshops – but should always aim to bring the lived experiences of people and communities who are affected by an algorithm to bear on the AIA process.[footnote]See: Ada Lovelace Institute. (2021). Participatory data stewardship. Available at: https://www.adalovelaceinstitute.org/report/ participatory-data-stewardship/ for a framework of different approaches to participation in relation to data-driven technologies and systems.[/footnote]When carried out effectively, participation supports teams in building higher quality, safer and fairer products.[footnote]Madaio, M.A. et al (2020) ’Co-designing checklists to understand organizational challenges and opportunities around fairness in AI’ CHI Conference on Human Factors in Computing Systems, pp.1-14 [online]. Available at: https://doi.org/10.1145/3313831.3376445[/footnote] The participatory workshop in the NMIP AIA (see: ‘Annex 1: Proposed process in detail’ for a full description) enables the process of impact identification to go beyond the narrow scope of the applicant team(s). Building participation into the AIA process brings external scrutiny of an AI healthcare system from outside the teams’ perspective, provides alternate sources of knowledge and relevant lived experience and expertise. It also enables independent review of the impacts of an AI system, as participants are unencumbered by the typical conflicts of interest that may interfere with the ability of project stakeholders to judge their system impartially.

The context of healthcare AI

There is a surge in the development and trialling of AI systems in healthcare.[footnote]Davenport, T. and Kalakota, R. (2019). ‘The potential for artificial intelligence in healthcare’. Future Healthcare Journal, 6,2, pp.94-98. [online] Available at: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6616181/[/footnote] A significant area of growth is the use of AI in medical imaging, where AI imaging systems assist clinicians in cancer screening, supporting diagnosis/prognosis, patient triage and patient monitoring.[footnote]NHS AI Lab. AI in imaging. Available at: https://www.nhsx.nhs.uk/ai-lab/ai-lab-programmes/ai-in-imaging/[/footnote]

The UK Department of Health and Social Care (DHSC) has set out national commitments to support public and private sector AI research and development in healthcare by ensuring that developers and researchers have access to high-quality datasets to train and validate AI models, underlining four guiding principles that steer this effort:

  1. user need
  2. privacy and security
  3. interoperability and openness
  4. inclusion.[footnote]Department of Health and Social Care. (2018). The future of healthcare: our vision for digital, data and technology in health and care. UK Government. Available at: https://www.gov.uk/government/publications/the-future-of-healthcare-our-vision-for-digital-data-andtechnology- in-health-and-care/the-future-of-healthcare-our-vision-for-digital-data-and-technology-in-health-and-care[/footnote]

In the current NHS Long Term Plan, published in 2019, AI is described as a means to improve efficiency across service delivery by supporting clinical decisions, as well as a way to ‘maximise the opportunities for use of technology in the health service’.[footnote]NHS. (2019). The NHS Long Term Plan. Available at: https://www.longtermplan.nhs.uk/wp-content/uploads/2019/08/nhs-long-termplan- version-1.2.pdf[/footnote] Current initiatives to support this drive for testing, evaluation and scale of AI-driven technologies include the AI in Health and Care Award, run by the Accelerated Access Collaborative, in partnership with NHSX (now part of the NHS Transformation Directorate)[footnote]NHSX is now part of the NHS Transformation Directorate. More information is available at: https://www.nhsx.nhs.uk/blogs/nhsxmoves- on/ At the time of research and writing NHSX was a joint unit of NHS England and the UK Department of Health and Social Care that reported directly to the Secretary of State and the Chief Executive of NHS England and NHS Improvement. NHSX was also the parent organisation of the NHS AI Lab. For the remainder of the report, ‘NHSX’ will be used to refer to this organisation.[/footnote] and the National Institute for Health Research (NIHR).

However, while data-driven healthcare innovation holds the potential to support new practices in healthcare, careful research into the integration of AI systems in clinical practice is needed to ground claims of model performance and to uncover where systems would be most beneficial in the context of particular clinical pathways. For example, a recent systematic review of studies measuring test accuracy of AI in mammography screening practice has revealed that radiologists still outperform the AI in detection of breast cancer.[footnote]Freeman, K., Geppert, J., Stinton, C., Todkill, D., Johnson, S., Clarke, A. and Taylor-Phillips, S. (2021). ‘Use of artificial intelligence for image analysis in breast cancer screening programmes: systematic review of test accuracy’. British Medical Journal 2021, 374 [online] Available at: https://pubmed.ncbi.nlm.nih.gov/34470740/[/footnote]

To ensure healthcare AI achieves the benefits society hopes for, it is necessary to recognise the possible risks of harmful impacts from these systems. For instance, concerns have been raised that AI risks further embedding or exacerbating existing health and social inequalities – a risk that is evidenced in both systems that are working as designed,[footnote]Wen, D., Khan, S., Ji Xu, A., Ibrahim, H., Smith, L., Caballero, J., Zepeda, L., de Blas Perez, C., Denniston, A., Lui, X. and Martin, R. (2021). ‘Characteristics of publicly available skin cancer image datasets: a systematic review’. The Lancet: Digital Health [online]. Available at: https://www.thelancet.com/journals/landig/article/PIIS2589-7500(21)00252-1/fulltext[/footnote] and in those that are producing errors or are failing.[footnote]Banerje, I et al. (2021). ‘Reading race: AI recognises patient’s racial identity in medical images’. Computer Vision and Pattern Recognition. Available at: https://arxiv.org/abs/2107.10356[/footnote]  [footnote]Antun, V., Renna, F., Poon, C., Adcock, B., Hansen, A. C. (2020). ‘On instabilities of deep learning in image reconstruction and the potential costs of AI’. Proceedings of the National Academy of Sciences of the United States of America, p. 117, 48 [online] Available at: https://www.pnas.org/content/117/48/30088[/footnote]

Additionally, there are concerns around the kinds of interactions that take place between clinicians and AI systems in clinical settings: the AI system may contribute to human error, override much-needed human judgement, or lead to overreliance or misplaced faith in the accuracy metrics of the system.[footnote]Topol, E. (2019). ‘High performance medicine: the convergence of human and artificial intelligence’. Nature Medicine, 25, pp.45-56. [online] Available at: https://www.nature.com/articles/s41591-018-0300-7[/footnote]

The NHS has a longstanding commitment to privacy and processing personal data in accordance with the General Data Protection Regulation (GDPR)[footnote]NHSX. How NHS and care data is protected. Available at: https://www.nhsx.nhs.uk/key-tools-and-info/data-saves-lives/how-nhs-and-care-data-is-protected[/footnote] which may create tension with the more recent commitment to make patient data available for companies.[footnote]NHS Digital. How NHS Digital makes decisions about data access. Available at: https://digital.nhs.uk/services/data-access-requestservice-dars/how-nhs-digital-makes-decisions-about-data-access[/footnote] Potential harmful impacts arising from use of these systems are myriad, from both healthcare-specific concerns around violating patient consent over the use of their data, to more generic risks such as creating public mistrust of AI systems and the institutions that develop or deploy them.

It is important to understand impacts do not have parity across people and groups: for example, a person belonging to a marginalised group may experience even greater mistrust around use of AI, owing to past discrimination.

These impacts can result in serious harm to both individuals and groups, who are often ‘left behind’ in provision of health and social care.[footnote]Ada Lovelace Institute. (2021). The data divide. Available at: https://www.adalovelaceinstitute.org/report/the-data-divide/[/footnote] Harmful impacts can arise from endemic forms of bias during AI design and development, from error or malpractice at the point of data collection, to over-acceptance of model output, and reducing vigilance at the point of end use.[footnote]Data Smart Schools. (2021). Deb Raji on what ‘algorithmic bias‘ is (…and what it is not). Available at: https://data-smart-schools. net/2021/04/02/deb-raji-on-what-algorithmic-bias-is-and-what-it-is-not/[/footnote] Human values and subjectivities such as biased or racist attitudes or behaviours can become baked-in to AI systems,[footnote]Balayn, A and Gürses, S. (2021). Beyond debiasing: regulating AI and its inequalities. European Digital Rights. Available at: https://edri. org/wp-content/uploads/2021/09/EDRi_Beyond-Debiasing-Report_Online.pdf[/footnote] and reinforce systems of oppression once in use, resulting in serious harm.[footnote]Noble, S.U. (2018). Algorithms of oppression: how search engines reinforce racism. NYU Press[/footnote] For example, in the USA, an algorithm commonly used in hospitals to determine which patients required follow-up care was found to classify White patients as more ill than Black patients even when their level of illness was the same, affecting millions of patients for years before it was detected.[footnote]Chakradhar, S. (2019). ‘Widely used algorithm in hospitals is biased, study finds’. STAT. Available at: https://www.statnews. com/2019/10/24/widely-used-algorithm-hospitals-racial-bias/[/footnote]

Because of the risk and scale of harm, it is vital that developers of AI-based healthcare systems go through a process of assessing potential impacts of their system throughout its lifecycle. Doing so can help developers mitigate possible risks to patients and the public, reduce legal liabilities for healthcare providers who use their system, and consider how their system can be successfully integrated and used by clinicians.

Impacts arising from development and deployment
of healthcare AI systems

AI systems are valued by their proponents for their potential to support clinical decisions, monitoring of patient health, freeing resources and improving patient outcomes. These impacts, if realised, would hopefully result in beneficial, tangible outcomes, but there may also be consequences arising from when the AI system is used as intended or when it is producing errors or failing.

 

Many of these technologies are in their infancy, and often only recently adopted into clinical settings, so there is a real risk of these technologies producing adverse effects, causing harm to people and society in the near and long term. Given the scale that these systems operate at and the high risk of significant harm if they do fail in a healthcare setting, it is essential for developers to consider the impacts of their system before they are put in use.

 

Recent evidence provides examples of some kinds of impacts (intended or otherwise) that have emerged from the development and deployment of healthcare AI systems:

  • A study released in July 2021 found that algorithms used in healthcare are
    able to read a patient’s race from medical images including chest and hand
    X-rays and mammograms.[footnote]Gichoya, J.W. et al. (2021). ‘Reading race: AI recognises patient’s racial identity in medical images’. arXiv. Available at: https://arxiv.org/ abs/2107.10356[/footnote] Race is not an attribute normally detectable from scans. Other evidence shows that Black patients and patients from other marginalised groups may receive inferior care than White patients.[footnote]Frakt, A. (2020). ‘Bad medicine: the harm that comes from racism’. The New York Times. [online] Available at: https://www.nytimes. com/2020/01/13/upshot/bad-medicine-the-harm-that-comes-from-racism.html[/footnote] Being able to identify race from a scan (with any level of certainty) raises the risk of introducing an unintended system impact that causes harm to both individuals and society, reinforcing systemic health inequalities
  • A 2020 study of the development, implementation and evaluation of Sepsis
    Watch, an AI ‘early-warning system’ for assisting hospital clinicians in the early
    diagnosis and treatment of sepsis uncovered unintended consequences.[footnote]Sendak, M. et al. (2020). ‘“The human body is a black box”: supporting clinical decision-making with deep learning’. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. ACM:, New York, NY, USA, pp. 99–109. Available at: https://doi.org/10.1145/3351095.3372827[/footnote] Sepsis Watch was successfully integrated with clinical practice after close engagement with nurses and hospital staff to ensure it triggered an alarm in an appropriate way and led to a meaningful response. But the adoption of the system had an unanticipated impact of clinicians taking on an intermediary role between the AI system and other clinicians in order to successfully integrate the tool for hospital use. This demonstrates that developers should take into account the socio-environmental requirements to successfully implement and run an AI system.
  • A study released in December 2021 revealed underdiagnosis bias in AIbased
    chest X-ray (CXR) prediction models among marginalised populations,
    particularly in intersectional subgroups.[footnote]Seyyad-Kalantari, L., Zhang, H., McDermott, M., Chen, I. Y., Ghassemi, M. (2021). ‘Underdiagnosis bias of artificial intelligence algorithms applied to chest radiographs in underserved patient populations’. Nature Medicine, 27, pp. 2176-2182. Available at: https://www.nature.com/articles/s41591-021-01595-0[/footnote] This example shows that analysis of how an AI system performs on certain societal groups may be missed, so careful consideration of user populations ex ante is critical to help mitigate harms ex post. It also demonstrates how some AI systems may result in a reduced quality of care that may result in injury to some patients.
  • A study on the implementation of an AI-based retinal scanning tool in Thailand for detecting diabetic eye disease found that its success depended on socio-environmental factors like whether the hospital had a stable internet connection and lighting conditions for taking photographs – when these were insufficient, the use of the AI system caused delays and disruption.[footnote]Beede, E., Elliott Baylor, E., Hersch, F., Iurchenko, A., Wilcox, L., Ruamviboonsuk, P. and Vardoulakis, L. (2020). ‘A human-centered evaluation of a deep learning system deployed in clinics for the detection of diabetic retinopathy’. In: CHI Conference on Human Factors in Computing Systems (CHI ‘20), April 25-30, 2020, Honolulu, HI, USA. ACM, New York, NY, USA. Available at: https://dl.acm.org/doi/fullHtml/10.1145/3313831.3376718[/footnote] They found that clinicians unexpectedly created ‘work-arounds’ for the intended study design use of the AI system. This reflected unanticipated needs that affected how the process worked, in particular that patients may struggle to attend distant hospitals for further examination, which made hospital referral a bad fallback for when the AI system failed. This concern was identified through researchers’ discussions with clinicians, showing the potential value of participation early in the design and development process.

The utility of AIAs in health policy: complementing existing governance processes in the UK healthcare space

The AIA process is intended to complement and build from existing regulatory requirements imposed on proposed medical AI products, recognising the sanctity of well-established regulation. As a result, it is essential to survey that regulatory context before diving into the specifics of what an AIA requires, and where an AIA can add value.

Compared to most other domains, the UK’s healthcare sector already has in place relatively mature regulatory frameworks for the development and deployment of AI systems with a medical purpose. The UK Government has indicated that further updates to regulation are
forthcoming, in order to be more responsive to data-driven technologies like AI.[footnote]Medicines and Healthcare products Regulatory Agency (2020). Regulating medical devices in the UK. UK Government. Available at: https://www.gov.uk/guidance/regulating-medical-devices-in-the-uk[/footnote] There is in a complex ecosystem of regulatory compliance, with several frameworks for risk assessment, technical, scientific and clinical assurance and data protection that those adopting or building these systems must navigate.

This AIA process is therefore proposed as one component in a broader accountability toolkit, which is intended to provide a standardised, reflexive framework for assessing impacts of AI systems on people and society. It was designed to complement – not replicate or override –existing governance processes in the UK healthcare space. Table 1 below compares the purpose, properties and evidence required by some of these processes, to map how this AIA adds value.

Table 1: How does this AIA complement some existing processes in the healthcare space?

Name of initiative Medical devices
regulation
NHS code of
conduct for digital
and data-driven
health technologies
(DHTs)
NICE evidence
standards
frameworks for
DHTs
Data protection
impact
assessments
(DPIAs)
ISO clinical
standards: 14155
& 14971
Type of initiative Legislation Non-mandatory,
voluntary best-practice
standards
Non-mandatory,
voluntary best-practice
standards
Mandatory impact
assessment (with a
legal basis under
the GDPR)
Non-mandatory
clinical standards
for medical devices
(including devices
with an AI
component)
Initiative details Follows the EU risk-based classification of medical devices implemented and enforced by a competent authority:
in the UK, this is the Medicines & Healthcare products
Regulatory Agency
(MHRA). MHRA’s medical device product
registration, known as a CE marking
process, is a
requirement under
the UK’s Medical Device Regulations
2002. Higher-risk products will have conformity assessments carried out by third parties: notified bodies.[footnote]Medicines and Healthcare products Regulatory Agency (MHRA). (2020). Medical devices: conformity assessment and the UKCA mark. UK Government. Available at: https://www.gov.uk/guidance/medical-devices-conformity-assessment-and-the-ukca-mark[/footnote]
The NHS outlines 12
key principles of good practice for innovators designing and developing data-driven healthcare products, including ‘how to operate ethically’, ‘usability and
accessibility’, and technical assurance. There is
considerable emphasis on ‘good data protection practice, including data transparency’.
Outlines a set of standards for innovation, grouping DHTs into tiers based on
functionality for a proportionate, streamlined framework. The framework’s scope covers DHTs that
incorporate AI using fixed algorithms (but not DHTs using adaptive algorithms).
Completed as a guardrail against improper data handling and to protect individual data rights (DPIAs are not specific to healthcare). From the
International
Standards
Organisation, and considered gold standard, is internationally  recognised, and can be used as a benchmark for regulatory compliance.
Which part of
project lifecycle?
Whole lifecycle,
particularly
development, and
including post-deployment.
Development
and procurement.
Development
and procurement.
Ideation to
development.
Whole lifecycle.
Purpose To demonstrate the product meets
regulatory requirements and to achieve a risk
classification, from Class I (lowest perceived risk) to Class III (highest) that provides a quantified measure of risk.
To help developers understand NHS motivations and standards for buying digital and data-driven technology products. To help developers collect the appropriate evidence to demonstrate clinical effectiveness and
economic impact for their data-driven product.
To ensure safe and fair handling of
personal data and minimise risks arising from improper data handling, and as a legal compliance
exercise.
To provide
‘presumption of
conformity’ of good clinical practice during design, conduct, recording and reporting of
clinical investigations, to assess the clinical performance or effectiveness and safety of medical devices.
Output? Classification of device, e.g. Class IIb,
to be displayed outwardly. Technical documentation on metrics like safety and performance. Declaration of conformity resulting
in CE/UKCA mark.
No specific output. No specific output. Completed DPIA document, probably
a Word document or PDF saved as an internal record. While there is a general obligation to
notify a data subject about the processing
of their data, there is no obligation to
publish the results of
the DPIA.[footnote]Kaminski, M.E. and Malgieri, G. (2020). ‘Algorithmic impact assessments under the GDPR: producing multi-layered explanations’. International Data Privacy Law, 11,2, pp.125-144. Available at: https://doi.org/10.1993/idpl/ipaa020[/footnote]
No specific output.
What evidence
is needed?
Chemical, physical and biological properties of the product, and that the benefits outweigh risks and achieve claimed performance (proven with clinical evidence).

Manufacturers must also ensure ongoing safety by carrying out post-market surveillance under guidance of MHRA.

Value proposition,
mission statement,
assurance testing of product, and asks users to think of data ethics frameworks.
Evidence of
effectiveness of the technology and evidence of economic impact standards. Uses contextual questions to help identify ‘higher-risk’ DHTs, e.g. those with users from ‘vulnerable groups’.
Evidence of compliance with the GDPR regulation on data categories, data handling, redress procedures, scope, context and nature of processing.

Asks users to identify source and nature of risk on individuals, with an assessment of likelihood and severity of harm.

The DPIA also includes questions on consultations with ‘relevant stakeholders’.

Evidence of how rights, safety and wellbeing of subjects are protected, scientific conduct, and responsibilities
of principal
investigator. The ISO 14971
requires teams to build a risk-management plan, including a risk-assessment
to identify possible hazards.
How does the AIA differ from, and complement this process?
Building off the risk-based approach, the AIA encourages further reflexivity on who gets to decide and define these risks and impacts, broadening out the MHRA classification framework.

It also helps teams better understand impacts beyond risk to the individual.

This AIA proposes a DAC to assess AIAs; in future, this could be a notified body (as in the MHRA
initiative).

The code of conduct mentions DPIAs; this AIA would move beyond data-processing
risk. The guide considers impacts, such as impact on patient outcomes: the AIA adds weight by detailing procedure to achieve this impact: e.g. improving clinical outcomes because of the
comprehensive
assessment of
negative impacts,
producing a record of this information to
build evidence, and releasing it publicly for transparency.
Our impact
identification
exercise uses similar Q&A prompts to help developers assess risk, but the AIA
helps interrogate the ‘higher-risk’ framing: higher risk for who?
Who decides?The participatory workshop broadens out the people
involved in these discussions, to help build a more holistic
understanding of risk.
AIAs and DPIAs differ in scope and
procedure, and we therefore recommend a copy of the DPIA also be included as part the NMIP data access process. AIAs seek to encourage a reflexive discussion among project teams to identify and mitigate a wider array of potential impacts, including environmental, societal or individual harms.DPIAs are generally led by a single
data-controller processor, legal
expert or information-governance
team, limiting scope for
broader engagement. The AIA encourages engagement of individuals who may be affected by an AI system even if they are not subjects of that data.
The process of identifying possible
impacts and building
into a standardised framework is
confluent between
the ISO 14971 and the AIA. However,
the AIA does not measure for quality assurance or clinical robustness to avoid duplication. Instead,
it extends these
proposals by helping
developers better understand the needs of their users
through the
participatory
exercise.

 

There is no single body responsible for regulation for data-driven technologies in healthcare. Some of the key regulatory bodies for the development of medical devices in the UK that include an AI component are outlined in Table 2 below:

Table 2: Key regulatory bodies for data-driven technologies in healthcare

Regulatory body Medicines and
Healthcare products
Regulatory Agency
(MHRA)
Health Research
Authority (HRA)
Information
Commissioner’s Office
(ICO)
National Institute for Health & Care Excellence (NICE)
Details The MHRA regulates
medicine, medical
devices and blood
components in the UK. It
ensures regulatory
requirements are met
and has responsibility for setting post-market
surveillance standards
for medical devices.[footnote]Health Research Authority (HRA). Research Ethics Service and Research Ethics Committees. Available at: https://www.hra.nhs.uk/ about-us/committees-and-services/res-and-recs/[/footnote] AI
systems that are
regulated by the MHRA as medical devices.
If AI systems are developed within the NHS, projects will need
approval from the Health Research Authority, who oversee responsible use of medical data, through
a process that includes
seeking ethical approval from an independent Research Ethics Committee (REC).[footnote]Health Research Authority (HRA). Research Ethics Service and Research Ethics Committees. Available at: https://www.hra.nhs.uk/ about-us/committees-and-services/res-and-recs/[/footnote]The REC evaluates for ethical concerns around research methodology but does not evaluate for the potential broader societal impacts of
research.
The ICO is the UK’s data
protection regulator. AI
systems in health are
often trained on, and process individual
patients’ health data.There must be a lawful
basis for use of personal data in the UK,[footnote]Health Research Authority (HRA). Research Ethics Service and Research Ethics Committees. Available at: https://www.hra.nhs.uk/ about-us/committees-and-services/res-and-recs/[/footnote] and
organisations are
required to demonstrate understanding of and compliance with data
security policies, usually by completing a data protection impact assessment (DPIA).
The ICO assurance team may conduct audits of different health organisations to ensure compliance with the Data Protection Act.[footnote]Information Commissioner’s Office (ICO). Findings from ICO audits of NHS Trusts under the GDPR. Available at: https://ico.org.uk/ media/action-weve-taken/audits-and-advisory-visits/2618960/health-sector-outcomes-report.pdf[/footnote]
NICE supports
developers and
manufacturers of healthcare products,
including data-driven technologies like AI systems, to be able to
produce robust evidence for their effectiveness.They have produced comprehensive guidance
pages for clinical conditions, quality standards and advice pages, including the NICE evidence standards framework for digital health technologies (see ‘Table 1’ above).[footnote]National Institute for Care Excellence (NICE). Evidence standards framework for digital health technologies. Available at: https://www. nice.org.uk/about/what-we-do/our-programmes/evidence-standards-framework-for-digital-health-technologies[/footnote]

 

It is important to emphasise that this proposed AIA process is not a replacement for the above governance and regulatory frameworks. NMIP applicants expecting to build or validate a product from NMIP data are likely to go on to complete (or in some cases, have already completed), the processes of product registration and risk classification, and are likely to have experience working with frameworks such as the ‘Guide to good practice’ and NICE evidence standards framework.

Similarly, DPIAs are widely used across multiple domains because of their legal basis and are critical in healthcare, where use of personal data is widespread across different research and clinical settings. As Table 1 shows, we recommend to the NHS AI Lab that NMIP applicant teams should be required to submit a copy of their DPIA as part of the data-access process, as it specifically addresses data protection and privacy concerns around the use of NMIP data, which have not been the focus of the AIA process.

The AIA process complements these processes by providing insights into potential impacts through a participatory process with patients and clinicians (see ‘What value can AIAs offer developers of medical technologies?’) The AIA is intended as a tool for building robust accountability by providing additional routes to participation and external scrutiny: for example, there is no public access requirement for DPIAs, so we have sought to improve documentation practice to provide stable records of the process.

This project also made recommendations to the NHS AI Lab around best practice for documenting the NMIP dataset itself, using a datasheet that includes information about the dataset’s sources, what level of consent it was collected under, and other necessary information to help inform teams looking to use NMIP data and conduct AIAs – because datasets can have downstream consequences for the impacts of AI systems developed with them. [footnote]Boyd, K.L. (2021). ‘Datasheets for datasets help ML engineers notice and understand ethical issues in training data’. Proceedings of the ACM on Human-Computer Interaction, 5, 438. [online] Available at: http://karenboyd.org/blog/wp-content/uploads/2021/09/ Datasheets_Help_CSCW-5.pdf[/footnote]  [footnote]Gebru, T., Mogenstern, J., Vecchione, B., Wortman Vaughan, J., Wallach, H., Daumé III, H. and Crawford, K. (2018). Datasheets for datasets. ArXiv [online] Available at: https://arxiv.org/abs/1803.09010[/footnote]

Where does an AIA add value among existing processes?

Viewing impacts of AI systems with a wider lens

Given the high-stakes context of healthcare, many accountability initiatives use matrices of technical assurance, like accuracy, safety and quality. Additionally, technologies that build from patient data would need to be assessed for their impacts on individual data privacy and security.

This AIA process encourages project teams to consider a wider range of impacts on individuals, society and the environment in the early stages of their work. It encourages a reflexive consideration of common issues that AI systems in healthcare may face, such as considerations around the explainability and contestability of decisions, potential avenues for misuse or abuse of a system, and where different forms of bias may appear in the development and deployment of a system.

Broadening the range of perspectives in a governance process

Beyond third-party auditing, there is little scope in the current landscape for larger-scale public engagement activity to deliberate on governance or regulation of AI in the healthcare space. Public and patient participation in health processes is widespread, but many organisations lack the resources or support to complete public engagement work at the scale they’d like to. It emerged from stakeholder interviews that our AIA would need to include a bespoke participatory process, to provide insight into potential algorithmic harm in order to build meaningful, critical AIAs, which in turn will help to build better products.

Standardised, publicly available documentation

Many risk assessments, including other impact assessments like DPIAs, do not have a requirement for completed documentation to be published or for other evidence about how the process was undertaken to be evidenced.[footnote]Gebru, T., Mogenstern, J., Vecchione, B., Wortman Vaughan, J., Wallach, H., Daumé III, H. and Crawford, K. (2018). Datasheets for datasets. ArXiv [online] Available at: https://arxiv.org/abs/1803.09010[/footnote] It has been demonstrated that the varied applications of AI in healthcare worldwide have led to a lack of consensus and standardisation of documentation around AI systems and their adoption in clinical decision-making settings, which has implications both for evaluation and auditing of these systems, and for ensuring harm prevention.[footnote]Sendak, M., Gao, M., Brajer, N. and Balu, S. (2020). ‘Presenting machine learning model information to clinical end users with model facts labels’. npj Digital Medicine, 3,41, p1-4. [online] Available at: https://www.nature.com/articles/s41746-020-0253-3[/footnote] For the NMIP context, the intention was to introduce a level of standardisation across all AIAs to help address this challenge.

What value can AIAs offer developers of medical
technologies?

With over 80 AI ethics guides and guidelines available, developers express confusion about how to translate ethical and social principles into practice that leads to inertia. To disrupt this cycle, it is vital that technology developers and organisations adopting AI systems have access to frameworks and step-by-step processes to proceed with ethical design.

We interviewed several research labs and private firms developing AI products to identify where an AIA would add value (see ‘Methodology’). Our research uncovered that academic research teams, small health-tech start-ups and more established companies all have different considerations, organisational resources and expertise to bring to the table, but there are still common themes that underscore why a developer benefits from this AIA process:

  1. Clearer frameworks for meeting NHS expectations. Developers see value
    in considering societal impacts at the outset of a project, but lack a detailed
    and actionable framework for thinking about impacts. This kind of AIA exercise
    can identify potential failure modes within the successful implementation of
    a medical technology, and can help developers meet the NHS’s compliance
    requirements.
  2. Early insights can support and improve patient care outcomes. Some technology developers we interviewed reported a struggle with reaching and
    engaging patients and representatives of the public at the scale they would
    like. The AIA enables this larger-scale, meaningful interaction, resulting in
    novel insights. For applicant teams early on in the development process, the
    participatory workshop provides important context for how an applicant’s
    AI system might be received. Better understanding patient needs before the
    majority of system development or application is underway allows for further
    consideration in design decisions that might have a tangible effect on the
    quality of patient care in settings supported by an AI system.
  3. Building on AI system risk categorisation. Applicants hoping to use NMIP
    data to build and validate products will also have to undertake the MHRA
    medical device classification, which asks organisations to assign a category
    of risk to the product. It can be challenging for AI developers to make a
    judgement on the risk level of their system, and so the framework requires
    developers to assign a pre-determined risk category using a flowchart for
    guidance. It may still be challenging for developers to understand why and
    how certain attributes or more detailed design decisions correspond to a
    higher level of risk.The AIA’s reflexive impact identification exercise and participatory workshop move beyond a process of mapping technical details and help build a comprehensive understanding of possible impacts. It also provides space for applicant teams to explore risks or impacts that they feel may not be wholly addressed by current regulatory processes, such as considering societal risk in addition to individual risk of harm.

Case study: NHS AI Lab’s National Medical Imaging Platform

In this research, the NHS AI Lab’s National Medical Imaging Platform (NMIP) operates as a case study: a specific research context to test the applicability of algorithmic impact assessments (AIAs) within the chosen domain of AI in healthcare. It should be emphasised that this is not an implementation case study – rather, it is a case study of designing and building an AIA process. Further work will be required to implement and trial the process, and to evaluate its effectiveness once in operation.

The NHS AI Lab – part of the NHS Transformation Directorate driving the digital transformation of care – aims to accelerate the safe, ethical and effective adoption of AI in healthcare, bringing together government, health and care providers, academics and technology companies to collaborate on achieving this outcome.[footnote]NHS AI Lab. The NHS AI Lab: accelerating the safe adoption of AI in health and care. Available at: https://www.nhsx.nhs.uk/ai-lab/[/footnote]

The NMIP is an initiative to bring together medical-imaging data from across the NHS and make it available to companies and research groups to develop and test AI models.[footnote]NHS AI Lab. National Medical Imaging Platform (NMIP). Available at: https://www.nhsx.nhs.uk/ai-lab/ai-lab-programmes/ai-in-imaging/ national-medical-imaging-platform-nmip/[/footnote]

It is envisioned as a large medical-imaging dataset, comprising chest X-ray (CXR), magnetic resonance imaging (MRI) and computed tomography (CT) images from a national population base. It is being scoped as a possible initiative after a precursor study, the National COVID Chest Imaging Database (NCCID), which was a centralised database that contributed to the early COVID-19 pandemic response.[footnote]NHS AI Lab. The National COVID Chest Imaging Database. Available at: https://www.nhsx.nhs.uk/covid-19-response/data-andcovid- 19/national-covid-19-chest-imaging-database-nccid/[/footnote] The NMIP was designed with the intention of broadening the geographical base and diagnostic scope of the original NCCID platform. At the time of writing, the NMIP is still a proposal and does not exist as a
database.

How is AI used in medical imaging?

 

When we talk about the use of AI in medical imaging, we mean the use of machine-learning techniques on images for medical purposes – such as CT scans, MRI images or even photographs of the body. Medical imaging can be used in medical specialisms including radiology (using CT scans or X-rays) and ophthalmology (using retinal photographs). Machine learning describes when computer software ‘learns’ to do a task from data it is given instead of being programmed explicitly to do that task. The use of machine learning with images is often referred to as ‘computer vision’. The field of computer vision – the use of machine learning (i.e. AI tools) to better process information about images – has had an impact in the medical field over a long period. [footnote]Esteva, A., Chou, K., Yeung, S., Naik, N., Madani, A., Mottaghi, A., Liu, Y., Topol, E., Dean, J., and Socher, R. (2021). ‘Deep learning-enabled medical computer vision’. npj Digital Medicine, pp.1-9 [online]. Available at: https://www.nature.com/articles/s41746-020-00376-2[/footnote]

 

For example, AI in medical imaging may be used to make a diagnosis from a radiology image. The machine learning model will be trained on many radiology images (‘training data’) – some which exhibit the clinical condition, and some which don’t – and from this will ‘learn’ to recognise images with the clinical condition, with a particular level of accuracy (they won’t always be correct). This model could then be used in a radiology department for diagnosis. Other uses include identifying types or severity of a clinical condition. Currently, these models are mostly intended for use alongside clinicians’ opinions.

 

An example of AI in medical imaging is a software that uses machine learning to read chest CT scans, to detect possible early-stage lung cancer. It does this by identifying lung (pulmonary) nodules, a kind of abnormal growth that forms in the lung. Such products are intended to speed up the CT reading process and claim to lower the risk of misdiagnosis.

 

The NMIP, as part of the NHS AI Lab, is intended to collect medical images and associated data that could be used to train and validate machine learning models.[footnote]NHS AI Lab. National Medical Imaging Platform. Available at: https://www.nhsx.nhs.uk/ai-lab/ai-lab-programmes/ai-in-imaging/ national-medical-imaging-platform-nmip/[/footnote]

An example product that might be built from a dataset like the NMIP would be a tool that helps to detect the presence of a cardiac tumour by interpreting images, after training on thousands of MRI images that show both presence and no presence of a tumour. As well as detection, AI
imaging products may help with patient diagnosis for clinical conditions like cancer, and may also help triage patients based on the severity of abnormality detected from a particular set of images. The developers of these products claim they have the potential to improve health outcomes – by speeding up waiting times for patient diagnosis, for example – and to ease possible resourcing issues at clinical sites.

The NMIP will be available, on application, for developers to test, train and validate imaging products. Organisations with a project that would benefit from access to the NMIP dataset would need to make an application to access the dataset, describing the project and how it will use NMIP data.

From interviews with stakeholders, we envisage that applicants will be seeking access to the NMIP for one of three reasons:

  1. To conduct academic or corporate research that uses images from the NMIP dataset.
  2. To train a new commercial medical product that uses NMIP data.
  3. To analyse and assess existing models or commercial medical products using NMIP data.

This AIA process is therefore aimed at both private and public-sector researchers and firms

In this proposed process, access to the NMIP will be decided by an NHS-operated
Data Access Committee (DAC). DACs are already used for access to other NHS datasets, such as the University College London Hospital (UCLH) DAC, which manages and controls access to COVID-19 patient data.[footnote]UCL. (2020). UCLH Covid-19 data access committee set up. Available at: https://www.ucl.ac.uk/joint-research-office/news/2020/jun/ uclh-covid-19-data-access-committee-set[/footnote] There is also a DAC process in place for the NCCID, which will help inform the process for the NMIP.

For the NCCID, the DAC evaluates requests for access on criteria such as scientific merit of the project, its technical feasibility, the track record of the research team, reasonable evidence that access to data can benefit patients and the NHS, compliance with the GDPR and NHS
standards of information governance and IT security. We anticipate the NMIP will evaluate for similar criteria, and have structured this process so that the AIA complements these other criteria by encouraging research teams to think reflexively about the potential benefits and harms of their project, engage with patients and clinicians to surface critical responses, and present a document outlining those impacts to the DAC.

DACs can deliberate on a number of ethical and safety issues around use of data, as shown in the detailed process outlined below. For example, in the NMIP context, the DAC will be able to review submitted AIAs and make judgements about the clarity and strength of the process of impact identification, but they may also be required to review a DPIA, which we recommend would be a requirement of access. This would provide a more well-rounded picture of how each applicant has considered possible social impacts arising from their project. However, evidence suggests DACs often deliberate predominately around issues of data privacy and the rights of individual data subjects[footnote]Cheah, P.Y. and Piasecki, J. (2020). ’Data access committees‘. BMC Medical Ethics, 21, 12 [online] Available at: https://link.springer. com/article/10.1186/s12910-020-0453-z[/footnote]  [footnote]Thorogood A., and Knoppers, B.M. (2017). ‘Can research ethics committees enable clinical trial data sharing?’. Ethics, Medicine and Public Health, 3,1, pp.56-63.[online] Available at: https://www.sciencedirect.com/science/article/abs/pii/S2352552517300129[/footnote] which is not the sole focus of our AIA. Accordingly, the NMIP DAC will be expected to broaden their expertise and understanding of a range of possible harms and benefits from an AI system – a task that we acknowledge is essential but may require additional resource and support.

The proposed AIA process

Summary

Our AIA process is designed to ensure that National Medical Imaging Platform (NMIP) applicants have demonstrated a thorough and thoughtful evaluation of possible impacts, in order to be granted access to the platform. The process presented here is the final AIA process we recommend the NHS AI Lab implements and makes requisite for NMIP applicants.

While this process is designed specifically for NHS AI Lab and NMIP applicants, we expect it to be of interest to policymakers, AIA researchers and those interested in adopting algorithmic accountability mechanisms.

As the first draft of this process, we expect the advice to develop over time as teams trial the process and discover its strengths and limitations, as the public and research community provide feedback on this process, and as new AIA practical frameworks emerge.

The process consists of seven steps, with three main exercises, or points of activity, from the NMIP applicant perspective: a reflexive impact identification exercise, a participatory workshop, and a synthesis of the two (AIA synthesis). See figure 1 (below) for an overview of the process.

Figure 1: Proposed AIA process

The described AIA process is initiated by a request from a team of technology developers to access the NMIP database. It is the project that sets the conditions for the AIA – for example, the dataset might be used to build a completely new model or, alternatively, the team may have a pre-existing functioning model that the team would like to be retrained or validated on the NMIP. At the point that the applicant team decides the project would benefit from NMIP data access, they will be required to begin the AIA process as part of their data-access request.

  1. AIA reflexive exerciseA reflexive impact identification exercise submitted to the NMIP DAC as part of the application to access the NMIP database.The exercise uses a questionnaire format, drawing from best-practice methodologies for impact assessments. It prompts teams to answer a set of questions that consider common ethical considerations in AI and healthcare literature, and potential impacts that could arise, based on the best-case and worst-case scenarios for their project. It then asks teams to discuss the potential harms arising from uses based on the identified scenarios, and who is most likely to be harmed.Applicants are required to consider harms in relation to their perceived importance or urgency, i.e. weight of the consequence, difficulty to remediate and detectability of the impact. Teams are then asked to consider possible steps to mitigate these harms. These responses will be captured in the AIA template.
  2. Application filtering
    At this stage, the NMIP DAC filters initial applications.Applications are judged according to the engagement shown toward the exercise: whether they have completed all the prompts set out in the AIA template, and whether the answers to the AIA prompts are written in an understandable format, reflecting serious and careful consideration to the potential impacts of this system.Those deemed to have met the above criteria will be invited to take part in the participatory workshop, and those that have not are rejected until the reflexive exercise is properly conducted.
  3. AIA participatory workshop
    Step three is a participatory process designed as an interactive workshop, which would follow a ‘citizen’s jury’ methodology,[footnote]Gastil, J. (ed.) (2005). The deliberative democracy handbook: strategies for effective civic engagement in the twenty-first century. 1. ed., 1. impr. Hoboken, N.J: Wiley.[/footnote] equipping patients and members of the public with a means to pose questions and pass judgement on the harm and benefit scenarios identified in the previous exercise (and possibly uncovering some further impacts).The workshop would be an informal setting, where participants should feel safe and comfortable to ask questions and receive support from the workshop facilitator and other experts present. An NHS AI Lab rapporteur would be present to document the workshop’s deliberations and findings on behalf of the patient and public participants.After the exercise has concluded, the participants will asynchronously review the rapporteur’s account and the list of impacts identified, and review any mitigation plans the applicant team has devised in this window. 
  4. AIA synthesis
    The applicant team(s) revisit the template, and work the new knowledge back into the template document, based on findings from the participatory workshop.
  5. Data-access decision
    This updated template is re-submitted to the DAC, who will also receive the account of the participatory workshop from the NHS AI Lab rapporteur.The DAC then makes a decision on whether to grant access to the data, based on a set of criteria relating to the potential risks posed by this system, and whether the product team has offered satisfactory mitigations to potentially harmful outcomes.
  6. AIA publication
    The completed AIAs are then published in a central, easily accessible location – probably the NMIP website – for internal record-keeping and the potential for external viewing on request.
  7. AIA iteration
    The AIA is then revisited on an ongoing basis by project teams, and at certain trigger points.

    Such reviews may be prompted by notable events, such as changes to the proposed use case or a significant model update. In some cases, the DAC may, as part of its data access decision, mandate selected project teams to revisit the AIA after a certain period of time to determine if they are allowed to retain access, at its discretion.

Learnings from the AIA process

1. AIA reflexive exercise

Recommendation

For this first step we recommend a reflexive impact identification and analysis exercise to be run within teams applying for the NMIP. This exercise enables teams to identify possible impacts, including harms, arising from development and deployment of the applicant team’s AI system by working together through a template of questions and discussion prompts.

Implementation detail

  1. Applicant teams should identify a lead for this exercise (we recommend the project team lead, principal investigator or product lead) and a notetaker (for small teams, these roles may be combined).
  2. Once identified, the lead should organise and facilitate a meeting with relevant team members to work through the prompts (estimated time: two-to-three hours). The notetaker will be responsible for writing up the team’s answers in the template document (estimate one-to-two hours).
  3. Teams will first give some high-level project information: the purpose, the intended uses of the system, model of research; the project team/organisation; the inputs and outputs for the system, and the stakeholders affected by the system, including users and the people it serves.
  4. The template then guides applicants through some common ethical considerations in the context of healthcare, AI and the algorithmic literature, including whether the project could exacerbate health inequalities, increase surveillance, impact the relationship between stakeholders, have environmental effects or whether it could be intentionally or unintentionally misused.
  5. In the next section, impact identification and scenarios, teams reflect on some possible scenarios arising from use of the system and what impacts they would have, including the best-case scenario when the system is working as designed and the worst-case scenario, when not working in some way. This section also asks for some likely challenges and hurdles encountered on the way to achieving the best-case scenario, and the socio-environmental requirements necessary to achieve success, such as a stable connection to the internet, or training for doctors and nurses.
  6. In the final section, teams undertake potential harms analysis – based on the scenarios identified earlier in the exercise, teams should consider what the potential harms resulting from implementation that should be designed for, and who is at risk of being harmed. Teams should also make a judgement on the perceived importance, urgency, difficulty and detectability of the harms.
  7. Teams are given space to detail some possible mitigation plans to minimise identified harms.

All thinking is captured by the notetaker in the AIA template document. It is estimated that this exercise will take three-to-five hours in total (discussion and documentation) to complete.

Frictions and learnings

The impact assessment design:
This exercise is designed to encourage critical dialogue and reflexivity within teams. By stipulating that evidence of these discussions should be built into a template, the AIA facilitates records and documentation for internal and external visibility.

This format of the exercise draws from an approach often used in impact assessments, including AIAs, adapting a Q&A or questionnaire format to prompt teams to consider possible impacts, discuss the circumstances in which impacts might arise and who might be affected. This exercise was also built to be aligned with traditional internal auditing processes – a methodical, internally conducted process with the intention to enrich understanding of possible risk once a system or product is in deployment.[footnote]Raji, D., Smart, A., White, R. N., Mitchell, M., Gebru, T., Hutchinson, B., Smith-Loud, J., Theron, D. and Barnes, P. (2020). ‘Closing the AI accountability gap: defining an end-to-end framework for internal algorithmic auditing’. Conference on Fairness, Accountability, and Transparency, pp.33–44. Barcelona: ACM. Available at: https://doi.org/10.1145/3351095.3372873[/footnote]

Other impact assessments request consideration of high-level categories of impact, such as privacy, health or economic impact.[footnote]Government of Canada. (2020). Algorithmic impact assessment tool. Available at: https://www.canada.ca/en/government/system/digital-government/digital-government-innovations/responsible-use-ai/algorithmic-impact-assessment.html[/footnote] In this process, we chose to prompt consideration of impacts by asking teams to consider what they perceive to be the best and worst-case arising from the use of the system. Our hope is this will make the exercise easier to digest and engage with for those less familiar with adopting ethics discussions into their work. Impacts should relate to the kinds of challenges that are associated with AI systems, such as concerns around bias, misuse, explainability of findings, contestability; but may also include several of the ‘Generic data and digital health considerations’ outlined by the NHS such as concerns around patient involvement and ownership of health and care data.[footnote]NHSX. (2019). Artificial Intelligence: how to get it right. Available at: https://www.nhsx.nhs.uk/media/documents/NHSX_AI_report.pdf[/footnote]

Some other formats of impact assessment ask an assessor to assign a risk level (e.g. low to high) for a product, which may in turn dictate additional follow-up actions from a developer. The regulatory framework adopted by the MHRA classifies AI as a medical device using a risk-level system. We therefore expect most applicants to be familiar with this format, with some projects having a system that will have already undergone this process at the point of NMIP application. This process is intended to complement risk categorisation, giving developers and
project leads a richer understanding of potential harmful impacts of an AI system, to better inform this self-assessment of risk.

The current MHRA framework is focused primarily on risks to the individual, i.e. the risk of harm to a patient if a technology fails (similar to a DPIA, which focuses on the fundamental rights of an individual and the attendant risks of improper personal data handling). Assessment of individual risk is an important component of ensuring safe, fair and just patient outcomes, but we designed the reflexive exercise to go further than this framing of risk. By asking developers to reflexively examine impacts through a broader lens, they are able to consider some possible impacts of their proposed system on society, such as whether it might reinforce systemic biases and discrimination against certain marginalised groups. This process is not about identifying a total measure or quantification of risk, as incorporated in other processes such as the MHRA medical device classification system, but about better understanding impacts and broadening the range of impacts given due consideration.

It is possible that as a precedent of completed AIAs develops, the NHS may in future be able to ascribe particular risk categories based on common criteria or issues they see. But at this stage, we have intentionally chosen not to ask applicants to make a value judgement on risk severity. As described in ‘5. Data-access decision’, we recommend that applicants should also submit their DPIA as part of the application process, concurrently with the reflexive exercise. However, if the NHS team determined that not all project teams would have to undertake a DPIA prior to receiving full assessment by the DAC, then we would recommend a template amendment to reflect more considerations around data privacy.

Making complex information accessible:
We also experienced a particular challenge in this exercise of couching the language of ethical values like accountability and transparency into practical recommendations for technology developers to understand and comply with, given that many may be unfamiliar with AIAs and wider algorithmic accountability initiatives.

Expert stakeholder interviews with start-ups and research labs revealed enthusiasm for an impact assessment, but less clarity on and understanding of the types of questions that would be captured in an AIA and the kinds of impacts to consider. To address these concerns, we produced a detailed AIA template that helps applicant teams gain understanding of possible ethical considerations and project impacts, how they might arise, who they might impact the most, and which of the impacts are likely to result in harm. We also point to data, digital, health and algorithmic considerations from NHSX’s AI report,[footnote]NHSX. (2019). Artificial Intelligence: how to get it right. Available at: https://www.nhsx.nhs.uk/media/documents/NHSX_AI_report.pdf[/footnote] which many teams may be familiar with, and instruct applicants on how best to adopt plain language in their answers.

Limitations of the proposed process
We note that different NMIP applicants may have different interactions with the exercise once trialled: for example, an applicant developing an AI system from scratch may have higher expectations of what it might achieve or what its outcomes will be once deployed, than an applicant who is seeking to validate an existing system, and may be armed with prior evidence. Once these sorts of considerations emerge after the AIA process is trialled, we may be able to make a more robust claim about the utility of a reflexive exercise as a quality component of this AIA.

This exercise (in addition to the participatory workshop, below) is probably applicable to other AIA-process proposals, with some amendments. However, we emphasise that domain expertise should be used to ensure the reflexive exercise operates as intended, as a preliminary exploration of potential benefits and harms. Further study will be required to see how well this exercise works in practice for both project teams and the NHS DAC, how effective it proves at foreseeing which impacts may arise, and whether any revisions or additions to the
process are required.

2. Application filtering

Recommendation

We recommend that the DAC conducts an application-filtering exercise once the reflexive exercise has been submitted, to remove applications that are missing basic requirements, or will not meet the criteria for reasons other than the AIA.

Depending on the strength of the application, the DAC can choose to either reject NMIP applications at this stage, or invite applicants to proceed to the participatory workshop. Most of these criteria will be established by the NMIP team, and we anticipate they will be similar to
those for the National COVID Chest Imaging Database (NCCID), which sees an administrator and a subset of the DAC members involved in application screening for completeness, technical and scientific quality, and includes safeguards for conflicts of interest.[footnote]Based on NHS AI Lab documentation reviewed in research.[/footnote]

Implementation detail

Based on the review process for the National COVID Chest Imaging Database (NCCID), the precursor platform to the NMIP, the NHS AI Lab is likely to adopt the following filtering procedure:

  1. After the applicant team completes the reflexive exercise and builds the evidence into the template document, the teams submit the exercise as part of their application to the NMIP dataset.
  2. The person(s) fielding the data-access requests, such as the administrator, will check that all relevant information required in the AIA template has been submitted. In the instance that some is missing, the administrator will go back to the applicant to request it. If the initial submission is very incomplete, the application is declined at this step.
  3. The administrator passes the acceptable applications to members of the DAC.
  4. The members of the DAC chosen to filter the application are given opportunity to declare any conflict of interests with the applicant’s project. If one or both has a conflict of interest, they should select another expert to review the AIA at this stage.
  5. The selected members assess the AIA and make a judgement call about whether the applicant team can proceed on to the participatory workshop. The decision will be based on technical and scientific quality criteria established by the DAC, as well as review of the AIA, for which we suggest the following initial filtering criteria:
    1. The project team has completed the reflexive exercise.
    2. The answers to the AIA prompts are written in an understandable
      format, avoiding jargon and other technical language.
    3. The answers given do not identify any unacceptable impacts that would place severe risk on the health and wellbeing of patients and clinicians.

Frictions and learnings

Scale and resource:
The NHS AI Lab raised a challenge of scale for this process and suggested that a triage phase might be needed to identify applicants to prioritise for the participatory workshop in the event of a high volume of applications. However, in prioritising some applications over others for the participatory workshop, there is a risk of pre-empting what the findings would be; implicitly judging which applications might be of greater risk. Without a history of participatory impact identification, that judgement is a challenging one for the DAC or NHS AI Lab to make, and risks prioritising impacts and harms already well understood by established processes.

Consequently, we have recommended that all applicants should undertake the participatory workshop but that, as the process begins to be trialled, the DAC will develop a paradigm based on previous applications, which may enable them to make a judgement call for
applicant suitability.

3. AIA participatory workshop

Recommendation

We recommend that the NHS AI Lab runs centralised participatory impact assessment workshops, in order to bring a diverse range of lived experience and perspectives to the AIA process, and to support NMIP applicants who may not have the resources or skills to run a participatory process of this size. We believe this process will also help project teams identify risks and mitigation measures for their project and provide valuable feedback on how their project might be successfully integrated into the UK’s healthcare system.

Implementation detail

  1. We recommend the NHS AI Lab sets up a paid panel of 25–30 patients and members of the public who represent traditionally underrepresented groups who are likely to be affected by algorithms that interact with NMIP data, across dimensions such as age, gender, region, ethnic background, socio-economic background. This includes members of the impacted groups, in addition to adequate representatives of certain communities (e.g. a representative from a grassroots immigrant support organisation being able to speak to migrant experience and concerns).
  2. All panellists will be briefed on their role at an induction session, where they will be introduced to each other, learn more about AI and its uses in healthcare, and about the NMIP and its aims and purpose. Participants should also be briefed on the aims of the workshops, how the participatory process will work and what is expected of
    them.
  3. The panellists will be invited to discuss the applicant team’s answers to the reflexive exercise, possibly identifying other harms and impacts not already addressed by applicant teams. This is designed as an interactive workshop following a ‘citizen’s jury’ methodology,[footnote]For more information on citizen’s jury methodology, see Involve. Citizen’s jury. Available at: https://www.involve.org.uk/resources/ methods/citizens-jury[/footnote] equipping participants with a means to deliberate on the harms and benefit scenarios identified in the previous exercises (and possibly uncovering some further impacts). The workshop would be designed as an informal setting, where participants should feel safe and comfortable to ask questions and receive support from the workshop facilitator and other experts present. The workshops will involve a presentation from the developers of each applicant team on what their system does or will do, what prompted the need for it, how the system uses NMIP data, what outputs it will generate, how the AI system will be deployed and used and what benefits and impacts it will bring and how these were considered (reporting back evidence from the reflexive exercise).
  4. The panellists will then deliberate on the impacts identified to consider whether they agree with the best, worst and most-likely scenarios produced, what other considerations might be important and possible next steps. The facilitator will support this discussion, offering further questions and support where necessary.
  5. The Lab would have ownership over this process, and contribute staffing support by supplying facilitators for the workshops, as well as other miscellaneous resources such as workshop materials. The facilitator will coordinate and lead the workshop, with the
    responsibility for overseeing the impact identification tasks, fielding
    questions, and for leading on the induction session.
  6. 8–12 panellists will be present per workshop, to avoid the same people reviewing every application (8–12 participants per applicant project suggests a different combination for each workshop if there are six or more applications to review). We recommend one workshop per application, and that workshops are batched, so they can run quarterly.
  7. Also present at these workshops would be two ‘critical friends’: independent experts in the fields of data and AI ethics/computer science and biomedical sciences, available to judge the proposed model with a different lens and offer further support. An NHS rapporteur will also be present to provide an account of the workshop on behalf of the patient and public panellists that is fed back to the NHS AI Lab. The rapporteur’s account will be reviewed by the panellists to ensure it is an accurate and full representation of the workshop deliberations.
  8. Members of the applicant team will be present to observe the workshop and answer any questions as required, and will then return to their teams to update the original AIA with the new knowledge and findings.
  9. This updated AIA is then re-submitted to the NMIP DAC.

For the full participatory AIA implementation detail, see Annex 3.

Frictions and learnings

The bespoke participatory framework:
The impetus for producing a tailor-made participatory impact assessment framework came from combining learning from AIA literature with challenges with public and patient involvement (PPI) processes that were raised in our stakeholder interviews. 

There is consensus within the AIA literature that public consultation and participation are an important component of an AIA, but little consensus as to what that process should involve. Within the UK healthcare context, there is an existing participatory practice known as public and patient involvement (PPI). These are frameworks that aim to improve consultation and engagement in how healthcare research and delivery is designed and conducted.[footnote]Health Research Authority (HRA). What is public involvement in research? Available at: https://www.hra.nhs.uk/planning-andimproving- research/best-practice/public-involvement/[/footnote] PPI is well-supported and a common feature in healthcare research: many research funders now require evidence of PPI activity from research labs or companies as a condition of funding.[footnote]University Hospital Southampton. Involving patients and the public. Available at: https://www.uhs.nhs.uk/ ClinicalResearchinSouthampton/For-researchers/PPI-in-your-research.aspx[/footnote] Our interviews revealed a multitude of different approaches to PPI in healthcare, with varying levels of maturity and formality. This echoes research findings that PPI activities, and particularly reporting and documentation, can often end up as an ‘optional extra’ instead of an embedded practice.[footnote]Price, A., Schroter, S., Snow, R., et al. (2018). ‘Frequency of reporting on patient and public involvement (PPI) in research studies published in a general medical journal: a descriptive study’. BMJ Open 2018;8:e020452. [online] Available at: https://bmjopen.bmj. com/content/8/3/e020452[/footnote]

Our stakeholder interviews highlighted that PPI processes are generally supported among both public and private-sector organisations and are in use across the breadth of organisations in the healthcare sector, but many expressed challenges with engagement from patients and the public. One interviewee lamented the struggle to recruit participants: ‘Why would they want to talk to us? […] it might be that we’re a small company: why engage with us?’

In an earlier iteration of our AIA, we recommended applicants design and run the participatory process themselves, but our interviews identified varying capacity for such a process, and it was decided that this would be too onerous for individual organisations to manage on their own
– particularly for small research labs. One organisation, a healthtech start-up, reported that having more access to funding enabled them to increase the scope and reach of their activity: ‘We would always have been keen to do [PPI work], but [funding] is an enabler to do it bigger
than otherwise’. These interviews demonstrated the desire to undertake public participation, but also showcased a lack of internal resources to do so effectively.

There is also the risk that having individual applicants run this process themselves may create perverse incentives for ‘participation washing’, in which perspectives from the panellists are presented in a way that downplays their concerns.[footnote]Sloane, M., Moss, E., Awomolo, O., & Forlano, L. (2020). ‘Participation is not a design fix for machine learning’. arXiv. [online] Available at: https://arxiv.org/abs/2007.02423[/footnote] It will be preferable for this process to be run by a centralised body that is independent from applicants, as well as independent from the NHS, and can provide a more standardised and consistent experience across different applications. This led to the proposal that NHS AI Lab run the participatory process centrally, to ease the burden on applicants, reduce any conflicts of interests, and to gain some oversight over the quality of the process.

Lack of standardised method:
To address the challenge of a lack of standardised methods for how to run public engagement, we decided building a bespoke methodology for participation in impact assessment was an important recommendation for this project. This would provide a way to stabilise the differing PPI approaches currently in use in healthcare research, align the NMIP with best practice for public deliberation methods and ameliorate some concerns over a lack of standard procedure.

This process also arose from an understanding that developing a novel participatory process for an AIA requires a large amount of both knowledge and capacity for the process to operate meaningfully and produce high-quality outputs. To address this challenge, we have drawn from the Ada Lovelace Institute’s experience and expertise in designing and delivering public engagement in the data/AI space, as well as best practice from the field in order to co-ordinate an approach to a participatory AIA.

Resource versus benefit:
The participatory workshop is an extension of many existing participatory procedures in operation, and consequently is time and resource intensive for the stakeholders involved, but has significant benefits.

Beyond bringing traditionally underrepresented patients into the process, which is an important objective, we believe that the workshop offers the potential to build more intuitive, higher-quality products that understand and can respond to the needs of end users.

For applicants early on in the project lifecycle, the participatory workshop is a meaningful opportunity to engage with the potential beneficiaries of their AI system: patients (or patient representatives). It means possible patient concerns around the scope, applicability or use case for the proposed system can be surfaced while there is still opportunity to make changes or undertake further reflection before the system is in use. In this way, the participatory workshop strengthens the initial internally conducted exercise of impact identification.

Researchers have argued that, to support positive patient outcomes in clinical pathways in which AI systems are used to administer or support care, evaluation metrics must go beyond measures of technical assurance and look at how use of AI might impact on the quality of
patient care.[footnote]Kelly, C.J., Karthikesalingam, A., Suleyman, M., Corrado, G. and King. D. (2019). ‘Key challenges for delivering clinical impact with artificial intelligence’ BMC Medicine. 29 October, 17: 195. [online] Available at: https://bmcmedicine.biomedcentral.com/ articles/10.1186/s12916-019-1426-2[/footnote] The process provides a useful forum for communication between patients and developers, in which developers may be able to better understand the needs of the affected communities, and therefore build products better suited to them.

The process at scale:
Given that the NMIP is purported as a national initiative, challenges and uncertainties have arisen from the NHS AI Lab around how this process would operate at scale. We have sought to address this by recommending the NHS AI Lab run the workshops in batches, as opposed to on a rolling basis. We have also suggested that over time the NHS AI Lab may be able to use previous data-access decisions to triage future applications, and possibly even have applicants with similar projects sharing the same workshop.

Recompensing panellists appropriately:
All participants must be renumerated for their time, but we also recognise the inherent labour of attending these workshops, which may not be adequately covered or reflected by the renumeration offered.

Limitations of resource:
Other organisations hoping to adopt this exercise may be practically constrained by a lack of funding or available expertise. We hope that in future, as participatory methods and processes grow in prominence and the AIA ecosystem develops further, alternate sources of funding and support will be available for organisations wanting to adopt or adapt this framework for their contexts.

4. AIA synthesis

Recommendation

We recommend that after the participatory workshop is completed, the applicant team synthesises its findings with the findings from their original AIA template (completed in the reflexive exercise), building the knowledge produced back into the AIA, in order to ensure the deliberation-based impacts are incorporated and that applicant teams respond to them.

The synthesis step is a critical phase in accountability processes.[footnote]Raji, D., Smart, A., White, R. N., Mitchell, M., Gebru, T., Hutchinson, B., Smith-Loud, J., Theron, D. and Barnes, P. (2020). ‘Closing the AI accountability gap: defining an end-to-end framework for internal algorithmic auditing’. Conference on Fairness, Accountability, and Transparency, pp.33–44. Barcelona: ACM. Available at: https://doi.org/10.1145/3351095.3372873[/footnote] It serves to summarise and consolidate the information gained throughout the AIA process and ensure they are incorporated into the assessment of impacts, and actionable steps for mitigations of harm.

Implementation detail

  1. Throughout the AIA process, thinking and discussion should be captured in the AIA template document, allowing documentation to be revisited after each exercise and as a record for future updates of the AIA.#
  2. Once the synthesis exercise has been completed, the AIA is considered complete. The AIA is then ready to be resubmitted to the NMIP DAC.

Frictions and learnings

Goals of the process:

The synthesis exercise serves two purposes:

  1. Documentation provides a stable record of activity throughout the AIA process, for internal and external viewing (by the DAC at resubmission phase, and the public, post-publication).
  2. It encourages a critical, reflexive response to the impactidentification process, by asking applicants to revisit their responses in the light of new information and knowledge from the participatory panel in the participatory workshop.

The NHS rapporteur report also incentivises a high-quality synthesis exercise, as it allows the DAC to refer back to a full account of the workshop, which has been reviewed by the patient and public panellists, to come to a final judgement of the applicant team’s willingness to incorporate new feedback and ability to be critical of their own processes and assumptions.

Supporting meaningful participation:
Some scholars in the public-participation literature have argued that meaningful participation should be structured around co-design and collaborative decision-making.[footnote]Madaio, M, Stark, L, Wortman Vaughan, J, Wallach, H. (2020). ‘Co-designing checklists to understand organisational challenges and opportunities around fairness in AI’. Proceedings of the CHI Conference on Human Factors in Computing Systems, pp.1-14 [online] Available at: https://dl.acm.org/doi/abs/10.1145/3313831.3376445[/footnote] We designed our participatory process as a feed-in point, for patients and the public to discuss and put forward ideas on how developers of AI systems might address possible benefits and harms.

The synthesis exercise ensures that these ideas become incorporated into the AIA template document, creating an artefact that the DAC and members of the public can view. Developers can then refer back to the AIA template as required throughout the remainder of the project development process.

Though undertaking a process of synthesis boosts opportunity for reflexivity and reflection, there is no guarantee that the broader stakeholder discussions that occur in the participatory workshop will lead to tangible changes in design or development practice.

In an ideal scenario, participants would be given opportunity to have direct decision-making power on design decisions. It has been argued that the ideal level of participation in AI contexts amounts to participatory co-design, a process that sees people and communities affected by the adoption of AI systems become directly involved in the design decisions that may impact them.[footnote]Costanza-Chock, S. (2020) Design justice: community-led practices to build the worlds we need. Cambridge: MIT Press[/footnote] In Participatory data stewardship, the Ada Lovelace Institute describes a vision for people being enabled and empowered to actively collaborate with designers, developers and deployers of AI systems – a step which goes further than both ‘informing’ people about data use, via transparency measures and ‘consulting’ people about AI system design, via surveys or opinion polls.[footnote]Ada Lovelace Institute. (2021). Participatory data stewardship. Available at: https://www.adalovelaceinstitute.org/report/participatorydata-stewardship/[/footnote]

Similarly, while beyond the scope of this study, it is suggested that participants would ideally be invited for multiple rounds of involvement, such as a second workshop prior to system deployment.[footnote]Madaio, M, Stark, L, Wortman Vaughan, J, Wallach, H. (2020). ‘Co-designing checklists to understand organisational challenges and opportunities around fairness in AI’. Proceedings of the CHI Conference on Human Factors in Computing Systems, pp.1-14 [online] Available at: https://dl.acm.org/doi/abs/10.1145/3313831.3376445[/footnote] This would enable participants to be able to make a permissibility call on whether the system should be deployed, based on the applicant team’s monitoring and mitigation plan, and again once the system is in use, so participants can voice concerns or opinions on any impacts that have surfaced ex post.

5. Data-access decision

Recommendation

We recommend that the NHS AI Lab uses the NMIP DAC to assess the strength and quality of each AIA, alongside the assessment of other material required as part of the NMIP application.

Implementation detail

  1. We recommend the DAC comprises at least 11 members, including academic representatives from social sciences, biomedical sciences, computer science/AI and legal fields and representatives from patient communities (see ‘NMIP DAC membership, process and criteria for assessment’ below).
  2. Once the participatory workshop is complete, and the applicant team has revised their AIA template, providing new evidence, the template is resubmitted to the DAC. In order to come to a data-access decision, the DAC follows the assessment guidelines, reviewing the quality of both the reflexive exercise and the workshop based on the detail in the AIA output template and the strength of engagement in the participatory workshop, as well as the supporting evidence from the NHS rapporteur. If the accounts and evidence have significant divergence, the applicant team may either be instructed to undertake further review and synthesis, or be denied access.
  3. The assessment guidelines include questions on whether the DAC agrees with the most-likely, worst-case, and best-case scenarios identified, based on their knowledge of the project team’s proposal, and whether the project meets the requirements and expectations of existing NHSX frameworks for digital health technologies.[footnote]Such as the NHS Code of Conduct for Data-driven Health and Care Technology, available at: https://www.gov.uk/government/ publications/code-of-conduct-for-data-driven-health-and-care-technology/initial-code-of-conduct-for-data-driven-health-and-care- technology and NHSX’s ‘What Good Looks Like’ framework, available at: https://www.nhsx.nhs.uk/digitise-connect-transform/what-good-looks-like/what-good-looks-like-publication/[/footnote] The guidelines also establish normative guidelines for the DAC to ascertain the acceptability of the AIA based on whether the project meets the requirements and expectations of NHSX’s ‘What good looks like’ framework, which includes: ’being well led’, ’empowering citizens’ and ’creating healthy populations’ among others. If the process was deemed to have been completed incorrectly or insufficiently, or if the project is deemed to have violated normative or legal red lines, the DAC would be instructed to reject the application.
  4. In the NCCID data-access process, if the application is accepted, the applicant team would be required to submit a data-access framework contract and a data-access agreement. We believe the existing documentation from the NCCID, if replicated, would probably require applicant teams to undertake a DPIA, to be submitted with the AIA and other documentation at this stage. (If the Lab decides that not all applicant teams would be required to undertake a DPIA prior to this stage, we recommend the reflexive exercise be amended to include more data privacy considerations – see ‘AIA reflexive exercise’). Once these additional documents are completed and signed, access details are granted to the applicant.

NMIP DAC membership, process and criteria for assessment

We recommended to the NHS AI Lab that the NMIP DAC comprises at least 11 members:

  1. a chair from an independent institution
  2. an independent deputy chair from a patients-rights organisation or civil-society organisation
  3. two representatives from the social sciences
  4. one representative from the biomedical sciences
  5. one representative from the computer science/AI field
  6. one representative with legal and data ethics expertise
  7. two representatives from patient communities or patients-rights organisations
  8. two members of the NHS AI Lab.

For the NCCID, an administrator was required to help manage access requests, which would probably be required in the NMIP context. Similarly, we anticipate that in addition to the core committee, a four-person technical-review team of relevant researchers, data managers
and lab managers who can assess data privacy and security questions, may be appointed by the DAC (as per the NCCID terms).

The responsibilities of the DAC in this context are to consider and authorise requests for access to the NMIP, as well deciding whether to continue or disable access. They will base this decision on criteria and protocols for assessment and will assess the completed AIA, including the participatory workshop using the NHS AI Lab rapporteur’s account of the exercise (as described previously) as additional evidence.

For the NCCID project, the DAC assessed applications along the criteria of scientific merit, technical feasibility and reasonable evidence that access to the data can benefit patients and the NHS. This may be emulated in the NMIP, but broader recommendations for application
assessment beyond the AIAs are out of scope for this study.

As guidelines to support the DAC to make an assessment about the strength of the AIA we provide two groups of questions to consider: the AIA process and the impacts identified as part of the process.

Questions on the process include:

  1. Did the project team revise the initial reflexive exercise after the participatory workshop was conducted?
  2. Are the answers to the AIA prompts written in an understandable format, reflecting serious and careful consideration to the potential impacts of this system?
  3. Did the NHS AI Lab complete a participatory AIA with a panel featuring members of the public?
  4. Was that panel properly recruited according to the the NHS AI Lab AIA process guide?
  5. Are there any noticeable differences between the impacts/concerns/risks/challenges that the NHS AI Lab rapporteur identified and what the AIA document states? Is there anything unaddressed or missing?

Questions on the impacts include:

  1. Based on your knowledge of the project team’s proposal, do you agree with the most likely, worst-case, and best-case scenarios they have identified?
  2. Are there any potential stakeholders who may be more seriously affected by this project? Is that reason well-justified?
  3. For negative impacts identified, has the project team provided a satisfactory mitigation plan to address these harms?
    1. If you were to explain these plans to a patient who would be
      affected by this system, would they agree these are reasonable?

Frictions and learnings

The role of the DAC and accountability:
In an accountability relationship between applicant teams, the NHS AI Lab and members of the public, the DAC is the body that can pose questions and pass judgement, and ultimately is the authoritative body to approve, deny or remove access to the NMIP.

The motivation behind this design choice was the belief that a DAC could contribute to two primary goals of this AIA: accountability, by building an external forum to which the actor must be accountable; and standardisation, whereas applications grow in volume, the DAC will be able to build a case law of common benefits and harms arising from impact assessments and independent scrutiny, which may offer different or novel priorities to the AIA not considered by the applicant team(s).

Recommendations for the composition of the DAC contribute to broadening participation in the process, by bringing different forms of expertise and knowledge into the foreground, particularly those not routinely involved in data-access decision-making such as patient
representatives.

The literature review surfaced a strong focus on mandatory forms of assessment and governance in both the healthcare domain and AIA scholarship. In healthcare, many regulatory frameworks and legislation including the MHRA Medical Device Directive, a liability-based regime, ask developers to undertake a risk assessment to provide an indication of the safety and quality of a product and gatekeep entry to the market.

Initiatives like the MHRA Medical Device Directive address questions relating to product safety, but lack robust accountability mechanisms, a transparency or public-access requirement, participation and a broader lens to impact assessment, as discussed in this report. This AIA was designed to add value for project teams, the NHS and patients in these areas.

Legitimacy without legal instruments:
In the AIA space, recent scholarship from Data & Society argues that AIAs benefit from a ‘source of legitimacy’ of some kind in order to scaffold accountability and suggest that this might include being adopted under a legal instrument.[footnote]Madaio, M, Stark, L, Wortman Vaughan, J, Wallach, H. (2020). ‘Co-designing checklists to understand organisational challenges and opportunities around fairness in AI’. Proceedings of the CHI Conference on Human Factors in Computing Systems, pp.1-14 [online] Available at: https://dl.acm.org/doi/abs/10.1145/3313831.3376445[/footnote] However, there is not currently a legal requirement for AIAs in the UK, and the timeline for establishing such a legal basis is outside of the scope of this case study, necessitating a divergence from the literature. This will be a recurring challenge for AIAs as people look to trial and evaluate them as a tool at a faster pace than they are being adopted in policy.

This AIA process attempts to address this challenge by considering how alternative sources of legitimacy can be wielded, in lieu of law and regulation. Where top-down governance frameworks like legal regimes may prohibit participation and deliberation in decision-making, this AIA process brings in both internal and external perspectives of harms and benefits of AI systems. We recommended the NHS AI Lab make use of a DAC to prevent organisations building and assessing AIAs independently, as self-assessed AIAs. This may allay some concerns around interpretability and whether the AIA might end up being self-affirming.[footnote]Individual interpretation of soft governance frameworks may lead to some organisations picking and choosing which elements to enact, which is known as ’ethics washing’. See: Floridi, L. (2019). ‘Translating principles into practices of digital ethics: five risks of being unethical’. Philosophy & technology, 32, pp.185-193 [online] Available at: https://link.springer.com/article/10.1007/s13347-019-00354-x[/footnote]

Potential weaknesses of the DAC model:
In this study, the DAC has the benefit of giving the AIA process a level of independence and some external perspective. We recognise however that the appointment of a DAC may prove to be an insufficient form of accountability and legitimacy once in place. We recommend the membership of the DAC comprise experts from a variety of fields to ensure diverse forms of knowledge. Out of 11 members, only two are patient representatives, which may disempower the patients and undermine their ability to pass judgement.

The DAC functions as an accountability mechanism in our context because the committee members are able to pass judgement and scrutinise on the completed AIAs. However, the fairly narrow remit of a DAC may result in an AIA expertise deficit, where the committee may
find their new role of understanding and responding to AIAs and adopting a broad lens to impact challenging.

The data-access context means that it is not possible to further specify additional project points where applicant teams might benefit from reflexive analysis of impacts, such as at ideation phase, or at the final moment pre-deployment, that would make the process more iterative.

Additionally, the DAC still sits inside the NHS as a mechanism and is not wholly external: in an ideal scenario, an AIA might be scrutinised and evaluated by an independent third party. This also raises some tensions around whether there might be, in some cases, political pressures on
the NMIP DAC to favour certain decisions. The DAC also lacks statutory footing, putting it at the mercy of NHS funding: if funds were to be redirected elsewhere, this could leave the DAC on uncertain ground.

As other AIAs outside this context begin to be piloted, a clearer understanding of what ‘good’ accountability might look like will emerge, alongside the means to achieve this as an ideal.

6. AIA publication

Recommendation

To build transparency and accountability, we recommend that the NHS AI Lab publishes all completed AIAs, by publishing the final AIA template, alongside the name and contact details of a nominated applicant team member who is willing to field further information and questions on the process from respective interested parties on request.

We also recommend the Lab publishes information on the membership of the DAC, its role and the assessment criteria, so that external viewers can learn how data-access decisions are made.

Implementation detail

  1. We recommend that the Lab publishes completed AIAs on a central
    repository, such as an NMIP website,[footnote]Such as the website designed for the National Covid Chest Imaging Database (NCCID), see: https://nhsx.github.io/covid-chestimaging- database/[/footnote] that allows for easy access by request from the public. Only AIAs that have completed both the reflexive exercise and the participatory workshop will be published. However, there may be value in the DAC periodically publishing high-level observations around the unsuccessful AIAs (as a collective, as opposed to individual AIAs), and we also note that individual applicant teams may want to publish their AIA independently, regardless of the access decision.
  2. The designed AIA template is intended to ensure the AIAs are able to be easily published by the Lab without further workload, and the template is an accessible document that follows a standard format. It is likely a nominated NHS AI Lab team member will be needed to publish the AIAs, such as an administrator.

Frictions and learnings

Public access to AIAs:
There is widespread consensus within the AIA and adjacent literature that public access to AIAs and transparent practice are important ideals.[footnote]Latonero, M. and Agarwal, A. (2021). Human rights impact assessments for AI: learning from Facebook’s failure in Myanmar. Carr Center for Human Rights Policy: Harvard Kennedy School. Available at: https://carrcenter.hks.harvard.edu/publications/humanrights-impact-assessments-ai-learning-facebook%E2%80%99s-failure-myanmar[/footnote] [footnote]Loi, M. in collaboration with Matzener, A., Muller, A. and Spielkamp, M. (2021). Automated decision-making systems in the public sector. An impact assessment tool for public authorities. Algorithm Watch. Available at: https://algorithmwatch.org/en/wp-content/ uploads/2021/06/ADMS-in-the-Public-Sector-Impact-Assessment-Tool-AlgorithmWatch-June-2021.pdf[/footnote] [footnote]Selbst, A.D. (2018). ‘The intuitive appeal of explainable machines’. Fordham Law Review 1085. [online] Available at: https://papers.ssrn. com/sol3/papers.cfm?abstract_id=3126971[/footnote] Public access to documentation associated with decision-making has been put forward as a way to build transparency and, in turn, public trust in the use of AI systems.[footnote]Reisman, D., Schultz, J., Crawford, K. and Whittaker, M. (2018). Algorithmic impact assessments: a practical framework for public agency accountability. AI Now Institute. Available at: https://ainowinstitute.org/aiareport2018.pdf[/footnote] This is a particularly significant
dimension for a public-sector agency.[footnote]Hildebrandt, M. (2012). ‘The dawn of a critical transparency right for the profiling era’. Digital Enlightment Yearbook, pp.41-56 [online] Available at: https://repository.ubn.ru.nl/handle/2066/94126[/footnote]

Transparency is an important underpinning for accountability, where access to reviewable material helps to structure accountability relationships and improves the strength and efficacy of an impact assessment process.[footnote]Metcalf, J., Moss, E., Watkins, E.A., Ranjit, S. and Elish, M.C. (2021). ‘Algorithmic impact assessments and accountability: the coconstruction of impacts’. Conference on Fairness Accountability, and Transparency [online] Available at: https://dl.acm.org/doi/ pdf/10.1145/3442188.3445935[/footnote] Making AIAs public means they can be scrutinised and evaluated by interested parties, including patients and the public, and also enables deeper understanding and learning from approaches among research communities. Publication in our context also helps standardise applicants’ AIAs.

Other impact assessments, such as data protection impact assessments (DPIAs) and human rights impact assessments (HRIAs) have drawn criticism for not demonstrating consistent publication practice,[footnote]Metcalf, J., Moss, E., Watkins, E.A., Ranjit, S. and Elish, M.C. (2021). ‘Algorithmic impact assessments and accountability: the coconstruction of impacts’. Conference on Fairness Accountability, and Transparency [online] Available at: https://dl.acm.org/doi/ pdf/10.1145/3442188.3445935[/footnote] therefore missing opportunities to build accountability and public scrutiny. We also base our recommendation in part on audit processes, where transparent, auditable systems equip developers, auditors and
regulators with knowledge and investigatory powers for the benefit of the system itself, but also the wider ecosystem.[footnote]Singh, J, Cobbe, J and Norval, C. (2019). ‘Decision provenance: harnessing data flow for accountable systems’. IEEE Access, 7, pp. 6592-6574 [online]. Available at: https://arxiv.org/abs/1804.05741[/footnote]

Putting transparency into practice:
In this study, we found that translating transparency ideals into practice in this context required some discussion and consensus around establishing the publishable output of the AIA. During our interview process, we surfaced some potential concerns around publishing commercially sensitive information from private companies. The AIA as it appears in the AIA template document does not necessitate commercially sensitive information or detailed technical attributes.

Further transparency mechanisms:
In this context, full transparency is not necessarily achieved by publishing the AIA, and other mechanisms might be needed for more robust transparency. For example, for organisations interested in transparent model reporting, we recommend developers consider completing and publishing a model card template – a template developed by Google researchers to increase machine learning model transparency by providing a standardised record of system attributes.[footnote]More information on model cards, including example model cards, can be found here: https://modelcards.withgoogle.com/about[/footnote] This framework has been adapted to a medical context, based on the original proposal from the team at Google.[footnote]Sendak, M., Gao, M., Brajer, N. and Balu, S. (2020). ‘“The human body is a black box”’: supporting clinical decision-making with deep learning’ Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. ACM: New York, NY, USA, pp. 99–109. Available at: https://doi.org/10.1145/3351095.3372827[/footnote]

7. AIA iteration

Recommendation

We recommend that project teams revisit and update the AIA document at certain trigger points: primarily if there is a significant change to the system or its application.

We also recommend a two-year review point in all cases, because it can be hard to identify what constitutes a ‘significant change’. The exercise is designed to be a valuable reflection opportunity for a team, and a chance to introduce new team members who may have joined in the intervening time to the AIA process. The DAC might also make suggestions for an appropriate time period for revision in certain cases, and revision of the AIA could be a requirement of continued access.

Implementation detail

A potential process of iteration might be:

  1. After a regular interval of time has elapsed (e.g. two years), project teams should revisit the AIA. For some applicants, this might occur after the proposed AI system has entered into deployment. In this scenario, previously unanticipated impacts may have emerged.
  2. Reviewing the AIA output template and updating with new learnings and challenges will help strengthen record-keeping and reflexive practice.
  3. All iterations are recorded in the same way to allow stable documentation and comparison over time.
  4. If revision is a condition of continued access, the DAC may see fit to review the revised AIA.
  5. The revised AIA is then published alongside the previous AIA, providing important research and development findings to the research community, as with each AIA iteration, new knowledge and evidence may be surfaced.

Frictions and learnings

Benefits of ex post assessment:
Although we consider our AIA primarily as a tool for pre-emptive impact assessment, this iterative process provides a means for an AIA to function as both an ex ante and ex post assessment, bridging different impact-assessment methodologies to help build a more holistic picture of benefits and harms. This will capture instances where impacts emerge that have not been adequately anticipated by a pre-emptive AIA.

This would align our AIA with methods like AI assurance,[footnote]Information Commissioner’s Office (ICO). (2019). An overview of the auditing framework for artificial intelligence and its core components. Available at: https://ico.org.uk/about-the-ico/news-and-events/ai-blog-an-overview-of-the-auditing-framework-forartificial-intelligence-and-its-core-components/[/footnote] which offer a possible governance framework across the entire AI-system lifecycle, of which impact assessment is one component. There are other similar mechanisms already in place in the healthcare sector, such as the ISO/TR 20416 post-market surveillance standards, which provide users with a way to identify ‘undesirable effects’ at pace.[footnote]International Standards Organization (ISO). (2021). New ISO standards for medical devices. Available at: https://www.iso.org/news/ ref2534.html[/footnote]

Revising the AIA also equips teams with further meaningful opportunity for project reflection.[footnote]Raji, D., Smart, A., White, R. N., Mitchell, M., Gebru, T., Hutchinson, B., Smith-Loud, J., Theron, D. and Barnes, P. (2020). ‘Closing the AI accountability gap: defining an end-to-end framework for internal algorithmic auditing’. Conference on Fairness, Accountability, and Transparency, pp.33–44. Barcelona: ACM. Available at: https://doi.org/10.1145/3351095.3372873[/footnote]

Limitations of the model:
Many impact assessment proposals suggest adopting an incremental, iterative approach to impact identification and evaluation, identifying several different project points for activity across the lifecycle.[footnote]Information Commissioner’s Office (ICO). Data protection impact assessments. Available at: https://ico.org.uk/for-organisations/ guide-to-data-protection/guide-to-the-general-data-protection-regulation-gdpr/accountability-and-governance/data-protection-impact- assessments/[/footnote]  [footnote]The Equality and Health Inequalities Unit. (2020). NHS England and NHS Improvement: Equality and Health Inequalities Impact Assessment (EHIA). Available at: https://www.england.nhs.uk/wp-content/uploads/2020/11/1840-Equality-Health-Inequalities- Impact-Assessment.pdf[/footnote] However, as with other components of AIAs, many do not detail a specific procedure for monitoring and mitigation once the model is deployed.

Trigger points for iteration will probably vary across NMIP use cases owing to the likely breadth and diversity of potential applicants. The process anticipates that many applicants will not have fully embarked on research and development at the time of application, so the AIA is designed primarily as an ex ante tool, equipping NMIP applicants with a way to assess risk prior to deployment, while there is still opportunity to make design changes. We consider it as a mechanism that is equipped to diagnose possible harms so, accordingly, the AIA may be an insufficient mechanism to treat or address harms.

Healthcare and other contexts:
Although we recommend iteration of an AIA, the proposed process does not include an impact mitigation procedure. In the context of AI systems for healthcare, post-deployment monitoring fall under the remit of medical post-market surveillance, known as the medical device vigilance
system, and can report any ‘adverse incidents’ to the MHRA.[footnote]Medicines & Healthcare products Regulatory Agency (MHRA). Guidance: Medical device stand-alone software including apps (including IVDMDs). Available at: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/999908/Software_flow_chart_Ed_1-08b-IVD.pdf[/footnote]

The aim of iteration of the AIA is therefore to ensure impacts anticipated by the participatory process are addressed and new potential impacts can be identified. It provides impetus for continual reflection, building good practice for future products, and for ensuring thorough documentation into the future. This context is specific to our study: policymakers and researchers interested in trialling AIAs may find that building an ex post monitoring or evaluation framework is appropriate in domains where existing post-deployment monitoring is lacking.

Applicability of this AIA to other use cases

This case study differs from existing proposals and examples of AIAs in three ways. Those wanting to apply the AIA process will need to consider the specific conditions of other domains or contexts:

  1. Healthcare domain
    At the time of writing, this is the first detailed proposal for use of an AIA in a healthcare context. Healthcare is a significantly regulated area in the UK, particularly in comparison to other public-sector domains. There is also notable discussion and awareness of ethical issues in the sector, with recognition that many AI applications in healthcare would be considered ‘high risk’. In the UK, there are also existing public participation practices in healthcare – typically referred to as ‘patient and public involvement and engagement’ (PPIE) – and requirements for other forms of impact assessment, such as DPIAs and Equalities Impact Assessments. This means that an AIA designed for this context can rely on existing processes – and will seek to avoid unnecessary duplication of those processes – that AIAs in other domains cannot.
  2. Public and private-sector intersection
    AIA proposals and implementation have been focused on public-sector uses, with an expectation that those conducting most of the process will be a public-sector agency or team.[footnote]Reisman, D., Schultz, J., Crawford, K. and Whittaker, M. (2018). Algorithmic impact assessments: a practical framework for public agency accountability. AI Now Institute. Available at: https://ainowinstitute.org/aiareport2018.pdf[/footnote]  [footnote]Ada Lovelace Institute, AI Now Institute, Open Government Partnership. (2021). Algorithmic accountability for the public sector. Available at: https://www.opengovpartnership.org/wp-content/uploads/2021/08/algorithmic-accountability-public-sector.pdf[/footnote]  [footnote]Government of Canada. (2020). Algorithmic impact assessment tool. Available at: https://www.canada.ca/en/government/system/digital-government/digital-government-innovations/responsible-use-ai/algorithmic-impact-assessment.html[/footnote] While AIAs have not yet been applied in the private sector, there has been some application of human rights impact assessments to technology systems,[footnote]WATERFRONToronto. (2020). Preliminary Human Rights Impact Assessment for Quayside Project. Available at: http://blog. waterfrontoronto.ca/nbe/portal/wt/home/blog-home/posts/preliminary+human+rights+impact+assessment+for+quayside+project[/footnote] which may surface overlapping concerns through a human-rights lens. There are also similarities with proposals around internal-auditing approaches in the private sector.[footnote]Raji, D., Smart, A., White, R. N., Mitchell, M., Gebru, T., Hutchinson, B., Smith-Loud, J., Theron, D. and Barnes, P. (2020). ‘Closing the AI accountability gap: defining an end-to-end framework for internal algorithmic auditing’. Conference on Fairness, Accountability, and Transparency, pp.33–44. Barcelona: ACM. Available at: https://doi.org/10.1145/3351095.3372873[/footnote] To date, this case study is unique in looking explicitly at the intersection of public and private sector – with applications being developed by a range of mainly private actors for use of data
    originating in the public sector, with some oversight from a public-sector department (NHS).
  3. Data-access context
    This AIA is being proposed as part of a data-access process for a public-sector dataset (the NMIP). This is, to our knowledge, unique in AIAs so far. The DAC provides a forum for holding developers accountable where other proposals for AIAs have used legislation or independent assessors – to require the completion of the AIA, to evaluate the AIA and to prevent a project proceeding (or at least, proceeding with NHS data) if the findings are not satisfactory.

    These differences, and their implication for the design of this AIA, should be considered by anyone looking to apply parts of this process in another domain or context. We expect elements of this process, such as the AIA template and exercise formats, to prove highly transferrable.However, the core accountability mechanism – that the AIA is both required and reviewed by the DAC – is not transferrable to many potential AIA use cases outside data access; an alternative mechanism would be needed.Similarly, the centralisation of both publication and resourcing for the participatory workshops with the NHS AI Lab may not be immediately transferrable – though one could imagine a central transparency register and public-sector resource for participatory workshops providing this role for mandated public-sector AIAs.

Seven operational questions for AIAs

Drawing on findings from this case study, we identify seven operational questions for those considering implementing an AIA process in any context, as well as considerations for how the NMIP AIA process addresses these issues.

1. How to navigate the immaturity of the assessment ecosystem?

AIAs are an emerging method for holding AI systems more accountable to those who are affected by them. There is not yet a mature ecosystem of possible assessors, advisers or independent bodies to contribute to or run all or part of an AIA process. For instance, in environmental and fiscal impact assessment, there is a market of consultants available to act as assessors. There are public bodies and regulators who have the power to require their use in particular contexts under relevant legal statutes, and there are more established norms and standards around how these impact assessments should be conducted. [footnote]Moss, E., Watkins, E.A., Singh, R., Elish, M.C. and Metcalf, J. (2021). Assembling accountability: algorithmic impact assessment for the public interest. Data & Society. Available at: https://datasociety.net/library/assembling-accountability-algorithmic-impactassessment- for-the-public-interest/[/footnote]

In contrast, AIAs do not yet have a consistent methodology, lack any statutory footing to require their use, and do not have a market of assessors who are empowered to conduct these exercises. A further complexity is that AI systems can be used in a wide range of different
contexts – from healthcare to financial services, from criminal justice to the delivery of public services – making it a challenge to identify the proper scope of questions for different contexts.

This immaturity of the AIA ecosystem poses a challenge to organisations hoping to build and implement AIAs, who may not have the skills or experience in house. It also limits the options for external and independent scrutiny or assessment within the process. Future AIA
processes must identify the particular context they are operating in, and scope their questions to meet that context.

In the NMIP case study, this gap is addressed by centring the NMIP DAC as the assessor of NIMP AIAs. They are a pre-existing group already intended to bring together a range of relevant skills and experience with independence from the teams developing AI, as well as with authority to require and review the process.

We focus the NMIP AIA’s scope on the specific context of the kinds of impacts that healthcare products and research could raise for patients in the UK, and borrow from existing NHS guidance on the ethical use of data and AI systems to construct our questions. In addition, under this proposal, the NHS AI Lab itself would organise facilitation of the participatory workshops within the AIA.

2. What groundwork is required prior to an AIA?

AIAs are not an end-to-end solution for ethical and accountable use of AI, but part of a wider AI-development and governance process.

AIAs are not singularly equipped to identify and address the full spectrum of possible harms arising from the deployment of an AI system,[footnote]Metcalf, J., Moss, E., Watkins, E.A., Ranjit, S. and Elish, M.C. (2021). ‘Algorithmic impact assessments and accountability: the coconstruction of impacts’. Conference on Fairness Accountability, and Transparency [online] Available at: https://dl.acm.org/doi/ pdf/10.1145/3442188.3445935[/footnote] given that societal harms are unpredictable and some harms are experienced more profoundly by those occupying or holding simultaneous marginalised identities. Accordingly, our AIA should not be
understood as a complete solution for governing AI systems.

This AIA process does not replace other standards for quality and technical assurance or risk management already in use in the medical-device sector (see: ‘The utility of AIAs in health policy’). Teams hoping to implement AIAs should consider the ‘pre’ and ‘post’ AIA work that might be required, particularly given projects may be at different stages, or with different levels of AI governance maturity, at the point that they begin the AIA process.

For example, one proposed stakeholder impact assessment framework sets out certain activities to be taken place at the ‘alpha phase’[footnote]Leslie, D. (2019). Understanding artificial intelligence ethics and safety. Alan Turing Institute. Available at: https://www.turing.ac.uk/ sites/default/files/2019-08/understanding_artificial_intelligence_ethics_and_safety.pdf[/footnote] (problem formulation), which includes ‘identifying affected stakeholders’: applicants may find it helpful to use this as a guide to identify affected individuals and communities early on in the process, and in order to be clear on how different interests might coalesce in this project. This is a useful precursor for completing the impact identification exercises in this AIA.

In the NMIP case study, in recognition of the fact that applicant teams are likely to be in differing stages of project development at the point of application, we make some recommendations for ‘pre-AIA’ exercises and initiatives that might capture other important project-management processes considered out of the scope of this AIA.

It is also important to have good documentation of the dataset any model or product will be developed on, to inform the identification of impacts. In the case of the NMIP, the AIAs will all relate to the same dataset (or subsets thereof). There is a significant need for documentation around NMIP datasets that sets out key information such as what level of consent the data was collected under, where the data comes from, what form it takes, what kinds of biases it has been tested for, and other essential pieces of information.

We made recommendations to the NHS AI Lab to take the burden of documenting the NMIP dataset using datasheets.[footnote]Boyd, K.L. (2021). ‘Datasheets for datasets help ML engineers notice and understand ethical issues in training data’. Proceedings of the ACM on Human-Computer Interaction, 5, 438. [online] Available at: http://karenboyd.org/blog/wp-content/uploads/2021/09/ Datasheets_Help_CSCW-5.pdf[/footnote] [footnote]Gebru, T., Mogenstern, J., Vecchione, B., Wortman Vaughan, J., Wallach, H., Daumé III, H. and Crawford, K. (2018). Datasheets for datasets. ArXiv [online] Available at: https://arxiv.org/abs/1803.09010[/footnote] For AIAs in different contexts, dataset documentation may also be an essential precondition as it provides an important source of information to consider the potential impacts of uses of that data.

3. Who can conduct the assessment?

Previous studies highlight the importance of an independent ‘assessor’ in successful impact-assessment models, in other domains such as environmental or fiscal impact assessments.[footnote]Moss, E., Watkins, E.A., Singh, R., Elish, M.C. and Metcalf, J. (2021). Assembling accountability: algorithmic impact assessment for the public interest. Data & Society. Available at: https://datasociety.net/library/assembling-accountability-algorithmic-impactassessment- for-the-public-interest/[/footnote] However, most proposals
for AIA processes, and the Canadian AIA model in implementation,[footnote]Government of Canada. (2020). Algorithmic impact assessment tool. Available at: https://www.canada.ca/en/government/system/digital-government/digital-government-innovations/responsible-use-ai/algorithmic-impact-assessment.html[/footnote] have instead used self-assessment as the main mechanism.

Part of this difference may be due to whether the focus of an AIA is accountability or reflexivity: accountability prioritises independence of assessment as it creates a relational dynamic between a forum and an actor, whereas reflexivity prioritises self-assessment as a mechanism for learning and improvement on the part of the system developers.

In our NMIP case study, we seek to capture both interests – with the initial exercise allowing a reflexive, in-team process for developers, and the DAC review playing the role of an independent assessor. We acknowledge the significant power this process gives the DAC and the potential limitations of delegating this power to a committee established by the NHS. For example, there may be concerns around the ability of the DAC to make impartial decisions and not those that could serve wider NHS aims. We have included in our recommendations a potential composition of this DAC that includes members of the public, patients or patients-rights advocates, and other independent experts who are external to the NHS.

There is, however, a more immediate and practical constraint for those considering AIAs currently – who can assess. Without the wider ecosystem of assessment mentioned previously, for AIAs proposed in contexts outside a data-access process, or without a centralised body
to rely on, it may be a necessary short-term solution for companies to run and assess the AIA and participatory processes themselves. This, however, eliminates much of the possibility for independence, external visibility and scrutiny to improve accountability, so should not be
considered a long-term ideal, but rather a response to current practical constraint. For those building AIA processes in other domains, it will be essential to consider which actors are best equipped to play the role of an independent assessor.

4. How to ensure meaningful participation in defining and identifying impacts?

The literature on AIAs, and other methods of assessing AI systems, makes the case for consultation and participation of multidisciplinary stakeholders,[footnote]It should be noted that public consultation is distinct from public access, which refers to the publication of key documentation and other material from the AIA, as a transparency mechanism. See: Ada Lovelace Institute. (2021). Participatory data stewardship, and Moss, E., Watkins, E.A., Singh, R., Elish, M.C. and Metcalf, J. (2021). Assembling accountability: algorithmic impact assessment for the public interest. Data & Society. Available at: https://datasociety.net/library/assembling-accountability-algorithmic-impactassessment- for-the-public-interest/[/footnote] affected communities and the wider public.[footnote]Reisman, D., Schultz, J., Crawford, K. and Whittaker, M. (2018). Reisman, D., Schultz, J., Crawford, K. and Whittaker, M. (2018). Algorithmic impact assessments: a practical framework for public agency accountability. AI Now Institute. Available at: https:// ainowinstitute.org/aiareport2018.pdf[/footnote]  [footnote]European Commission. (2020). The Assessment List for Trustworthy Artificial Intelligence (ALTAI) for self-assessment. Available at: https://op.europa.eu/en/publication-detail/-/publication/73552fcd-f7c2-11ea-991b-01aa75ed71a1[/footnote]  [footnote]Moss, E., Watkins, E.A., Singh, R., Elish, M.C. and Metcalf, J. (2021). Assembling accountability: algorithmic impact assessment for the public interest. Data & Society. Available at: https://datasociety.net/library/assembling-accountability-algorithmic-impactassessment- for-the-public-interest/[/footnote]  [footnote]Institute for the Future of Work. (2021). Artificial intelligence in hiring: assessing impacts on equality. Available at: https://www.ifow.org/publications/artificial-intelligence-in-hiring-assessing-impacts-on-equality[/footnote] This can create a more accountable relationship between developers
of a technology and those affected by it, by ensuring that impacts are constructed from the perspective of those affected by a system, and not simply those developing a system.

There is, however, differing opinion on the people or groups that should be involved: some proposals are explicit in the requirement to include public perspectives in the impact assessment process, others simply suggest a mix of internal and external stakeholders. Types of participation also vary, and range from simply informing key stakeholders, to consultation, to collaboration for consensus-building.[footnote]For further information on participatory approaches, see: Ada Lovelace Institute. (2021). Participatory data governance, and Arnstein, S. (1969). ‘A ladder of citizen participation’. Journal of the American Institute of Planners, 36, pp.216-224 [online] Available at: https:// www.tandfonline.com/doi/full/10.1080/01944363.2018.1559388[/footnote]

As with other constitutive components of AIAs, there is currently little procedure for how to engage practically with the public in these processes. Our framework seeks to bridge that gap, drawing from Ada’s internal public deliberation/engagement expertise to build a
participatory workshop for the NMIP AIA.

A key learning from this process is that there are significant practical challenges to implementing participatory ideals:

  • Some participatory exercises may be tokenistic or perfunctory, which
    means they do nothing to rebalance power between developers and
    affected communities and may be harmful for participants.[footnote]Sloane, M., Moss, E., Awomolo, O., & Forlano, L. (2020).’Participation is not a design fix for machine learning’. ArXiv. [online] Available at: https://arxiv.org/abs/2007.02423[/footnote] Beginning to address this must involve participants being renumerated for their time, given the safety and security to deliberate freely and provide critical feedback, and having assurance that their contributions will be addressed by a developer who will be required to respond to their concerns before the DAC.
  • There is a potential challenge in implementing robust, meaningful participatory processes at scale. In our case, the NMIP – as a large dataset comprised of different image formats – has scope to underpin a variety of different algorithms, models and products, so is expected to receive a large number of data-access applications. This means that any framework needs to be flexible and accommodating, and able
    to be scaled up as required. This may place considerable demand on resources. Pilot studies of our participatory workshop would help us further understand and account for some of these demands, as they arise in practice.

5. What is the artefact of the AIA and where can it be published?

Whether the goal of an AIA process is to encourage greater reflexivity and consideration for harmful impacts from developers or to hold developers of a technology more accountable to those affected by its system, an AIA needs an artefact – a document that comes from the process – to be able to be shared with others, and reviewed and assessed. Most proposals of AIAs recommend publication of results or key information,[footnote]Reisman, D., Schultz, J., Crawford, K. and Whittaker, M. (2018). Algorithmic impact assessments: a practical framework for public agency accountability. AI Now Institute. Available at: https://ainowinstitute.org/aiareport2018.pdf[/footnote] but do not provide a format or template in which to do so.

In public-sector use cases, the Canadian AIA has seen three published AIAs to date, with departments conducting the AIA being responsible for publication of results in an accessible format, and in both official languages – English and French – on the Canadian Open Government portal.[footnote]Government of Canada (2020). Algorithmic impact assessment tool. Available at: https://www.canada.ca/en/government/system/digital-government/digital-government-innovations/responsible-use-ai/algorithmic-impact-assessment.html[/footnote]

When publishing completed AIAs, an AIA process must account for the
following:

  • What will be published: what content, in what format? Our case study provides a template for developers to complete as the first exercise, and update throughout the AIA process, producing the artefact of the AIA that can then be published.
  • Where it will be published: is there a centralised location that the public can use to find relevant AIAs? In our case study, as all AIAs are being performed by applicants looking to use NMIP data, the NMIP can act as a central hub, listing all published AIAs. In public-sector use cases, a public-sector transparency register could be that centralised location.[footnote]UK.Gov. (2021). Algorithmic transparency standard. Available at: https://www.gov.uk/government/collections/algorithmictransparency-standard[/footnote]
  • What are the limitations and risks of publishing: several of our interview subjects flagged concerns that publishing an AIA may raise intellectual property or commercial sensitivities, and may create a perverse incentive for project teams to write with a mindset for public relations rather than reflexivity. These are very real concerns, but they must be balanced with the wider goal of an AIA to increase accountability over AI systems.

This study seeks to balance this concern in a few ways. In this case study the AIA document would not contain deep detail on the functioning of the system that may raise commercial sensitivities, but rather focus on the potential impacts and a simple explanation of its intended use.

Study respondents flagged that an AIA within a data-access process may also raise concerns about publishing ‘unsuccessful’ AIAs – AIAs from applicants to the NMIP who were rejected (which may have been on grounds other than the AIA) – which could raise potential liability
issues. Given this constraint, we have chosen to prioritise publication of AIAs that have completed the reflexive exercise and the participatory workshop , and not AIAs that did not proceed past DAC filtering. However, we recognise there could be valuable learnings from AIAs that have been rejected, and would encourage the DAC to share observations and learnings from them, as well as enabling individual teams to voluntarily publish AIAs regardless of data-access outcome.

6. Who will act as a decision-maker on the suitability of the AIA and the acceptability of the impacts it documents?

As well as identifying what standards an AIA should be assessed against, it is necessary to decide who can assess the assessment.

There is not yet a standard for assessing the effectiveness of AIAs in a particular context, or a clear benchmark that AIAs can use for what ‘good’ looks like. This makes it hard to measure both the effectiveness of an individual AIA process in terms of what effects have been achieved or what harms have been prevented, and hard to evaluate different AIA approaches to ascertain which approach is more effective in a particular context.

A potential failure mode of an AIA would be a process that carefully documented a series of likely negative impacts of a system, but then saw the team proceed undeterred with development and deployment.[footnote]Moss, E., Watkins, E.A, Singh, R., Elish, M.C, Metcalf, J. (2021). Assembling accountability: algorithmic impact assessment for the public interest. Data & Society. Available at: https://datasociety.net/library/assembling-accountability-algorithmic-impact-assessment-for-the- public-interest/[/footnote] Similarly concerning would be a scenario where an AIA is poorly conducted, surfacing few of the potential impacts, but a team is able to point to a completed AIA as justification for continued development.

An AIA will require a decision to be made about what to do in response to impacts identified – whether that is to take no action (impacts considered acceptable), to amend parts of the AI system (impacts require mitigation or action), or to not proceed with the development or use of the system (impacts require complete prevention). This is a high-level decision about
the acceptability of the impacts documented in an AIA.

In our contextual example, the applicant team is a voluntary decisionmaker (they could propose changes to their system, or choose to end their NMIP application or entire system development as a result of AIA findings). However, the ultimate decision about acceptability of impacts lies with the NMIP DAC who would decide whether data can be made available for the applicant’s use case – this is, implicitly, a decision about the acceptability of impacts documented in the AIA (along with other documents) and whether the AIA has been completed to a sufficient standard.

To help the DAC in its decision-making, the proposal includes a draft terms of reference that specifies what a ‘good’ AIA in this context might look like and what rubric they should review it under. In order to prevent a myopic reading of an AIA, the DAC should comprise of a diverse panel of representatives, including representatives from the NHS AI Lab, the social sciences, biomedical sciences, computer science/AI, legal and data ethics and community representatives. It should also follow standards set for the cadence of committee meetings.

The guidelines instruct the DAC to accept or reject the applicant based on whether the AIA process has been run correctly, with evidence from both the reflexive exercise and the participatory workshop produced as part of the application, reflecting serious and careful consideration of impacts. The impetus behind these approaches is to provide a level of external scrutiny and visibility, which legitimise the process when compared with a wholly self-assessed approach.

In our context, we entrust the NMIP DAC with making the judgement call about the suitability of each AIA, and this then informs the final data-access decision. However, the role of the DAC in the NMIP context is broader than typical, as we are asking members to make an assessment
of a variety of potential impacts and harms to people and society, beyond privacy and security of individuals and their data.

Accordingly, AIAs designed for different contexts may require the chosen assessor to fulfil a slightly different role or require additional expertise. Over time, assessors of an AIA will need to arbitrate on the acceptability of the possible harmful impacts of a system and probably begin to construct clear, normative red lines. Regular and routine AIAs in operation across different domains will lead to clearer benchmarks for evaluation.

7. How will trials be resourced, evaluated and iterated?

Governments, public bodies and developers of AI systems are looking to adopt AIAs to create better understanding of, and accountability for potential harms from AI systems. The evidence for AIAs as a useful tool to achieve this is predominantly theoretical, or based in examples from
other sectors or domains. We do not yet know if AIAs achieve these goals in practice.

Anyone looking to adopt or require an AIA, should therefore consider trialling the process, evaluating it and iterating on the process. It cannot be assumed that an AIA is ‘ready to go’ out of the box.

This project has helped to bridge the gap between AIAs as a proposed governance mechanism, and AIAs in practice. The kinds of expertise, resources and timeframe needed to build and implement an AIA are valuable questions that should be discussed early on in the
process.

For trials, we anticipate three key considerations: resourcing, funding and evaluation.

  1. To resource the design and trialling of an AIA process will require skills from multiple disciplines: we drew on a mix of data ethics, technical, public deliberation, communications and domain expertise (in this case, health and medical imaging)
  2. Funding is a necessary consideration as our findings suggest the process may prove more costly than other forms of impact assessment, such as a data protection impact assessment (DPIA), due predominantly to the cost of running a participatory process.We argue that such costs should be considered a necessary condition of building an AI system with an application in high-stakes clinical pathways. The cost of running a participatory AIA will bring valuable insight, enabling developers to better understand how their system works in clinical pathways, outside of a research lab environment.
  3. A  useful trial will require evaluation, considering questions such as: is the process effective in increasing the consideration of impacts, does it include those who may be affected by the system in the identification of impacts, does it result in the reduction of negative impacts? This may be done as part of the trial, or through documentation and publication of the process and results for others to review and learn from.

Currently, there are very few examples of AIA practice – just four published AIAs from the Canadian government’s AIA process[footnote]An example of a publicly-available AIA, from the Canadian Government Directive on Automated Decision-making can be found here: https://open.canada.ca/aia-eia-js/?lang=en[/footnote] – with few details on the experience of the process or the changes resulting from it. As the ecosystem continues to develop, we hope that clearer routes to funding, trialling and evaluation will emerge, generating new AIAs: though policymakers may be disappointed to find that AIAs are not an ‘oven-ready’ approach, and that this AIA will need amendments before being directly transferable to other domains, we argue there is real value to be had to in beginning to test AIA approaches within, and across different domains.

Conclusion

This report has set out the assumptions, goals and practical features of a proposed algorithmic impact assessment process for the NHS AI Lab’s National Medical Imaging Platform, to contribute to the evidence base for AIAs as an emerging governance mechanism.

It argues that meaningful accountability depends on an external forum being able to pass judgement on an AIA, enabled through standardisation of documentation for public access and scrutiny, and through participation in the AIA, bringing diverse perspectives and relevant lived experience.

By mapping out the existing healthcare ecosystem, detailing a step-by-step process tailored to the NMIP context, including a participatory workshop, and presenting avenues for future research, we demonstrate how a holistic understanding of the use case is necessary to build an AIA that can confront and respond to a broad range of possible impacts arising from a specific use of AI.

As the first detailed proposal for the use of AIAs in a healthcare context, the process we have built was constructed according to the needs of the NMIP: our study adds weight to the argument that AIAs are not ‘ready to roll out’ across all sectors. However, we have argued that testing, trialling and evaluating AIA approaches will help build a responsive and robust assessment ecosystem, which may in turn generate further AIAs by providing a case law of examples, and demonstrating how certain resources and expertise might be allocated.

This report aligns three key audiences for this work: policymakers interested in AIAs, AIA practitioners and researchers, and developers of AI systems in the healthcare space.

Policymakers should pay attention to how this proposed AIA fits in the existing landscape, and to the findings related to process development that show some challenges, learnings and uncertainties when adopting AIAs.

There is further research to be carried out to develop robust AIA
practices.

Developers of AI systems that may be required to complete an AIA will want to use the report to learn how it was constructed and how it is implemented, as well as Annex 1 for the ‘AIA user guide’, which provides step-by-step detail. Building a shared understanding of the value of AIAs, who could adopt them, and what promise they hold for the AI governance landscape, while responding to the nuances of different domain contexts, will be critical for future applications of AIA.

This project has offered a new lens through which to examine and develop AIAs at the intersection of private and public-sector development, and to understand how public-sector activity could shape industry practice in the healthcare space. But this work is only in its
infancy.

As this report makes clear, the goals of AIAs – accountability, transparency, reflection, standardisation, independent scrutiny – can only be achieved if there is opportunity for proposals to become practice through new sites of enquiry that test, trial and evaluate AIAs, helping to make sure AI works for people and society

Methodology

To investigate our research questions and create recommendations for an NMIP-specific AIA process, we adopted three main methods:

  • a literature review
  • expert interviews
  • process development

Our literature review surveyed AIAs in both theory and practice, as well as analogous approaches to improving algorithmic accountability, such as scholarship on algorithm audits and other impact assessments for AI that are frequently adopted in tandem with AIAs. In order to situate discussion on AIAs within the broader context, we reviewed research from across the fields of AI and data ethics, public policy/public administration, political theory and computer science.

We held 20 expert interviews with a range of stakeholders from within the NHS AI Lab, NHSX and outside. These included clinicians and would-be applicants to the National Medical Imaging Platform, such as developers from healthtech companies building on imaging data, to understand how they would engage with an AIA and how it would slot into existing workstreams.

Finally, we undertook documentation analysis of material provided by the NHS AI Lab, NMIP and NCCID teams to help understand their needs, in order to develop a bespoke AIA process. We present the details of this process in ‘Annex 1: Proposed process in detail’, citing insights from the literature review and interviews to support the design decisions that define the proposed NMIP AIA process.

This partnership falls under NHS AI Lab’s broader work programme known as ‘Facilitating early-stage exploration of algorithmic risk’.[footnote]NHS AI Lab. The AI Ethics Initiative: Embedding ethical approaches to AI in health and care. Available at: https://www.nhsx.nhs.uk/ ai-lab/ai-lab-programmes/ethics/[/footnote]

Acknowledgements

We would like to thank the following colleagues for taking time to review a draft of this paper or offering their expertise and feedback:

  • Brhmie Balaram, NHS AI Lab
  • Dominic Cushnan, NHS AI Lab
  • Emanuel Moss, Data & Society
  • Maxine Mackintosh, Genomics England
  • Lizzie Barclay, Aidence
  • Xiaoxuan Liu, University Hospitals Birmingham NHS Foundation Trust
  • Amadeus Stevenson, NHSX
  • Mavis Machirori, Ada Lovelace Institute.

This report was lead authored by Lara Groves, with substantive contributions from Jenny Brennan, Inioluwa Deborah Raji, Aidan Peppin and Andrew Strait.

 

Annex 1: Proposed process in detail

As well as synthesising information about AIAs, this project has developed a first iteration of a process for using an AIA in a public-sector, data-access context. The detail of the process will not be applicable to every set of conditions in which AIAs might be used, but we expect it will
provide opportunities to develop further thinking for these contexts.

People and organisations wishing to understand more about, or implement, an AIA process will be interested in the detailed documentation developed for the NMIP and NHS AI Lab:

  • NMIP AIA user guide: a step-by-step guide to completing the AIA for
    applicants to the NMIP.
  • AIA reflexive template: the document NMIP applicants will fill in during
    the AIA and submit to the NMIP with their application.

Annex 2: NMIP Data Access Committee Terms of Reference

Responsibilities

  • To consider and authorise requests for access to the National Medical Image Platform (NMIP), a centralised database of medical images collected from NHS trusts.
  • To consider and authorise applications for the use of data from the NMIP.
  • To consider continuing or disabling access to the NMIP and uses of its data.
  • To judge applications using the criteria and protocols outlined in the NMIP’s data access documentation request forms, which include but are not limited to:
    • an algorithmic impact assessment (AIA) reflexive template
      (completed by requesting project teams)
    • an accompanying participatory workshop report (completed by an NHS AI Lab rapporteur on behalf of the patient and public participants for the participatory workshop)
    • a data protection impact assessment (DPIA).
  • To judge applications according to the NMIP Data Access Committee (DAC) policy, which includes guidance on the reflexive exercise and participatory workshop requirements. This guidance will be updated regularly and placed on the NMIP website.
  • To establish a body of published decisions on NMIP data access requests, as precedents which can inform subsequent requests for NMIP access and use.
  • To disseminate policies to applicants and encourage adherence to all
    guidance and requirements.

Membership

  • Membership of the DAC will comprise at least eleven members as
    outlined below:

    • a chair from an independent institution
    • an independent deputy Chair
    • two academic representatives from the social sciences
    • one academic representatives from the biomedical sciences
    • one academic representative from the computer science/AI field
    • one academic representative with legal and data ethics expertise
    • two non-academic representatives from patient communities
    • two members of the NHS AI Lab.
  • In addition to the core DAC, a four-person technical review team will comprise relevant researchers, data managers and lab managers who can assess data privacy and security questions. This team will be appointed by the DAC.
  • DAC members will be remunerated for their time according to an hourly wage set by NHS AI Lab.
  • An NHS AI Lab participatory workshop rapporteur will attend DAC meetings to provide relevant information when necessary to inform the decisions.
  • When reviewing data access requests, the following members from the project team will be in attendance to present their case:
    • the study’s principal investigator (PI)
    • a member of the study’s technical team.
  • When reviewing data-access requests, the DAC may request that a representative of the project’s funding organisation, members of a technical review team, or representatives reflecting experiential expertise relevant to the project may attend in an ex officio capacity to observe and provide information to help inform decisions.
  • Members, including the Chair and Deputy Chair, will usually be appointed for three years, with the option to extend for a further three after the first term only. Appointment to the DAC will be staggered in order to ensure continuity of membership. The recruitment process will occur annually, when new appointments are necessary, ahead of the second face-to-face meeting of the year.
  • The DAC will co-opt members as and when there is a need for additional expertise. These members will have full voting rights and their term will end on appointment of new members through the annual  recruitment process.

Modes of operation

  • The DAC will follow the guidance for assessing data access request documentation. Updating this guidance will involve a majority vote of the DAC to approve.
  • The DAC will meet virtually to address data access requests once each month. The DAC will meet face to face three times a year to discuss emerging issues in relation to data access and provide information on these to the individual studies and funders. Projects leads will be copied into email correspondence regarding individual applications.
  • Quoracy formally requires the attendance of half the full independent members (with at least one independent member with biomedical science expertise and one with social science expertise) and that either the Chair or the Deputy Chair must be present for continuity. For face-to-face meetings, where it is unavoidable, attendance of a member by teleconference will count as being present.
  • Comments from the technical review team will be circulated to the DAC along with any applications requesting access to the data.
  • Decisions of the DAC on whether to grant access to applications will be based on a majority vote. In the event that either a) a majority decision amongst DAC members is not reached; or b) a project lead has grave concerns that the DAC’s decision creates unreasonable risk for the project, the Chair of the DAC will refer the decision to the relevant appeals body.
  • Where appropriate, the DAC will take advantage of third-party specialist knowledge, particularly where an applicant seeks to use depletable samples. Where necessary the specialist will be invited to sit on the DAC as a co-opted member.

Reporting

  • Decisions of the Committee will be reported on the NMIP website and must be published no more than one month after a decision has been reached. Decisions must be accompanied by relevant documentation from the research.

Annex 3: Participatory AIA process

NHS AI Lab NMIP participatory AIA process outline

Overview

  • The recommendation is that NHS AI Lab sets up a paid panel of 25-30 patients and members of the public who reflect the diversity of the population who will be affected by algorithms that interact with NMIP data.
  • This panel will form a pool of participants to take part in a small series of activities that form the participatory component of the AIA process.
  • When an applicant to NMIP data is running their AIA process, the NHS AI Lab should work with them to set up a workshop with the panel to identify and deliberate on impacts. The applicant then develops responses that address the identified impacts, which the panel members review and give feedback on. The Data Access Committee (DAC) uses the outcomes of this process to support their consideration of the application, alongside the wider AIA.
  • The five stages of the participatory component are:
  1. recruit panel members
  2. induct panel members
  3. hold impact identification workshops
  4. technology developers (the NMIP applicants) review impacts identified
    in the workshops and develop plan to address or mitigate them
  5. panel review mitigation plans and feedback to NHS AI Lab DAC.

These stages are detailed below, along with an indication of required costs and resources, and additional links for information.

Panel recruitment

The panel forms a pool of people who can be involved in the reflexive impact workshop for each project. This is designed to factor in the panel recruitment and induction burden, by enabling projects to be reviewed in ‘batches’ – for instance, if the NMIP had quarterly application rounds, a panel may be recruited to be involved in all the reflexive workshops for
that round.

Note: the following numbers are estimates based on best practice. Exact numbers may vary depending on expected and actual application numbers.

  • 25–30 people who reflect the diversity of the population that might be affected by the algorithm across: age, gender, region, ethnic background, socio-economic background, health condition and access to care. The number 25–30 is designed assuming multiple AIAs are required, to ensure the same people aren’t reviewing every algorithm. 25-30 means you could have a different combination of 8–12 participants for each algorithm if there are six or more to review. If the number of AIAs needed is smaller than this, then a smaller panel could be used.
  • Recruited either via a social research recruitment agency, or via NHS trusts involved.
  • Panel does not need to be statistically representative of the UK public, but instead should reflect the diversity of perspectives and experiences in the populations/communities likely to be affected by the algorithms.[footnote]Steel, D., Bolduc, N., Jenei, K. and Burgess, M. (2020). ‘Rethinking representation and diversity in deliberative minipublics’. Journal of Deliberative Democracy, 16,1, pp.46-57 [online]. Available at: https://delibdemjournal.org/article/id/626/[/footnote]
  • (Ideally) one or two panel members should sit on the DAC as full members.
  • Panel members should be remunerated for their involvement on the panel. The amount should reflect the hours required to participate in all the activities: the induction, the assessment workshops, reviewing materials and feeding back on impact mitigation plans (inc. travel if necessary) (see ‘Resourcing and costs’).

Panel induction

After being recruited, the NHS AI Lab should run an induction session to inform the panel members about the NMIP, how the application and AIA process works and their role.

Participants:

  • All panel members: to attend and learn about the NMIP, AIAs – including where this exercise sits in the timeline of the AIA process (i.e. after NMIP applicants have completed internal AIA exercises) and their role.
  • NHS panel coordinator: to run the session and facilitate discussion.
  • Technology and Society (T&S) professional: to present to the panel on what algorithms are, what the AIA process is, and some common issues or impacts that may arise.

Structure:

  • Two hours, virtual or in-person (for either format, ensure participants have support to access and engage fully).
  • Suggested outline agenda:
    • introduction to each other
    • introduction to the NMIP – what it is, what it aims to do
    • introduction to the panel’s purpose and aims
    • presentation from T&S professional on what an algorithm is and what an AIA is followed by Q&A
    • interactive exercises and discussion of case studies of specific algorithm use cases, with strawperson examples; mapping how different identities/groups would interact with the algorithm (with a few example patients from different groups).
    • how the panel and participatory AIA process will work
    • what is required of the panel members.

Equipment and tools required:

  • Accessible room/venue and or online video-conferencing tool (e.g. Zoom – with provisions for visually or hearing impaired and neurodiverse people as required).
  • Slide deck for introductions and presentations (with accessibility provisions).
  • Any documentation for further reading (e.g. links to ‘about’ page of the NMIP, information about AIAs, document outlining participatory process and requirements of participants).

Outputs:

  • Participants are equipped with the knowledge they need to be able to
    be active members of the participatory process.

Participatory workshop

The participatory workshop follows the reflexive exercise and provides the forum for a broad range of people to discuss and deliberate on some impacts of the applicant’s proposed system, model or research.

Participants:

  • Panel members (8–12 people): to participate in the workshop and share their perspectives on the algorithm’s potential impacts.
  • Facilitator (one or two people): to lead the workshop, guide discussion and ensure the participants’ views are listened to. Facilitators could be an NHS AI Lab staff member, a user researcher from the applicant organisation or a consultant; either way, they must have facilitation experience and remain impartial to the process. Their role is to ensure the process happens effectively and rigorously, and they should have the skills and position to do so.
  • Rapporteur (one person, may be a facilitator): to serve the panel in documenting the workshop.
  • Technology developer representative (one or two people): to represent the technology development team, explain the algorithm, take questions and, crucially, listen to the participants and take notes.
  • (Ideally) ‘critical friend’ (one person): a technology and society (T&S) professional to join the workshop, help answer participants’ questions, and support participants to fully explore potential impacts. They are not intended to be deeply critical of the algorithm, but to impartially support the participants in their enquiry.
  • (Optional) a clinical ‘critical friend’ (one person): a medical professional to play a similar role to the T&S professional.

Structure:

  • Three hours, virtual or in-person (for either format, ensure participants
    have support to access and engage fully).
  • Suggested agenda:
    • Introductions to each other and the session, with a reminder of the
      purpose and agenda (10 mins).
    • Presentation from technology developers about their algorithm, in
      plain English (20 mins), covering:

      • Who their organisation is, its aims, values and whether it is
        for or non-profit, if it already works with NHS and how.
      • What their proposed algorithm is: what it aims to do (and what prompted the need for the algorithm), how it works (not in technical detail), what data will be input (both how the algorithm uses NMIP data and the other datasets used to train, if applicable), what outputs the algorithm will generate, how the algorithm will be deployed and used (e.g. in hospitals, via a direct-to-patient app etc.), who it will affect, what benefits it will bring, what impact considerations the team have already considered.
    • Q&A led by the lead facilitator (20 mins).
    • A session to identify potential impacts (45–60 mins with a break part way through, and a facilitator taking notes on a [virtual] whiteboard):
      • As one group or in two breakout groups, the participants consider the algorithm and generate ideas for how it could create impacts. With reference to the best, worst and most-likely scenarios that might arise from deployment of the algorithm that applicant teams completed for the reflexive exercise, participants will discuss these answers and provide their thoughts. Technology developer observes but does not participate unless the facilitator brings them in to address a technical or factual point. Critical friend observes and supports as required (guided by facilitator).
      • This task should be guided by the facilitator, asking questions to prompt discussion about the scenarios, such as:
        • What groups or individuals would be affected by this
          project?
        • What potential risks, biases or harms do you foresee
          occurring from use/deployment of this algorithm?
        • Who will benefit most from this project and how?
        • Who could be harmed if this system fails?
        • What benefits will this project have for patients and the NHS?
        • Of the impacts identified, what would be potential causes for this impact?
        • What solutions or measures would they like to see adopted to reduce the risks of harm?
      • A session to group themes in the impacts into the template and prioritise them (25 mins):
        • As one group or in two breakout groups, the participants
          consider any common themes in their identified impacts and group them together. (e.g. multiple impacts might relate to discrimination, or to reduced quality of care.) Technology developer observes but does not participate unless the facilitator brings them in to address a technical or factual point. Critical friend observes and supports as required (guided by facilitator). The facilitator should use a (virtual)
          whiteboard to fill out the template.
        • The participants then prioritise the themes and specific
          impacts by dot-voting159 they should be guided by the
          facilitator, asking questions such as:

          • Of the impacts identified, which are likely to cause high and very high risk of harm?
          • Of the impacts identified, which would you consider
            to be the most important? How consequential is this
            harm for the wellbeing of which stakeholders?
          • Of the impacts identified, which are urgent? How immediate would the threat of this impact be?
          • Of the impacts identified, which will be the most difficult to mitigate?
          • Of the impacts identified, which will be the most difficult to detect, given the current design?
        • Participants take a break while the technology developer reviews the templates of identified impacts. (10 mins).
        • A session with facilitated discussion so the technology developer can ask questions back to the participants, to clarify the impacts identified and further flesh out impacts (and provide overview of next steps: how will the developers be confronting/responding to the impacts identified in the development process, and what that could look like (i.e. model retraining) as well as updates to the AIA (25 mins).
        • Wrap up and close (5 mins).

Equipment and tools required:

  • Accessible room/venue and or online video conferencing tool (e.g. Zoom – with provisions for visually or hearing impaired and neurodiverse people as required).
  • Slide deck for introductions and presentations (with accessibility
    provisions). Ideally shared beforehand.
  • (Virtual) whiteboard/flipchart and post-its.
  • Impacts template prepared and ready to be filled in.

Outputs:

  • Filled out template that lists impacts and priority of them (based on
    dot-votes).
  • Technology developers should take notes to deepen their understanding of their algorithms’ potential impacts and perspectives of the public.

Applicant teams devise ideas to address or mitigate impacts

Following the workshop the applicant team should consider mitigations, solutions or plans to address the impacts identified during the workshop, and update the first iteration of the AIA in light of the participants’ deliberations.

This analysis should be worked back into the template as part of the synthesis exercise.

Panel reviews mitigation plans

  • The applicant team’s plans to address impacts are shared with the panel participants, who review them and share any feedback or reflections. This can be done asynchronously via email, over a period of two-to-four weeks. Assuming all participants are supported to engage: accessible materials, access to the web, etc.
  • Panel can make a judgement call on permissibility of the algorithm based on the developer’s updated proposals, for the DAC to consider.
  • The panels’ comments are used by the NHS AI Lab NMIP DAC to support their assessment of the overall AIA.

Resourcing and costs

Staff resources:

  • Panel coordinator: a member of NHS AI Lab staff to coordinate and run the panel process, and to ensure it is genuinely and fully embedded in the wider AIA and considered by the DAC. This individual should have experience and knowledge of: public engagement, public and stakeholder management, workshop design and working with those with complex health conditions. This could be a stand-alone role, or a part of another person’s role, as long as they are given sufficient time and capacity to run the process well.
  • Facilitators: additional NHS staff, partner organisation staff or freelancers to support workshop facilitation as required.
  • Technology developers and critical friends who participate in the impact identification workshops should be able to do this as part of their professional roles, so would not typically require remuneration.

Panel participant cost estimates:

Recruitment: there are two approaches to recruiting participants:

  1. Panel coordinator works with NHS trusts and community networks to directly recruit panel members (e.g. by sending email invitations). The coordinator would need to ensure they reach a wide population, so that the final panel is sufficiently diverse. This option has no additional cost, but is significantly time-intensive, and would require the co-ordinator to have sufficient capacity, support and skills to do so.
  2. Commission a research participant recruitment agency to source panel members and manage communication and remuneration.

Costs:

  • Recruitment cost estimated at £100 per person for 30 people: £3,000.
  • Administrative cost estimate for communications and remunerating
    participants: £2,000 – £4,000.
  • Remuneration: participants should be remunerated at industry best practice rates of £20–£30 per hour of activity.
  • Assuming 30 participants who each participates in the induction (two hours) and a single ‘batch’ of NMIP applications, for example five workshops (15 hours) and reviews five mitigation plans (six hours), estimated remuneration costs would be: £13,800 – £20,700.

Miscellaneous costs to consider:

  • If hosting workshops virtually: cost for any software and accessibility support such as interactive whiteboards, video-conferencing software, live captioning, etc.
  • If hosting workshops in-person: venue hire, catering, travel etc.
  • Materials: printing, design of templates, information packs etc. as required.

 

 

 

 

1–12 of 15

Skip to content

COVID-19 Data Explorer

This report is accompanied by the 'COVID-19 Data Explorer', a resource containing country-specific data on timelines, technologies, and public response to be used to explore the legacy and implications of the rapid deployment of contact tracing apps and digital vaccine passports across the world.

Executive summary

The COVID-19 pandemic is the first global public health crisis of ‘the algorithmic age’.[1] In response, hundreds of new data-driven technologies have been developed to diagnose positive cases identify vulnerable populations and conduct public health surveillance of individuals known to be infected.[2] Two of the most widely deployed are digital contact tracing apps and digital vaccine passports.

For many governments, policymakers and public health experts across the world, these technologies raised hopes through their potential to assist in the fight against the COVID-19 virus. At the same time, they provoked concerns about privacy, surveillance, equity and social control because of the sensitive social and public health surveillance data they use – or are seen as using.

An analysis of the evidence on how contact tracing apps and digital vaccine passports were deployed can provide valuable insights about the uses and impacts of technologies at the crossroads of public emergency, health and surveillance.

Analysis of their role in societies can shed light on the responsibilities of the technology industry and policymakers in building new technologies, and on the opinions and experiences of members of the public who are expected to use them to protect public health.

These technologies were rolled out rapidly at a time when countries were under significant pressure from the financial and societal costs of the pandemic. Public healthcare systems struggled to cope with the high numbers of patients, and pandemic restrictions such as lockdowns resulted in severe economic crises and challenges to education, welfare and wellbeing.

Governments and policymakers needed to make decisions and respond urgently, and they turned to new technologies as a tool to help control the spread of infection and support a return to ‘normal life’. This meant that – as well as guiding the development of technologies – they had an interest in convincing the public that they were useful and safe.

Technologies such as contact tracing apps and digital vaccine passports have significant societal implications: for them to be effective, people must consent to share their health data and personal information.

Members of the public were expected to use the technologies in their everyday lives and change their behaviour because of them – for example, proving their vaccination status to access workplaces, or staying at home after receiving a COVID-19 exposure alert.

Examining these technologies therefore helps to build an understanding of the public’s attitudes to consent in sharing their health information, as well as public confidence in and compliance with health technologies more broadly.

As COVID-19 technologies emerged, the Ada Lovelace Institute was one of the first research organisations to investigate their potential legislative, technical and societal implications. We reviewed the available evidence and made a wide range of policy and practice recommendations, focusing on effectiveness, public legitimacy, governance and potential impact on inequalities.

This report builds on this work: revisiting those early recommendations; assessing the evidence available now; and drawing out lessons for policymakers, technology developers, civil society and public health organisations. Research from academia and civil society into the technologies concentrates largely on specific country contexts.[3]

There are also international studies that provide country-specific information and synthesise cross-country evidence but focus primarily on one aspect of law and governance or public attitudes. [4], [5], [6] This body of research provides valuable insights into diverse policies and practices and unearths legislative and societal implications of these technologies at different stages of the pandemic.

Yet research that investigates COVID-19 technologies in relation to public health, societal inequalities and regulations simultaneously and at an international level remains limited. In addition, efforts to track the development of global policy and practice have slowed in line with the reduced use of these technologies in many countries.

However, it remains important to understand the benefits and potential harms of these technologies by considering legislative, technical and societal aspects simultaneously. Despite the limitations, presenting the evidence and identifying gaps can provide cross-cutting lessons for governments and policymakers, to inform policy and practice both now and in the future.

These lessons concern the wide range of technical, legislative and regulatory requirements needed to build public trust and cooperation, and to mitigate harms and risks when using technologies in public crises, and in health and social care provision.

Learning from the deployment of contact tracing apps and digital vaccine passports remains highly relevant. As the infrastructure remains in place in many countries (for example, authentication services, external data storage systems, security operations built within applications, etc.), the technologies are easy to reinstate or repurpose.

Some countries have already transformed them into new health data and digital identity systems – for example, the Aarogya Setu app in India. In addition, on 27 January 2023, the World Health Organization (WHO) stated: ‘While the world is in a better position than it was during the peak of the Omicron transmission one year ago, more than 170,000 COVID-19-related deaths have been reported globally within the last eight weeks’.[7]

And on 5 May 2023, the WHO confirmed that while COVID-19 no longer constitutes a public health emergency of international concern and the number of weekly reported deaths and hospitalisations has continued to decrease, it is concerned that ‘surveillance reporting to WHO has declined significantly, that there continues to be inequitable access to life-saving interventions, and that pandemic fatigue continues to grow’. [8]

In other words, the pandemic is far from over, and we need to pay attention to the place of these technologies in our societies now and in future pandemics.

This report synthesises the available evidence on a cross-section of 34 countries, exploring technical considerations and societal implications relating to the effectiveness, public legitimacy, inequalities and governance of COVID-19 technologies.

Evidence was gathered from a wide range of sources across different disciplines, including academic and grey literature, policy papers, the media and workshops with experts.

Existing research demonstrates that governments recognised the value of health, mobility, economic or other kinds of personal data in managing the COVID-19 pandemic and deployed a wide range of technologies to collect and share data.

However, given that the technologies were developed and deployed at pace, it was difficult for governments to adequately prepare to use them – and the data collected and shared through them – in their broader COVID-19 pandemic management.[9]

It is therefore unsurprising that governments did not clearly define how to measure the effectiveness and social impacts of COVID-19 technologies. This leaves us with important evidence gaps, making it harder to fully evaluate the effectiveness of the technologies and understand their impact on health and other forms of social inequalities.

We also highlight evidence gaps that indicate where evaluation and learning mechanisms fell short when technologies were used in response to COVID-19. We call on governments to consider these gaps and retrospectively evaluate the effectiveness and impact of COVID-19 technologies.

This will enable them to improve their evaluation and monitoring mechanisms when using technologies in future pandemics, public health, and health and social care provision.

The report’s findings should guide governments, policymakers and international organisations when deploying data-driven technologies in the context of public emergency, health and surveillance. They should also support civil society organisations and those advocating for technologies that support fundamental rights and protections, public health and public benefit.

‘COVID-19 technologies’ refers to data-driven technologies and AI tools that were built and deployed to support the COVID-19 pandemic response. Two of the most widely deployed are contact tracing apps and digital vaccine passports, and they are main focus of this report. Both technologies aim to identify an individual’s risk to others and block or allow freedoms and restrictions accordingly. There are varying definitions of these technologies. In this report we define them through their common purposes and properties, as follows:

  • Contact tracing apps aim to measure an individual’s risk of becoming infected with COVID-19 and of transmitting the virus to others based on whether they have been in close proximity to a person known to be infected. If a positive COVID-19 test result is reported to the app (by the user or the health authorities), the app alerts other users who might have been in close proximity to the person known to be infected with COVID-19. App users who have received an alert are expected to get tested and/or self-isolate at home for a certain period of time.[10]
  • Digital vaccine passports show the identity of a person and their COVID-19 vaccine status or antigen test results. They are used to prove the level of risk an individual poses to others based on their COVID-19 test results, and proof of recovery or vaccine status. They function as a pass that blocks or allows access to spaces and activities (such as travelling, leisure or work).[11]

Cross-cutting findings

Despite the complex, conflicting and limited evidence available about contact tracing and digital vaccine passports, this report uses a wide range of available resources and identifies the cross-cutting findings summarised here under the four themes of effectiveness; public legitimacy; inequalities; and governance, regulation and accountability.

Effectiveness: Did COVID-19 technologies work?

  • Contact tracing apps and digital vaccine passports were – necessarily – rolled out quickly, without consideration of what evidence would be needed to demonstrate their effectiveness. There was insufficient consideration and no consensus reached on how to define, monitor, evaluate or demonstrate their effectiveness and impacts.
  • There are indications of the effectiveness of some technologies, for example the NHS COVID-19 app (used in England and Wales). However, the limited evidence base makes it hard to evaluate their technical efficacy or epidemiological impact overall at an international level.
  • The technologies were not well integrated into broader public health systems and pandemic management strategies, and this reduced their effectiveness. However, the evidence on this is limited in most of the countries in our sample (with a few exceptions, for example Brazil and India), and we do not have clear evidence to compare COVID-19 technologies with non-digital interventions or to weigh up their relative benefits and harms.
  • The evidence is inadequate on whether COVID-19 technologies resulted in positive change in people’s health behaviours (for example, whether people self-isolated after receiving an alert from a contact tracing app), either when the technologies were first deployed or over time.
  • Similarly, it is not clear how the apps’ technical properties and the various policies and approaches impacted on public uptake of the apps or adherence to relevant guidelines (for example, self-isolation after receiving an alert from a contact tracing app).

Public legitimacy: Did people accept COVID-19 technologies?

  • Public legitimacy was key to ensuring the success of these technologies, affecting uptake and behaviour.
  • People were concerned about the use of digital vaccine passports to enforce restrictions on liberty and increased surveillance. People protested against them, and the restrictive policies they enabled, in more than half of the countries in our sample.
  • Public acceptance of contact tracing apps and digital vaccine passports depended on trust in their effectiveness, as well as trust in governments and institutions to safeguard civil rights and liberties. Individuals and communities who encounter structural inequalities are less likely to trust government institutions and the public health advice they offer. Not surprisingly, these groups were less likely than the general population to use these technologies.
  • The lack of targeted public communications resulted in poor understanding of the purpose and technical properties of COVID-19 technologies. This reduced public acceptance and social consensus around whether and how to use the technologies.

Inequalities: How did COVID-19 technologies affect inequalities?

  • Some social groups faced barriers to accessing, using or following the guidelines for contact tracing apps and digital vaccine passports, including unvaccinated people, people structurally excluded from sufficient digital access or skills, and people who could not self-isolate at home due to financial constraints. A small number of sample countries adopted policies and practices to mitigate the risk of widening existing inequalities. For example, the EU allowed paper-based Digital COVID Certificates for those with limited digital access and skills.
  • This raises the question of whether COVID-19 technologies widened health and other societal inequalities. In most of the sample countries, there is no clear evidence whether governments adopted effective interventions to help those who were less able to use or benefit from these technologies (for example, whether they provided financial support for those who could not self-isolate after receiving an exposure alert due to not being able to work from home).
  • Most sample countries requested proof of vaccination from inbound travellers before allowing unconditional entry (that is, without a quarantine or self-isolation period) at some stage of the pandemic. This amplified global inequalities by discriminating against the residents of countries that could not secure adequate vaccine supply or had low vaccine uptake – specifically, many African countries.

Governance, regulation and accountability: Were COVID-19 technologies governed well and with accountability?

  • Contact tracing apps and digital vaccine passports combine health information with social or surveillance data. As they limit rights (for example, by blocking access to travel or entrance to a venue for people who do not have a digital vaccine passport), their use must be proportional. This means striking a balance between limitations of rights, potential harms and the intended purpose. To achieve this, it is essential that these tools are governed by robust legislation, regulation and oversight mechanisms, and that there are clear ‘sunset mechanisms’ in place to determine when they no longer need to be used.
  • Most countries in our sample governed these technologies in line with pre-existing legislative frameworks, which were not always comprehensive. Only a few countries enacted robust regulations and oversight mechanisms specifically governing contact tracing apps and digital vaccine passports, including the UK, EU member states, Taiwan and South Korea.
  • The lack of robust data governance frameworks, regulation and oversight mechanisms led to lack of clarity about who was accountable for misuse or poor performance of COVID-19 technologies. Not surprisingly, there were incidents of data leaks, technical errors and data being reused for other purposes. For example, contact tracing app data was used in police investigations in Singapore and Germany, and sold to third parties for commercial purposes in the USA.[12]
  • Many governments relied on private technology companies to develop and deploy these technologies, demonstrating and reinforcing the industry’s influence and the power located in digital infrastructure.

Lessons

These findings present clear lessons for governments and policymakers deciding how to use contact tracing apps and digital vaccine passports in the future. These lessons may also apply more generally to the development and deployment of any new data-driven technologies and approaches.

Effectiveness

To build evidence on the effectiveness of contact tracing apps and digital vaccine passports:

  • Support research and learning efforts which review the impact of these technologies on people’s health behaviours.
  • Weigh up the technologies’ benefits and harms by considering their role within the broader COVID-19 response and comparing them with non-digital interventions (for example, manual contact tracing).
  • Understand the varying impacts of apps’ different technical properties, and of policies and approaches to implementation on people’s acceptance of, and experiences of, these technologies in specific socio-cultural contexts and across geographic locations.
  • Use this impact evaluation to help set standards and strategies for the future use of these technologies in public crises.

To ensure the effective use of technology in future pandemics:

  • Invest in research and evaluation from the start, and implement a clear evaluation framework to build evidence during deployment that supports understanding of the role that technologies play in broader pandemic health strategies.
  • Define criteria for effectiveness using a human-centred approach that goes beyond technical efficacy and builds an understanding of people’s experiences.
  • Establish how to measure and monitor effectiveness by working closely with public health experts and communities, and set targets accordingly.
  • Carry out robust impact assessments and evaluation.

Public legitimacy

To improve public acceptance:

  • Build public trust by publishing guidance and enacting clear law about permitted and restricted uses and mechanisms to support rights (for example, the right to privacy) and how to tackle legal issues and enable redress (e.g., data leakage, which could involve using collected data for reasons other than health).
  • Effectively communicate the purpose of using technology in public crises, including the technical infrastructure and legislative framework for specific technologies, to address public hesitancy and build social consensus.

Inequalities

To avoid entrenching and exacerbating societal inequalities:

  • Create monitoring mechanisms that specifically address the impact of technology on inequalities. Monitor the impact on public health behaviours, particularly in relation to social groups who are more likely to encounter health and other forms of social inequalities.
  • Use the impact evidence to identify marginalised and disadvantaged communities and to establish strong public health services, interventions and social policies to support them.

To avoid creating or reinforcing global inequalities and tensions:

  • Harmonise global, national and regional regulatory tools and mechanisms to address global inequalities and tensions.

Governance and accountability

To ensure that individual rights and freedoms are protected:

  • Establish strong data governance frameworks and ensure regulatory bodies and clear sunset mechanisms are in place.
  • Create specific guidelines and laws to ensure technology developers follow privacy-by-design and ethics-by-design principles, and that effective monitoring and evaluation frameworks and sunset mechanisms are in place for the deployment of technologies.
  • Build clear evidence about the effectiveness of new technologies to make sure that their use is proportionate to their intended results.

To reverse the growing power imbalance between governments and the technology industry:

  • Develop the public sector’s technical literacy and ability to create technical infrastructure. This does not mean that the private sector should be excluded from developing technologies related to public health, but it is crucial that technical infrastructure and governance are effectively co-designed by government, civil society and private industry.

Effectiveness, public legitimacy, inequalities and accountability have varying definitions across disciplines. In this report we define them as follows:

 

Effectiveness: We define the effectiveness of contact tracing apps and digital vaccine passports in terms of the extent to which they positively affect public health, that is, result in decreasing the rate of transmission. We use a non-technocentric approach, distinguishing technical efficacy from effectiveness. Technical efficacy refers to a technology’s ability to perform a technical task (that is, a digital vaccine passport’s ability to generate QR code to share data).

 

Public legitimacy: We define this in terms of public acceptance of using contact tracing apps and digital vaccine passports. We also focus specifically on marginalised and disadvantaged communities, whose opinions and experiences might differ from the dominant dispositions.

 

Inequalities: We investigate inequalities both within and across countries. We look at whether COVID-19 technologies create new or reinforce existing health and other types of societal inequalities for disadvantaged and vulnerable groups (for example, people who could not use COVID-19 technologies due to inadequate digital access and skills). We also examine their impact on global inequalities by focusing on inequalities of resources, opportunities and power between countries and regions (for example, around access to vaccine supply).

 

Accountability: We use this to refer to the regulation, institutions and mechanisms that are ways of making governments and officials accountable for preserving civil rights and freedoms.

Introduction

The COVID-19 pandemic is the first global epidemic of ‘the algorithmic age’.[13] In response, hundreds of new technologies have been developed, to diagnose patients, identify vulnerable populations and conduct surveillance of individuals known to be infected.[14] Data and artificial intelligence (AI) have therefore played a key role in how policymakers and international and national health authorities have responded to the pandemic.

Digital contact tracing apps and digital vaccine passports, which are the focus of this report, are two of the most widely deployed new technologies. Although versions of contact tracing apps had previously been deployed in some countries, such as in Sierra Leone as part of the Ebola response, for most countries across the world this was their first experience of such technologies.[15]

These technologies differ from pre-existing state surveillance tools, such as CCTV, and from other types of technologies deployed in the context of the COVID-19 pandemic, such as machine learning algorithms that profile the risk of incoming travellers or predict infected patients at high risk of developing severe symptoms.[16]

To be effective, contact tracing apps and digital vaccine passports require public acceptance and cooperation, as individuals need to consent to share their health and other types of personal information and change their behaviour, for example, by showing evidence of health status to enter a venue via a digital vaccine passport, or by staying at home on receiving an exposure notification from a contact tracing app.[17]

These technologies are therefore at the crossroads of public emergency, health and surveillance and so have significant societal implications.

The emergence of contact tracing apps and digital vaccine passports resulted in public anxiety and resistance related to their effectiveness, legitimacy and proportionality, as well as concern about the implications for informed consent, privacy, surveillance, equality, discrimination and the role of technology in broader public health management.

These technologies were therefore high stakes and were perceived as necessary, but high-risk measures in dealing with the pandemic.

As the technologies brought together a range of highly sensitive data, they were a test of the extent of the public’s willingness to share sensitive personal data and to accept limits on freedoms and rights.

The technologies were developed and deployed to save lives, but in practice they both enabled and limited people’s individual freedoms, by scoring the risk they posed to others based on their health status, location or mobility data.

Despite the risks and sensitivities, due to the challenging conditions of the pandemic, they were created and implemented quickly, and without a clear consensus on how they should be designed, governed and regulated.

Countries adopted different approaches, and – while there are some commonalities across countries and dominant infrastructures – the technical choices, policies and practices were neither unified nor consistent. Frequent changes were made even at a regional level.

It was particularly challenging for countries with weaker technological infrastructures, financial capabilities or legislative frameworks to develop and deploy COVID-19 technologies. Even in countries with relatively comprehensive regulation, these technologies caused fresh concerns for human rights and civil liberties, as they intensified ‘top-down institutional data extraction’ across the world.[18]

Many critics correctly anticipated that such technologies would normalise surveillance via state ownership of sensitive data in a way that would persist beyond the pandemic.

This creates a complex picture, made more challenging by incomplete evidence on how the technologies were developed, used and governed – and, most importantly, on their impact on people, health, healthcare provision and society. It is therefore important to monitor their development, understand their impact and consider what legacy they might have as well as the lessons we can learn for the future.

A range of studies focus on aspects of contact tracing apps and digital vaccine passports at different stages of the pandemic. The Ada Lovelace Institute has monitored the evolution of these technologies over the last three years. However, compared with more traditional health technologies or policy interventions, there is a lack of in-depth research into them or evaluation of their effectiveness.

As the infrastructure is still in place in most countries, these technologies can easily be re-used or transformed into new technologies for new purposes. Therefore, these are live questions with tangible effects on people and societies.

By synthesising evidence from a cross-section of 34 countries, this report identifies cross-cutting issues and challenges, and considers what lessons we should learn from the deployment of COVID-19 technologies as examples of new and powerful technologies that have been embedded across society.

Scope and rationale of this report

In the first two years of the pandemic, from early 2020, the Ada Lovelace Institute conducted extensive research first on contact tracing apps and then on digital vaccine passports. This research focused on the technical considerations and societal implications of these new technologies and included public attitudes research, expert deliberations, workshops, webinars and evidence reviews.

To conduct this research, we engaged multidisciplinary experts from the fields of behavioural science, bioethics, ethics, development studies, immunology, law, public health and sociology. As well as analysing the technical efficacy of the technologies, this created a holistic picture of their legal, societal and public health implications.

We published nine reports based on our research, and two international monitors, which tracked policy and practice developments related to digital vaccine passports and contact tracing apps.

In this work, we acknowledged the potential of new data-driven technologies in the fight against COVID-19. However, we also identified the risks of rapid decision-making by governments and policymakers.

In most cases, there was not sufficient time or adequate research to consider and address the wide range of societal, political, legal and ethical risks. This led to significant challenges, related to effectiveness, public legitimacy, inequalities, and governance and accountability.

Risks and challenges of COVID-19 technologies contained in the Ada Lovelace Institute’s previous publications

When contact tracing apps and digital vaccine passports first emerged, we argued that governments and policymakers should pay attention to a wide range of risks and challenges when deploying these technologies.

From early 2020, the Ada Lovelace Institute – through reports, trackers and monitors – identified and warned about the risks of these technologies.[19]

The risks we identified and highlighted can be summarised as:

Effectiveness

  • Lack of resources to monitor effectiveness and impact. Impact monitoring and evaluation strategies were not developed, making it difficult to assess the effectiveness of the technologies. Digital vaccine passports and contact tracing apps were new technologies, developed and deployed at pace, so there was not enough time or resource to establish effective strategies and monitoring mechanisms to investigate their impacts on public health.
  • Undermining public health by treating a collective problem (public health) as an individual one (personal safety). This placed the emphasis on individualised risks or requirements, and greater health surveillance at an individual level. For example, contact tracing apps categorise an individual as lower risk based on their vaccine or test status, rather than focusing on a more contextual risk of local infection in a specific area.
  • An increase in higher-risk behaviours due to the technologies fostering a false sense of security. Experts highlighted that COVID-19 technologies could create a false sense of security and discourage people from adhering to other protection measures that reduce the risk of transmission, for example, wearing a mask.[20]

Public legitimacy

  • Harming public trust in health data-driven technologies if they were not governed properly or were used for reasons other than health (for example, surveillance). Damaged public trust could make it difficult for governments to roll out new data-driven approaches and technologies to deal with public crises and in general.

Inequalities

  • Creating new forms of stratification and discrimination (for example, discrimination against unvaccinated people or those unable to access accepted vaccines or tests) or amplifying existing societal inequalities (for example, digital exclusion or poor access to healthcare).
  • Amplifying existing global inequalities and geopolitical tensions, particularly in the case of inequitable access to vaccines on a global level. Digital vaccine passport schemes required proof of vaccination for international travel or access to domestic activities (for example, entering a venue for a concert) across the world. This created the risk of a global race for vaccine supply, leaving many low- and middle-income countries scrambling for access.

Governance and accountability

  • Facilitating restrictions on individual liberty and increased surveillance. Members of the public were expected to use these powerful and potentially invasive technologies that collected and stored their personal data. These tools could therefore be used for surveillance, invading privacy or controlling individuals’ activities and mobility in general.
  • Repurposing individuals’ data for reasons other than health, for example, tracking dissidents’ activities, selling data to third parties for commercial purposes, etc.
  • Uncertainty and lack of transparency about private sector involvement and the risks of concentrating power and enabling long-term digital infrastructure that is reliant on private actors.[21]

Our reports made several recommendations for policymakers about how to mitigate these risks and challenges. As well as detailed recommendations for each technology, our cross-cutting recommendations covered the lifecycle of development and implementation.

Recommendations for policymakers made in previous Ada Lovelace Institute reports (2020–2022)

Effectiveness

  • Demonstrate the effectiveness of these technologies within the broader public health ecosystem, publishing modelling and testing; considering uptake and adherence to guidelines around these technologies (for example, reporting a positive COVID-19 test result, self-isolating on receiving an exposure notification or getting vaccinated); and publicly setting success criteria and outcomes and identifying risks and harms, particularly for vulnerable groups.

Public legitimacy

  • Build public trust through clear public communications and transparency. These communications should consider ethical considerations; establish clear legal guidance about permitted and restricted uses and mechanisms to support rights; and demonstrate how to tackle legal issues and enable redress (for example, by making a formal complaint in the case of a privacy breach).

Inequalities

  • Proactively address the needs of, and risks in relation to, vulnerable groups.
  • Work with international bodies to seek cross-border agreements and mechanisms to counteract the creation or amplification of global inequalities.

Governance and accountability

  • Ensure data protection by design to prevent data breaches or misuse.
  • Develop legislation with clear, specific and delimited purposes, and ensure clear sunset clauses for the technologies, and the legislation governing them.[22]

The focus of this research

The Ada Lovelace Institute’s original research in 2020 and 2021 focused on the conditions and principles required to safely deploy and monitor COVID-19 technologies.

By early 2022 many countries had deployed these technologies. Therefore, we shifted our focus and began investigating whether the risks and challenges we identified had materialised and, if so, what could be done differently in deploying technologies in the future.

As identified above, contact tracing apps and digital vaccine passports were deployed without consistent research and monitoring mechanisms. This contributed to a limited evidence base and meant that we needed to use a broad range of resources and research methods to develop this report (see Methodology).

Academic and grey literature provided valuable insights. This was supplemented by media and civil society coverage, for example of the repurposing of data collected through the contact tracing app Luca in Germany or the blocking of protests through Health Code app in China.[23]

The evidence in this report includes qualitative and quantitative data related to the uses and impacts of COVID-19 technologies drawn from policy trackers, the media, policy papers, research papers and workshops convened with experts between January 2022 and December 2022.

To accompany the report, we have created the ‘COVID-19 Data Explorer: Policies, Practices and Technology’[24] to enable civil society organisations, researchers, journalists and members of the public to access the body of data.

The COVID-19 Data Explorer supports the discovery and exploration of policies and practices relating to digital vaccine passports and contact tracing apps across the world. The data on timelines, technologies and public response demonstrates the legacy and implications of their rapid deployment.

By using a wide range of resources, reviewing the existing evidence and identifying evidence gaps, we draw important cross-cutting lessons to inform policy and practice.

We synthesise the available evidence from a sample of 34 countries, with the aim of taking a macro view and identifying cross-cutting issues at an international level. The report contributes to the growing body of research on COVID-19 technologies, improving how we understand, investigate and build data-driven technologies for public good.

The evidence sources include:

  • the Ada Lovelace Institute’s previous work on contact tracing apps and digital vaccine passports in the first two years of the pandemic
  • academic and grey literature on digital vaccine passports, contact tracing apps and COVID-19 pandemic management, focusing on the 34 countries in our sample
  • government websites and policy papers
  • a workshop delivered by the Ada Lovelace Institute with cross-country experts, focusing on the effectiveness of contact tracing apps in Europe
  • papers submitted in response to The Ada Lovelace Institute’s international call for evidence on the effectiveness of digital vaccine passports and contact tracing apps
  • news media coverage of digital vaccine passports, contact tracing apps and pandemic management in the 34 countries in our sample.

See Methodology for more information on methods, sampling and resources.

Ada Lovelace Institute publications on COVID-19 technologies from 2020 to 2023[25]

  • Exit through the App Store? (April 2020): A rapid evidence review of the technical considerations and societal implications of using technology to transition from the first COVID-19 lockdown.
  • Confidence in a crisis? (August 2020): Findings of a public online deliberation project on attitudes to the use of COVID-19 technologies to transition out of lockdown.
  • Provisos for a contact tracing app (May 2020): A report that highlights the milestones that would have to be met by the UK Government to ensure the safety, equity and transparency of digital contact tracing apps.
  • COVID-19 digital contact tracing tracker (July 2020): A resource for monitoring the development, uptake and efficacy of global attempts to use smartphones and other digital devices for contact tracing.
  • No green lights, no red lines (November 2020): A report that explores the public perspectives on COVID-19 technologies and draws lessons to assist governments and policymakers when deploying data-driven technologies in the context of the pandemic.
  • What place should COVID-19 vaccine passports have in society? (February 2021): Findings from an expert deliberation on the potential roll-out of digital vaccine passports.
  • Public attitudes to COVID-19, technology and inequality (March 2021): A tracker summarising studies and projects that offer insights into people’s attitudes to and perspectives on COVID-19, technology and inequality.
  • The data divide (March 2021): Public attitudes research in partnership with the Health Foundation to explore the impacts of data-driven technologies and systems on inequalities in the context of the pandemic.
  • Checkpoints for vaccine passports (May 2021): A report on the requirements that governments and developers need to meet for any vaccine passport system to deliver societal benefit.
  • International COVID-19 monitor (June 2021): A policy and practice tracker that summarises developments concerning digital vaccine passports and COVID-19 status apps.
  • The rule of trust (July 2022): Principles identified by citizens’ juries to ensure that data-driven technologies are implemented in ways that the public can trust and have confidence in.

List of countries in our sample:

  1. Argentina (ARG)
  2. Australia (AUS)
  3. Brazil (BRA)
  4. Botswana (BWA)
  5. Canada (CAN)
  6. China (CHN)
  7. Germany (DEU)
  8. Egypt (EGY)
  9. Estonia (EST)
  10. Ethiopia (ETH)
  11. Finland (FIN)
  12. France (FRA)
  13. United Kingdom (GBR)
  14. Greece (GRC)
  15. India (IND)
  16. Israel (ISR)
  17. Italy (ITA)
  18. Jamaica (JAM)
  19. Kyrgyzstan (KGZ)
  20. South Korea (KOR)
  21. Morocco (MAR)
  22.  Mexico (MEX)
  23.  Nigeria (NGA)
  24.  New Zealand (NZL)
  25.  Romania (ROU)
  26.  Russia (RUS)
  27.  Saudi Arabia (SAU)
  28.  Singapore (SGP)
  29.  Tunisia (TUN)
  30.  Türkiye (TUR)
  31. Taiwan (TWN)
  32.  United States of America (USA)
  33.  South Africa (ZAF)
  34.  Zimbabwe (ZWF)

Contact tracing apps

Emergence

Contact tracing is an established disease control measure. Public health experts help patients recall everyone they have come into close contact with during the timeframe in which they may have been infectious. Contact tracing teams then inform exposed individuals that they are at risk of infection and provide them with guidance and information.[26]

In the early phase of the pandemic, the idea of building on this practice by digitising contact tracing quickly became prominent. With lockdowns contributing to social and economic hardships, the objective was to return to the pre-pandemic ‘normal’ as soon as possible, and the global consensus at the time was that vaccination would be the only long-term solution to achieve this.

While vaccines were being developed, many countries relied on contact tracing to break chains of infection so that they could ease pandemic restrictions such as lockdowns.

Research shows that contact tracing as a disease control measure reaches its full potential when carried out by trained public health experts, who are able to engage with patients and their contacts rapidly and sensitively.[27] However, many countries lacked adequate numbers of trained public health staff and resources (for example, testing capacity to detect contacts known to be infected) for this kind of manual tracking and isolation.[28] In this context, digital contact tracing offered the possibility of accelerating contact tracing.

Countries had varying approaches to contact tracing and the use of digital contact tracing technologies, depending on their existing infrastructure. South Korea, for example, established a national tower that oversaw data collection and monitoring activities. This was built on existing smart city infrastructures which contained data collected from immigration records, CCTV footage, card transaction data and medical records.[29]

Research in South Africa highlights the state’s surveillance capabilities using mobile network systems and tracking internet users’ online activities.[30] South Africa used location information from mobile network operators to help contact tracing teams who ‘tracked and traced’ people infected with COVID-19 with no prior public announcement or consultation, although it later abandoned this approach.[31]

In Asia and Africa, digital contact tracing involved extensive collection of personal data through mass surveillance. In Europe and the USA, on the other hand, the idea of digital contact tracing through a mobile app on citizens’ smartphones began to be considered. Contact tracing apps were considered a lower-risk alternative than the mass surveillance tools adopted in Asia and Africa.

The idea of building contact tracing apps eventually gained momentum not only in Europe and the USA but across the world. Governments needed to consider the technical infrastructure, efficacy and purpose of this new technology, and the related benefits, risks and harms.

As early research from the Ada Lovelace Institute showed, public legitimacy and trust were critical for these technologies to work effectively.[32] Members of the public had to use contact tracing apps in the way intended by governments and technology companies, such as by uploading their health information if diagnosed with COVID-19 or isolating after being informed they had had close contact with someone known to be infected with COVID-19. This was particularly challenging for countries and regions with low levels of digital access and skills.[33]

To support public trust, contact tracing apps needed to be built using established best-practice methods and principles, and uses of the technology and data had to be controlled through strong regulation. If the data were to be repurposed, such as for surveillance purposes, it could damage public trust in the government, limiting the effectiveness of using COVID-19 technologies to deal with public crises in the future.

Despite these challenges, many countries across the world deployed contact tracing apps at pace in 2020.[34] In this chapter, we outline the various technical approaches and infrastructure behind contact tracing apps to build understanding of the different debates and concerns around them. We then assess their effectiveness, public legitimacy, impact on inequalities and governance.

Types of contact tracing apps

Contact tracing apps can be divided into two types: centralised or decentralised. This determines where data is stored and who can access it.[35]

Table 1: Design approaches for contact tracing apps

Communication protocol How is data generated, stored and processed? Who can access the data?
Centralised system approach Users’ data is generated, stored and processed on a central server operated by public authorities. Public authorities have access to data. They score users according to their risk and decide which users to inform. For example, if person x has been in close proximity to y, who is known to be infected with COVID-19, public authorities will be able to identify x and contact them.
Decentralised system approach Users’ data is generated, stored and processed on users’ mobile phones. The data gathered through mobile phones can also be shared on a backend server. A backend server is responsible for storing, processing and communicating data. But decentralised contact tracing systems use arbitrary identifiers (for example, a set of numbers and letters) rather than identifiers (for example., IP address). Hence, even when public authorities access the data on a backend server, they cannot identify users or reconstruct their locations and social interactions.[36]

 

There are three main technologies that are used in both centralised and decentralised systems to detect and trace users’ contacts and estimate their risk of infection.

Table 2: Technologies of contact tracing apps

How do apps decide if a user has been in contact with a person known to be infected?
Bluetooth exposure notification system This approach is based on proximity tracing: this means determining whether two individuals were near each other in a particular context for a specific duration.[37] Contacts are identified through Bluetooth technology on mobile phones. By giving permission for contact tracing apps to use their smartphone’s Bluetooth function, users allow the app to track real-time and historical proximity to other smartphones using the app. The app will share an infection alert if a user has been in proximity to a person who is known to be infected with COVID-19.

Contact tracing apps based on Bluetooth technology are also referred to as exposure notification apps.

Location GPS data This approach is based on location: contact tracing apps use the mobile device’s location (GPS) feature to identify contacts who have been in the same location as a person who is known to be infected with COVID-19
QR code This approach is based on presence tracing; whether two individuals were present at the same time in a venue where infection could have taken place.[38] Users scan a QR code with their smartphone on entry to venues. If a user who is known to be infected with COVID-19 uploads this information to the app, other users who have scanned the same QR code are notified.

New Zealand incorporated Near Field Communication (NFC) codes as an alternative to QR codes in the NZ COVID Tracer app. NFC is a technology that allows two devices to connect through proximity. NFC codes work by tapping mobile phones on or near NFC readers, in the same way that contactless credit cards, Google and Apple Pay work by tapping on or near card readers.[39]

When contact tracing apps were being considered for development, many countries were enthusiastic about deploying apps with a centralised system approach, which stores the data of app users on a central server.

Supporters of this centralised approach argued that access to data would give epidemiologists and health authorities valuable information for analysis. However, many privacy, data security and human rights researchers and activists highlighted the risks created by user data being accessible to third parties through a centralised server. These risks included the privacy infringements, data repurposing and increased surveillance.

In this context, proposals emerged for technical protocols that would enable decentralised contact tracing, designed to be ‘privacy preserving’ by enabling users’ data to be stored on their mobile smartphones rather than on a centralised server.

Several decentralised protocols emerged in April 2020, including the open protocol DP-3T (Decentralized Privacy-Preserving Proximity Tracing), PEPP-PT (Pan-European Privacy-Preserving Proximity Tracing) and the Apple/Google Exposure Notification protocol (GAEN API).

In our research, we collected evidence about the system approaches of contact tracing apps in 25 countries.[40] We discovered that 15 out of 25 countries used a decentralised system approach. Of the 15 countries that adopted a decentralised approach, not all of these based their decision on their privacy-preserving infrastructure.

The Apple/Google protocol quickly became the dominant decentralised protocol, because of the control exercised by the platforms over the two main smartphone operating systems (iOS and Android, respectively).

The Apple/Google protocol gained dominance in part because centralised contact tracing apps could not perform well on Google and Apple’s operating systems[41] without the platforms making technical changes to these systems, which they refused to do because of concerns about users’ privacy.[42]

The centralised contact tracing apps of Australia and France, for example, had major technical problems.[43] In June 2020, France’s junior minister for digital affairs highlighted that the poor technical efficacy of France’s centralised app had led to decreased public confidence in the app, stating: ‘There has been an upward trend in uninstalling over the last few days, to the tune of several tens of thousands per day’.

Similarly, Australia’s contact tracing app, which combined Bluetooth technology with a centralised system server approach, identified only 17 contacts not found manually in two years.

This caused tensions between technology companies and governments that wanted to use centralised systems with Bluetooth technology, which was considered less invasive of privacy than collecting geographical location data. Countries such as the UK and Germany, which initially pursued centralised apps independently of the Apple/Google protocols, eventually had to deploy the GAEN API to enable their Bluetooth notification systems to work effectively.[44]

In some cases, the distinction between centralised and decentralised systems was blurred. There are decentralised contact tracing systems that centralise information, if users voluntarily upload data.

For example, Singapore’s Bluetooth exposure notification app is decentralised in that it does not store users’ data on a central server. However, when users sign up for TraceTogether, they provide their phone number and ‘unique identification number’ (a government ID used for a range of activities).

If a user is known to be infected with COVID-19, they can grant the Ministry of Health access to their Bluetooth proximity data. This allows the ministry to identify people who have had close contact with the infected app user within the last 25 days, so it follows a more centralised model at that point.[45]

The developers emphasised that they built this ‘hybrid model of decentralised and centralised approach specifically for Singapore’.[46] Similarly, Ireland’s COVID Tracker allows users to upload their contact data, age, sex and health status to a centralised data storage server.[47] There are also apps that use both GPS data and a Bluetooth exposure system, such as India’s Aarogya Setu.

QR codes were also widely used in contact tracing apps, especially those with Bluetooth exposure notification systems, such as the UK’s NHS COVID-19 app.

  • Romania, the USA, Russia and Greece are the only countries in our sample that did not launch a national contact tracing app.[48]
  • India, Ghana, South Korea, Türkiye, Israel and Saudi Arabia used both Bluetooth and location data with a centralised approach.[49]
  • Estonia, France, Finland, Canada, India and Australia discontinued their contact tracing apps and deleted all of the data gathered and stored through them.[50] England and Wales also closed down their contact tracing app NHS COVID-19, and the personal data collected was deleted, but anonymous analytical data may be retained for up to 20 years.[51]
  • Several contact tracing apps were expanded to include vaccine information – for example, Italy’s Immuni app, Türkiye’s Hayat Eve Sığar (HES; Life Fits into Home) app and Singapore’s TraceTogether (TT) app.
  • The USA did not have a federal contact tracing app. MIT Technology Review’s COVID Tracing Tracker demonstrates that only 19 states out of 50 had rolled out contact tracing apps as of December 2020, and to the best of our knowledge no contact tracing app was developed in the USA after this date.[52]

Effectiveness of contact tracing apps

In April 2020, the Ada Lovelace Institute published the rapid evidence review Exit through the App Store?. [53] This report explored technical and societal implications of a variety of COVID-19 technologies, including contact tracing apps. The review acknowledged that, given the potential of data-driven technologies ‘to inform research into the disease, prevent further infections and support the restoration of system capacity and the opening up of the economy’, it was right for governments to consider their use.

However, we urged decision-makers to consider the lack of scientific evidence demonstrating the potential efficacy and impact of contact tracing apps. And we pointed out that there had not been adequate time or resources to establish effective strategies and monitoring mechanisms to investigate their impacts on public health.

We emphasised that lack of credible evidence supporting the apps’ effectiveness could undermine public trust and hinder implementation due to low uptake.

Since then, a considerable number of studies have emerged investigating the effectiveness of contact tracing apps. This body of literature offers four key findings:

  • Some Bluetooth notification exposure apps with decentralised systems have been effective in identifying and notifying close contacts of people known to be infected with COVID-19, for example the UK’s NHS COVID-19 app.[54] However, the technical efficacy of this kind of system cannot be generalised at an international level. The evidence from South Africa and Canada, for example, indicates technical problems, including insufficient Bluetooth accuracy and smartphone batteries being quickly drained.[55] Such technical issues affected the apps’ ability to identify and notify close contacts of people who were known to be infected with COVID-19.
  • Apps with centralised systems and Bluetooth exposure notification systems, which were not compatible with Google and Apple’s GAEN API, had significant technical problems. This reduced their ability to identify close contacts.[56] For example, France’s contact tracing app had sent only 14 notifications after 2 million downloads as of June 2020.[57]
  • Low uptake of contact tracing apps reduced their effectiveness in some countries, for example in Australia.[58] This is because the proportion of potentially exposed people who actually receive an exposure notice and stay at home is, by definition, lower if fewer people are using the app overall.
  • Contact tracing apps were insufficiently integrated with government services and public health systems. An investigation of the effectiveness of contact tracing apps from a public health perspective in six countries found that apps did not reach their full potential, due to inadequate testing capacity and poor data sharing across local and central government authorities.[59]

However, there are still important evidence gaps which prevent us from definitively assessing the effectiveness of contact tracing apps.

To explore these gaps, we organised a multidisciplinary workshop with experts from the USA and Europe in October 2022 to discuss the effectiveness of contact tracing apps. The findings from the workshop (listed below) demonstrate the limitations of the evidence.

It was clear that there is still no consensus on what effectiveness means beyond apps’ technical efficacy. How can we define people-centred effectiveness?

Research is also limited on how contact tracing apps affected individual behaviours that would have supported wider public health measures: for example, whether users self-isolated after a COVID-19 exposure notification. The existing evidence is limited in both sample size and scope,[60] because (to date) people’s real-life experiences of contact tracing apps have received little research attention.

A Digital Global Health and Humanitarianism Lab (DGHH Lab) investigation of contact tracing apps provides a useful framework for how further research should evaluate people’s real-life experiences of contact tracing such apps. The investigation looks at people’s opinions and experiences of contact tracing apps in five countries: Cyprus, Iceland, Ireland, Scotland and South Africa.[61] It concludes that user engagement with the apps should be seen in four stages:

  1. Uptake (users download the app).
  2. Use (users run the app and keeps it updated).
  3. Report (users report a positive COVID-19 diagnosis via the app).
  4. React ( users follow necessary next steps when they receive an exposure notification from the app).[62]

Uptake alone does not guarantee continued use and change in behaviour (for example, getting tested or staying at home when notified of an exposure). The stage-based approach should therefore guide our understanding of individuals’ actual, ongoing usage of COVID-19 technologies.

Several studies demonstrate that uptake does not guarantee continued use. In France, for example, only a minority of users of the TousAntiCovid (Everyone Against COVID, formerly StopCovid) app used the contact tracing feature.

BBC News reported that although two million people downloaded the Protect Scotland app, only 950,000 people actively used it, and that around 50,000 people stopped using it a few months after its launch.[63] Similarly, there is evidence that millions of people who downloaded the NHS COVID-19 app (used in England and Wales) never technically enabled it, so despite having an intention to engage with it, they did not use it in practice.[64]

This evidence does not suggest that contact tracing apps were completely ineffective. But it challenges us to consider why people did not use the apps as anticipated by policymakers and developers.

Exploring this will help ensure that contact tracing apps and similar health technologies reach their full potential in the future.

A research study on the UK contact tracing apps demonstrates that some people also stopped using apps after a while because they lost confidence in their effectiveness.[65] Similarly, the Government of Canada’s evaluation of the COVID Alert app notes that its perceived lack of effectiveness among the public led to fewer downloads and less continued usage, which prevented the app from reaching its full potential.[66]

These findings demonstrate that more research is needed to investigate people’s views and practices in relation to contact tracing apps in real-life contexts and over time. This will help review the apps’ effectiveness, not just technically but in terms of outcomes for people and society.

How did different technologies, policies and public communications impact public attitudes when the apps were first deployed and over time?

We need more comparative evidence to understand how different technologies, policies and public communication strategies impacted public attitudes. The existing evidence, despite its limitations, indicates the importance of comparative research.

For example, there is an important distinction between tracing apps (location GPS data) and exposure notification apps (Bluetooth technology), in terms of the risks and challenges they pose. Yet there is no adequate research into how the public perceives the respective risks and effectiveness of these two different types of contact tracing apps.

A qualitative research study with 20 users of Canada’s COVID Alert app confirms the significance of this evidence gap. It demonstrates that participants favoured the app’s decentralised approach over centralised systems because of the higher level of privacy protection and optional level of cooperation.[67] The research also finds that users’ motivation to notify the app if known to be infected with COVID-19,and to follow government guidelines, increases with their understanding of the purpose and technical functionality of the app.

A limitation of the evidence base is that existing research largely investigates contact tracing apps in the first year of the pandemic. There is a need to understand the success and effectiveness in the context of changing nature of the pandemic. This will help understand how people’s confidence in apps’ effectiveness and their usage practices have changed over time.

Our recommendation when contact tracing apps emerged in 2020:

  • Establish the effectiveness of contact tracing apps as part of a wider pandemic response strategy.[68]

 

In 2023, the evidence on the effectiveness of the various apps can be summarised as follows:

  • Countries did not decide what effectiveness would look like when rolling out these apps.
  • Contact tracing apps have demonstrated that digital contact tracing is feasible. Some decentralised contact tracing apps with Bluetooth technology worked well, in that they demonstrated technical efficacy (for example, the NHS COVID-19 app in England and Wales[69]). However, the technical efficacy of decentralised Bluetooth exposure notification systems cannot be generalised at an international level. The evidence from South Africa and Canada, for example, indicates technical problems.
  • Apps with centralised systems and Bluetooth exposure notification systems, which were not compatible with Google and Apple’s GAEN API, had significant technical problems. This negatively impacted their ability to identify and notify close contacts (for example, in France).
  • Existing research and expert opinion indicate that the apps were not well integrated within broader public health systems and pandemic management strategies, which negatively impacted their effectiveness.
  • The impact of contact tracing apps on public health is unclear because significant evidence gaps remain that prevent understanding of their impact on public health behaviours at different stages of the pandemic. There is also a lack of clear evidence around how different technologies, policies and public communications have affected public attitudes towards the apps.

 

Lessons learned:

To build evidence around the effectiveness of contact tracing apps as part of the wider pandemic response strategy:

  • Support research and learning efforts on the impact of contact tracing apps on people’s public behaviours.
  • Understand how the apps’ technical properties, and different policies and implementation approaches, impact on people’s experiences of contact tracing apps in specific socio-cultural contexts and across geographic areas.
  • Use this impact evaluation to help set standards and strategies for the future use of technology in public crises. Weigh up digital tools’ benefits and harms by considering their role within the broader COVID-19 response and comparing them with non-digital interventions (for example, manual contact tracing).

 

To ensure the effective use of technologies in future pandemics:

  • Invest in research and evaluation from the outset, and implement a clear evaluation framework to build evidence during deployment that supports understanding of the role that COVID-19 technologies play in broader pandemic health strategies.
  • Define criteria for effectiveness using a human-centred approach that goes beyond technical efficacy and builds an understanding of people’s experiences.
  • Establish how to measure and monitor effectiveness by working closely with public health experts and communities, and set targets accordingly.
  • Carry out robust impact assessments and evaluation of technologies, both when first deployed and over time.

Public legitimacy of contact tracing apps

When they first emerged, we argued that public legitimacy was key to the success of contact tracing apps.

Members of the public were more likely to use the apps and follow the guidelines (for example, self-isolating after receiving a notification) if they trusted the technology’s effectiveness and believed that adequate regulatory mechanisms were in place to safeguard their privacy and freedoms.[70]

We also demonstrated that public support for contact tracing apps was contextual: people had varying views and experiences of the apps depending on how they were implemented locally (for example, whether uptake was mandatory or voluntary).[71]

In countries where contact tracing app use was mandatory, members of the public had to use them even if they did not think that they were legitimate technologies. For example, in China, the Health Code app was automatically integrated into users’ WeChat and Alipay, so that they could only deactivate the COVID-related functionality by deleting these applications.[72]

These applications are widely used, as smartphone-based digital payment is the main method of payment in China.[73] The app was therefore assigned mandatorily to 900 million users (out of 1.4 billion) in over 300 cities, using pre-existing legal mechanisms to justify and enforce the policy (for example, the Novel Coronavirus Pneumonia Prevention and Control Plans).[74]

The Health Code app was not the only automatically assigned technology across China. Cities and regions required their residents to use multiple technologies depending on their own local COVID-19 pandemic measures and mechanisms; however, there is not much information regarding local authorities’ administration of these technologies. Similarly, it was not always clear which government department had ultimate authority for oversight and enforcement.[75]

In the majority of the countries in our sample, contact tracing apps were voluntary. People were not obliged through legislation to use them, and only did so if they believed in their effectiveness and had the resources to adopt them and adhere to guidelines.

Seen through this lens, contact tracing apps can be taken as a test of public acceptance of powerful technologies that entail sensitive data and are embedded in everyday life.

A study that investigated voluntary contact tracing app adoption in 13 countries found that the adoption rate was 9% on average.[76] In 2020, the Ada Lovelace Institute conducted an online public deliberation project on the UK Government’s use of the NHS COVID-19 contact tracing app to transition out of lockdown.[77] This research demonstrated that the public demanded clarity on data use and potential risks as well as independent expert review of the technology’s efficacy. Since then, there has been a boom in research into public attitudes to contact tracing apps that confirms this point.

This demonstrates the reasons for low levels of public support for contact tracing apps. These include low levels of trust in government and concerns about apps’ security and effectiveness, leading to low adoption (or high rates of people discontinuing use) in some countries, for example, Australia, France and South Africa. [78]

While we do not have in-depth insights about public support for apps in the countries where uptake was mandatory, recent developments in China demonstrate people’s dissatisfaction with the Health Code app and the restrictions it enabled. When the Chinese government ended the Health Code mandate in December 2022, many people shared celebratory content on social media platforms.

Some of this content suggested that people were happy to make decisions and take precautions for themselves rather than rely on the Health Code algorithm.[79] A considerable number of privacy and human rights law experts were explicitly critical of the use of Health Code system (both about the use of the system in general and its use beyond the height of the pandemic) and urged the Chinese government to discontinue its use beyond the COVID-19 pandemic.[80]

Experts emphasise the importance of effective public communication strategies in pandemic management.[81] The existing research demonstrates that many governments across the world have not been able to communicate scientific evidence effectively, particularly to address vaccine hesitancy and misinformation.[82] This finding includes communications around digital interventions.

Research undertaken in the UK shows that the public do not have a clear understanding of the technical capabilities and uses of COVID-19 technologies.

When asked about digital contact tracing apps, participants in the research imagined these apps ‘being able to “see” or ‘visualise’ their every move’.[83]

This indicates a misunderstanding (or lack of knowledge) regarding the apps’ infrastructure. Contact tracing apps in the UK are built on the GAEN API using Bluetooth technology, so they do not collect geo-location data and are not able to track users’ location in the literal sense of knowing where a user is at a given point in time.

In Europe, Bluetooth technology has been widely used instead of geo-location data.[84] However, the perceived risk of surveillance and literal tracking has been a public concern in the majority of European countries, especially among social groups with lower levels of trust in government.[85] Similar evidence exists for South Africa, where the lack of focused and targeted communications reduced public trust, and the COVID Alert SA app was not widely used by members of the public.[86]

Perhaps an exception within our sample is Canada, which established an extensive communications campaign to increase awareness and understanding of the COVID Alert app.[87] Health Canada, the government department responsible for national health policy, spent C$21 million on this campaign to encourage Canadians to download and use the app.[88]

The official evaluation of the app published by Health Canada and the Public Health Agency of Canada concludes that these campaigns resulted in millions of downloads.[89] This evidence demonstrates the importance of effectively communicating the apps’ purpose and technical infrastructure to members of the public.

Existing political structures and socio-economic inequalities were also important in determining uptake. In many parts of the world, structural factors and inequalities mean that marginalised and disadvantaged communities are more likely to distrust the government, institutions and public health advice.[90]

It is unsurprising that these groups were less likely to use contact tracing apps. There is strong online survey research evidence from the UK that confirms this point, in an investigation of the adoption of and attitudes towards the NHS COVID-19 app:

  • 42% of Black, Asian and minority ethnic respondents downloaded the app compared with 50% of white respondents
  • 13% of Black, Asian and minority ethnic respondents downloaded then deleted the app compared with 7% of white respondents
  • Black, Asian and minority ethnic respondents were more concerned about how their data would be used and felt more frustrated as a result of a notification from the app than white respondents
  • Black, Asian and minority ethnic respondents had lower levels of trust in the National Health Service (NHS) and were less likely to download the app to help the NHS.[91]

Our recommendations when contact tracing apps emerged:

  • Build public trust by publicly setting out guidance and enacting clear law about permitted and restricted uses. Explain the legal guidance and mechanisms to support rights through clear public communications and transparency.
  • Ensure users understand apps’ purpose, the quality of its evidence, its risks and limitations, and users’ rights, as well as how to use the app.[92]

 

In 2023, the evidence that has emerged on the public legitimacy of contact tracing apps demonstrates these points:

  • Public acceptance of contact tracing apps depended on public trust in apps’ effectiveness and in governments and institutions, as well as the safeguard mechanisms in place to protect privacy and individual freedoms.
  • Individuals and communities who encounter structural inequalities were less likely to trust in government institutions and the public health advice they offered. Hence, they were less likely than the general population to use contact tracing apps.
  • Governments did not always do well at communicating with the public about the properties, purpose and legal mechanisms of contact tracing apps. This negatively impacted public legitimacy, since governments could not gain public trust in the safety and effectiveness of the apps.

 

Lessons learned:

To achieve public legitimacy for the use of technology in future pandemics:

  • Reinforce the need to build public trust by publicly setting out guidance and enacting clear law about permitted and restricted uses. Explain the legal guidance and mechanisms to support rights through clear public communications and transparency.
  • Effectively communicate the purpose, governance and properties of contact tracing technologies to the public.

Inequalities

The international evidence concerning the impact of COVID-19 on communities demonstrates higher infection and mortality rates among the most disadvantaged communities.

It highlights the intersections of socio-economic, ethnic, geographical, digital and health inequalities, particularly in unequal societies and regions.[93]

The introduction of contact tracing apps led to concerns that they could widen health inequalities for vulnerable and marginalised individuals in society (for example, around digital exclusion and poor access to healthcare). In this context, we called on governments to carefully consider the potential negative social impacts of contact tracing apps, especially on vulnerable and disadvantaged groups.[94]

A part of pandemic management, policymakers and technology companies developed and adopted new technologies rapidly. This left insufficient room to discuss questions about equality and impact, such as whether contact tracing apps would benefit everyone in society equally, who might not be able to benefit from them, and what the alternatives were for those individuals and communities.

There was a surge in techno-solutionism – the view that technologies can solve complex real-world problems – during the pandemic. As Marelli and others (2022) argue, ‘the rollout of COVID interventions in many countries has tended to replicate a mode of intervention based on ‘technological fixes’ and ‘silver-bullet solutions’, which tend to erase contextual factors and marginalize other rationales, values, and social functions that do not explicitly support technology-based innovation efforts’. [95]

This meant that non-digital interventions that could perhaps have benefited marginalised and disadvantaged communities – particularly manual contact tracing – were not adequately considered.

Research shows that contact tracing as a disease control measure, if effectively conducted in a timely way, can save lives, particularly for disadvantaged and marginalised communities.[96]

Manual contact tracing teams should ideally be trained to help individuals and families to access testing, identify symptoms, and secure food and medication when isolating. This type of in-depth case investigation and contact tracing requires knowing and effectively communicating with communities, which cannot be done via a mobile application.

Some contact tracing apps recognised this need and attempted to incorporate a manual function. COVID Tracker Ireland, for example, offered users the option of providing a phone number if they wanted to be contacted by public health staff.[97] This is important because it gives contact tracers the opportunity to contact people who are known to be infected with COVID-19 and address their needs.

However, it was unclear how these apps were intended to work alongside manual contact tracers, since it is a core function of majority of contact tracing apps that they inform individuals of exposure directly, with no involvement from public health staff.[98]

This raises the question of whether digital contact tracing was carried out at the expense of other health interventions (most notably, manual contact tracing) and led to the needs of particular individuals and families not being sufficiently considered.[99]

Furthermore, contact tracing apps’ success relies on the assumption that people will self-isolate if notified as a contact of someone who has tested positive for COVID-19. Yet as Landau, the author of People Count: Contact-Tracing Apps and Public Health, argues: ‘the privilege of staying at home is not evenly distributed’.[100]

While some people were able to work from home, many were not and therefore did not have the opportunity to self-isolate if notified of exposure. This shows that technologies cannot work efficiently in isolation and must be supported by strong social policies.

In some countries, governments introduced financial support for those who were ill or self-isolating. In the UK for example, the Government enabled citizens to claim a payment if notified by the NHS COVID-19 app.[101] But a report by Nuffield Foundation and the Resolution Trust found that the financial support given by the Government during the pandemic covered only a quarter of workers’ earnings.[102]

For health technologies such as contact tracing apps to result in changes in behaviour, policymakers need to address structural factors and inequalities that affect disadvantaged groups.

Similarly, people who did not have adequate digital access and skills were not able to use contact tracing apps, even if they wanted to. And these apps were particularly challenging for countries with low levels of internet access, such as South Africa and Nigeria.[103]

Our recommendation when contact tracing apps emerged:

  • Proactively address the needs of, and risks relating to, vulnerable groups.[104]

 

In 2023, the evidence on the impact of contact tracing apps on inequalities demonstrates these points:

  • The rapid introduction of apps caused concerns that they would widen health inequalities for vulnerable and marginalised individuals in society (for example, those who are digitally excluded or with poor access to healthcare) who would not be able to benefit from them.
  • The evidence is unclear around the impact of contact tracing apps on health inequalities and whether authorities produced effective non-digital solutions and services for marginalised and disadvantaged communities.
  • Marginalised and disadvantaged communities (for example, those facing digital exclusion or lacking the financial security to self-isolate) were less likely to use contact tracing apps. To increase their adoption, they had to be supported with non-digital solutions and public services (for example, with manual contact tracing or financial support).

 

Lessons learned:

To mitigate the risk of increasing inequalities when using technology in future pandemics:

  • Consider and monitor the impact of technologies on disadvantaged and marginalised communities. These communities may not benefit from technological solutions as much as the general population, which might increase health inequalities
  • Mitigate the risk of increasing (health) inequalities for these groups by establishing non-digital services and policies that will help them use the technologies and adhere to guidelines (for example, providing financial support for those who cannot work from home).

Governance, regulation and accountability

In deciding to introduce contact tracing apps, governments had to consider trade-offs between human rights and public health interests, because the apps used sensitive personal information and determined the freedoms and rights of individuals.

In the early stages of the pandemic, the Ada Lovelace Institute recommended that if governments wanted to build contact tracing apps, they should ensure that these new tools were governed by strong regulations and oversight mechanisms. We argued that contact tracing apps should be designed and governed in line with data protection and privacy principles.[105]

We acknowledge that these principles are not universal but are informed by political, cultural and social values. But they are underpinned by an international framework that informs the legal protection of human rights around the world.[106] It is beyond the scope of this report to evaluate country-specific laws. But the evidence we have uncovered suggests that different political cultures and pre-existing legislative frameworks of countries yielded varying governance mechanisms, which sometimes fell short of protecting civil rights and freedoms.

One of the most polarising issues concerning the launch of contact tracing apps was whether they should be mandatory or voluntary.

When contact tracing first emerged, we argued that making the use of contact tracing apps mandatory would not be proportionate given the lack of evidence for such apps’ effectiveness.

We also highlighted that contact tracing apps could facilitate surveillance and result in discrimination against certain groups (for example, those who are digitally excluded or refuse to use contact tracing apps). If these risks and challenges materialised, they could be detrimental to human rights.[107]

A comparative analysis of legislation and digital contact tracing policies in 12 countries shows that, in western countries, where privacy legislation strongly emphasises individual freedoms and rights, contact tracing app use was voluntary (for example, France, Austria and the UK).[108]

In Israel, China, Taiwan and South Korea, contact tracing app use was mandatory. Several studies demonstrate how the pre-existing laws and confidentiality requirements allowed Taiwan’s and South Korea’s governments to collect a wide range of social and surveillance data with relatively high levels of public acceptance.[109]

Both Taiwan and South Korea had had recent experiences of dealing with pandemics, and there was pre-existing legislation that permitted tracking through contact tracing apps, CCTV and credit card companies. These laws allowed the governments to carry out large-scale data collection programmes, and there were also strict confidentiality requirements in place.

Although digital contact tracing was mandatory and extensive, contact tracing app governance was transparent and civilian-run in both countries, based on pre-existing public emergency and data protection legislation.[110]

In China, on the other hand, there was no pre-existing comprehensive privacy legislation when the Health Code was deployed (as the Personal Information Protection Law came into effect in November 2021).[111] China enforced mandatory use of the Health Code app between February 2020 and December 2022.

Health Code served as both a contact tracing app and a digital vaccine passport, linked with users’ national identity numbers. It used GPS location in combination with data gathered through WeChat and Alipay, two of the most popular social commerce platforms in China.

These platforms were chosen to guarantee widescale adoption, since they provide the backbone for electronic financial transactions in China. The app categorised people into three categories to determine a risk score for users: green (low risk, free movement); yellow (medium risk, 7-day self-isolation); and red (high risk, 14-day mandatory quarantine)’.[112]

Health code systems were automatically added to citizens’ smartphones through Alipay and WeChat, and Chinese authorities were accused of misusing the systems to stop protests and conduct surveillance of activists.[113]

In Israel, where the contact tracing app was mandatory and centralised, the legislation relating to pandemics does not include digital data collection because it was established in 1940. When a state of emergency is declared, the government is empowered to enact emergency regulations that may suspend the validity of other laws that protect individual rights and freedoms.

In this context, the absence of digital data collection in the legislation relating to pandemics allowed the government to enact emergency regulations allowing the authorities to conduct extensive digital contact monitoring.[114]

The Lex-Atlas COVID-19 project also highlights that emergency powers were used to justify excessive data gathering and surveillance mechanisms in various countries.[115] Some countries unlawfully attempted to make the apps mandatory for domestic activities.

For example, in spring 2020, India made it mandatory for government and private sector employees to download the Aarogya Setu app. This decision was then questioned by experts, including a former Supreme Court judge in Kerala High Court, due to the lack of any law that backed mandatory use of the app.[116]

After the challenge was heard in early May 2020, the Ministry of Home Affairs issued a notification on 17 May 2020, clarifying that use of the Aarogya Setu app should be changed from mandatory to ‘best effort’ basis.[117] This allowed government employees to challenge the mandatory use of the app enforced by the government or a government institution.

In this case, the ‘competent authority’ to extend the scope of Aarogya Setu’s Data Access and Sharing Protocol was the Empowered Group on Technology and Data Management. However, the group was dissolved in September 2020, and the Protocol expired in May 2022. Therefore, the use of the app was anchored in a discontinued protocol and regulatory authority.[118]

Norton Rose Fulbright’s contact tracing global snapshot project demonstrates that countries with weaker legislation and enforcement mechanisms were less transparent when communicating information about their contact tracing apps. Türkiye and Russia, for example, did not clarify how long the data would be stored, whether a privacy risk assessment had been completed, or whether the data would be stored on a centralised or decentralised server.[119]

Another example demonstrating the importance of strong data protection mechanisms comes from the USA, where there are no federal privacy laws regulating companies’ data governance.[120] [121]

In 2020, we highlighted the risk of repurposing contact tracing apps being repurposed, that is, the technology and the data collected being used for reasons other than health.[122]

The company that owns the privacy and security assistant app Jumbo investigated the contact tracing app of the state of North Dakota in the USA. It reported that user location data was being shared with a third party, location data platform Foursquare.

Foursquare’s business model is based on providing advertisers with tools and data to target audiences at specific locations.[123] This exemplifies the repurposing of the data collected through a contact tracing app for commercial purposes, highlighting the importance of strong laws and mechanisms to safeguard users’ data.

Another important investigation was carried out by the Civil Liberties Union for Europe in 10 EU countries.[124] According to the EU General Data Protection Regulation (GDPR), providers should carry out a data protection and equality impact assessment before deploying contact tracing apps, as they posed risks to people’s rights and freedoms.

Yet the Civil Liberties Union for Europe investigation demonstrates that although these countries launched contact tracing apps in 2020, none had yet conducted these assessments by October 2021.

This point is also supported by Algorithm Watch’s evaluation of contact tracing apps in 12 European countries. It found that contact tracing app policies varied significantly within the EU, and that apps were deployed ‘not in an evidence-based fashion and mostly based on contradictory, faulty, and incomparable methods, and results’.[125]

Another relevant example is Singapore. The Criminal Procedure Code (2010) in Singapore allowed the police to use the data collected by contact tracing app TraceTogether data for reasons other than health.[126] In February 2021, it was reported that police had used the app in a murder investigation case.[127]

Following this, the government amended the COVID-19 (Temporary Measures) Act (2020) to restrict the use of the data. But according to this Act, personal data collected through digital contact tracing can still be used by law enforcement in investigations of ‘serious offences’.[128]

As the examples above show, unsurprisingly, countries with more comprehensive data protection and privacy legislation applied data protection principles more effectively than countries with weak legislation.

But incidents of privacy breaches and repurposing data also took place in countries with relatively strong laws and regulatory mechanisms. Germany has comprehensive personal data protection regulations under the EU GDPR and the new Federal Data Protection Act (BDSG).[129]

The Civil Liberties Union for Europe report highlights that Germany is one of the few EU countries that built and rolled out its contact tracing apps in line with the principles of transparency, public debate and impact assessments.[130] But the data gathered and stored through the Luca app, which provides QR codes to check in at restaurants, events and venues, was shared with the police and used in a murder investigation case.[131]

The role of the private sector

Our research reveals that contact tracing apps with centralised data systems were repurposed and/or used to restrict individual freedoms and privacy. This finding is also supported by Algorithm Watch’s COVID-related automated decision-making database project.

As highlighted in Algorithm Watch’s final report, there have been fewer cases of dangerous uses of data-driven technology and AI in EU countries, which largely used the decentralised GAEN API with Bluetooth technology, than in Asia and Africa.[132]

Many privacy advocates supported GAEN technology, which stored data on a decentralised server, since its use would prevent government mass surveillance and oppression.

Nonetheless, as this initiative was led by Google and Apple and not by policymakers and public health experts, it generated questions about the legitimacy of having private corporations decide the properties and uses of this kind of sensitive digital infrastructure.[133]

As digital rights academic Michael Veale argues, a GAEN-based contact tracing system may be ‘great for individual privacy, but the kind of infrastructural power it enables should give us sleepless nights’.[134] The pandemic demonstrated that big tech companies like Apple and Google hold enormous power over computing infrastructure, and therefore over significant health interventions such as digital contact tracing apps.

Apple and Google partnered to influence properties of contact tracing apps in a way that was not favourable to particular nation states (for example, France, which pursued a centralised system approach despite its incompatibility with Bluetooth technology).

This revealed the difficulty, even at state level, of engaging in advanced use of data without the cooperation of the corporations that control the software and hardware infrastructure.[135] While preventing government abuse is crucial, the growing power of technology companies, whose main interest is profit rather than public good, is equally concerning.

Some critics also – and rightly – challenge the common claim that contact tracing apps with GAEN API have been privacy preserving. The reason for the challenge is that it is very difficult to verify whether the data collected has been stored and processed as technology companies claim.[136] This indicates a wider problem: the lack of strong regulation to ensure clear and transparent insight into the workings of technology companies.

These concerns raise two important questions: how will governments rebalance power against dominant technology corporations; and how will they ensure that power is distributed to individuals and communities? As Knodel argues, governments need to move toward designing multistakeholder initiatives with increased ability ‘to respond and help check private sector motivations’.[137]

And as GOVLAB and Knight Foundation argue in their review of the use of data during the pandemic, more coordination between stakeholders would prevent fragmentation in management efforts and functions in future pandemics.[138]

In the light of evidence identified above, as we have already recommended, strong legislation and regulations should be enacted to impose strict purpose and time limitations on digital interventions in times of public crisis. Regulations and oversight mechanisms should be incorporated into emergency legal systems to curb state powers. Governments need to consider a long-term strategy that focuses on collaborating effectively with private technology companies.

Our recommendation when contact tracing apps emerged:

  • Governments should develop legislation, regulations and accountability mechanisms to impose strict purpose and time limitations.[139]

 

In 2023 the evidence on the governance, regulations and accountability of contact tracing apps demonstrates that:

  • Most countries in our sample rolled out contact tracing apps at pace, without strong legislation or public consultation. The different political cultures and pre-existing legislative frameworks of countries yielded varying governance mechanisms, which sometimes fell short of protecting civil rights and freedoms.
  • Some countries used existing emergency powers to sidestep democratic processes and regulatory mechanisms (for example, Türkiye, Russia and India). Even in those countries with relatively strong regulations, privacy breaches and repurposing of data took place, mostly notably in Germany.
  • We have not come across any incidents of misuse of the decentralised contact tracing apps using the Apple/Google GAEN API. But private sector influence on public health technologies is a factor in the ability of governments to develop regulation and accountability mechanisms. The COVID-19 pandemic (and particularly the roll-out of contact tracing apps) showed that national governments are not always able to use their regulatory powers, due to their reliance on large corporations’ infrastructural power.

Lessons learned:

  • Define specific guidelines and laws when deploying new technologies in emergency situations.
  • Develop the public sector’s technical literacy and ability to create technical infrastructure. This does not mean that the private sector should be excluded from developing technologies related to public health. But it is crucial that the technical infrastructure and governance are effectively co-designed by government, civil society and private industry.

Digital vaccine passports

Emergence

From the beginning of the COVID-19 pandemic, establishing some form of ‘immunity passport’ based on evidence or assumption of natural immunity and antibodies after infection with COVID-19 was seen as a possible route out of restrictions.

Governments hoped that immunity passports would allow them to lift mobility restrictions and restore individual freedoms, at least for those who had acquired immunity to the virus.

However, our understanding of infection-induced immunity from the virus was still inadequate due to lack of evidence concerning the level and longevity of antibody levels against COVID-19 after infected by the virus. In this context, these plans were slowed down to allow evidence to accumulate about the efficacy of natural immunity to protect people.[140]

In the meantime, there was considerable investment in efforts to develop vaccine against COVID-19 to protect people through vaccine-induced immunity. On 7 October 2020, Estonia and the World Health Organization (WHO) announced a collaboration to develop a digitally enhanced international certificate of vaccination to help strengthen the effectiveness of the COVAX initiative, which provides COVID-19 vaccines to poorer countries.[141]

The WHO eventually decided to discontinue this project, because the impacts and effectiveness of digital vaccine passports could not be estimated. It also pointed to several scientific, technical and societal concerns with the idea of an international digital vaccine passport system, including the fact that it could prevent citizens of countries unable to secure a vaccine supply from studying, working or travelling abroad.[142]

In November 2020, Pfizer and BioNTech announced their vaccine’s efficacy against COVID-19.[143] In December 2020, the first patient received COVID-19 vaccination in the UK.[144] In the same month, China approved its state-owned COVID vaccine for general use.[145]

Many other vaccines were quickly rolled out, including Moderna, Oxford AstraZeneca and Sputnik V. Countries aimed to roll out vaccination programmes as rapidly as possible to bring down numbers of deaths and cases, and facilitate the easing of COVID-19 restrictions.[146]

This re-energised the idea of establishing national and regional digital vaccine passport systems – among governments, but also among universities, retailers and airlines that sought an alternative to lockdowns.[147]

Despite the lack of scientific evidence on their effectiveness, the majority of countries in our sample eventually introduced digital vaccine passports, with two main purposes: to create a sense of security and to increase vaccine uptake when ending lockdowns.[148]

Unsurprisingly, technology companies raced towards building digital vaccine passports to be used domestically and internationally.[149] The digital identity industry strongly advocated for the introduction of digital vaccine passports.[150] Their argument in support of this was that, if enacted successfully, digital vaccine passports could prove the feasibility of national, regional and international schemes based on proving one’s identity and health status digitally.[151]

Private companies went on to build vaccine passports with the potential to be used in various industries as well by governments, for example, the International Air Transport Association’s Travel Pass app for international travel.[152]

Vaccine passports are not a new concept: paper vaccine passports have been around since the development of smallpox vaccines in the eighteenth century.[153] Although yellow fever is the only disease specified in the International Health Regulations (2005) for which countries may require proof of vaccination as a condition of entry, in the event of outbreaks the WHO recommends that countries ask for proof of vaccines.[154]

COVID-19 vaccine passports are the first digital health certificates that indicate someone’s vaccination against a particular disease. Due to their data-driven digital infrastructure, the health information of individuals can be easily collected, stored and shared. Digital infrastructure of COVID-19 vaccine passports caused public controversy.

When digital vaccine passports emerged, arguments offered in support of them included that they could: allow countries to lift lockdown measures more safely; enable those at lower risk of infection and transmission to help to restart local economies; and allow people to re-engage in social contact with reduced risk and anxiety.

Using a digital rather than a paper-based approach would accommodate future changes in policy, for example vaccine passes expiring or being re-enabled after subsequent infections, based on individual circumstances, countrywide policies or emerging scientific evidence.

Arguments against digital vaccine passports highlighted their potential risks and challenges. These included creating a two-tier society between unvaccinated and vaccinated people, amplifying digital exclusion, and risking privacy and personal freedoms. Experts also highlighted that vaccine passports attempt to manage risks and permit or restrict liberties at an individual level, rather than supporting collective action and contextual measures.

They categorise an individual as lower risk based on their vaccine or test status rather than taking into account a more contextual risk of local infection in a given area. They could also reduce the likelihood of individuals observing social distancing or mask wearing to protect themselves and others.[155]

Digital vaccine passport systems carry specific risks because they gather and store medical and other forms of sensitive personal information that can be compromised through hacking, leaking or selling of data to third parties. They can also be linked to other digital systems that store personal data, for example, the digital identity system Aadhaar in India and the health system Conecte SUS in Brazil.

Experts recommended that strong privacy-preserving technical designs and regulations were needed to prevent such problems, but these were challenging to establish at pace.[156]

These risks and challenges raised questions around public legitimacy and fuelled public resistance to digital vaccine passports in some countries, making it difficult for countries to gain public trust – particularly given the sharp rise in public discontent with governments and political systems due to the pressures of the pandemic.[157]

The Ada Lovelace Institute closely followed the debate regarding digital vaccine passports as they emerged. We conducted evidence reviews, convened workshops with scientists and experts, and published evidence-based research to support decision-making at pace.

Based on the evidence we gathered, we argued that although governments’ attempts to find digital solutions were understandable, rolling out these technologies without high standards of governance could lead to wider societal harms.

The expert deliberation we convened in 2021 suggested that governments should pause their digital vaccine passport plans until there was clear evidence that vaccines were effective in preventing transmission, and that they would be durable and effective against new variants of COVID-19.[158]

We also concluded that it was important to address public concerns and build public legitimacy through transparent adoption policies, secure technical designs and effective communication strategies.

Finally, we highlighted the risk of poorly governed vaccine passports being incorporated into broader systems of identification, and the wider implications of this for the UK and other countries (a risk that has been realised in various countries).[159]

Before proceeding to explaining whether the risks, aspirations and challenges outlined above have materialised, we need to identify the various digital vaccine restrictions and understand how these new technologies have been implemented across the world. In the next section, we discuss digital vaccine passport systems, and the restrictions they have enabled based on a person’s vaccination status or test results.

Types of digital vaccine passport systems and restrictions

In this section, we identify the types of digital vaccine passport systems and restrictions in 34 countries. All countries in our sample introduced digital vaccine passports between January and December 2021 – with varying adoption policies.

Digital vaccine passports were in use in two important public health contexts to either limit or enable individuals’ ability to access certain spaces and activities during the COVID-19 pandemic:

  1. Domestic vaccine passport schemes: providing a valid vaccine passport to prove immunity status when participating in public activities (for example, going to a restaurant).
  2. International vaccine passport schemes: providing a valid vaccine passport to show immunity status when travelling from one country to another.

The majority of the countries in our sample changed their vaccine passport schemes at multiple times throughout the pandemic.[160] For example, both Türkiye and France introduced digital vaccine passports in summer 2021, internationally for inbound travellers and domestically for residents to access particular spaces (for example, restaurant, museums, concert halls, etc.).

By spring 2022, both countries had lifted vaccine passport mandates domestically but still required inbound travellers to provide immunity proof to avoid self-isolation and testing.

By August 2022, digital vaccine passports were no longer in use or enforced in either country (although the infrastructure is still in place in both countries and can be reused at any time). At the time, China and New Zealand were still enforcing digital vaccine passports – to varying degrees – to maintain their relatively low number of deaths and cases by restricting residents’ eligibility for domestic activities and inbound travellers’ eligibility to visit.

Contrary to China and New Zealand’s severe vaccine passport schemes, many countries, especially European countries, implemented domestic vaccine passport schemes to ease COVID-19 measures and transition from lockdown measures, despite increasing number of cases and hospitalisations (for example, in summer 2022).[161]

We identified eight different vaccine passports systems that allowed or blocked freedoms for residents and inbound travellers in the 34 countries in our sample.

We have coded them according to the severity of their implementation.

Digital vaccine passport restrictions

  1. Available but not compulsory. In use but not enforced for inbound travellers and domestic use.
  2. Mandatory for inbound travellers. Not mandatory for domestic use.
  3. Not mandatory for inbound travellers. Domestic use decided by regional governments.
  4. Mandatory for inbound travellers unless they are nationals and/or residents. Domestic use decided by regional governments.
  5. Mandatory for inbound travellers. Domestic use decided by regional governments.
  6. Mandatory for inbound travellers unless they are nationals and/or residents. Domestic use decided at a federal level.
  7. Mandatory self-isolation for non-national inbound travellers, regardless of possession of vaccine passports.
  8. Mandatory self-isolation for non-national inbound travellers, regardless of vaccine passport. Federal policy for domestic use.

There is currently no universal vaccine passport scheme that can determine how and under what circumstances digital vaccine passports can be used internationally as well as for domestic purposes.[162]

In the absence of internationally accepted criteria, countries determined when and how to use digital vaccine passports themselves, leading to a wide range of adoption policies.

A map of the world that shows the introduction of vaccine passports in countries in our sample by quarter

  • Asian and European countries were among the first to introduce digital vaccine passports in early 2021
  • North and South America from mid-2021
  • Oceania from late 2021.

The different approaches to using digital vaccine passports in different countries stem from their different technical capabilities, politics, public tolerance, finance and, most importantly, approaches to pandemic management.

Countries with zero-COVID policies, for example China and New Zealand, implemented stringent vaccine passport policies along with closing borders and imposing strict lockdowns on residents to suppress transmission.[163]

Many countries relied on a combination of various measures at different phases of the pandemic. In 2023, all countries in our sample currently have either no or moderate measures in place and seem to have chosen a ‘living with COVID’ policy.

Despite the varying approaches, in all the countries in our sample the technological and legislative infrastructure of vaccine passports are still in place. This is important not only because vaccine passports can still be reused, but because they can be transformed into other forms of digital systems in the future.

Examples of how varying pandemic management approaches and political contexts affected digital vaccine passport systems across the world include:

  • Brazil: Former Brazilian president Bolsonaro was against vaccination in general.[164] This meant that most of the pressure for vaccination campaigns came from the federal regions. The judiciary also played a strong role in pressuring the government to take measures against COVID-19, including vaccination. A Supreme Court justice ruled that inbound travellers had to show digital or paper-based proof of vaccination against COVID-19.[165]
  • USA: Digital vaccine passports, particularly for domestic use, were a politically divisive issue in the USA. Some states banned vaccine mandates and the use of digital vaccine passports within their states. Citizens in these states could acquire paper-based vaccine passports to prove their vaccination status for international travel. Several studies demonstrated that political affiliation, perceived effectiveness of vaccines and education level shaped individuals’ attitudes towards digital vaccine passports. Unsurprisingly, fear of surveillance was prominent in determining whether people trusted the government and corporations with their personal data.[166] The federal US administration did not initiate a national domestic vaccine passport but was involved in efforts to establish standards for vaccine passports for international travel.
  • Italy: Italy was the first country in Europe to be hit by the COVID-19 pandemic.[167] The government was confronted with high numbers of hospitalisations and deaths, and faced criticism for being slow to act. It responded by taking stricter measures than many of its European counterparts, and so Italy had one of the strictest vaccine passport schemes in Europe. It separated each region into a coloured zone depending on how severe the rate of transmission and hospitalisation numbers were in that area. It operated a two-tiered green pass system. The ‘super green pass’ was valid proof of vaccination or recovery, the ‘green pass’ was proof of a negative COVID test. Different venues and activities required one or both of the passes.[168]
  • The EU: Member states in the EU experienced the pandemic differently – some countries had higher number of deaths, cases and hospitalisations than others. Vaccine uptake across the member states differs significantly.[169] While the EU Digital COVID Certificate helped the EU to reintroduce freedom of movement and revive the economy within the zone, member states have the liberty to implement vaccine passports domestically as they see fit. This led to considerable differences in domestic vaccine passport schemes across the EU zone.[170] For example, Romania, one of the least vaccinated countries in the EU, made digital vaccine passports mandatory for inbound national travellers for only a short period of time to address the surge in numbers of cases and deaths as lockdowns were ended. Finland, which had a high vaccination rate, required a digital vaccine passport for all inbound travellers, including nationals, for nine months before it stopped enforcing digital vaccine passports completely.

Effectiveness

Digital vaccine passports essentially demonstrate an individual’s transmission risk to other people.

A digital vaccine passport scheme relies on the assumption that an individual is a lower risk to others if they have been vaccinated (or if they have gained natural immunity after being infected with and recovering from the disease).

In early 2021, we argued that there was no clear evidence about whether being vaccinated reduced an individual’s risk of transmitting the disease. We suggested that governments should pause deploying vaccine passports until the evidence was clearer.[171]

We also called on governments to build evidence that considers the benefits and risks of digital vaccine passports – in particular, whether they would increase risky behaviours (for example, not observing social distance) by creating a false sense of security.

Despite this lack of evidence, many governments across the world moved forward to introduce digital vaccine passports in 2021.[172]

Policymakers saw digital vaccine passports as valuable public health tools, once the initial scientific trials of vaccines suggested that they would reduce the likelihood of severe symptoms, and hence hospitalisations and deaths.

This was critical for policymaking in many countries whose healthcare systems were under immense pressure.

At the same time, vaccine scepticism was on the rise in many countries. In this context, the idea developed that digital vaccine passport schemes would give people an incentive to get vaccinated. This represented a considerable shift in their purpose, from a digital health intervention aimed at reducing transmission to a behaviour control tool aimed at increase vaccine uptake.

Many countries considered mandatory vaccination for domestic activities as a way to increase uptake. For example, in January 2022, announcing domestic vaccine mandates, French President Macron stated ‘the unvaccinated, I really want to hassle them. And so, we will continue to do it, until the end.’[173]

Mandatory digital vaccine passport schemes raise the question of ‘whether that is ethically acceptable or instead may be an unacceptable form of coercion, detrimental to the right to free self-determination, which is guaranteed for any medical treatment, thus coming to resemble a sort of roundabout coercion’.[174]

In short, it was hoped that digital vaccine passports would positively impact public health in two main ways: (1) reducing transmission, hospitalisations and deaths, and (2) increasing vaccine uptake.

In this section, we will look at the evidence on the effectiveness of digital vaccine passports in both of these senses. We will then briefly explain several evidence gaps that prevent us from building a full understanding of digital vaccine passports’ overall impact on public health.

Impact of digital vaccine passports on reducing transmission, hospitalisations and deaths

In 2023, the scientific evidence on the efficacy of vaccines to reduce transmissions still needs to be elucidated. Although there is some evidence that being vaccinated makes it less likely that one will transmit the virus to others, experts largely agree that ‘a vaccinated person’s risk of transmitting the virus is not considerably lower than an unvaccinated person’.[175] [176] Yet there is strong evidence that vaccines are effective in protecting individuals from developing severe symptoms (although the experts say that their efficacy reduces over several months).[177]

Therefore, even if mandatory domestic vaccine passport schemes did not help to decrease rates of transmission, they might have reduced the pressure on public healthcare because fewer number of people needed medical care. This would only be the case if digital vaccine passports were indeed effective in increasing vaccine uptake (see next section below).

Vaccines have been found to be effective against new variants, but the level of effectiveness is unclear.[178] According to the WHO, there are five predominant variants of COVID-19 and more than 200 subvariants. The WHO also reports that it is becoming more difficult to monitor new variants, since many countries have stopped testing and surveillance.

The infrastructure and legislation of digital vaccine passports are still in place, meaning that they can be reused at any time.

But limited monitoring and research on (sub)variants raises concerns around vaccines’ durability and their ability to be used more widely Governments need to invest in building evidence on the vaccines’ efficacy against rapidly evolving variants if they decide to re-use digital vaccine passport.

Impact of digital vaccine passports on vaccine uptake

Digital vaccine passport systems had a mixed impact on vaccine uptake at an international level. Several countries reported a significant increase in vaccination after the introduction of digital vaccine passports. In France for example, after the digital vaccine passports were introduced, ‘the overall uptake of first doses… increased by around 15% in the last month following a lull in vaccinations.’[179]

Another study suggests that the vaccine passport requirement for domestic travelling and accessing different social settings led to higher vaccination rates in the majority of the EU countries.[180] However, levels of COVID-19 vaccine acceptance were low particularly in West Asia, North Africa, Russia, Africa and Eastern Europe despite the use of digital vaccine passports.[181]

For example, one out of four Russians continued to refuse vaccination despite the government’s plan to introduce mandatory digital vaccine passports for accessing certain spaces (for example, workplaces).[182] Similarly, in Nigeria, Bulgaria, Russia and Romania, black markets for fake passports were created by anti-vaxxers,[183] demonstrated the strength of resistance among some people to getting vaccinated or sharing their data. These examples indicate the importance of political and cultural contexts and urge us to avoid broad international conclusions.

Important evidence gaps

As well as vaccination, the scientific evidence shows that a wide range of measures can reduce the risk of COVID-19 transmission. How have vaccine passports affected individuals’ motivation to follow other COVID-19 protection measures? This question is fundamental: one of the major concerns about digital vaccine passports was that they might give people a false sense of security, leading them to stop following other important COVID-19 health measures such as wearing a face mask.

Some experts argue that digital vaccine passport schemes in the EU led to more infections because they led to increased social contact.[184] But studies that explore this were either conducted in the early phase of the pandemic or remain limited in their scope. This means that we cannot fully evaluate the impact of digital vaccine passports on public health behaviours, so we cannot weigh their benefits against the risks in a comprehensive manner.

To fill this evidence gap, we need studies that examine (and compare) unvaccinated and vaccinated people’s attitudes to other COVID-19 protection measures over time.

A systematic review of community engagement to support national and regional COVID-19 vaccination campaigns demonstrates that working with members (or representatives) of communities to co-design vaccination strategies, build trust in authorities and address misinformation is an effective way to increase vaccine uptake.

The review points to the success of several COVID-19 vaccination rollout programmes, including the United Nations High Commissioner for Refugees efforts to reach migrant workers and refugees, a female-led vaccination campaign for women in Sindh province in Pakistan and work with community leaders to reach out to the indigenous population in Malaysia.[185]

The standard and quality of countries’ healthcare systems also played a huge role in how successfully they tackled vaccine hesitancy. For example, Morocco’s pre-existing national immunisation programme, supported by a successful COVID-19 communications campaign, led to in higher vaccination rates in Morocco compared with other African countries.[186]

This raises another important question, which cannot be comprehensively answered due to limited evidence: were digital vaccine passport policies deployed at the expense of other (non-digital) interventions, such as targeted community-based vaccination programmes?

Governments’ ambition to increase vaccine uptake by using digital vaccine passport schemes (for example, by not allowing unvaccinated people to enter venues) raises the question of whether they expected digital vaccine passports to ‘fix’ the problem of vaccine hesitancy instead of working with communities and effectively communicating scientific evidence.

To comprehensively address this question, governments would need to provide detailed documentation of vaccination rollout programmes and activities and support expert evaluations of the risks and benefits of digital vaccine passport systems, compared with non-digital interventions like vaccination campaigns targeted at communities with high levels of vaccine hesitancy.

Our recommendations when digital vaccine passports emerged:

  • Build an in-depth understanding of the level of protection offered by individual vaccines in terms of duration, generalisability, efficacy regarding mutations and protection against transmission.
  • Build evidence of the benefits and risks of digital vaccine passports. For example, consider whether they reduce transmission but also increase risky behaviours (for example, not observing social distancing), with a new harmful effect.[187]

 

In 2023, the evidence on the effectiveness of digital vaccine passports reveals:

  • Countries initially aimed to use digital vaccine passports to score an individual’s transmission risk based on their vaccination status, test results or proof of recovery. They established digital vaccine passport schemes without clear evidence of the vaccine’s effectiveness in reducing a transmission risk. Governments hoped that even if vaccines did not reduce transmission risk, digital vaccine passports would increase vaccine uptake, and hence decrease an individual’s risk of developing severe symptoms and increase vaccine uptake.
  • Vaccines were effective at reducing the likelihood of developing severe symptoms, and therefore of hospitalisations and deaths. This meant that they decreased the pressure on health systems because fewer people required medical care.
  • However, there is no clear evidence that vaccinated people are less likely to transmit the virus than unvaccinated people, which means that vaccines have not reduced transmissions as hoped by governments and policymakers.
  • In some countries (for example, France) digital vaccine passport schemes increased vaccine uptake, but in other countries (for example, Russia and Romania) people resisted vaccinations despite digital vaccine passport restrictions. Black markets for fake digital vaccine passports were created in some places (for example, Italy, Nigeria and Romania). This demonstrates that we cannot reach broad international conclusions about digital vaccine passports’ impact on vaccine uptake.
  • Significant gaps in the evidence prevent us from weighing the benefits of digital vaccine passport systems against the harms. These include the impact of digital vaccine passports on other COVID-19 protection measures (for example, wearing mask) and whether governments relied on digital vaccine passport systems to increase vaccine uptake instead of establishing non-digital community-targeted interventions to address vaccine hesitancy.

 

Lessons learned:

To build evidence on the effectiveness of digital vaccine passports as part of the wider pandemic response strategy:

  • Support research and learning to understand the impact of digital vaccine passports on other COVID-19 protection measures (for example, wearing mask and observing social distancing).
  • Support research and learning to understand the impact of digital vaccine passports on non-digital interventions (for example, effective public communications to address vaccine hesitancy).
  • Use this impact evaluation to weigh up the risks and harms of digital vaccine passports and to help set standards and strategies for the future use of technology in public crises.

To ensure the effective use of technologies in future pandemics:

  • Invest in research and evaluation from the outset, and implement a clear evaluation framework to build evidence during deployment that supports understanding of the role that digital technologies play in broader pandemic health strategies.
  • Define criteria for effectiveness using a societal approach that goes beyond technical efficacy and takes account of people’s experiences.
  • Establish how to measure and monitor effectiveness by closely working with public health experts and communities, and set targets accordingly.
  • Carry out robust impact assessments and evaluation of technologies, both when first deployed and over time.

Public legitimacy

Public legitimacy was key to ensuring that digital vaccine passports were legitimate and effective health interventions. In the first two years of the pandemic, we conducted a survey and public deliberation research to investigate public attitudes to digital vaccine passports in the UK.

We found that digital vaccine passports needed to be supported by strong governance and accountability mechanisms to build public trust. Our work also highlighted public concern with regards to digital vaccine passport schemes’ potential negative impacts on marginalised and disadvantaged communities. We called on governments to build public trust and create social consensus on whether and how to use digital vaccine passports.[188]

Since then, wider evidence has emerged that complements our findings. For example, an IPSOS Mori survey from March 2021 found that minority ethnic communities in the UK were more concerned than white respondents about vaccine passports being used for surveillance.[189]

This reflects a general trend in UK society: minoritised and disadvantaged people trust public institutions less with personal data than the white majority do.[190] Unsurprisingly, there is also a link between people’s attitudes to digital vaccine passports and vaccine hesitancy.

Those who are less likely to take up the COVID-19 vaccine feel their sense of personal autonomy is threatened by mandatory vaccine passport schemes.[191]

It is difficult to draw conclusions about public acceptance of digital vaccine passports at an international level, since public legitimacy depends on existing legal and constitutional frameworks as well as moral, cultural and political factors in a society.

But we can say that more than 50% of countries in our sample experienced protests against digital vaccine passports and the restrictive measures that they enabled (for example, not being eligible to enter the workplace or travel without proof of vaccination), showing the widespread public resistance across the world.

Countries that saw such protests vary in terms of political cultures and attitudes to technology, including Italy, Russia, France, Nigeria and South Africa. In most cases, anti-digital vaccine passport protests started shortly after national or regional governments had announced mandatory schemes, demonstrating public resistance to using data-driven technology in everyday contexts.

Several studies demonstrated that people were less favourable towards domestic uses of digital vaccine passports than towards their use for international travel.

This was particularly the case for schemes that required people to use a digital vaccine passport to access work, education, and religious settings and activities.[192] Lack of trust in government and institutions, vaccine efficacy and digital vaccine passports’ effectiveness all contributed to public resistance to digital vaccine passport systems.[193]

Our recommendations when digital vaccine passports emerged:

  • Build public trust through strong regulation, effective public communication and consultation.[194]
  • Ensure social consensus on whether and how to use digital vaccine passports.

 

In 2023, the evidence on the public legitimacy of digital vaccine passports reveals that:

  • Many countries experienced protests against digital vaccine passports (more than half of the countries in our sample) and the restrictive measures that they enabled. This demonstrates the lack of public acceptance of, and social consensus around, digital vaccine passport systems.
  • Lack of trust in government and institutions, vaccine efficacy and digital vaccine passports’ effectiveness all contributed to public resistance to digital vaccine passports.[195]

 

Lesson learned:

  • Ensure that people’s rights and freedoms are safeguarded with strong regulations, oversight and redressal mechanisms. Effectively communicate the purpose and legislative and regulatory basis of health technologies to build public trust and social consensus.

Inequalities

Digital vaccine passports posed significant inequality risks, including discrimination based on immunity status, excess policing of citizens, and amplification of digital inequalities and other forms of societal inequalities.[196]

In this context, one of the major risks highlighted by the Ada Lovelace Institute was that mandatory vaccine passports could lead to discrimination against unvaccinated people. Mandatory vaccination policies were frequently adopted by (national or regional) governments or workplaces across the countries in our sample.[197]

For example, in November 2021, the Austrian government announced mobility restrictions for unvaccinated people.[198] The measure was ended in January 2022 due to dropping case numbers and decreasing pressure on hospitals. However, the government announced a vaccine mandate policy with penalties of up to €3,000 for anyone who refused to be vaccinated. The controversial law was never enforced due to civil unrest and international criticism.[199]

In Italy, people had to show a ‘green pass’, which included vaccination proof, recovery proof and a negative Polymerase Chain Reaction (PCR) test, to access workplaces between October and December 2021.

The policy officially ended on 1 May 2022, making it illegal for employers to ask for vaccine passports.[200] In 2021, the Moscow Department of Health declared that only vaccinated people could receive medical care.[201] The Mayor of Moscow also instituted a mandatory vaccine passport system for gaining entry to restaurants, bars and clubs after 11pm in the city.

In relation to digital exclusion, we recommended that if governments were to pursue digital vaccine passport plans, they should create non-digital (paper) alternatives for those with no or limited digital access and skills. We also recommended that plans should include different forms of immunity in vaccine passports – such as antigen test results – to prevent discrimination against unvaccinated people.[202]

In some countries, for example, Türkiye, although physical vaccine passports were available, people had to download their vaccination proof as an electronic PDF (portable document format) , which excluded those who were unable to use the internet.[203]

Some countries adopted good practices and policies to mitigate the inequality risks. In India, for example, the Supreme Court decided that vaccination could not be made compulsory for domestic activities and directed the federal government to provide publicly available information on any adverse effects of vaccination.[204]

The UK Government introduced a non-digital NHS COVID Pass letter.[205] Those who did not have access to a smartphone or internet could request this physical letter via telephone.

The European Union’s Digital COVID Certificate could be obtained after taking a biochemical test that demonstrates a form of immunity or lack of infection and hence does not discriminate against those who cannot be or refuse to be vaccinated. This made the Digital COVID Certificate available to wider population, as 25% of the EU population remained unvaccinated as of August 2022.[206]

Global inequalities

Tackling pandemics requires global cooperation. Effective collaboration is needed to fight diseases at regional and global levels.[207] Digital vaccine passports, which were used for border management in the name of public health, created vaccine nationalism, and as a result they amplified global inequalities.[208]

Digital vaccine passports did not emerge in a vacuum; state-centric perspectives that prioritise the ‘nation’s health’ by restricting or controlling certain communities and nations have existed for decades.[209] Securitising trends using the unprecedented compilation and analysis of personal data intensified following the 9/11 terrorist attack in New York.[210]

Countries compiled pandemic-related data about other countries to score risk and produce entry schemes for inbound travellers. This led to the emergence of an international digital vaccine passport scheme where individuals were linked to a verifiable test or vaccine.[211]

Low-income countries found it difficult to meet rigid standards for compliance due to low access to and uptake of vaccines.[212]

There is a positive correlation between a country’s GDP and the share of vaccinated individuals in the population.[213]

According to Our World in Data, when digital vaccine passports were introduced, the share of fully vaccinated people was 17% in Jamaica, 18% in Tunisia and 11% in Egypt.[214] At the other end of the scale, 56% of the population was fully vaccinated in Singapore, 32% in Italy and 37% in Germany.[215]

International digital vaccine passport schemes also resulted in new global tensions. The COVAX initiative led by the WHO, aimed at ensuring equitable access to COVID-19 treatments and vaccines through global collaboration.[216]

COVISHIELD, a COVID-19 vaccine manufactured in India, was distributed largely to African countries through the COVAX initiative. Nonetheless, the EU, which donated €500 million donation to support the initiative, did not authorise COVISHIELD as part of the EU Digital COVID Certificate system.[217] This meant that the digital vaccine passports of people who had received COVISHIELD in Africa were not recognised as valid in the EU, restricting their ability to travel to EU countries.

As of December 2022, Africa still had the slowest vaccination rate of any continent, with just 33% of the population receiving at least one dose of a vaccine.[218]

In this context, many low- and middle-income countries sought vaccines approved by the European Medicine Agency (EMA). This was challenging due to lack of financial means and the limited number of vaccine manufacturing companies.

The EU Digital COVID Certificate system eventually expanded to only 49 non-EU countries, including Monaco, Türkiye, the UK and Taiwan (to give a few examples from our sample).[219] These countries’ national vaccination programmes offered vaccines authorised for use by EMA in the EU.

Our recommendations when digital vaccine passports emerged:

  • Carefully consider the groups that might face discrimination if mandatory domestic and international vaccine passport policies are adopted (for example, unvaccinated people).
  • Make sure policies and interventions are in place to mitigate the amplification of societal and global inequalities – for example, provide paper-based vaccine certificates for people who are not able or not willing to use digital vaccine passports.[220]

 

In 2023, the evidence on the impact of digital vaccine passports on inequalities demonstrates that:

  • The majority of countries in our sample adopted mandatory domestic and international vaccine passport schemes at different stages of the pandemic, which restricted the freedoms of individuals.
  • Some countries in our sample (for example, the EU and UK) adopted physical digital vaccine passports and approved a biochemical test to demonstrate a form of immunity or lack of infection as part of their digital vaccine passports. These helped to mitigate the risk of discrimination against unvaccinated individuals and individuals who lack adequate digital access and skills.
  • Countries compiled pandemic-related data about other countries to score risk and produce entry schemes for inbound travellers. This led to the emergence of an international digital vaccine passport scheme where individuals were linked to a verifiable test or vaccine. Low-income countries found it difficult to meet rigid standards of compliance due to low access to and uptake of vaccines.

 

Lessons learned:

  • Address the needs of vulnerable groups and offer non-digital solutions where necessary to prevent discrimination and amplification of inequalities.
  • Consider the implications of national policies and practices relating to technologies at a global level. Cooperate with national, regional and international actors to make sure technologies do not reinforce existing global inequalities.

Governance, regulation and accountability

Like contact tracing apps, digital vaccine passports had implications for data privacy and human rights, provoking reasonable concerns about proportionality, legality and ethics.

Data protection regimes are based largely on principles that aim to protect rights and freedoms. Included within these is a set of principles and ‘best practices’ that guide data collection in disaster conditions. These include that:

  • measures are transparent and accountable
  • the limitations of rights are proportional to the harms they are intended to prevent or limit
  • data collection is minimised and time constrained
  • data is retained for research or public use purposes and unused personal data is destroyed
  • data is anonymised in such a way that individuals cannot be reidentified
  • third party sharing both within and outside of government is prevented.[221]

In the Checkpoints for vaccine passports report, we made a set of legislative, regulatory and technical recommendations in line with the principles outlined above.

We highlighted the importance of oversight mechanisms to ensure technical efficacy and security, as well as the enforcement of relevant regulations.[222] It is beyond the scope of this report to analyse country-specific regulations and how they were shaped by differences in legal systems and ethical and societal values. But there are several cross-cutting issues and reflections that are worth drawing attention to.

As far as we know, there were fewer incidents of repurposing data and privacy breaches in the case of digital vaccine passports than in relation to contact tracing apps. Yet in some countries, critics warned that data protection principles were not always followed despite relevant regulations being in place.[223] For example, central data systems had security flaws in some countries, for example, in Brazil and Jamaica, which resulted in people’s health records being hacked.[224]

The effectiveness of digital vaccine passports was critical when deciding whether they were proportionate to their intended purpose.[225] When they emerged, some bioethicists argued that digital vaccine passport policies were a justified restriction on civil liberties, since vaccinated people were unlikely to spread the disease and hence posed no risk to others’ right to life.[226]

However, as explained in the previous sections, the evidence does not confirm vaccines’ effectiveness at reducing transmission. And it is noteworthy that some places for example, Vietnam, successfully managed the disease without a focus on technology due to their pre-existing strong healthcare systems.[227]

Our evidence also reveals that although some countries established specific regulations for digital vaccine passports (for example, UK and Canada), this was not the case for most of the countries in our sample.

In many countries, digital vaccine passports were regulated through existing public laws, protocols and general data protection regulations.

This created concerns in those countries without data protection frameworks, for example, South Africa.[228]

In our sample of 34 countries, the EU Digital COVID Certificate regulation is the most comprehensive regulation. It clearly states when the vaccine passport scheme will end (June 2023).[229] It also provides detailed information regarding security safeguards and time limitation.

But it is important to note that the EU does not determine member states’ national policies on vaccine passport use, which means that countries can choose to keep the infrastructure and reuse digital vaccine passports domestically.

Our recommendations when digital vaccine passports emerged:

  • Use scientific evidence to justify the necessity and proportionality of digital vaccine passport systems.
  • Establish regulations with clear, specific and delimited purposes, and with clear sunset mechanisms .
  • Ensure best-practice design principles to ensure data minimisation, privacy and safety.
  • Ensure that strong regulations and regulatory bodies and redressal mechanisms are in place to safeguard individual freedoms and privacy.

 

In 2023, the evidence on governance, regulations and accountability of digital vaccine passports demonstrates that:

  • Only a handful of countries (for example, the UK and the EU) enacted specific regulations before rolling out digital vaccine passports.
  • In many countries, digital vaccine passports were regulated using existing public laws, protocols and general data protection regulations. This created concerns in countries without data protection frameworks, for example, South Africa.
  • There were fewer incidents of repurposing data and privacy breaches in the case of digital vaccine passports than there were in connection with contact tracing apps. But the lack of strong regulation or oversight mechanisms and poor design still resulted in data leakages, privacy breaches and repurposing of the technology in some countries (for example, hacking digital vaccine passport data in Brazil).

 

Lessons learned:

  • Justify the necessity and proportionality of technologies with sufficient relevant evidence in public health emergencies.
  • If technologies are found to be necessary and proportional and therefore justified, create specific guidelines and regulations. These guidelines and regulations should ensure that mechanisms for enforcement are in place as well as methods of legal redress.

Conclusions

Contact tracing apps and digital vaccine passports have been two of the most widely deployed technologies in COVID-19 pandemic response across the world.

They raised hopes through their potential to assist countries in their fight against the COVID-19 virus. At the same time, they provoked concerns about privacy, surveillance, equity and social control, because of the sensitive social and public health surveillance data they use – or are perceived to use.

In the first two years of the pandemic, the Ada Lovelace Institute extensively investigated the societal, legislative and regulatory challenges and risks of contact tracing apps and digital vaccine passports. We published nine reports containing a wide range of recommendations for governments and policymakers about what they should do mitigate these risks and challenges when using these two technologies.

This report builds on this earlier work. It synthesises the evidence on contact tracing apps and digital vaccine passports from a cross-section of 34 countries. The findings should guide governments, policymakers and international organisations when using data-driven technologies in the context of public emergencies, health and surveillance.

They should also support civil society organisations and those advocating for technologies that support fundamental rights and protections, public health and public benefit.

We also identify important gaps in the evidence base. COVID-19 was the first global health crisis of ‘the algorithmic age’, and evaluation and monitoring efforts fell short in understanding the effectiveness and impacts of the technologies holistically.

The evidence gaps identified in this report indicate the need to continue research and evaluation efforts, to retrospectively investigate the impact of COVID-19 technologies so that we can decide on their role in our societies, now and in the future. The gaps should also guide evaluation and monitoring frameworks when using technology in future pandemics and in broader contexts of public health and social care provision.

This report synthesises the evidence by focusing on four questions:

  1. Did the new technologies work?
  2. Did people accept them?
  3. How did they affect inequalities?
  4. Were they well governed and accountable?

The limited and inconsistent evidence base and the wide-ranging, international scope present some challenges to answering these questions. Using a wide range of resources, we aim to provide some balance and context to compensate for missing information.

These resources include the media, policy papers, findings from the Ada Lovelace Institute’s workshops, evidence reviews of academic and grey literature, and material submitted to international calls for evidence.

We illustrate the findings on both contact tracing apps and digital vaccine passports with policy and practice examples from the sample countries.

Within the evidence base, the two technologies were implemented using a wide range of technical infrastructures and adoption policies. Despite these divergences and the often hard-to-uncover evidence, there are important cross-cutting findings that can support current and future decision-making around pandemic preparedness, and health and social care provision more broadly.

Cross-cutting findings

Effectiveness: did COVID-19 technologies work?

  • Digital vaccine passports and contact tracing apps were – of necessity – rolled out quickly, but without consideration of what evidence would be required to demonstrate their effectiveness. There was insufficient consideration and no consensus reached on how to define, monitor, evaluate or demonstrate their effectiveness­ and impacts.
  • There are indications of the effectiveness of some technologies, for example the NHS COVID-19 app (used in England and Wales). However, the limited evidence base makes it hard to evaluate their technical efficacy or epidemiological impact overall at an international level.
  • The technologies were not well integrated within broader public health systems and pandemic management strategies, and this reduced their effectiveness. However, the evidence on this is limited in most of the countries in our sample (with a few exceptions, for example Brazil and India), and we do not have clear evidence to compare COVID-19 technologies with non-digital interventions and weigh up their relative benefits and harms.
  • It is not clear whether COVID-19 technologies resulted in positive change in people’s health behaviours (for example, whether people self-isolated after receiving an alert from a contact tracing app).
  • It is also not clear if public support was impacted by the apps’ technical properties, or the associated policies and implementations.

Public legitimacy: Did people accept COVID-19 technologies?

  • Public legitimacy was key to ensuring the success of these technologies, affecting uptake and behaviour.
  • The use of digital vaccine passports to enforce restrictions on liberty and increased surveillance caused concern. There were protests against them, and the restrictive policies they enabled, in more than half the countries in our sample.
  • Public acceptance of contact tracing apps and digital vaccine passports depended on trust in their effectiveness, as well as trust in governments and institutions to safeguard civil rights and liberties. Individuals and communities who encounter structural inequalities are less likely to trust government institutions and the public health advice they offer. Not surprisingly, these groups were less likely than the general population to use these technologies.
  • The lack of targeted public communications resulted in poor understanding of the purpose and technical properties of COVID-19 technologies. This reduced public acceptance and social consensus around whether and how to use the technologies.

Inequalities: How did COVID-19 technologies affect inequalities?

  • Some social groups faced barriers to accessing, using or following the guidelines for contact tracing apps and digital vaccine passports, including unvaccinated people, people structurally excluded from sufficient digital access or skills, and people who could not self-isolate at home due to financial constraints. A small number of sample countries adopted policies and practices to mitigate the risk of widening existing inequalities. For example, the EU allowed paper-based Digital COVID Certificates for those without sufficient digital access and skills.
  • This raises the question of whether these technologies widened health and other societal inequalities. In the majority of sample countries, there is no clear evidence as to whether governments adopted effective interventions to help those who were less able to use or benefit from these technologies (for example, whether financial support was provided for those who could not self-isolate after receiving an exposure alert due to not being able to work from home).
  • The majority of sample countries requested proof of vaccination from inbound travellers before allowing unconditional entry (that is, without a quarantine or self-isolation period) at some stage of the pandemic. This amplified global inequalities by discriminating against the residents of countries that could not secure adequate vaccine supply or had low vaccine uptake – specifically, many African countries.

Governance, regulation and accountability: Were COVID-19 technologies well governed and accountable?

  • Contact tracing apps and digital vaccine passports combine health information with social or surveillance data. As they limit rights (for example, by blocking access to travel or entrance to a venue for people who do not have a digital vaccine passport), they must be proportional. This means striking a balance between limitations of rights, potential harms and intended purpose. To achieve this, it is essential that they are governed by robust legislation, regulation and oversight mechanisms, and that there are clear sunset mechanisms in place to determine when they no longer need to be used.
  • Most countries in our sample governed these technologies in line with pre-existing legislative frameworks, which were not always comprehensive. Only a few countries enacted robust regulations and oversight mechanisms specifically governing contact tracing apps and digital vaccine passports, including the UK, EU member states, Taiwan and South Korea.
  • The lack of robust data governance frameworks, regulation and oversight mechanisms led to lack of clarity about who was accountable for misuse or poor performance of COVID-19 technologies. Not surprisingly, there were incidents of data leaks, technical errors and data being reused for other purposes. For example, contact tracing app data was used in police investigations in Singapore and Germany, and sold to third parties for commercial purposes in the USA.[230]
  • Many governments relied on private technology companies to develop and deploy these technologies, demonstrating and reinforcing the industry’s influence and the power located in digital infrastructure.

Lessons

In light of these findings, there are clear lessons for governments and policymakers deciding how to use digital vaccine passports and contact tracing apps in the future.

These lessons may also apply more generally to the development and deployment of new data-driven technologies and approaches.

Effectiveness

To build evidence on the effectiveness of contact tracing apps and digital vaccine passports:

  • Support research and learning efforts on impact of these technologies on people’s health behaviours.
  • Understand the impacts of apps’ technical properties, and of policies and approaches to implementation, on people’s acceptance of, and experiences of, these technologies in specific socio-cultural contexts and across geographic locations.
  • Weigh up their benefits and harms by considering their role within the broader COVID-19 response and comparing with non-digital interventions (for example, manual contact tracing).
  • Use this impact evaluation to help set standards and strategies for the future use of these technologies in public crises.

To ensure the effective use of technology in future pandemics:

  • Invest in research and evaluation from the start, and implement a clear evaluation framework to build evidence during deployment that supports understanding of the role that technologies play in broader pandemic health strategies.
  • Define criteria for effectiveness using a human-centred approach that goes beyond technical efficacy and builds an understanding of people’s experiences.
  • Establish how to measure and monitor effectiveness by working closely with public health experts and communities, and set targets accordingly.
  • Carry out robust impact assessments and evaluation.

Public legitimacy

To improve public acceptance:

  • Build public trust by publicly setting out guidance and enacting clear law about permitted and restricted uses and mechanisms to support rights, and redress and tackle legal issues.
  • Effectively communicate the purpose of using technology in public crises, including the technical infrastructure and legislative framework of specific technologies, to address public hesitancy and create social consensus.

Inequalities

To avoid making societal inequalities worse:

  • Create monitoring mechanisms that specifically address the impact of technology on inequalities. Monitor the impact on public health behaviours, particularly in relation to social groups who are more likely to encounter health and other forms of social inequalities.
  • Use the impact evidence to identify marginalised and disadvantaged communities and to establish strong public health services, interventions and social policies to support them.

To avoid creating or reinforcing global inequalities and tensions:

  • Harmonise global, national and regional regulatory tools and mechanisms to address global inequalities and tensions.

Governance and accountability

To ensure that individual rights and freedoms are protected:

  • Establish strong data governance frameworks and make sure that regulatory bodies and clear sunset mechanisms are in place.
  • Create specific guidelines and laws to make sure that technology developers follow privacy-by-design and ethics-by-design principles, and that effective monitoring and evaluation frameworks and sunset mechanisms are in place for the deployment of technologies.
  • Build clear evidence about the effectiveness of new technologies to make sure that their use is proportionate to their intended results.

To reverse the growing power imbalance between governments and the technology industry:

  • Develop the public sector’s technical literacy and ability to create technical infrastructure. This does not mean that the private sector should be excluded from developing technologies related to public health, but it is crucial that technical infrastructure and governance are effectively co-designed by government, civil society and private industry.

The legacy of COVID-19 technologies? Outstanding questions

This report synthesises evidence that has emerged on contact tracing apps and digital vaccine passports from 2020 to 2023. These technologies have short histories, but they have potential long-term, societal implications and bring opportunities as well as challenges.

In this research we have attempted to uncover evidence of existing practices rather than speculating about the potential long-term impacts.

In the first two years of the pandemic, the Ada Lovelace Institute raised concerns about the potential risks and negative longer-term implications of COVID-19 technologies for society, beyond the COVID-19 pandemic. The main concerns were about:

  • repurposing of digital vaccine passports and contact tracing apps beyond the health context, such as for generalised surveillance
  • expanding or transforming of digital vaccine passports into wider digital identity systems by allowing digital vaccine passports to ‘set precedents and norms that influence and accelerate the creation of other systems for identification and surveillance’
  • damaging public trust in health and social data-sharing technologies if these technologies were mismanaged, repurposed or ineffective.[231]

In this section, we identify three outstanding research questions which would allow these three potential longer-term risks and implications. Addressing these questions will require consistent research and thinking on the evolution of COVID-19 technologies and their longer-term implications for society and technology.

Governments, civil society and the technology industry should consider the following under-researched questions, and should work together to increase understanding of contact tracing apps and digital vaccine passports and their long-term impact.

Question 1: Will contact tracing apps and digital vaccine passports continue to be used? If so, what will happen to the collected data?

Only a minority of countries, including Australia, Canada and Estonia,[232] have decommissioned their contact tracing apps and deleted the data collected. Digital vaccine passport infrastructure is still in place in many countries across the world, despite most countries having adopted a ‘living with COVID’ policy.

It is important to consider the current and future objectives of governments that are preserving these technological infrastructures, as well as how they intend to use the collected data beyond the pandemic. Given that most countries in our sample did not enact strong regulations with sunset clauses that restrict use and clarify structures or guidance to support deletion, it is crucial that we continue to monitor the future uses of these technologies and ensure that they are not repurposed beyond the health context.

Question 2: How will the infrastructure of COVID-19 technologies and related regulation persist in future health data and digital identity systems?

Digital vaccine passports have accelerated moves towards digital identity schemes in many countries and regional blocs.[233] In Saudi Arabia, the Tawakkalna contact tracing app has been transformed into a comprehensive digital identity system, which received a public service award from the United Nations for institutional resilience and innovative responses to the COVID-19 pandemic.[234]

The African Union, which built the My COVID Pass vaccine passport app in collaboration with African Centres for Disease Control and Prevention, is working towards building a digital ID framework for the African continent. The EU introduced uniform and inter-operable proofs of vaccination through the EU Digital COVID Certificate .

It is not yet clear what the societal implications of these changes of use are, or how they will affect fundamental rights and protections. Following the Digital COVID Certificate’s perceived success among policymakers, the European Commission plans to introduce an EU digital wallet that will give every EU citizen digital identity credentials that are recognised throughout the EU zone.

In some countries, healthcare systems have been transformed as a result of COVID-19 technologies. India has transformed its contact tracing app Aarogya Setu to become the nation’s health app.[235]

In the UK, data and AI have been central to the Government’s response to the pandemic. This has accelerated proposals to use health data for research and planning services. NHS England has initiated a ‘federated data platform’. This will enable NHS organisations to share their operational data through software.

It is hoped that researchers and experts from academia, industry and the charity sector will use the data gathered on the platform for research and analysis to improve the health sector in England.[236]

The federated data platform initiative has been recognised for its potential to transform the healthcare system, but it has also caused concerns about accountability and trustworthiness, as patients’ data will be accessible to many stakeholders. [237] These include private technology companies like Palantir, which has been reported as not always being transparent in how it gathers, analyses and uses people’s data.[238]

These changes in digital identity and health ecosystems can provide significant economic and societal benefits to individuals and nations.[239] But they should be well designed and governed in order to benefit everyone in society. In this context, it is necessary to continue monitoring the evolution of COVID-19 technologies into new digital platforms and to understand their legislative, technical and societal legacies.

Question 3: How have COVID-19 technologies affected public’s attitudes towards data-driven technologies in general?

There is a lot of research on public attitudes towards  COVID-19 technologies. This body of research was largely undertaken in the first years of the pandemic.[240] But, the question of whether, and how, they have affected people’s attitudes towards data-driven technologies beyond the pandemic has not had much attention.

People had to use these technologies in their everyday lives to prove their identity and share their health and other kinds of personal information. But, as demonstrated in this report, there have been incidents that might have damaged people’s confidence in the technologies’ safety and effectiveness.

In this context, we believe that it is crucial to continue to reflect on COVID-19 technologies’ persistent impacts on public attitudes towards data-driven technologies – particularly, those technologies that entail sensitive personal data.

Methodology

In 2020 and 2021, the Ada Lovelace Institute conducted extensive research on COVID-19 technologies. We organised workshops and webinars, and conducted public attitudes research, evidence reviews and desk research. We published nine reports and two monitors. This body of research highlighted the risks and challenges these  technologies posed and made policy recommendations to ensure that they would not cause or exacerbate harms and would benefit everyone in society equally.

In the first two years of the pandemic, many countries rolled out digital vaccine passports and contact tracing apps, as demonstrated in ‘International monitor: vaccine passports and COVID-19 status apps’.[241] In January 2022, as we were entering the third year of the pandemic, we adjusted the scope and objectives of the COVID-19 technologies project. In the first two years of the pandemic, we had focused on the benefits, risks and challenges; now we started focusing on the lessons learned from these technologies from January 2022 onwards. We aimed to address the following questions:

  1. Did COVID-19 technologies work? Were they effective public health tools?
  2. Did people accept them?
  3. How did they affect inequalities?
  4. Were they governed well and with accountability?
  5. What lessons can we learn from the deployment and uses of these new technologies?

Sampling

We aimed for regional representation in our sample. We decided to focus on policies and practices in 34 countries in total. We based our sampling on geographical regions of North Africa, Central Africa, South Africa, South East Asia, Central Asia, East Asia, North America, South America, Eastern Europe, European Union, West Asia, North Africa and Oceania.

Relying on Our World in Data[242] datasets on total deaths, total cases and the share of people who had completed the initial vaccine protocol in 194 countries on 5 June 2022, we created a pandemic impact score for each country, giving equal weight to each of the three variables.

In each geographical region, we then selected two countries with the highest impact score, two countries with medium impact score, and two countries with low impact score for detailed review.

Methods and evidence

This research project encompasses evidence from 34 countries (see the list of the countries in our sample).

Unsurprisingly, the amount and type of evidence on each country varies significantly. Our aim in this research project is not to compare these countries with very different technical infrastructures, political cultures and pandemic management strategies, but to have a number of shared criteria against which we can assess the policies, practices and technical infrastructure in these countries.

With this aim in mind, we established a list of data categories to collect country-specific information:

  • introduction date of vaccine passports
  • end date of vaccine passport regulations
  • protests against vaccine passports or contact tracing apps
  • implementations of vaccine passports, for example, being mandatory in workplaces, for international travel, etc.
  • cumulative number of cases when digital vaccine passports were introduced
  • cumulative number of deaths when digital vaccine passports were introduced
  • share of the vaccinated people when digital vaccine passports were introduced
  • whether there was a government-launched contact tracing app
  • technical infrastructure of contact tracing apps
  • reported cases of surveillance
  • reported cases of repurposing data
  • reported cases of rights infringements
  • evidence on whether COVID-19 technologies increased societal inequalities (for example, around digital exclusion)
  • evidence on whether COVID-19 technologies increased global inequalities
  • evidence on the effectiveness of digital vaccine passports and contact tracing apps.

We used the following methods and resources to gather evidence on the data categories outlined above:

External datasets

We used quantitative datasets of other organisations’ data trackers and policy monitors for the following data categories:

  • proportion of the vaccinated people from Our World in Data.[243]
  • COVID restrictions (for example, school closures, lockdowns, etc.) from Blavatnik School of Government, Oxford University.[244]
  • cumulative number of cases from Our World in Data.[245]
  • cumulative number of deaths from Our World in Data.[246]

Call for evidence

In July 2022, we announced an international call for input on the effectiveness and social impact of digital vaccine passports and contact tracing apps. We incorporated the relevant evidence submitted to this call into the evidence base. For some countries, the evidence submitted was helpful as it either provided us with the missing information or confirmed that the respective country did not have an official regulation (or protocol) to govern vaccine passports or contact tracing apps.

We also worked with some of the individuals and organisations that submitted evidence as consultants to acquire further information on their respective country of expertise.

Workshop

We organised a workshop for evidence building in October 2022. The workshop aimed to discuss the effectiveness of contact tracing apps with experts from the disciplines of epidemiology, cybersecurity, public health, law and media and communications.

The aim of the workshop was to deliberate on the effectiveness of contact tracing apps in Europe. The multidisciplinary background of the workshop participants allowed a focus on the effectiveness beyond technical efficacy by considering the social, legislative and regulatory impacts of apps.

Desk research

Between August 2022 and January 2023 we conducted multiple, structured internet search queries using a set of keywords for each country in our sample. These keywords include ‘vaccine certificate’, ‘vaccine passport’, ‘immunity certificate’, ‘digital contact tracing’, ‘contact tracing app’, ‘COVID technologies’ and ‘the name of the country’.

This approach to desk research enabled collection and analysis of evidence from three different types of resources: media news, government websites, and academic and grey literature (produced by organisations who are not traditional publishers, including government documents, or third-sector organisation reports).

Limitations

There are 34 countries in this research sample. Although the sampling covers every continent, as discussed in the sampling section, we do not claim that our country-specific findings are representative of continents, regions or political blocs. Similarly, we also do not claim exhaustive evidence on developments in every country.

We also recognise that as a UK-based organisation, there might be barriers to discovering evidence emerging from various parts of the world. Our qualitative evidence on media reports in particular is largely in the English language – although there are a few exceptions. We worked with consultants from Brazil, India, Egypt, China and South Africa who provided us with non-English language media and government reports that we had not been able to capture through desk research.

The language barrier also emerged in our policy analysis. We aimed to collect data on policies and regulations from government websites and official policy papers. We used online translation software to conduct research in the official languages of the countries in our sample.

The low rate of success in discovering official policy papers of countries indicates that there are limitations to this method. Not all governments made policies and practices of contact tracing apps and digital vaccine passports publicly available. In this context, while the low amount of policy papers we gathered is partly due to the language barrier, it also relates to governments’ lack of transparency about the uses and governance of these technologies.

Acknowledgements

This report was lead-authored by Melis Mevsimler, with substantive contributions from Bárbara Prado Simão, Dr Nagla Rizk, Gabriella Razzano and Prateek Waghre, who provided evidence and analysis as consultants.

Participants in the workshop:

Professor Christophe Fraser, University of Oxford

Professor Susan Landau, Tufts University

Dr Frans Folkvord, Tilburg University

Claudia Wladdimiro Quevedo, Uppsala University

Dr Simon Williams, Swansea University

Francisco Lupianez Villanueva, Open University of Catalonia

Krzysztof Izdebski, Open Spending EU Coalition

Dr Stephen Farrell, Trinity College Dublin

Dr Laszlo Horvath, Birbeck University

Dr Mustafa Al-Haboubi, London School of Hygiene & Tropical Science

Danqi Guo, Free University of Berlin

Dr Federica Lucivero, University of Oxford

Shahrzad Seyfafheji, Bilkent University

Dr Agata Ferretti, ETH Zurich

Yasemin Gumus Agca, Bilkent University

Boudewijn van Eerd, AWO

Peer reviewers:

Eleftherios Chelioudakis, AWO

Hunter Dowart, Bird & Bird

Professor Ana Beduschi, University of Exeter


Footnotes

[1] Carly Kind, ‘What will the first pandemic of the algorithmic age mean for data governance?’ (Ada Lovelace Institute, 2 April 2020) www.adalovelaceinstitute.org/blog/first-pandemic-of-the-algorithmic-age-data-governance/#:~:text=Coronavirus%20is%20the%20first%20pandemic,its%20detection%2C%20treatment%20and%20prevention accessed 12 April 2023.

[2] The BMJ, ‘Artificial intelligence and Covid-19’, www.bmj.com/AICOVID19 accessed 31 March 2023.

[3] For example, G Samuel and others, ‘COVID-19 Contact Tracing Apps: UK Public Perceptions’ (2021) 32:1 Critical Public Health 31, https://doi.org/10.1080/09581596.2021.1909707; MC Mills and T Ruttanauer, ‘The Effect of Mandatory COVID-19 Certificates on Vaccine Uptakes: Synthetic-Control Modelling of Six Countries’ (2022) 7:1 The Lancet 15, https://doi.org/10.1016/S2468-2667(21)00273-5.

[4] ‘COVID-19 Law Lab’ https://covidlawlab.org accessed 31 March 2023; ‘Lex-Atlas: Covid-19’ https://lexatlas-c19.org accessed 31 March 2023; ‘Digital Global Health and Humanitarianism Lab (DGHH Lab)’ https://dghhlab.com/publications/#PUB-DRCOVID19 accessed 31 March 2023.

[5] AWO, ‘Assessment of Covid-19 response in Brazil, Colombia, India, Iran, Lebanon and South Africa’ (29 July 2021) www.awo.agency/blog/covid-19-app-project accessed 13 April 2023.

[6] MIT Technology Review, ‘Covid Tracing Tracker’ www.technologyreview.com/tag/covid-tracing-tracker accessed 31 March 2023.

[7] World Health Organization, ‘Statement on the fourteenth meeting of the International Health Regulations (2005) Emergency Committee regarding the coronavirus disease (COVID-19) pandemic’ (WHO, 30 January 2023) www.who.int/news/item/30-01-2023-statement-on-the-fourteenth-meeting-of-the-international-health-regulations-(2005)-emergency-committee-regarding-the-coronavirus-disease-(covid-19)-pandemic accessed 31 March 2023.

[8] World Health Organization, ‘Statement on the fifteenth meeting of the IHR (2005) Emergency Committee on the COVID-19 pandemic’, (WHO 5 May 2023) https://www.who.int/news/item/05-05-2023-statement-on-the-fifteenth-meeting-of-the-international-health-regulations-(2005)-emergency-committee-regarding-the-coronavirus-disease-(covid-19)-pandemic accessed 31 May 2023

[9] GOVLAB and Knight Foundation, ‘The #Data4Covid19 Review’ https://review.data4covid19.org accessed 12 April 2023.

[10] M Shahroz and others, ‘COVID-19 Digital Contact Tracing Applications and Techniques: A Review Post Initial Deployments’ (2021) 5 Transportation Engineering 100072, https://doi.org/10.1016/j.treng.2021.100072.

[11] Ada Lovelace Institute, Checkpoints for vaccine passports (2021) www.adalovelaceinstitute.org/report/checkpoints-for-vaccine-passports accessed 12 April 2023.

[12] A Hussain, ‘TraceTogether data used by police in one murder case: Vivian Balakrishnan (Yahoo! News, 5 January 2021) https://uk.style.yahoo.com/trace-together-data-used-by-police-in-one-murder-case-vivian-084954246.html?guccounter=2 accessed 12 April 2023; DW, ‘German police under fire for misuse of COVID app’ DW (11 January 2022) www.dw.com/en/german-police-under-fire-for-misuse-of-covid-contact-tracing-app/a-60393597 accessed 31 March 2023.

[13] Carly Kind, ‘What will the first pandemic of the algorithmic age mean for data governance?’ (Ada Lovelace Institute, 2 April 2020) www.adalovelaceinstitute.org/blog/first-pandemic-of-the-algorithmic-age-data-governance/#:~:text=Coronavirus%20is%20the%20first%20pandemic,its%20detection%2C%20treatment%20and%20prevention accessed 26 April 2023.

[14] The BMJ, ‘Artificial intelligence and covid-19’, www.bmj.com/AICOVID19 accessed 31 March 2023.

[15] LO Danquah and others, ‘Use of a Mobile Application for Ebola Contact Tracing and Monitoring in Northern Sierra Leone: A Proof-of-Concept Study’ (2019) 19 BMC Infectious Diseases 810, https://doi.org/10.1186/s12879-019-4354-z.

[16] Fabio Chiusi and others, ‘Automating COVID Responses: The Impact of Automated Decision-Making on the COVID-19 Pandemic’ (AlgorithmWatch 2022) https://algorithmwatch.org/en/wp-content/uploads/2021/12/Tracing-The-Tracers-2021-report-AlgorithmWatch.pdf accessed 26 April 2023.

[17] F Yang, L. Heemsbergen and R Fordyce, ‘Comparative Analysis of China’s Health Code, Australia’s COVIDSafe and New Zealand’s COVID Tracer Surveillance App: A New Corona of Public Health Governmentality?’ (2020) 178:1 Media International Australia 182, 10.1177/1329878X20968277.

[18] F Yang, L Heemsbergen and R Fordyce, ‘Comparative Analysis of China’s Health Code, Australia’s COVIDSafe and New Zealand’s COVID Tracer Surveillance App: A New Corona of Public Health Governmentality?’ (2020) 178:1 Media International Australia 182, 10.1177/1329878X20968277.

[19] Ada Lovelace Institute, ‘Health data and COVID-19 technologies’ https://www.adalovelaceinstitute.org/our-work/programmes/health-data-covid-19-tech/
accessed 31 May 2023.

[20] Ada Lovelace Institute, Checkpoints for vaccine passports (2021) www.adalovelaceinstitute.org/report/checkpoints-for-vaccine-passports accessed 30 March 2023.

[21] Ada Lovelace Institute, Checkpoints for vaccine passports (2021) www.adalovelaceinstitute.org/report/checkpoints-for-vaccine-passports accessed 30 March 2023; Ada Lovelace Institute, ‘Exit through the App Store? COVID-19 rapid evidence review’ (2020) www.adalovelaceinstitute.org/evidence-review/covid-19-rapid-evidence-review-exit-through-the-app-store accessed 30 March 2023.

[22] Ada Lovelace Institute, Checkpoints for vaccine passports (2021) www.adalovelaceinstitute.org/report/checkpoints-for-vaccine-passports accessed 12 April 2023; ‘Exit through the App Store? COVID-19 Rapid Evidence Review’ (2020) www.adalovelaceinstitute.org/evidence-review/covid-19-rapid-evidence-review-exit-through-the-app-store accessed 12 April 2023; ‘No Green Lights, No Red Lines’ (2020) www.adalovelaceinstitute.org/report/covid-19-no-green-lights-no-red-lines accessed 12 April 2023; ‘Confidence in a Crisis? Building Public Trust in a Contact Tracing App’ (2020) www.adalovelaceinstitute.org/report/confidence-in-crisis-building-public-trust-contact-tracing-app accessed 12 April 2023.

[23] DW, ‘German police under fire for misuse of COVID app’ DW (11 January 2022) www.dw.com/en/german-police-under-fire-for-misuse-of-covid-contact-tracing-app/a-60393597 accessed 31 March 2023; E Tham, ‘China Bank Protest Stopped by Health Codes Turning Red, Depositors Say’ (Reuters, 16 June 2022) www.reuters.com/world/china/china-bank-protest-stopped-by-health-codes-turning-red-depositors-say-2022-06-14 accessed 31 March 2023.

[24] Ada Lovelace Institute, ‘COVID-19 Data Explorer: Policies, Practices and Technology’ (2023) https://covid19.adalovelaceinstitute.orgaccessed 31 May 2023.

[25] Ada Lovelace Institute, ‘Health data and COVID-19 technologies’  https://www.adalovelaceinstitute.org/our-work/programmes/health-data-covid-19-tech accessed 31 May 2023.

[26] Centers for Disease Control and Prevention ‘Contact Tracing’ (2022) www.cdc.gov/coronavirus/2019-ncov/easy-to-read/contact-tracing.html accessed 31 March 2023.

[27] M Hunter, ‘Track and Trace, Trial and Error: Assessing South Africa’s Approaches to Privacy in Covid-19 Digital Contact Tracing’ (December 2020) www.researchgate.net/publication/350896038_Track_and_trace_trial_and_error_Assessing_South_Africa%27s_approaches_to_privacy_in_Covid-19_digital_contact_tracing accessed 31 March 2023.

[28] Some areas used manual contact tracing effectively, for example Vietnam and the Indian state of Kerala. See G Razzano, ‘Digital hegemonies for COVID-19’ (Global Data Justice, 5 November 2020) https://globaldatajustice.org/gdj/188 accessed 31 March 2023.

[29] C Yang, ‘Digital Contact Tracing in the Pandemic Cities: Problematizing the Regime of Traceability in South Korea’ (2022) 9:1 Big Data & Society https://doi.org/10.1177/20539517221089294.

[30] Freedom House ‘Freedom on the net 2021: South Africa’ (2021) https://freedomhouse.org/country/south-africa/freedom-net/2021 accessed 31 March 2023.

[31] M Hunter, ‘Track and Trace, Trial and Error: Assessing South Africa’s Approaches to Privacy in Covid-19 Digital Contact Tracing’ (December 2020) www.researchgate.net/publication/350896038_Track_and_trace_trial_and_error_Assessing_South_Africa%27s_approaches_to_privacy_in_Covid-19_digital_contact_tracing accessed 31 March 2023.

[32] Ada Lovelace Institute, Checkpoints for vaccine passports (2021) www.adalovelaceinstitute.org/report/checkpoints-for-vaccine-passports accessed 9 June 2023.

[33] Ada Lovelace Institute, ‘Provisos for a contact tracing app: The route to trustworthy digital contact tracing’ (4 May 2020) www.adalovelaceinstitute.org/evidence-review/provisos-covid-19-contact-tracing-app accessed 31 March 2023.

[34] Ada Lovelace Institute, ‘COVID-19 Data Explorer: Policies, Practices and Technology’ (2023), https://covid19.adalovelaceinstitute.org  accessed 31 May 2023

[35] M Ciucci and F Gouarderes, ‘National COVID-19 Contact Tracing Apps’ (Think Tank European Parliament, 15 May 2020) www.europarl.europa.eu/thinktank/en/document/IPOL_BRI(2020)652711 accessed 31 March 2023.

[36] M Briers, C Holmes and C Fraser, ‘Demonstrating the impact of the NHS COVID-19 app: Statistical analysis from researchers supporting the development of the NHS COVID-19 app’ (The Alan Turing Institute, 2020) www.turing.ac.uk/blog/demonstrating-impact-nhs-covid-19-app accessed 31 March 2023.

[37] M Veale, ‘The English Law of QR Codes: Presence Tracing and Digital Divides’ (Lex-Atlas: Covid-19, 25 May 2021) https://lexatlas-c19.org/the-english-law-of-qr-codes accessed 31 March 2023.

[38] M Veale, ‘The English Law of QR Codes: Presence Tracing and Digital Divides’ (Lex-Atlas: Covid-19, 25 May 2021) https://lexatlas-c19.org/the-english-law-of-qr-codes accessed 31 March 2023.

[39] Ministry of Health, ‘Ministry of Health to trial Near Field Communication (NFC) tap in technology with NZ COVID Tracer’ (Ministry of Health, New Zealand, 2021) www.health.govt.nz/news-media/media-releases/ministry-health-trial-near-field-communication-nfc-tap-technology-nz-covid-tracer accessed 14 April 2023.

[40] We draw on evidence on a cross-section of 34 countries in this report. Three countries in our sample never launched a national contact tracing app, and we could not find reliable information on six countries. You can find more information on technical infrastructure of contact tracing apps on COVID-19 data explorer. Ada Lovelace Institute, ‘COVID-19 Data Explorer: Policies, Practices and Technology’ (May 2023), https://covid19.adalovelaceinstitute.org  l accessed 31 May 2023

[41] L White and P Basshuysen, ‘Privacy versus Public Health? A Reassessment of Centralised and Decentralised Digital Contact Tracing’ (2021) 27 Science and Engineering Ethics 23 https://doi.org/10.1007/s11948-021-00301-0 accessed 31 March 2023

[42] M Ciucci and F Gouarderes, ‘National COVID-19 Contact Tracing Apps’ (Think Tank European Parliament, 15 May 2020) www.europarl.europa.eu/thinktank/en/document/IPOL_BRI(2020)652711 accessed 31 March 2023.

[43] E Braun, ‘French contact-tracing app sent just 14 notifications after 2 million downloads’ (Politico, 23 June 2020) www.politico.eu/article/french-contact-tracing-app-sent-just-14-notifications-after-2-million-downloads accessed 31 March 2023; BBC News ‘Australia Covid: Contact tracing app branded expensive “failure”’ (10 August 2022) www.bbc.co.uk/news/world-australia-62496322 accessed 31 March 2023.

[44] M Veale, ‘Opinion: Privacy is not the problem with the Apple-Google contact tracing app’ (UCL News, 1 July 2020) www.ucl.ac.uk/news/2020/jul/opinion-privacy-not-problem-apple-google-contact-tracing-app accessed 31 March 2023; N Lomas ‘Germany ditches centralized approach to app for COVID-19 contacts tracing’ (TechCrunch, 27 April 2020) https://techcrunch.com/2020/04/27/germany-ditches-centralized-approach-to-app-for-covid-19-contacts-tracing accessed 31 March 2023.

[45] G Goggin, ‘COVID-19 Apps in Singapore and Australia: Reimagining Health Nations with Digital Technology’ (2020) 177:1 Media International Australia 61, 10.1177/1329878X20949770.

[46] G Goggin, ‘COVID-19 Apps in Singapore and Australia: Reimagining Health Nations with Digital Technology’ (2020) 177:1 Media International Australia 61, 10.1177/1329878X20949770.

[47] M Ciucci and F Gouarderes, ‘National COVID-19 Contact Tracing Apps’ (Think Tank European Parliament, 15 May 2020) www.europarl.europa.eu/thinktank/en/document/IPOL_BRI(2020)652711 accessed 26 May 2023 C Gorey, ‘4 things you need to know before installing the HSE Covid-19 contact-tracing app’ (Silicon Republic, 7 July 2020) www.siliconrepublic.com/enterprise/hse-COVID-19-contact-tracing-app accessed 31 March 2023.

[48] AL Popescu, ‘România în urma pandemiei. Statul ignoră propria aplicație anti-Covid, dar și una lansată gratis’ (Europa Libera Romania, 27 November 2020) https://romania.europalibera.org/a/rom%C3%A2nia-%C3%AEn-urma-pandemiei-statul-ignor%C4%83-propria-aplica%C8%9Bie-anti-covid-dar-%C8%99i-una-lansat%C4%83-gratis/30972627.html accessed 31 March 2023; Fabio Chiusi and others, ‘Automating COVID Responses: The Impact of Automated Decision-Making on the COVID-19 Pandemic’ (AlgorithmWatch 2022) https://algorithmwatch.org/en/wp-content/uploads/2021/12/Tracing-The-Tracers-2021-report-AlgorithmWatch.pdf accessed 31 March 2023 https://romania.europalibera.org/a/românia-în-urma-pandemiei-statul-ignoră-propria-aplicație-anti-covid-dar-și-una-lansată-gratis/30972627.html

[49] Several countries in our sample, such as China and India, had a very fragmented contact tracing app ecosystem, with various states/cities/municipalities attempting to create their own apps. There are therefore notable differences across provinces, making difficult to capture the diversity of implementation and experiences.

[50] Ada Lovelace Institute, ‘COVID-19 Data Explorer: Policies, Practices and Technology’ (2023), https://covid19.adalovelaceinstitute.org l accessed 31 May 2023

[51] UK Health Security Agency, ‘NHS COVID-19 app’ (gov.uk, 2020) www.gov.uk/government/collections/nhs-covid-19-app accessed 31 March 2023.

[52] MIT Technology Review, ‘Covid Tracing Tracker’ (2021) www.technologyreview.com/tag/covid-tracing-tracker accessed 31 March 2023.

[53] Ada Lovelace Institute, ‘Exit through the App Store? COVID-19 rapid evidence review’ (19 April 2020) www.adalovelaceinstitute.org/evidence-review/covid-19-rapid-evidence-review-exit-through-the-app-store accessed 31 March 2023, 4.

[54] C Wymant, ‘The epidemiological impact of the NHS COVID-19 app’ (National Institutes of Health, 2021) https://pubmed.ncbi.nlm.nih.gov/33979832/ accessed 31 March 2023.

[55] RW Albertus and F Makoza, ‘An Analysis of the COVID-19 Contact Tracing App in South Africa: Challenges Experienced by Users’ (2022) 15:1 African Journal of Science, Technology, Innovation and Development 124,  https://doi.org/10.1080/20421338.2022.2043808; Office of Audit and Evaluation (Health Canada) and the Public Health Agency of Canada, ‘Evaluation of the National COVID-19 Exposure Notification App’ (Health Canada, 20 June 2022) www.canada.ca/en/health-canada/corporate/transparency/corporate-management-reporting/evaluation/covid-alert-national-covid-19-exposure-notification-app.html accessed 26 May 2023.

[56] F Vogt and others, ‘Effectiveness Evaluation of Digital Contact Tracing for COVID-19 in New South Wales, Australia’ (2022) 7:3 The Lancet E250, https://doi.org/10.1016/S2468-2667(22)00010-X; Ada Lovelace Institute, ‘Provisos for a contact tracing app: The route to trustworthy digital contact tracing’ (2020) www.adalovelaceinstitute.org/evidence-review/provisos-covid-19-contact-tracing-app accessed 26 May 2023.

[57] E Braun, ‘French contact-tracing app sent just 14 notifications after 2 million downloads.’ (Politico, 23 June 2020) www.politico.eu/article/french-contact-tracing-app-sent-just-14-notifications-after-2-million-downloads accessed 31 March 2023.

[58] F Vogt and others, ‘Effectiveness Evaluation of Digital Contact Tracing for COVID-19 in New South Wales, Australia’ (2022) 7:3 The Lancet E250, https://doi.org/10.1016/S2468-2667(22)00010-X; AWO, ‘Assessment of Covid-19 response in Brazil, Colombia, India, Iran, Lebanon and South Africa’ (29 July 2021) www.awo.agency/blog/covid-19-app-project accessed 13 April 2023.

[59] AWO, ‘Assessment of Covid-19 response in Brazil, Colombia, India, Iran, Lebanon and South Africa’ (29 July 2021) www.awo.agency/blog/covid-19-app-project accessed 13 April 2023.

[60] For example, see Y Huang and others, ‘Users’ Expectations, Experiences, and Concerns with COVID Alert, and Exposure-Notification App’ (2022) 6: CSCW2 ACM Journals: Proceedings of the ACM on Human–Computer Interaction 350, https://doi.org/10.1145/3555770.

[61] ‘Digital Global Health and Humanitarianism Lab (DGHH Lab)’ https://dghhlab.com/publications/#PUB-DRCOVID19 accessed 31 March 2023.

[62] ‘Digital Global Health and Humanitarianism Lab (DGHH Lab)’ https://dghhlab.com/publications/#PUB-DRCOVID19 accessed 31 March 2023.

[63] BBC News, ‘Covid in Scotland: Thousands turn off tracking app’ (24 July 2021) www.bbc.co.uk/news/uk-scotland-57941343 accessed 31 March 2023.

[64] S Trendall, ‘Data suggests millions of users have not enabled NHS contact-tracing app’ (Public Technology, 30 June 2021) www.publictechnology.net/articles/news/data-suggests-millions-users-have-not-enabled-nhs-contact-tracing-app accessed 31 March 2023.

[65] V Garousi and D Cutting, ‘What Do Users Think of the UK’s Three COVID-19 Contact Tracing Apps? A Comparative Analysis’ (2021) 28:1 BMJ Health Care Inform e100320, 10.1136/bmjhci-2021-100320.

[66] Office of Audit and Evaluation (Health Canada) and the Public Health Agency of Canada, ‘Evaluation of the National COVID-19 Exposure Notification App’ (Health Canada, 20 June 2022) www.canada.ca/en/health-canada/corporate/transparency/corporate-management-reporting/evaluation/covid-alert-national-covid-19-exposure-notification-app.html accessed 31 March 2023.

[67] Y Huang and others, ‘Users’ Expectations, Experiences, and Concerns with COVID Alert, and Exposure-Notification App’ (2022) 6: CSCW2 ACM Journals: Proceedings of the ACM on Human–Computer Interaction 350, https://doi.org/10.1145/3555770.

[68] Ada Lovelace Institute, ‘Exit through the App Store? COVID-19 rapid evidence review’ (2020) www.adalovelaceinstitute.org/evidence-review/covid-19-rapid-evidence-review-exit-through-the-app-store accessed 31 March 2023.

[69] C Wymant, ‘The epidemiological impact of the NHS COVID-19 app’ (National Institutes of Health, 2021) https://directorsblog.nih.gov/2021/05/25/u-k-study-shows-power-of-digital-contact-tracing-in-the-pandemic accessed 26 May 2023.

[70] Ada Lovelace Institute, ‘Confidence in a crisis? Building public trust in a contact tracing app’ (2020) www.adalovelaceinstitute.org/report/confidence-in-crisis-building-public-trust-contact-tracing-app accessed 26 May 2023.

[71] Ada Lovelace Institute, ‘Exit through the App Store? COVID-19 rapid evidence review’ (2020) www.adalovelaceinstitute.org/evidence-review/covid-19-rapid-evidence-review-exit-through-the-app-store accessed 26 May 2023.

[72] F Yang, L. Heemsbergen and R Fordyce, ‘Comparative Analysis of China’s Health Code, Australia’s COVIDSafe and New Zealand’s COVID Tracer Surveillance App: A New Corona of Public Health Governmentality?’ (2020) 178:1 Media International Australia 182, 10.1177/1329878X20968277.

[73] Planet Payment. ‘Alipay and WeChat Pay’ https://www.planetpayment.com/en/merchants/alipay-and-wechat-pay/ accessed 26 May 2023.

[74] F Liang, ‘COVID-19 and Health Code: How Digital Platforms Tackle the Pandemic in China’ (2021) 6:3 Social Media + Society, https://doi.org/10.1177/2056305120947657; National Health Commission of the People’s Republic of China, ‘Prevention and control of novel coronavirus pneumonia’ (7 March 2020) www.nhc.gov.cn/xcs/zhengcwj/202003/4856d5b0458141fa9f376853224d41d7.shtml accessed 26 May 2023.

[75] W Bin and others, ‘Depositors Are Forcibly Given Red Codes, the Latest Responses from All Parties’ (Southern Metropolis Daily, 14 June 2022) https://mp.weixin.qq.com/s/KAc8_3rCviqnVv05aQvSlw?fbclid=IwAR1xfMQtjZsRikz9vkisYxQBVAAkE9tgekKnMQ4nPaynr2BN9Ceyep3mjq8 accessed 13 April 2023.

[76] S Chan, ‘COVID-19 contact tracing apps reach 9% adoption in most populous countries’ (Sensor Tower, July 2020) https://sensortower.com/blog/contact-tracing-app-adoption accessed 26 May 2023.

[77] Ada Lovelace Institute, ‘Confidence in a crisis? Building public trust in a contact tracing app’ (2020) www.adalovelaceinstitute.org/report/confidence-in-crisis-building-public-trust-contact-tracing-app accessed 26 May 2023

[78] L Muscato, ‘Why people don’t tryst contact tracing apps, and what to do about it’ (Technology Review, 12 November 2020) www.technologyreview.com/2020/11/12/1012033/why-people-dont-trust-contact-tracing-apps-and-what-to-do-about-it accessed 31 March 2023; AWO, ‘Assessment of Covid-19 response in Brazil, Colombia, India, Iran, Lebanon and South Africa’ (29 July 2021) www.awo.agency/blog/covid-19-app-project accessed 13 April 2023; L Horvath and others, ‘Adoption and Continued Use of Mobile Contact Tracing Technology: Multilevel Explanations from a Three-Wave Panel Survey and Linked Data’ (2022) 12:1 BMJ Open e053327, 10.1136/bmjopen-2021-053327; Ada Lovelace Institute, ‘Public attitudes to COVID-19, technology and inequality: A tracker’ (2021) https://www.adalovelaceinstitute.org/resource/public-attitudes-covid-19/ accessed 26 May 2023; A Kozyreva and others, ‘Psychological Factors Shaping Public Responses to COVID-19 Digital Contact Tracing Technologies in Germany’ (2021) 11 Scientific Reports 18716, https://doi.org/10.1038/s41598-021-98249-5; G Samuel and others, ‘COVID-19 Contact Tracing Apps: UK Public Perceptions’ (2022) 1:32 Critical Public Health 31, 10.1080/09581596.2021.1909707; M Caserotti and others, ‘Associations of COVID-19 Risk Perception with Vaccine Hesitancy Over Time for Italian Residents’ (2021) 272 Social Science & Medicine 113688, 10.1016/j.socscimed.2021.113688.

[79] M Koetse ‘Goodbye, Health Code: Chinese netizens say farewell to the green horse’ (What’s on Weibo, 8 December 2022) www.whatsonweibo.com/goodbye-health-code-chinese-netizens-say-farewell-to-the-green-horse accessed 26 May 2023; L Houchen, ‘Are you ready to use the “Health Code” all the time?’ (7 April 2020) https://mp.weixin.qq.com/s/xDKKicV22IBRGnNnNStOVg accessed 26 May 2023. The National Health Commission’s notice to end the Health Code mandate did not immediately translate into municipal governments discontinuing their policies. See Health Commission, ‘Notice on printing and distributing the Prevention and Control Plan for Novel Coronavirus Pneumonia (Ninth Edition)’ (Health Commission, 28 June 2022) www.gov.cn/xinwen/2022-06/28/content_5698168.htm accessed 13 April 2023.

[80] For example, see Southern Metropolis Daily’s interview with a number of experts on the impacts of using health codes in China. W Bin and others, ‘Depositors Are Forcibly Given Red Codes, the Latest Responses from All Parties’ (Southern Metropolis Daily, 14 June 2022) https://mp.weixin.qq.com/s/KAc8_3rCviqnVv05aQvSlw?fbclid=IwAR1xfMQtjZsRikz9vkisYxQBVAAkE9tgekKnMQ4nPaynr2BN9Ceyep3mjq8 accessed 13 April 2023.

[81] M Caserotti and others, ‘Associations of COVID-19 Risk Perception with Vaccine Hesitancy Over Time for Italian Residents’ (2021) 272 Social Science & Medicine 113688, 10.1016/j.socscimed.2021.113688.

[82] M Dewatripont, ‘Policy Insight 110: Vaccination Strategies in the Midst of an Epidemic’ (Centre for Economic Policy Research, 1 October 2021) https://cepr.org/publications/policy-insight-110-vaccination-strategies-midst-epidemic accessed 13 April 2023.

[83] G Samuel and others, ‘COVID-19 Contact Tracing Apps: UK Public Perceptions’ (2022) 1:32 Critical Public Health 31, 10.1080/09581596.2021.1909707.

[84] S Landau, People Count: Contact-Tracing Apps and Public Health (The MIT Press, 2021).

[85] J Amann, J Sleigh and E Vayena, ‘Digital Contact-Tracing during the Covid-19 Pandemic: An Analysis of Newspaper Coverage in Germany, Austria, and Switzerland’ (2021) PLOS ONE, https://doi.org/10.1371/journal.pone.0246524.

[86] AWO, ‘Assessment of Covid-19 response in Brazil, Colombia, India, Iran, Lebanon and South Africa’ (29 July 2021) www.awo.agency/blog/covid-19-app-project accessed 13 April 2023.

[87] Office of Audit and Evaluation (Health Canada) and the Public Health Agency of Canada, ‘Evaluation of the National COVID-19 Exposure Notification App’ (Health Canada, 20 June 2022) www.canada.ca/en/health-canada/corporate/transparency/corporate-management-reporting/evaluation/covid-alert-national-covid-19-exposure-notification-app.html accessed13 April 2023.

[88] J Ore, ‘Where did things go wrong with Canada’s COVID Alert App?’ (CBS, 9 February 2022) www.cbc.ca/radio/costofliving/from-boycott-to-bust-we-talk-spotify-and-neil-young-and-take-a-look-at-covid-alert-app-1.6339708/where-did-things-go-wrong-with-canada-s-covid-alert-app-1.6342632 accessed 13 April 2023.

[89] Office of Audit and Evaluation (Health Canada) and the Public Health Agency of Canada, ‘Evaluation of the National COVID-19 Exposure Notification App’ (Health Canada, 20 June 2022) www.canada.ca/en/health-canada/corporate/transparency/corporate-management-reporting/evaluation/covid-alert-national-covid-19-exposure-notification-app.html accessed 13 April 2023.

[90] S Landau, People Count: Contact-Tracing Apps and Public Health (The MIT Press, 2021).

[91] L Dowthwaite and others, ‘Public Adoption of and Trust in the NHS COVID-19 Contact Tracing App in the United Kingdom: Quantitative Online Survey Study’ (2021) 23:9 JMIR Publications e29085, 10.2196/29085.

[92] Ada Lovelace Institute, ‘Confidence in a crisis? Building public trust in a contact tracing app’ (2020) www.adalovelaceinstitute.org/report/confidence-in-crisis-building-public-trust-contact-tracing-app accessed 13 April 2023; ‘Provisos for a contact tracing app: The route to trustworthy digital contact tracing’ (2020) www.adalovelaceinstitute.org/evidence-review/provisos-covid-19-contact-tracing-app accessed 13 April 2023.

[93] C Bambra and others, ‘The COVID-19 Pandemic and Health Inequalities’ (2020) 74:11 Journal of Epidemiology & Community Health 964, http://dx.doi.org/10.1136/jech-2020-214401; E Yong, ‘The Pandemic’s Legacy Is Already Clear’ (The Atlantic, 30 September 2022) www.theatlantic.com/health/archive/2022/09/covid-pandemic-exposes-americas-failing-systems-future-epidemics/671608 accessed 13 April 2023.

[94] Ada Lovelace Institute, ‘Exit through the App Store? COVID-19 Rapid Evidence Review’ (2020) www.adalovelaceinstitute.org/evidence-review/covid-19-rapid-evidence-review-exit-through-the-app-store accessed 26 May 2023.

[95] L Marelli, K Kieslich and S Geiger, ‘COVID-19 and Techno-Solutionism: Responsibilization without Contextualization?’ (2022) 32:1 Critical Public Health 1, https://doi.org/10.1080/09581596.2022.2029192.

[96] S Landau, People Count: Contact-Tracing Apps and Public Health (The MIT Press, 2021).

[97] Government of Ireland, ‘COVID Tracker app’ www.covidtracker.ie accessed 31 March 2023.

[98]  S Landau, People Count: Contact-Tracing Apps and Public Health (The MIT Press, 2021).

[99] S Landau, People Count: Contact-Tracing Apps and Public Health (The MIT Press, 2021).

[100] S Landau, People Count: Contact-Tracing Apps and Public Health (The MIT Press, 2021).

[101] M Veale, ‘The English Law of QR Codes: Presence Tracing and Digital Divides’ (Lex-Atlas: Covid-19, 25 May 2021) https://lexatlas-c19.org/the-english-law-of-qr-codes accessed 31 March 2023.

[102] S Reed and others, ‘Tackling Covid-19: A Case for Better Financial Support to Self-Isolate’ (Nuffield Trust, 14 May 2021) www.nuffieldtrust.org.uk/research/tackling-covid-19-a-case-for-better-financial-support-to-self-isolate accessed 26 May 2023.

[103] Statista, ‘Internet user penetration in Nigeria from 2018 to 2027’ (June 2022) www.statista.com/statistics/484918/internet-user-reach-nigeria accessed 26 May 2023.; G Razzano, ‘Privacy and the pandemic: An African response’ (Association For Progressive Communications, 21 June 2020) www.apc.org/en/pubs/privacy-and-pandemic-african-response accessed 31 March 2023.

[104] Ada Lovelace Institute, ‘Confidence in a Crisis? Building Public Trust in a Contact Tracing App’ (2020) www.adalovelaceinstitute.org/report/confidence-in-crisis-building-public-trust-contact-tracing-app accessed 26 May 2023.; ‘Provisos for a Contact Tracing App: The Route to Trustworthy Digital Contact Tracing’ (2020) www.adalovelaceinstitute.org/evidence-review/provisos-covid-19-contact-tracing-app accessed 26 May 2023.

[105] Privacy International, ‘The principles of data protection: not new and actually quite familiar’ (24 September 2018) https://privacyinternational.org/news-analysis/2284/principles-data-protection-not-new-and-actually-quite-familiar accessed 31 March 2023; Ada Lovelace Institute, ‘Provisos for a contact tracing app: The route to trustworthy digital contact tracing’ (2020) www.adalovelaceinstitute.org/evidence-review/provisos-covid-19-contact-tracing-app accessed 26 May 2023. Ada Lovelace Institute, ‘Exit through the App Store? COVID-19 rapid evidence review’ (2020) www.adalovelaceinstitute.org/evidence-review/covid-19-rapid-evidence-review-exit-through-the-app-store accessed 26 May 2023.

[106] Ada Lovelace Institute, Checkpoints for vaccine passports (2021) www.adalovelaceinstitute.org/report/checkpoints-for-vaccine-passports accessed 26 May 2023.

[107] Ada Lovelace Institute, ‘Exit through the App Store? COVID-19 rapid evidence review’ (2020) www.adalovelaceinstitute.org/evidence-review/covid-19-rapid-evidence-review-exit-through-the-app-store accessed 26 May 2023.

[108] TT Altshuler and RA Hershkovitz, ‘Digital Contact Tracing and the Coronavirus: Israeli and Comparative Perspectives’ (The Brookings Institution, August 2020) www.brookings.edu/wp-content/uploads/2020/08/FP_20200803_digital_contact_tracing.pdf accessed 31 March 2023.

[109] P Garrett and others, ‘High Acceptance of COVID-19 TRACING Technologies in Taiwan: A Nationally Representative Survey Analysis’ (2020) 19:5 International Journal of Environmental Research and Public Health 3323, 10.3390/ijerph19063323.

[110] TT Altshuler and RA Hershkovitz, ‘Digital Contact Tracing and the Coronavirus: Israeli and Comparative Perspectives’ (The Brookings Institution, August 2020) www.brookings.edu/wp-content/uploads/2020/08/FP_20200803_digital_contact_tracing.pdf accessed 31 March 2023.

[111] J Zhu, ‘The Personal Information Protection Law: China’s version of the GDPR?’ (Columbia Journal of Transnational Law, 14 February 2022) www.jtl.columbia.edu/bulletin-blog/the-personal-information-protection-law-chinas-version-of-the-gdpr accessed 26 May 2023.; it is noteworthy that there were pre-existing privacy rules in place embedded in several laws and regulations; however, these were not enforced with adequate oversight capacity. See A Geller, ‘How Comprehensive Is Chinese Data Protection Law? A Systematisation of Chinese Data Protection Law from a European Perspective’ (2020) 69:12 GRUR International Journal of European and International IP Law 1191, https://doi.org/10.1093/grurint/ikaa136.

[112] H Yu, ‘Living in the Era of Codes: A Reflection on China’s Health Code System’ (2022) Biosocieties, 10.1057/s41292-022-00290-8.

[113] A Li, ‘Explainer: China’s Covid-19 Health Code System’ (Hong Kong Free Press, 13 July 2022) https://hongkongfp.com/2022/07/13/explainer-chinas-COVID-19-health-code-system accessed 31 March 2023; A Clarance, ‘Aarogya Setu: Why India’s Covid-19 contact tracing app is controversial’ (BBC News, 15 May 2020) www.bbc.co.uk/news/world-asia-india-52659520 accessed 31 March 2023; W Bin and others, ‘Depositors Are Forcibly Given Red Codes, the Latest Responses from All Parties’ (Southern Metropolis Daily, 14 June 2022) https://mp.weixin.qq.com/s/KAc8_3rCviqnVv05aQvSlw?fbclid=IwAR1xfMQtjZsRikz9vkisYxQBVAAkE9tgekKnMQ4nPaynr2BN9Ceyep3mjq8 accessed 13 April 2023.

[114] TT Altshuler and RA Hershkovitz, ‘Digital Contact Tracing and the Coronavirus: Israeli and Comparative Perspectives’ (The Brookings Institution, August 2020) www.brookings.edu/wp-content/uploads/2020/08/FP_20200803_digital_contact_tracing.pdf accessed 31 March 2023.

[115] ‘Lex-Atlas: Covid-19’ https://lexatlas-c19.org accessed 31 March 2023.

[116] A Clarance, ‘Aarogya Setu: Why India’s Covid-19 contact tracing app is controversial’ (BBC News, 15 May 2020) www.bbc.co.uk/news/world-asia-india-52659520 accessed 31 March 2023.

[117] Internet Freedom Foundation, ‘Statement: Victory! Aarogya Setu changes from mandatory to, “best efforts”’ (18 May 2020) https://internetfreedom.in/aarogya-setu-victory accessed 26 May 2023.

[118] Evidence submitted to Ada Lovelace Institute by Internet Freedom Foundation, India.

[119] Norton Rose Fulbright, ‘Contact Tracing Apps: A New World for Data Privacy’ (February 2021) www.nortonrosefulbright.com/en/knowledge/publications/d7a9a296/contact-tracing-apps-a-new-world-for-data-privacy accessed 26 May 2023.

[120] T Klosowski, ‘The State of Consumer Data Privacy Laws in the US (and Why It Matters)’ (New York Times, 6 September 2021) www.nytimes.com/wirecutter/blog/state-of-privacy-laws-in-us accessed 26 May 2023.

[121] Health Insurance Portability and Accountability Act  is a federal law to protect sensitive patient health information, but contact tracing apps were not covered because they are not ‘regulated entities’ under the Act. Centers for Disease Control and Prevention ‘Health Insurance Portability and Accountability Act of 1996 (HIPAA) https://www.cdc.gov/phlp/publications/topic/hipaa.html accessed 26 May 2023.

[122] Ada Lovelace Institute, ‘Exit through the App Store? COVID-19 rapid evidence review’ (2020) www.adalovelaceinstitute.org/evidence-review/covid-19-rapid-evidence-review-exit-through-the-app-store accessed 31 March 2023.

[123] P Valade, ‘Jumbo Privacy Review: North Dakota’s Contact Tracing App’ (Jumbo, 21 May 2020) https://blog.withjumbo.com/jumbo-privacy-review-north-dakota-s-contact-tracing-app.html accessed 31 March 2023.

[124]  Civil Liberties Union for Europe, ‘Do EU Governments Continue to Operate Contact Tracing Apps Illegitimately?’ (October 2021) https://dq4n3btxmr8c9.cloudfront.net/files/Nv4A36/DO_EU_GOVERNMENTS_CONTINUE_TO_OPERATE_CONTACT_TRACING_APPS_ILLEGITIMATELY.pdf accessed 31 March 2023.

[125] Fabio Chiusi and others, ‘Automating COVID Responses: The Impact of Automated Decision-Making on the COVID-19 Pandemic’ (AlgorithmWatch 2022) https://algorithmwatch.org/en/wp-content/uploads/2021/12/Tracing-The-Tracers-2021-report-AlgorithmWatch.pdf accessed 31 March 2023.

[126] A Hussain, ‘TraceTogether data used by police in one murder case: Vivian Balakrishnan (Yahoo! News, 5 January 2021) https://uk.style.yahoo.com/trace-together-data-used-by-police-in-one-murder-case-vivian-084954246.html?guccounter=2 accessed 31 March 2023.

[127] K Han, ‘COVID app triggers overdue debate on privacy in Singapore’ (Al Jazeera, 10 February 2021) www.aljazeera.com/news/2021/2/10/covid-app-triggers-overdue-debate-on-privacy-in-singapore accessed 31 March 2023.

[128] K Han, ‘COVID app triggers overdue debate on privacy in Singapore’ (Al Jazeera, 10 February 2021) www.aljazeera.com/news/2021/2/10/covid-app-triggers-overdue-debate-on-privacy-in-singapore accessed 31 March 2023.

[129] S Hilberg, ‘The new German Privacy Act: An overview’ (Deloitte) www2.deloitte.com/dl/en/pages/legal/articles/neues-bundesdatenschutzgesetz.html accessed 26 May 2023.

[130] Civil Liberties Union for Europe, ‘Do EU Governments Continue to Operate Contact Tracing Apps Illegitimately?’ (October 2021) https://dq4n3btxmr8c9.cloudfront.net/files/Nv4A36/DO_EU_GOVERNMENTS_CONTINUE_TO_OPERATE_CONTACT_TRACING_APPS_ILLEGITIMATELY.pdf accessed 31 March 2023.

[131] H Heine, ‘Check-In feature: Corona-Warn-App can now scan luca’s QR codes’ (Corona Warn-app Open Source Project, 9 November 2021) www.coronawarn.app/en/blog/2021-11-09-cwa-luca-qr-codes accessed 26 May 2023.

[132] Fabio Chiusi and others, ‘Automating COVID Responses: The Impact of Automated Decision-Making on the COVID-19 Pandemic’ (AlgorithmWatch 2022) https://algorithmwatch.org/en/wp-content/uploads/2021/12/Tracing-The-Tracers-2021-report-AlgorithmWatch.pdf accessed 26 May 2023.

[133] M Knodel, ‘Public Health, Big Tech, and Privacy: Multistakeholder Governance and Technology-Assisted Contact tracing’ (Global Insights, January 2021) www.ned.org/wp-content/uploads/2021/01/Public-Health-Big-Tech-Privacy-Contact-Tracing-Knodel.pdf accessed 16 April 2023.

[134] M Veale, ‘Opinion: Privacy is not the problem with the Apple-Google contact tracing app’ (UCL News, 1 July 2020) www.ucl.ac.uk/news/2020/jul/opinion-privacy-not-problem-apple-google-contact-tracing-app accessed 31 March 2023.

[135] Ada Lovelace Institute, Rethinking Data and Rebalancing Digital Power (2022) www.adalovelaceinstitute.org/report/rethinking-data accessed 16 April 2023.

[136] H Mance, ‘Shoshana Zuboff: “Privacy Has Been Extinguished. It Is Now a Zombie”’ (Financial Times, 30 January 2023) www.ft.com/content/0cca6054-6fc9-4a94-b2e2-890c50d956d5#myft:my-news:page accessed 16 April 2023.

[137] M Knodel, ‘Public Health, Big Tech, and Privacy: Multistakeholder Governance and Technology-Assisted Contact tracing’ (Global Insights, January 2021) www.ned.org/wp-content/uploads/2021/01/Public-Health-Big-Tech-Privacy-Contact-Tracing-Knodel.pdf accessed 16 April 2023.

[138] GOVLAB and Knight Foundation, ‘The #Data4Covid19 Review’ https://review.data4covid19.org accessed 16 April 2023.

[139] Ada Lovelace Institute, ‘Exit through the App Store? COVID-19 rapid evidence review’ (2020) www.adalovelaceinstitute.org/evidence-review/covid-19-rapid-evidence-review-exit-through-the-app-store accessed 16 April 2023.

[140] Ada Lovelace Institute, Checkpoints for vaccine passports (2021) www.adalovelaceinstitute.org/report/checkpoints-for-vaccine-passports accessed 16 April 2023.

[141] World Health Organization, ‘Estonia and WHO to jointly develop digital vaccine certificate to strengthen COVAX’ (WHO, 7 October 2020) www.who.int/news-room/feature-stories/detail/estonia-and-who-to-jointly-develop-digital-vaccine-certificate-to-strengthen-covax accessed 16 April 2023.

[142] World Health Organization, ‘Estonia and WHO to jointly develop digital vaccine certificate to strengthen COVAX’ (WHO, 7 October 2020) www.who.int/news-room/feature-stories/detail/estonia-and-who-to-jointly-develop-digital-vaccine-certificate-to-strengthen-covax accessed 16 April 2023.

[143] Pfizer, ‘Pfizer and BioNtech announce vaccine candidate against COVID-19 achieved success in first interim analysis from Phase 3 Study’ (9 November 2020) www.pfizer.com/news/press-release/press-release-detail/pfizer-and-biontech-announce-vaccine-candidate-against accessed 16 April 2023.

[144] NHS England, ‘Landmark moment as first NHS patient receives COVID-19 vaccination’ (NHS England News, December 2020) www.england.nhs.uk/2020/12/landmark-moment-as-first-nhs-patient-receives-COVID-19-vaccination accessed 12 April 2023.

[145] H Davidson, ‘China Approves Sinopharm Covid-19 Vaccine for General Use’ (Guardian, 31 December 2020) www.theguardian.com/world/2020/dec/31/china-approves-sinopharm-covid-19-vaccine-for-general-use accessed 12 April 2023.

[146] NHS England, ‘Landmark moment as first NHS patient receives COVID-19 vaccination’ (NHS England News, * December 2020) www.england.nhs.uk/2020/12/landmark-moment-as-first-nhs-patient-receives-COVID-19-vaccination accessed 12 April 2023.

[147] Y Noguchi, ‘The history of vaccine passports in the US and what’s new’ (NPR, 8 April 2021) www.npr.org/2021/04/08/985253421/the-history-of-vaccine-passports-in-the-u-s-and-whats-new accessed 12 April 2023.

[148]  Ada Lovelace Institute, ‘COVID-19 Data Explorer: Policies, Practices and Technology’ (2023) https://covid19.adalovelaceinstitute.org l accessed 31 May 2023.

[149] Y Noguchi, ‘The history of vaccine passports in the US and what’s new’ (NPR, 8 April 2021) www.npr.org/2021/04/08/985253421/the-history-of-vaccine-passports-in-the-u-s-and-whats-new accessed 12 April 2023.

[150] K Teyras, ‘Covid-19 health passes can open the door to a digital ID revolution’ (THALES, 30 November 2021) https://dis-blog.thalesgroup.com/identity-biometric-solutions/2021/06/23/covid-19-health-passes-can-open-the-door-to-a-digital-id-revolution accessed 12 April 2023; Privacy International, ‘Covid-19 vaccination certificates: WHO sets minimum demands, governments must do even better’ (9 August 2021) https://privacyinternational.org/advocacy/4607/covid-19-vaccination-certificates-who-sets-minimum-demands-governments-must-do-even accessed 12 April 2023.

[151] S Davidson, ‘How vaccine passports could change digital identity’ (Digicert,6 November 2021) www.digicert.com/blog/how-vaccine-passports-could-change-digital-identity accessed 12 April 2023.

[152] Ada Lovelace Institute, ‘International monitor: vaccine passports and COVID-19 status apps’ (2021) https://www.adalovelaceinstitute.org/resource/international-monitor-vaccine-passports-and-covid-19-status-apps/ accessed 12 April 2023.

[153] F Kritz, ‘The vaccine passport debate actually began in 1897 over a plague vaccine’ (NPR, 8 April 2021) www.npr.org/sections/goatsandsoda/2021/04/08/985032748/the-vaccine-passport-debate-actually-began-in-1897-over-a-plague-vaccine accessed 12 April 2023.

[154] F Kritz, ‘The vaccine passport debate actually began in 1897 over a plague vaccine’ (NPR, 8 April 2021) www.npr.org/sections/goatsandsoda/2021/04/08/985032748/the-vaccine-passport-debate-actually-began-in-1897-over-a-plague-vaccine accessed 12 April 2023.

[155] Ada Lovelace Institute, Checkpoints for Vaccine Passports (2021) www.adalovelaceinstitute.org/report/checkpoints-for-vaccine-passports accessed 12 April 2023.

[156] ibid.

[157] S Subramanian, ‘Biometric tracking can ensure billions have immunity against Covid-19’ (Bloomberg, 13 August 2020) https://www.bloomberg.com/features/2020-COVID-vaccine-tracking-biometric accessed 13 April 2023

[158] Ada Lovelace Institute, Checkpoints for vaccine passports (2021) www.adalovelaceinstitute.org/report/checkpoints-for-vaccine-passports accessed 12 April 2023.

[159] See the legacy of COVID-19 technologies?: Outstanding questions section, p. 118.

[160] Ada Lovelace Institute, ‘COVID-19 Data Explorer: Policies, Practices and Technology’ (2023) https://covid19.adalovelaceinstitute.org  accessed 31 May 2023

 

 

[161] World Health Organization, ‘Rapidly escalating Covid-19 cases amid reduced virus surveillance forecasts a challenging autumn and winter in the WHO European Region’ (WHO, 19 July 2022) www.who.int/europe/news/item/19-07-2022-rapidly-escalating-COVID-19-cases-amid-reduced-virus-surveillance-forecasts-a-challenging-autumn-and-winter-in-the-who-european-region accessed 12 April 2023.

[162] F Kritz, ‘The vaccine passport debate actually began in 1897 over a plague vaccine’ (NPR, 8 April 2021) www.npr.org/sections/goatsandsoda/2021/04/08/985032748/the-vaccine-passport-debate-actually-began-in-1897-over-a-plague-vaccine accessed 12 April 2023.

[163] The New Zealand government shifted its policy towards COVID-19 acceptance by opening the borders and ending lockdowns in October 2021. See J Curtin, ‘The end of New Zealand’s zero-COVID policy’ (Think Global Health, 28 October 2021) www.thinkglobalhealth.org/article/end-new-zealands-zero-COVID-policy accessed 12 April 2023.

[164] Reuters, ‘Brazil health regulator asks Bolsonaro to retract criticism over vaccines’ (9 January 2022) www.reuters.com/business/healthcare-pharmaceuticals/brazil-health-regulator-asks-bolsonaro-retract-criticism-over-vaccines-2022-01-09 accessed 12 April 2023.

[165] Al Jazeera, ‘Brazil judge mandates proof of vaccination for foreign visitors’ (12 December 2021) www.aljazeera.com/news/2021/12/12/brazil-justice-mandates-vaccine-passport-for-visitors accessed 12 April 2023.

[166]  Y Noguchi, ‘The history of vaccine passports in the US and what’s new’ (NPR, 8 April 2021) www.npr.org/2021/04/08/985253421/the-history-of-vaccine-passports-in-the-u-s-and-whats-new accessed 12 April 2023.

[167] M Bull, ‘The Italian Government Response to Covid-19 and the Making of a Prime Minister’ (2021) 13:2 Contemporary Italian Politics 149, https://doi.org/10.1080/23248823.2021.1914453.

[168] A Peacock, ‘What is the Covid ‘Super Green’ pass?’ (Tuscany Now & More) www.tuscanynowandmore.com/discover-italy/essential-advice/travelling-italy-COVID-green-pass accessed 12 April 2023.

[169] Ada Lovelace Institute, ‘COVID-19 Data Explorer: Policies, Practices and Technology’ (May 2023), https://covid19.adalovelaceinstitute.org accessed 31 May 2023

 

[170] DF Povse, ‘Examining the pros and cons of digital COVID certificates in the EU’ (Ada Lovelace Institute, 15 December 2022) www.adalovelaceinstitute.org/blog/examining-digital-covid-certificates-eu accessed 12 April 2023.

[171] Ada Lovelace Institute, ‘What place should COVID-19 vaccine passports have in society?’ (17 February 2021) www.adalovelaceinstitute.org/report/covid-19-vaccine-passports accessed 31 March 2023.

[172] Ada Lovelace Institute, ‘International Monitor: vaccine passports and COVID-19 status apps’ (15 October 2021) https://www.adalovelaceinstitute.org/resource/international-monitor-vaccine-passports-and-covid-19-status-apps/ accessed 31 March 2023

[173] S Amaro, ‘France’s Macron sparks outrage as he vows to annoy the unvaccinated’ (CNBC, 5 January 2022) www.cnbc.com/2022/01/05/macron-french-president-wants-to-annoy-the-unvaccinated-.html accessed 12 April 2023.

[174] G Vergallo and others, ‘Does the EU COVID Digital Certificate Strike a Reasonable Balance between Mobility Needs and Public Health? (2021) 57:10 Medicina (Kaunas) 1077, 10.3390/medicina57101077.

[175] C Franco-Paredes, ‘Transmissibility of SARS-CoV-2 among Fully Vaccinated Individuals’ (2022) 22:1 The Lancet P16, https://doi.org/10.1016/S1473-3099(21)00768-4.

[176] World Health Organization, ‘Information for the public: COIVID-19 vaccines’ (WHO, 18 November 2022) https://www.who.int/westernpacific/emergencies/covid-19/information-vaccines accessed 01 June 2023.

[177] World Health Organization, ‘Vaccine efficacy, effectiveness and protection’ (WHO, 14 July 2021) www.who.int/news-room/feature-stories/detail/vaccine-efficacy-effectiveness-and-protection accessed 12 April 2024; A Allen, ‘Pfizer CEO pushes yearly shots for Covid: Not so fast, experts say’ (KFF Health News, 21 March 2022) https://kffhealthnews.org/news/article/pfizer-ceo-albert-bourla-yearly-COVID-shots accessed 31 March 2023.

[178] World Health Organization, ‘Tracking SARS-CoV-2 variants’ www.who.int/activities/tracking-SARS-CoV-2-variants accessed 31 March 2023.

[179] G Warren and R Lofstedt, ‘Risk Communication and COVID-19 in Europe: Lessons for Future Public Health Crises’ (2021) 25:10 Journal of Risk Research 1161, https://doi.org/10.1080/13669877.2021.1947874.

[180] DF Povse, ‘Examining the pros and cons of digital COVID certificates in the EU’ (Ada Lovelace Institute, 15 December 2022) www.adalovelaceinstitute.org/blog/examining-digital-covid-certificates-eu accessed 31 March 2023.

[181] M Sallam, ‘COVID-19 Vaccine Hesitancy Worldwide: A Concise Systematic Review of Vaccine Acceptance Rates’ (2021) 9:2 Vaccines 160, https://doi.org/10.3390/vaccines9020160.

[182] SuperJob, ‘Most often, the introduction of QR codes is approved at mass events, least often – in non-food stores, but 4 out of 10 Russians are against any QR codes’ (16 November 2021) www.superjob.ru/research/articles/113182/chasche-vsego-vvod-qr-kodov-odobryayut-na-massovyh-meropriyatiyah accessed 31 March 2023.

[183] G Salau, ‘How vaccine cards are procured without jabs’ (The Guardian [Nigeria], 23 December 2021) https://guardian.ng/features/how-vaccine-cards-are-procured-without-jabs accessed 26 May 2023; E de Bre, ‘Fake COVID-19 vaccination cards emerge in Russia’ (Organized Crime and Corruption Reporting Project, 30 June 2021) www.occrp.org/en/daily/14733-fake-COVID-19-vaccination-cards-emerge-in-russia accessed 31 March 2023.

[184] J Ceulaer, ‘Viroloog Emmanuel Andre: “Covid Safe Ticket leidde tot meer besmettingen”’ (De Morgen, 29 November 2021) www.demorgen.be/nieuws/viroloog-emmanuel-andre-covid-safe-ticket-leidde-tot-meer-besmettingen~bae41a3e/?utm_source=link&utm_medium=social&utm_campaign=shared_earned accessed 12 April 2023.

[185] Gilmore and others, ‘Community Engagement to Support COVID-19 Vaccine Uptake: A Living Systematic Review Protocol’ (2022) 12 BMJ Open e063057, http://dx.doi.org/10.1136/bmjopen-2022-063057.

[186] AD Bourhanbour and O Ouchetto, ‘Morocco Achieves the Highest COVID-19 Vaccine Rates in Africa in the First Phase: What Are Reasons for Its Success?’ (2021) 28:4 Journal of Travel Medicine taab040, https://doi.org/10.1093/jtm/taab040.

[187] Ada Lovelace Institute, Checkpoints for vaccine passports (2021) www.adalovelaceinstitute.org/report/checkpoints-for-vaccine-passports accessed 26 May 2023.

[188] Ada Lovelace Institute, Checkpoints for vaccine passports (2021) www.adalovelaceinstitute.org/report/checkpoints-for-vaccine-passports accessed 26 May 2023.

[189] K Beaver, G Skinner and A Quigley, ‘Majority of Britons support vaccine passports but recognise concerns in new Ipsos UK Knowledge Panel poll’ (Ipsos, 31 March 2021) www.ipsos.com/en-uk/majority-britons-support-vaccine-passports-recognise-concerns-new-ipsos-uk-knowledgepanel-poll accessed 12 April 2023.

[190] H Kennedy, ‘The vaccine passport debate reveals fundamental views about how personal data should be used, its role in reproducing inequalities, and the kind of society we want to live in’ (LSE, 12 August 2021) https://blogs.lse.ac.uk/impactofsocialsciences/2021/08/12/the-vaccine-passport-debate-reveals-fundamental-views-about-how-personal-data-should-be-used-its-role-in-reproducing-inequalities-and-the-kind-of-society-we-want-to-live-in accessed 26 May 2023.

[191] C Brogan, ‘Vaccine passports linked to COVID-19 vaccine hesitancy in UK and Israel’ (Imperial College London, 2 September 2021) www.imperial.ac.uk/news/229153/vaccine-passports-linked-covid-19-vaccine-hesitancy accessed 12 April 2023.

[192] J Drury, ‘Behavioural Responses to Covid-19 Health Certification: A Rapid Review’ (2021) 21 BMC Public Health 1205, https://doi.org/10.1186/s12889-021-11166-0; JR de Waal, ‘One year on: Global update on public attitudes to government handling of Covid’ (YouGov, 19 November 2021) https://yougov.co.uk/topics/international/articles-reports/2021/11/19/one-year-global-update-public-attitudes-government accessed 12 April 2023.

[193] H Kennedy, ‘The vaccine passport debate reveals fundamental views about how personal data should be used, its role in reproducing inequalities, and the kind of society we want to live in’ (LSE, 12 August 2021) https://blogs.lse.ac.uk/impactofsocialsciences/2021/08/12/the-vaccine-passport-debate-reveals-fundamental-views-about-how-personal-data-should-be-used-its-role-in-reproducing-inequalities-and-the-kind-of-society-we-want-to-live-in accessed 12 April 2023.

[194] Ada Lovelace Institute, Checkpoints for vaccine passports (2021) www.adalovelaceinstitute.org/report/checkpoints-for-vaccine-passports accessed 12 April 2023.

[195] H Kennedy, ‘The vaccine passport debate reveals fundamental views about how personal data should be used, its role in reproducing inequalities, and the kind of society we want to live in’ (LSE, 12 August 2021) https://blogs.lse.ac.uk/impactofsocialsciences/2021/08/12/the-vaccine-passport-debate-reveals-fundamental-views-about-how-personal-data-should-be-used-its-role-in-reproducing-inequalities-and-the-kind-of-society-we-want-to-live-in accessed 12 April 2023.

[196] Ada Lovelace Institute, Checkpoints for vaccine passports (2021) www.adalovelaceinstitute.org/report/checkpoints-for-vaccine-passports accessed 12 April 2023.

[197] Ada Lovelace Institute, ‘COVID-19 Data Explorer: Policies, Practices and Technology’ (May 2023), https://covid19.adalovelaceinstitute.orgaccessed 31 May 2023

[198] B Bell, ‘Covid: Austrians heading towards lockdown for unvaccinated’ (BBC News, 12 November 2021) www.bbc.co.uk/news/world-europe-59245018 accessed 12 April 2023.

[199] B Bell, ‘Covid: Austrians heading towards lockdown for unvaccinated’ (BBC News, 12 November 2021) www.bbc.co.uk/news/world-europe-59245018 accessed 12 April 2023.

[200] Simmons + Simmons, ‘COVID-19 Italy: An easing of covid restrictions’ (1 May 2022) www.simmons-simmons.com/en/publications/ckh3mbdvv151g0a03z6mgt3dr/covid-19-decree-brings-strict-restrictions-for-italy accessed 12 April 2023.

[201] E de Bre, ‘Fake COVID-19 vaccination cards emerge in Russia’ (Organized Crime and Corruption Reporting Project, 30 June 2021) www.occrp.org/en/daily/14733-fake-COVID-19-vaccination-cards-emerge-in-russia accessed 31 March 2023.

[202] Ada Lovelace Institute, Checkpoints for vaccine passports (2021) www.adalovelaceinstitute.org/report/checkpoints-for-vaccine-passports accessed 26 May 2023.

[203] Health Pass, ‘Sıkça Sorulan Sorular’ https://healthpass.saglik.gov.tr/sss.html accessed 12 April 2023.

[204] S Dwivedi, ‘“No one can be forced to get vaccinated”: Supreme Court’s big order’ (NDTV, 2 May 2022) www.ndtv.com/india-news/coronavirus-no-one-can-be-forced-to-get-vaccinated-says-supreme-court-adds-current-vaccine-policy-cant-be-said-to-be-unreasonable-2938319 accessed 12 April 2023.

[205] NHS, ‘NHS COVID Pass’ www.nhs.uk/nhs-services/covid-19-services/nhs-covid-pass accessed 12 May 2021.

[206] Our World in Data, ‘Coronavirus (COVID-19) Vaccinations’ https://ourworldindata.org/COVID-vaccinations?country=OWID_WRL accessed 12 April 2023.

[207] Harvard Global Health Institute, ‘From Ebola to COVID-19: Lessons in digital contact tracing in Sierra Leone’ (1 September 2020) https://globalhealth.harvard.edu/from-ebola-to-covid-19-lessons-in-digital-contact-tracing-in-sierra-leone accessed 26 May 2023.

[208] Ada Lovelace Institute, Checkpoints for vaccine passports (2021) www.adalovelaceinstitute.org/report/checkpoints-for-vaccine-passports accessed 26 May 2023. Riaz and colleagues define vaccine nationalism as ‘an economic strategy to hoard vaccinations from manufacturers and increase supply in their own country’. See M Riaz and others, ‘Global Impact of Vaccine Nationalism during COVID-19 Pandemic’ (2010) 49 Tropical Medicine and Health 101, https://doi.org/10.1186/s41182-021-00394-0.

[209] E Racine, ‘Understanding COVID-19 certificates in the context of recent health securitisation trends’ (Ada Lovelace Institute, 9 March 2023) www.adalovelaceinstitute.org/blog/covid-certificates-health-securitisation accessed 26 May 2023.

[210] E Racine, ‘Understanding COVID-19 certificates in the context of recent health securitisation trends’ (Ada Lovelace Institute, 9 March 2023) www.adalovelaceinstitute.org/blog/covid-certificates-health-securitisation accessed 26 May 2023.

[211] J Atick, ‘Covid vaccine passports are important but could they also create more global inequality?’ (Euro News, 17 August 2021) www.euronews.com/next/2021/08/16/covid-vaccine-passports-are-important-but-could-they-also-create-more-global-inequality accessed 12 April 2023.

[212] E Racine, ‘Understanding COVID-19 certificates in the context of recent health securitisation trends’ (Ada Lovelace Institute, 9 March 2023) www.adalovelaceinstitute.org/blog/covid-certificates-health-securitisation accessed 12 April 2023.

[213] A Suarez-Alvarez and AJ Lopez-Menendez, ‘Is COVID-19 Vaccine Inequality Undermining the Recovery from the COVID-19 Pandemic?’ (2022) 12 Journal of Global Health 05020, 10.7189/jogh.12.05020. Share of vaccinated people refers to the total number of people who received all doses prescribed by the initial vaccination protocol, divided by the total population of the country.

[214]  Ada Lovelace Institute, ‘COVID-19 Data Explorer: Policies, Practices and Technology’ (May 2023), https://covid19.adalovelaceinstitute.org accessed 31 May 2023.

[215]  ibid.

[216] World Health Organization, ‘COVAX: Working for global equitable access to COVID-19 vaccines’ www.who.int/initiatives/act-accelerator/covax, accessed 12 April 2023.

[217] European Commission, ‘Team Europe contributes €500 million to COVAX initiative to provide one billion COVID-19 vaccine doses for low and middle income countries’ (15 December 2020) https://ec.europa.eu/commission/presscorner/detail/en/ip_20_2262 accessed 12 April 2023.

 

[218] Holder, J. (2023). Tracking Coronavirus Vaccinations Around the World. The New York Times [online]. Available at: https://www.nytimes.com/interactive/2021/world/covid-vaccinations-tracker.html. (Accessed: 12 April 2023).

[219] European Council. EU digital COVID certificate: how it works. Available at: https://www.consilium.europa.eu/en/policies/coronavirus/eu-digital-covid-certificate//

[220] Ada Lovelace Institute, Checkpoints for vaccine passports (2021) www.adalovelaceinstitute.org/report/checkpoints-for-vaccine-passports accessed 12 April 2023.

[221] A Gillwald and others, ‘Mobile phone data is useful in coronavirus battle: But are people protected enough?’ (The Conversation, 27 April 2020) https://theconversation.com/mobile-phone-data-is-useful-in-coronavirus-battle-but-are-people-protected-enough-136404 accessed 26 May 2023..

[222] Ada Lovelace Institute, Checkpoints for vaccine passports (2021) www.adalovelaceinstitute.org/report/checkpoints-for-vaccine-passports accessed 26 May 2023..

[223] A Gillwald and others, ‘Mobile phone data is useful in coronavirus battle: But are people protected enough?’ (The Conversation, 27 April 2020) https://theconversation.com/mobile-phone-data-is-useful-in-coronavirus-battle-but-are-people-protected-enough-136404 accessed 26 May 2023.

[224] ABC News, ‘Brazil’s health ministry website hacked, vaccination information stolen and deleted’ (11 December 2021) www.abc.net.au/news/2021-12-11/brazils-national-vaccination-program-hacked-/100692952 accessed 12 April 2023; Z Whittaker, ‘Jamaica’s immigration website exposed thousands of travellers’ data’ (TechCrunch, 17 February 2021) https://techcrunch.com/2021/02/17/jamaica-immigration-travelers-data-exposed accessed 12 April 2023.

[225] Proportionality is a general principle in law which refers to striking a balance between the means used and the intended aim. See European Data Protection Supervisor, ‘Necessity and proportionality’ https://edps.europa.eu/data-protection/our-work/subjects/necessity-proportionality_en accessed 12 April 2023.

[226] Ada Lovelace Institute, Checkpoints for vaccine passports (2021) www.adalovelaceinstitute.org/report/checkpoints-for-vaccine-passports accessed 26 May 2023..

[227] G Razzano, ‘Privacy and the pandemic: An African response’ (Association For Progressive Communications, 21 June 2020) www.apc.org/en/pubs/privacy-and-pandemic-african-response accessed 26 May 2023.

[228] A Gillwald and others, ‘Mobile phone data is useful in coronavirus battle: But are people protected enough?’ (The Conversation, 27 April 2020) https://theconversation.com/mobile-phone-data-is-useful-in-coronavirus-battle-but-are-people-protected-enough-136404 accessed 26 May 2023.

[229] European Commission ‘Coronavirus: Commission proposes to extend the EU Digital COVID Certificate by one year’ (3 February 2022) https://ec.europa.eu/commission/presscorner/detail/en/ip_22_744 accessed 26 May 2023.

[230] A Hussain, ‘TraceTogether data used by police in one murder case: Vivian Balakrishnan (Yahoo! News, 5 January 2021) https://uk.style.yahoo.com/trace-together-data-used-by-police-in-one-murder-case-vivian-084954246.html?guccounter=2 accessed 30 March 2023. DW, ‘German police under fire for misuse of COVID app’ DW (11 January 2022) www.dw.com/en/german-police-under-fire-for-misuse-of-covid-contact-tracing-app/a-60393597 accessed 31 March 2023.

[231] Ada Lovelace Institute, Checkpoints for vaccine passports (2021) www.adalovelaceinstitute.org/report/checkpoints-for-vaccine-passports accessed 12 April 2023; ‘Confidence in a Crisis? Building Public Trust in a Contact Tracing App’ (17 August 2020) www.adalovelaceinstitute.org/report/confidence-in-crisis-building-public-trust-contact-tracing-app accessed 12 April 2023; ‘Exit through the App Store? COVID-19 Rapid Evidence Review’ (19 April 2020) www.adalovelaceinstitute.org/evidence-review/covid-19-rapid-evidence-review-exit-through-the-app-store accessed 12 April 2023.

[232] Ada Lovelace Institute, ‘COVID-19 Data Explorer: Policies, Practices and Technology’ (May 2023), https://covid19.adalovelaceinstitute.orgaccessed 31 May 2023

[233] European Council, ‘European digital identity (eID): Council makes headway towards EU digital wallet, a paradigm shift for digital identity in Europe’ (6 December 2022) https://www.consilium.europa.eu/en/press/press-releases/2022/12/06/european-digital-identity-eid-council-adopts-its-position-on-a-new-regulation-for-a-digital-wallet-at-eu-level accessed 12 April 2023; Y Theodorou, ‘On the road to digital-ID success in Africa: Leveraging global trends’ (Tony Blair Institute, 13 June) www.institute.global/insights/tech-and-digitalisation/road-digital-id-success-africa-leveraging-global-trends accessed 12 April 2023.

[234] The Tawakkalna app is available at https://ta.sdaia.gov.sa/en/index; Saudi–US Trade Group, ‘United Nations recognizes Saudi Arabia’s Tawakkalna app with Public Service Award for 2022 www.sustg.com/united-nations-recognizes-saudi-arabias-tawakkalna-app-with-public-service-award-for-2022 accessed 12 April 2023.

[235] Varindia, ‘Aarogya Setu has been transformed as nation’s health app’ (26 July 2022) https://varindia.com/news/aarogya-setu-has-been-transformed-as-nations-health-app accessed 13 April 2023.

[236] NHS England, ‘Digitising, connecting and transforming health and care’ www.england.nhs.uk/digitaltechnology/digitising-connecting-and-transforming-health-and-care accessed 13 April 2023.

[237] DHI News Team, ‘The role of a successful federated data platform programme’ (Digital Health, 27 September 2022) www.digitalhealth.net/2022/09/the-role-of-a-successful-federated-data-platform-programme accessed 12 April 2023; Department of Health and Social Care, ‘Better, broader, safer: Using health data for research and analysis (gov.uk, 7 April 2022) www.gov.uk/government/publications/better-broader-safer-using-health-data-for-research-and-analysis accessed 13 April 2023.

[238] N Sherman, ‘Palantir: The controversial data firm now worth £17bn’ (BBC News, 1 October 2020) www.bbc.co.uk/news/business-54348456 accessed 13 April 2023.

[239] C Handforth, ‘How digital can close the ‘identity gap’ (UNDP, 19 May 2022) www.undp.org/blog/how-digital-can-close-identity-gap?utm_source=EN&utm_medium=GSR&utm_content=US_UNDP_PaidSearch_Brand_English&utm_campaign=CENTRAL&c_src=CENTRAL&c_src2=GSR&gclid=CjwKCAiA0J accessed 13 April 2023.

[240] L Muscato, ‘Why people don’t tryst contact tracing apps, and what to do about it’ (Technology Review, 12 November 2020) www.technologyreview.com/2020/11/12/1012033/why-people-dont-trust-contact-tracing-apps-and-what-to-do-about-it accessed 31 March 2023; AWO, ‘Assessment of Covid-19 response in Brazil, Colombia, India, Iran, Lebanon and South Africa’ (29 July 2021) www.awo.agency/blog/covid-19-app-project accessed 13 April 2023; L Horvath and others, ‘Adoption and Continued Use of Mobile Contact Tracing Technology: Multilevel Explanations from a Three-Wave Panel Survey and Linked Data’ (2022) 12:1 BMJ Open e053327, 10.1136/bmjopen-2021-053327; A Kozyreva and others, ‘Psychological Factors Shaping Public Responses to COVID-19 Digital Contact Tracing Technologies in Germany’ (2021) 11 Scientific Reports 18716, https://doi.org/10.1038/s41598-021-98249-5; G Samuel and others, ‘COVID-19 Contact Tracing Apps: UK Public Perceptions’ (2022) 1:32 Critical Public Health 31, 10.1080/09581596.2021.1909707; M Caserotti and others, ‘Associations of COVID-19 Risk Perception with Vaccine Hesitancy Over Time for Italian Residents’ (2021) 272 Social Science & Medicine 113688, 10.1016/j.socscimed.2021.113688. Ada Lovelace Institute’s ‘Public attitudes to COVID-19, technology and inequality: A tracker’ summarises a wide range of studies and projects that offer insight into people’s attitudes and perspectives. See Ada Lovelace Institute, ‘Public attitudes to COVID-19, technology and inequality: A tracker’ (2021) https://www.adalovelaceinstitute.org/resource/public-attitudes-covid-19/ accessed 12 April 2023.

[241] Ada Lovelace Institute, ‘International monitor: vaccine passports and COVID-19 status apps’ (15 October 2021) https://www.adalovelaceinstitute.org/resource/international-monitor-vaccine-passports-and-covid-19-status-apps/ accessed 30 March 2023.

[242] Our World in Data, ‘Coronavirus Pandemic (COVID-19) https://ourworldindata.org/coronavirus, accessed 31 May 2023

[243] Our World in Data ‘Coronavirus Pandemic (COVID-19)’ https://ourworldindata.org/coronavirus#explore-the-global-situation accessed 12 April 2023.

[244] University of Oxford ‘COVID-19 Government Response Tracker’ https://www.bsg.ox.ac.uk/research/covid-19-government-response-tracker  accessed 12 April 2023.

[245] Our World in Data ‘Coronavirus Pandemic (COVID-19)’ https://ourworldindata.org/coronavirus#explore-the-global-situation accessed 12 April 2023.

[246] Our World in Data ‘Coronavirus Pandemic (COVID-19)’ https://ourworldindata.org/coronavirus#explore-the-global-situation accessed 12 April 2023.

1–12 of 15

Skip to content

Executive summary

‘Where should I go for dinner? What should I read, watch or listen to next? What should I buy?’ To answer these questions, we might go with our gut and trust our intuition. We could ask our friends and family, or turn to expert reviews. Recommendations large and small can come from a variety of sources in our daily lives, but in the last decade there has been a critical change in where they come from and how they’re used.

Recommendations are now a pervasive feature of the digital products we use. We are increasingly living a world of recommendation systems, a type of software designed to sift through vast quantities of data to guide users towards a narrower selection of material, according to a set of criteria chosen by their developers.

Examples of recommendation systems include Netflix’s ‘Watch next’ and Amazon’s ‘Other users also purchased’; TikTok’s recommendation system drives its main content feed.

But what is the risk of a recommendation? As recommendations become more automated and data-driven, the trade-offs in their design and use are becoming more important to understand and evaluate.

Background

This report explores the ethics of recommendation systems as used in public service media organisations. These independent organisations have a mission to inform, educate and entertain the public, and are often funded by and accountable to the public.

In media organisations, producers, editors and journalists have always made implicit and explicit decisions about what to give prominence to, both in terms of what stories to tell and what programmes to commission, but also in how those stories are presented. Deciding what makes the front page, what gets the primetime slot, what makes top billing on the evening news – these are all acts of recommendation. While private media organisations like Netflix primarily use these systems to drive user engagement with their content, public service media organisations, like the British Broadcasting Corporation (BBC) in the UK, operate with a different set of principles and values.

This report also explores how public service media organisations are addressing the challenge of designing and implementing recommendation systems within the parameters of their mission, and identifies areas for further research into how they can accomplish this goal.

While there is an extensive literature exploring public service values and a separate literature around the ethics and operational challenges of designing and implementing recommendation systems, there are still many gaps in the literature around how public service media organisations are designing and implementing these systems. Addressing these gaps can help ensure that public service media organisations are better able to design these systems. With this in mind, this project has explored the following questions:

  • What are the values that public service media organisations adhere to? How do these differ from the goals that private-sector organisations are incentivised to pursue?
  • In what contexts do public service media use recommendation systems?
  • What value can recommendation systems add for public service media and how do they square with public service values?
  • What are the ethical risks that recommendation systems might raise in those contexts? And what challenges should teams consider?
  • What are the mitigations that public service media can implement in the design, development, and implementation of these systems?

In answering these questions, we focused on European public service media organisations and in particular on the BBC in the UK, who are project partners on this research.

The BBC is the world’s largest public service media organisation and has been at the forefront of public service broadcasters exploring the use of recommendation systems. As the BBC has historically set precedents that other public service media have followed, it is valuable to understand its work in depth in order to draw wider lessons for the field.

In this report, we explore an in-depth snapshot of the BBC’s development and use of several recommendation systems from summer and autumn 2021, alongside an examination of the work of several other European public service media organisations. We place these examples in the broader context of debates around 21st century public service media and use them to explore the motivations, risks and evaluation of the use of recommendation systems by public service media and their use more broadly.

The evidence for this report stems from interviews with 11 current staff from editorial, product and engineering teams involved in recommendation systems at the BBC, along with interviews with representatives of six other European public service broadcasters that use recommendation systems. This report also draws on a review of the existing literature on public service media recommendation systems and on interviews with experts from academia, civil society and government.

Findings

Across these different public service media organisations, our research has found five key findings:

  1. The contextual role of public service media organisations is a major driver for their increasing use of recommendation systems. The last few decades have seen public service media organisations lose market share of news and entertainment to private providers, putting pressure on public service media organisations to use recommendation systems to stay competitive.
  2. The values of public service media organisations create different objectives and practices to those in the private sector. While private-sector media organisations are primarily driven to maximise shareholder revenue and market share, with some consideration of social values, public service media organisations are legally mandated to operate with a particular set of public interest values at their core, including universality, independence, excellence, diversity, accountability and innovation.
  3. These value differences translate into different objectives for the use of recommendation systems. While private firms seek to maximise metrics like user engagement, ‘time on product’ and subscriber retention in the use of their recommendation systems, public service media organisations seek related but different objectives. For example, rather than maximising engagement with recommendation systems, our research found public service media providers want to broaden their reach to a more diverse set of audiences. Rather than maximising time on product, public service media organisations are more concerned with ensuring the product is useful for all members of society, in line with public interest values.
  4. Public service media recommendation systems can raise a range of well-documented ethical risks, but these will differ depending on the type of system and context of its use. Our research found that public service media recognise a wide array of well-documented ethical risks of recommendation systems, including risks to personal autonomy, privacy, misinformation and fragmentation of the public sphere. However, the type and severity of the risks highlighted depended on which teams we spoke with, with audio-on-demand and video-on-demand teams raising somewhat different concerns to those working on news.
  5. Evaluating the risks and mitigations of recommendation systems must be done in the context of the wider product. Addressing the risks of public service media recommendation systems should not just focus on technical fixes. Aligning product goals and other product features with public service values are just as important in ensuring recommendation systems positive contribute the experiences of audiences and to wider society.

Recommendations

Based on these key findings, we make nine recommendations for future research, experimentation and collaboration between public service media organisations, academics, funders and regulators:

  1. Define public service value for the digital age. Recommendation systems are designed to optimise against specific objectives. However, the development and implementation of recommendation systems is happening at a time when the concept of public service value and the role of public service media organisations is under question. Unless public service media organisations are clear about their own identities and purpose, it will be difficult for them to build effective recommendation systems. In the UK, significant work has already been done by Ofcom as well as the Department for Digital, Culture, Media and Sport’s parliamentary Select Committee to identify the challenges public service media face and offer new approaches to regulation. Their recommendations must be implemented so that public service media can operate within a paradigm appropriate to the digital age and build systems that address a relevant mission.
  2. Fund a public R&D hub for recommendation systems and responsible recommendation challenges. There is a real opportunity to create a hub for R&D of recommendation systems that are not tied to industry goals. This is especially important as recommendation systems are one of the prime use cases of behaviour modification technology but research into it is impaired by lack of access to interventional data.  Therefore, as part of UKRI’s National AI Research and Innovation (R&I) Programme set out in the UK AI Strategy, it should fund the development of a public research hub on recommendation technology.
  3. Publish research into audience expectations of personalisation. There was a striking consensus in our interviews with public service media teams working on recommendations that personalisation was both wanted and expected by the audience. However, there is limited publicly available evidence underlying this belief and more research is needed. Understanding audience’s views towards recommendation systems is an important part of ensuring those systems are acting in the public interest. Public service media organisations should not widely adopt recommendation systems without evidence that they are either wanted or needed by the public. Otherwise, public service media risk simply following a precedent set by commercial competitors, rather than defining a paradigm aligned to their own missions.
  4. Communicate and be transparent with audiences. Although most public service media organisations profess a commitment to transparency about their use of recommendation systems, in practice there is little effective communication with their audiences about where and how recommendation systems are being used. Public service media should invest time and research into understanding how to usefully and honestly articulate their use of recommendation systems in ways that are meaningful to their audiences. This communication must not be one way. There must be opportunities for audiences to give feedback and interrogate the use of the systems, and raise concerns.
  5. Balance user control with convenience. Transparency alone is not enough. Giving users agency over the recommendations they see is an important part of responsible recommendation. Simply giving users direct control over the recommendation system is an obvious and important first step, but it is not a universal solution. We recommend that public service media providers experiment with different kinds of options, including enabling algorithmic choice of recommendation systems and ‘joint’ recommendation profiles.
  6. Expand public participation. Beyond transparency or individual user choice and control over the parameters of the recommendation systems already deployed, users and wider society could also have greater input during the initial design of the recommendation systems and in the subsequent evaluations and iterations. This is particularly salient for public service media organisations as, unlike private companies which are primarily accountable to their customers and shareholders, public service media organisations have an obligation to serve the interests of society. Therefore, even those who are not direct consumers of content should have a say in how public service media recommendations are shaped.
  7. Standardise metadata. Inconsistent, poor quality metadata – an essential resource for training and developing recommendation systems – was consistently highlighted as a barrier to developing recommendation systems in public service media, particularly in developing more novel approaches that go beyond user engagement and try to create diverse feeds of recommendations. Each public service media organisation should have a central function that standardises the format, creation and maintenance of metadata across the organisation. Institutionalising the collection of metadata and making access to it more transparent across each individual organisation is an important investment in public service media’s future capabilities.
  8. Create shared recommendation system resources. Given their limited resources and shared interests, public service media organisations should invest more heavily in creating common resources for evaluating and using recommendation systems. This could include a shared repository for evaluating recommendation systems on metrics valued by public service media, including libraries in common coding languages.
  9. Create and empower integrated teams. When developing and deploying recommendation systems, public service media organisations need to integrate editorial and development teams from the start. This ensures that the goals of the recommendation system are better aligned with the organisation’s goals as a whole and ensure the systems augment and complement existing editorial expertise.

How to read this report

This report examines how European public service media organisations think about using automated recommendation systems for content curation and delivery. It covers the context in which recommendation systems are being deployed, why that matters, the ethical risks and evaluation difficulties posed by these systems and how public service media are attempting to mitigate these risks. It also provides ideas for new approaches to evaluation that could enable better alignment of their systems with public service values.

If you need an introduction or refresher on what recommendation systems are, we recommend starting with the ‘Introducing recommendation systems’.

If you work for a public service media organisation

  • We recommend the chapters on ‘Stated goals and potential risks of using recommendation systems in public service media’ and ‘Evaluation of recommendation systems’.
  • For an understanding of how the BBC has deployed recommendation systems, see the case studies.
  • For ideas on how public service media organisations can advance their responsible use of recommendation systems, see the chapter on ‘Outstanding questions and areas for further research and experimentation’.

If you are a regulator of public service media

  • We recommend you pay particular attention to the section on ‘Stated goals and potential risks of using recommendation systems in public service media’ and ‘How do public service media evaluate their recommendation systems?’.
  • In addition, to understand the practices and initiatives that we believe should be encouraged within and experimented with by public service media organisations to ensure responsible and effective use of recommendation systems, see ‘Outstanding questions and areas for further research and experimentation’.

If you are a regulator of online platforms

  • If you need an introduction or refresher on what recommendation systems are, we recommend starting with the ‘Introducing recommendation systems’. Understanding this context can help disentangle the challenges in regulating recommendation systems, by highlighting where problems arise from the goals of public service media versus the process of recommendation itself.
  • To understand the issues faced by all deployers of recommendation systems, see the sections on the ‘Stated goals of recommendation systems’ and ‘Potential risks of using recommendation systems’.
  • To better understand how these risks change due to the context and choices of public service media, relative to other online platforms, and the difficulties even organisations explicitly oriented towards public value have in auditing their own recommendation systems to determine whether they are socially beneficial, beyond simple quantitative engagement metrics, see the section on ‘How these risks are viewed and addressed by public service media’ and the chapter on ‘Evaluation of recommendation systems’.

If you are a funder of research into recommendation systems or a researcher interested in recommendation systems

  • Public service media organisations, with mandates that emphasise social goals of universality, diversity and innovation over engagement and profit-maximising, can offer an important site of study and experimentation for new approaches to recommendation system design and evaluation. We recommend starting with the sections on ‘The context of public service values and public service media’ and ‘why this matters’, to understand the different context within which public service media organisations operate.
  • Then, the sections on ‘How do public service media evaluate their recommendation systems?’ and ‘How could evaluations be done differently?’, followed by the chapter on ‘Outstanding questions and areas for further research and experimentation’, could provide inspiration for future research projects or pilots that you could undertake or fund.

Introduction

Scope

Recommendation systems are tools designed to sift through the vast quantities of data available online and use algorithms to guide users towards a narrower selection of material, according to a set of criteria chosen by their developers. Recommendation systems sit behind a vast array of digital experiences. ‘Other users also purchased…’ on Amazon or ‘Watch next’ on Netflix guide you to your next purchase or night on the sofa. Deliveroo will suggest what to eat, LinkedIn where to work and Facebook who your friends might be.

These practices are credited with driving the success of companies like Netflix and Spotify. But they are also blamed for many of the harms associated with the internet, such as the amplification of harmful content, the polarisation of political viewpoints (although the evidence is mixed and inconclusive)[footnote]Cobbe, J. and Singh, J. (2019). ‘Regulating Recommending: Motivations, Considerations, and Principles’. European Journal of Law and Technology, 10(3), pp. 8–10. Available at: https://ejlt.org/index.php/ejlt/article/view/686; Steinhardt, J. (2021). ‘How Much Do Recommender Systems Drive Polarization?’. UC Berkeley. Available at: https://jsteinhardt.stat.berkeley.edu/blog/recsys-deepdive; Stray, J. (2021). ‘Designing Recommender Systems to Depolarize’, p. 2. arXiv. Available at: http://arxiv.org/abs/2107.04953[/footnote] and the entrenchment of inequalities.[footnote]Born, G. Morris, J. Diaz, F. and Anderson, A. (2021). Artificial intelligence, music recommendation, and the curation of culture: A white paper, pp. 10–13. Schwartz Reisman Institute for Technology and Society. Available at: https://static1.squarespace.com/static/5ef0b24bc96ec4739e7275d3/t/60b68ccb5a371a1bcdf79317/1622576334766/Born-Morris-etal-AI_Music_Recommendation_Culture.pdf[/footnote] Regulators and policymakers worldwide are paying increasing attention to the potential risks of recommendation systems, with proposals in China and Europe to regulate their design, features and uses.[footnote]See: European Union. (2022). Digital Services Act, Article 27. Available at: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=OJ:L:2022:277:TOC; For details of Article 17 of the Cybersecurity Administration of China (CAC)’s Internet Information Service Algorithm Recommendation Management Regulations, see: Huld, A. (2022). ‘China Passes Sweeping Recommendation Algorithm Regulations’. China Briefing News. Available at: https://www.china-briefing.com/news/china-passes-sweeping-recommendation-algorithm-regulations-effect-march-1-2022/[/footnote]

Public service media organisations are starting to follow the example of their commercial rivals and adopt recommendation systems. Like the big digital streaming service providers, they sit on huge catalogues of news and entertainment content, and can use recommendation systems to direct audiences to particular options.

But public service media organisations face specific challenges in deploying these technologies. Recommendation systems are designed to optimise for certain objectives: a hotel’s website is aiming for maximum bookings, Spotify and Netflix want you to renew your subscription.

Public service media serve many functions. They have a duty to serve the public interest, not the company bottom line. They are independently financed and are controlled by, if not answerable to, the public.[footnote]Conseil mondial de la radiotélévision. (2001). Public broadcasting: why? how? pp. 11–15. UNESCO Digital Library. Available at: https://unesdoc.unesco.org/ark:/48223/pf0000124058[/footnote] Their mission is to inform, educate and entertain. Public service media are committed to values including independence, excellence and diversity.[footnote]European Broadcasting Union. (2012). Empowering Society: A Declaration on the Core Values of Public Service Media. Available at: https://www.ebu.ch/files/live/sites/ebu/files/Publications/EBU-Empowering-Society_EN.pdf[/footnote] They must fulfil an array of duties and responsibilities set down in legislation that often predates the digital era. How do you optimise for all that?

Developing recommendation systems for public service media is not just about finding technical fixes. It requires an interrogation of the organisations’ role in democratic societies in the digital age. How do the public service values that have guided them for a century translate to a context where the internet has fragmented the public sphere and audiences are defecting to streaming services? And how can public service media use this technology in ways that serve the public interest?

These are questions that resonate beyond the specifics of public service media organisations. All public institutions that wish to use technologies for societal benefit must grapple with similar issues. And all organisations – public or private – have to deploy technologies in ways that align with their values. Asking these questions can be helpful to technologists more generally.

In a context where the negative impacts of recommendation systems are increasingly apparent, public service media must tread carefully when considering their use. But there is also an opportunity for public service media to do what, historically, it has excelled at – innovating in the public interest.

A public service approach to building recommendation systems that are both engaging and trustworthy could not only address the needs of public service media in the digital age, but provide a benchmark for scrutiny of systems more widely and create a challenge to the paradigm set by commercial operators’ practices.

In this report, we explore how public service media organisations are addressing the challenge of designing and implementing recommendation systems within the parameters of their organisational mission, and identify areas for further research into how they can accomplish this goal.

While there is an extensive literature exploring public service values and a separate literature around the ethics and operational challenges of designing and implementing recommendation systems, there are still many gaps in the literature around how public service media organisations are designing and implementing these systems. Addressing that gap can help ensure that public service media organisations are better able to design these systems. With that in mind, this report explores the following questions:

  • What are the values that public service media organisations adhere to? How do these differ from the goals that private-sector organisations are incentivised to pursue?
  • In what contexts do public service media use recommendation systems?
  • What value can recommendation systems add for public service media and how do they square with public service values?
  • What are the ethical risks that recommendation systems might raise in those contexts? And what challenges should different teams within public service media organisations (such as product, editorial, legal and engineering) consider?
  • What are the mitigations that public service media can implement in the design, development and implementation of these systems?

In answering these questions, this report:

  • provides greater clarity about the ethical challenges that developers of recommendation systems must consider when designing and maintaining these systems
  • explores the social benefit of recommendation systems by examining the trade-offs between their stated goals and their potential risks
  • provides examples of how public service broadcasters are grappling with these challenges, which can help inform the development of recommendation systems in other contexts.

This report focuses on European public service media organisations and in particular on the British Broadcasting Corporation (BBC) in the UK, who are project partners on this research. The BBC is the world’s largest public service media organisation and has been at the forefront amongst public service broadcasters of exploring the use of recommendation systems. As the BBC has historically set precedents that other public service media have followed, it is valuable to understand its work in depth in order to draw wider lessons for the field.

In this report, we explore an in-depth snapshot of the BBC’s development and use of several recommendation systems as it stood in 2021, alongside an examination of the work of several other European public service media organisations. We place these examples in the broader context of debates around 21st century public service media and use them to explore the motivations, risks and evaluation of the use of recommendation systems by public service media and their use more broadly.

The evidence for this report stems from interviews with 11 current staff from editorial, product and engineering teams, involved in recommendation systems at the BBC, along with interviews with representatives of six other European public service broadcasters that use recommendation systems. This report also draws on a review of the existing literature on public service media recommendation systems and on interviews with experts from academia, civil society and regulation who work on the design, development, and evaluation of recommendation systems.

Although a large amount of the academic literature focuses on the use of recommendations in news provision, we look at the full range of public service media content, as we found more of the advanced implementations of recommendation systems lie in other domains. We have drawn on published research about recommendation systems from commercial platforms, however, internal corporate studies are unavailable to independent researchers and our requests to interview both researchers and corporate representatives of platforms were unsuccessful.

Background

In this chapter, we set out the context for the rest of the report. We outline the history and context of public service media organisations, what recommendation systems are and how they are approached by public service media organisations, and what external and internal processes and constraints govern their use.

The context of public service values and public service media

The use of recommendation systems in public service media is informed by their history, values and remit, their governance and the landscape in which they operate. In this section we situate the deployment of recommendation systems in this context.

Broadly, public service media are independent organisations that have a mission to inform, educate and entertain. Their values are rooted in the founding vision for public service media organisations a century ago and remain relevant today, codified into regulatory and governance frameworks at organisational, national and European levels. However the values that public service media operate under are inherently qualitative and, even with the existence of extensive guidelines, are interpreted through the daily judgements of public service media staff and the mental models and institutional culture built up over time.

Although public service media have been resilient to change, they currently face a trio of challenges:

  1. Losing audiences to online digital content providers including Netflix, Amazon, YouTube and Spotify.
  2. Budget cuts and outdated regulation, framed around analogue broadcast commitments, hampering their ability to respond to technological change.
  3. Populist political movements undermining their independence.

Public service media are independent media organisations financed by and answerable to the publics they serve.[footnote]Conseil mondial de la radiotélévision. (2001). Public broadcasting: why? how? pp. 11–15. UNESCO Digital Library. Available at: https://unesdoc.unesco.org/ark:/48223/pf0000124058[/footnote] Their roots lie in the 1920s technological revolution of radio broadcasting when the BBC was established as the world’s first public service broadcaster, funded by a licence fee, and with the ambition to ‘bring the best of everything to the greatest number of homes’.[footnote]BBC. (2022). The BBC Story – 1920s factsheet. Available at: http://downloads.bbc.co.uk/historyofthebbc/1920s.pdf[/footnote] Other national broadcasters were soon founded across Europe and also adopted the BBC’s mission to ‘inform, educate and entertain’. Although there are now public service media organisations in almost every country in the world, this report focuses on European public service media, which share comparable social, political and regulatory developments and therefore a similar context when considering the implementation of recommendation systems.

Public service media organisations have come to play an important institutional role within democratic societies in Europe, creating a bulwark against the potential control of public opinion either by the state or by particular interest groups.[footnote]Tambini, D. (2021). ‘Public service media should be thinking long term when it comes to AI’. Media@LSE. Available at: https://blogs.lse.ac.uk/medialse/2021/05/12/public-service-media-should-be-thinking-long-term-when-it-comes-to-ai/[/footnote] The establishment of public service broadcasters for the first time created a universally accessible public sphere where, in the words of the BBC’s founding chairman Lord Reith, ‘the genius and the fool, the wealthy and the poor listen simultaneously’. They aimed to forge a collective experience, ‘making the nation as one man’.[footnote]Higgins, C. (2014). ‘What can the origins of the BBC tell us about its future?’. The Guardian. Available at: https://www.theguardian.com/media/2014/apr/15/bbc-origins-future[/footnote] At the same time public service media are expected to reflect the diversity of a nation, enabling the wide representation of perspectives in a democracy, as well as giving people sufficient information and understanding to make decisions on issues of public importance. These two functions create an inherent tension between public service media as an agonistic space where different viewpoints compete and a consensual forum where the nation comes together. 

Public service values

The founding vision for public service media has remained within the DNA of organisations as their public service values – often called Reithian principles, in reference to the influence of the BBC’s founding chairman.

The European Broadcasting Union (EBU), the membership organisation for public service media in Europe, has codified the public service mission into six core values: universality, independence, excellence, diversity, accountability and innovation, and member organisations commit to strive to uphold these in practice.[footnote]European Broadcasting Union. (2012). Empowering Society: A Declaration on the Core Values of Public Service Media. Available at: https://www.ebu.ch/files/live/sites/ebu/files/Publications/EBU-Empowering-Society_EN.pdf[/footnote]

 

Public service value Meaning
Universality ·  reach all segments of society, with no-one excluded

· share and express a plurality of views and ideas

· create a public sphere, in which all citizens can form their own opinions and ideas, aiming for inclusion and social cohesion

· multi-platform

· accessible for everyone

· enable audiences to engage and participate in a democratic society.

Independence · trustworthy content

· act in the interest of audiences

· completely impartial and independent from political, commercial and other influences and ideologies

· autonomous in all aspects of the remit such as programming, editorial decision-making, staffing

· independence underpinned by safeguards in law.

Excellence · high standards of integrity professionalism and quality; create benchmarks within the media industries

·  foster talent

· empower, enable and enrich audiences

· audiences are also participants.

Diversity · reflect diversity of audiences by being diverse and pluralistic in the genres of programming, the views expressed, and the people employed

· support and seek to give voice to a plurality of competing views – from those with different backgrounds, histories and stories. Help build a more inclusive, less fragmented society.

Accountability · listen to audiences and engage in a permanent and meaningful debate

· publish editorial guidelines. Explain. Correct mistakes. Report on policies, budgets, editorial choices

· be transparent and subject to constant public scrutiny

· be efficient and managed according to the principles of good governance.

Innovation · enrich the media environment

· be a driving force of innovation and creativity

· develop new formats, new technologies, new ways of connectivity with audiences

· attract, retain and train our staff so that they can participate in and shape the digital future, serving the public.

As well as signing up to these common values, each individual public service media organisation has its own articulation of its mission, purpose and values, often set out as part of its governance.[footnote]Statutory governance of public service media also varies from country to country and reflects national political and regulatory norms. The BBC is regulated by the independent broadcasting regulator Ofcom. The European Union’s revised Audio Visual Service Directive requires member states to have an independent regulator but this can take different forms. See: European Commission. (2018). Digital Single Market: updated audiovisual rules. Available at: https://ec.europa.eu/commission/presscorner/detail/en/MEMO_18_4093. For example, France has a central regulator, the Conseil Supérieur de l’Audiovisuel. But in Germany, although public service media objectives are defined in the constitution, oversight is provided by a regional broadcasting council, Rundfunkrat, reflecting the country’s federal structure. In Belgium too, regulation is devolved to two separate councils representing the country’s French and Flemish speaking regions.[/footnote] Ultimately these will align with those described by the EBU but may use different terms or have a different emphasis. Policymakers and practitioners operating at a national level are more likely to refer to these specific expressions of public values. The overarching EBU values are often referenced in academic literature as the theoretical benchmark for public service values. 

In the case of the BBC, the Royal Charter between the Government and the BBC is agreed for a 10 year period.[footnote]BBC. (2017). ‘Mission, values and public purposes’. Available at: https://www.bbc.com/aboutthebbc/governance/bbc.com/aboutthebbc/governance/mission/. For comparison, ARD, the German public service media organisation articulates its values as: ‘Participation, Independence, Quality, Diversity, Localism, Innovation, Value Creation, Responsibility’. See: ARD. (2021). Die ARD – Unser Beitrag zum Gemeinwohl. Available at: https://www.ard.de/die-ard/was-wir-leisten/ARD-Unser-Beitrag-zum-Gemeinwohl-Public-Value-100[/footnote]

The BBC: governance and values

 

Mission: to act in the public interest, serving all audiences through the provision of impartial, high-quality and distinctive output and services which inform, educate and entertain.

 

Public purposes:

  1. To provide impartial news and information to help people understand and engage with the world around them.
  2. To support learning for people of all ages.
  3. To show the most creative, highest quality and distinctive output and services.
  4. To reflect, represent and serve the diverse communities of all of the United Kingdom’s nations and regions and, in doing so, support the creative economy across the United Kingdom.
  5. To reflect the United Kingdom, its culture and values to the world.

 

Additionally, the BBC has its own set of organisational values that are not part of the governance agreement but that ‘represent the expectations we have for ourselves and each other, they guide our day-to-day decisions and the way we behave’:

  • Trust: Trust is the foundation of the BBC – we’re independent, impartial and truthful.
  • Respect: We respect each other – we’re kind, and we champion inclusivity.
  • Creativity: Creativity is the lifeblood of our organisation.
  • Audiences: Audiences are at the heart of everything we do.
  • One BBC: We are One BBC – we collaborate, learn and grow together.
  • Accountability: We are accountable and deliver work of the highest quality.

These kinds of regulatory requirements and values are then operationalised internally through organisations’ editorial guidelines which again will vary from organisation to organisation, depending on the norms and expectations of their publics. Guidelines can be extensive and their aim is to help teams put public service values into practice. For example, the current BBC guidelines run to 220 pages, covering everything from how to run a competition, to reporting on wars and acts of terror.

Nonetheless, such guidelines leave a lot of room for interpretation. Public service values are, by their nature, qualitative and difficult to measure objectively. For instance, consider the BBC guidelines on impartiality – an obligation that all regulated broadcasters in the UK must uphold – and over which the BBC has faced intense scrutiny:

‘The BBC is committed to achieving due impartiality in all its output. This commitment is fundamental to our reputation, our values and the trust of audiences. The term “due” means that the impartiality must be adequate and appropriate to the output, taking account of the subject and nature of the content, the likely audience expectation and any signposting that may influence that expectation.’

‘Due impartiality usually involves more than a simple matter of ‘balance’ between opposing viewpoints. We must be inclusive, considering the broad perspective and ensuring that the existence of a range of views is appropriately reflected. It does not require absolute neutrality on every issue or detachment from fundamental democratic principles, such as the right to vote, freedom of expression and the rule of law. We are committed to reflecting a wide range of subject matter and perspectives across our output as a whole and over an appropriate timeframe so that no significant strand of thought is under-represented or omitted.’ 

It’s clear that impartiality is a question of judgement and may not even be expressed in a single piece of content but over the range of BBC output over a period of time. In practice, teams internalise these expectations and make decisions based on institutional culture and internal mental models of public service value, rather than continually checking the editorial guidelines or referencing any specific public values matrix.[footnote]Mazzucato, M., Conway, R., Mazzoli, E., Knoll E. and Albala, S. (2020). Creating and measuring dynamic public value at the BBC, p.22. UCL Institute for Innovation and Public Purpose. Available at: https://www.ucl.ac.uk/bartlett/public-purpose/sites/public-purpose/files/final-bbc-report-6_jan.pdf[/footnote]

How public service media differ from other media organisations

Public service media are answerable to the publics they serve.[footnote]Not all public service media are publicly funded. Channel 4 in the UK for example is financed through advertising but owned by the public (although the UK Government has opened a consultation on privatisation).[/footnote] They should be independent from both government influence and from the influence of commercial owners. They operate to serve the public interest.

Commercial media, however, serve the interests of their owners or shareholders. Success for Netflix for example is measured in numbers of subscribers which then translates into revenues.[footnote]Circulation and profits for print media have declined in recent years but in some cases promote their proprietors’ interests through political influence – for instance the Murdoch-owned Sun in the UK or the Axel Springer-owned Bild Zeitung in Germany.[/footnote]

The activities of commercial media are nonetheless limited by regulation. In the UK the independent regulator Ofcom’s Broadcasting Code requires all broadcasters (not just public service media) to abide by principles such as fairness and impartiality.[footnote]Ofcom. (2020). The Ofcom Broadcasting Code (with the Cross-promotion Code and the On Demand Programme Service Rules). Available at: https://www.ofcom.org.uk/tv-radio-and-on-demand/broadcast-codes/broadcast-code[/footnote] Russia Today for example has been investigated for allegedly misleading reporting on the conflict in Ukraine.[footnote]Ofcom. (2022). ‘Ofcom launches 15 investigations into RT’. Available at: https://www.ofcom.org.uk/news-centre/2022/ofcom-launches-investigations-into-rt[/footnote] Streaming services are subject to more limited regulation which covers child protection, incitement to hatred and product placement,[footnote]Ofcom. (2021). Guide to video on demand. Available at: https://www.ofcom.org.uk/tv-radio-and-on-demand/advice-for-consumers/television/video-on-demand[/footnote] while the press – both online and in print – are largely lightly self-regulated through the Independent Press Standards Organisation, with some publications regulated by IMPRESS.[footnote]Independent Press Standards Organisation (IPSO). (2022). ‘What we do’. Available at: https://www.ipso.co.uk/what-we-do/; IMPRESS. ‘Regulated Publications’. Available at: https://impress.press/regulated-publications/[/footnote]

However, public service media have extensive additional obligations, amongst others to ‘meet the needs and satisfy the interests of as many different audiences as practicable’ and ‘reflect the lives and concerns of different communities and cultural interests and traditions within the United Kingdom, and locally in different parts of the United Kingdom’,[footnote]UK Government. Communications Act 2003, section 265. Available at: https://www.legislation.gov.uk/ukpga/2003/21/section/265[/footnote] 

These regulatory systems vary from country to country but hold broadly the same characteristics. In all cases, the public service remit entails far greater duties than in the private sector and broadcasters are more heavily regulated than digital providers.

These obligations are also framed in terms of public or societal benefit. This means public service media are striving to achieve societal goals that may not be aligned with a pure maximisation of profits, while commercial media pursue interests more aligned with revenue and the interests of their shareholders.

Nonetheless, public service media face scrutiny about how well they meet their objectives and have had to create proxies for these intangible goals to demonstrate their value to society.

‘[Public service media] is fraught today with political contention. It must justify its existence and many of its efforts to governments that are sometimes quite hostile, and to special interest groups and even competitors. Measuring public value in economic terms is therefore a focus of existential importance; like it or not diverse accountability processes and assessment are a necessity.’[footnote]Lowe, G. and Martin, F. (eds.). (2014). The Value and Values of Public Service Media.[/footnote]

In practice this means public service media organisations measure their services against a range of hard metrics, such as audience reach and value for money, as well as softer measures like audience satisfaction surveys.[footnote]BBC. (2021). BBC Annual Plan 2021-22, Annex 1. Available at: http://downloads.bbc.co.uk/aboutthebbc/reports/annualplan/annual-plan-2021-22.pdf[/footnote] In the mid-2000s the BBC developed a public value test to inform strategic decisions that has since been adopted as a public interest test which remains part of the BBC’s governance. Similar processes have been created in other public service media systems, such as the ‘Three Step Test’ in German broadcasting.[footnote]The 12th Inter-State Broadcasting Treaty, the regulatory framework for public service and commercial broadcasting across Germany’s federal states, introduced a three-step test for assessing whether online services offered by public service broadcasters met their public service remit. Under the three-step test, the broadcaster needs to assess: first, whether a new or significantly amended digital service satisfies the democratic, social and cultural needs of society; second, whether it contributes to media competition from a qualitative point of view and; third, the associated financial cost. See: Institute for Media and Communication Policy. (2009). Drei-Stufen-Test. Available at: http://medienpolitik.eu/drei-stufen-test/[/footnote] These methods have their own limitations, drawing public media into a paradigm of cost-benefit analysis and market fixing, rather than articulating wider values to individuals, society and industry.[footnote]Mazzucato, M., Conway, R., Mazzoli, E., Knoll E. and Albala, S. (2020). Creating and measuring dynamic public value at the BBC, p.22. UCL Institute for Innovation and Public Purpose. Available at: https://www.ucl.ac.uk/bartlett/public-purpose/sites/public-purpose/files/final-bbc-report-6_jan.pdf[/footnote] 

This does not mean commercial media are devoid of values. Spotify for example says its mission ‘is to unlock the potential of human creativity—by giving a million creative artists the opportunity to live off their art and billions of fans the opportunity to enjoy and be inspired by it’,[footnote]Spotify. (2022). ‘About Spotify’. Available at: https://newsroom.spotify.com/company-info/[/footnote] while Netflix’s organisational values are judgment, communication, curiosity, courage, passion, selflessness, innovation, inclusion, integrity and impact.[footnote]Netflix. (2022). ‘Netflix Culture’. Available at: https://jobs.netflix.com/culture[/footnote] Commercial media are also sensitive to issues that present reputational risk, for instance the outcry over Joe Rogan’s Spotify podcast propagating disinformation about COVID-19 or Jimmy Carr’s joke about the Holocaust.[footnote]Silberling, A. (2022). ‘Spotify adds COVID-19 content advisory’. TechCrunch. Available at: https://social.techcrunch.com/2022/03/28/spotify-covid-19-content-advisory-joe-rogan/; Jackson, S. (2022). ‘Jimmy Carr condemned by Nadine Dorries for “shocking” Holocaust joke about travellers in Netflix special His Dark Material’. Sky News. Available at: https://news.sky.com/story/jimmy-carr-condemned-for-disturbing-holocaust-joke-about-travellers-in-netflix-special-his-dark-material-12533148[/footnote]

However, commercial media harness values in service of their business model, whereas for public service media the values themselves are the organisational objective. Therefore, while the ultimate goal of a commercial media organisation is quantitative (revenue) the ultimate goal of public service media is qualitative (public value) – even if this is converted into quantitative proxies.

This difference between public and private media companies is fundamental in how they adopt recommendation systems. We discuss this further later in the report when examining the objectives of using recommendation systems.

Current challenges for public service media

Since their inception, public service media and their values have been tested and reinterpreted in response to new technologies.

The introduction of the BBC Light Programme in 1945, a light entertainment alternative to the serious fare offered by the BBC Home Service, challenged the principle of universality (not everyone was listening to the same content at the same time) as well as the balance between the mission to inform, educate and entertain (should public service broadcasting give people what they want or what they need?). The arrival of the video recorder, and then new channels and platforms, gave audiences an option to opt out of the curated broadcast schedule –where editors determined what should be consumed. While this enabled more and more personalised and asynchronous listening and viewing, it potentially reduced exposure to the serendipitous and diverse content that is often considered vital to the public service remit.[footnote]van Es, K. F. (2017). ‘An Impending Crisis of Imagination : Data‐Driven Personalization in Public Service Broadcasters’. Media@LSE. Available at: https://dspace.library.uu.nl/handle/1874/358206[/footnote] The arrival and now dominance of digital technologies comes amid a collision of simultaneous challenges which, in combination, may be existential.

Audience

Public service media have always had a hybrid role. They are obliged to serve the public simultaneously as citizens and consumers.[footnote]BBC Trust. (2012). BBC Trust assessment processes Guidance document. Available at: http://downloads.bbc.co.uk/bbctrust/assets/files/pdf/about/how_we_govern/pvt/assessment_processes_guidance.pdf[/footnote]

Their public service mandate requires them to produce content and serve audiences that the commercial market does not provide for. At the same time, their duty to provide a universal service means they must aim to reach a sizeable mainstream audience and be active participants in the competitive commercial market.

Although people continue to use and value public service media, the arrival of streaming services such as Netflix, Amazon and Spotify, as well as the availability of content on YouTube, has had a massive impact on public service media audience share.

In the UK, the COVID-19 pandemic has seen people return to public service media as a source of trusted information, and with more time at home they have also consumed more public service content.[footnote]BBC. (2021). Annual Plan 2021-22. Available at: http://downloads.bbc.co.uk/aboutthebbc/reports/annualplan/annual-plan-2021-22.pdf[/footnote]

But lockdowns also supercharged the uptake of streaming. By September 2020, 60% of all UK households subscribed to an on-demand service, up from 49% a year earlier. Just under half (47%) of all adults who go online now consider online services to be their main way of watching TV and films, rising to around two-thirds (64%) among 18–24 year olds.[footnote]Ofcom. (2021). Small Screen: Big Debate – Recommendations to Government on the future of Public Service Media. Available at: https://www.smallscreenbigdebate.co.uk/__data/assets/pdf_file/0023/221954/statement-future-of-public-service-media.pdf[/footnote]

Public service media are particularly concerned about their failure to reach younger audiences.[footnote]Lowe, G.F. and Maijanen, P. (2019). ‘Making sense of the public service mission in media: youth audiences, competition, and strategic management’. Journal of Media Business Studies. doi: 10.1080/16522354.2018.1553279; Schulz, A., Levy, D. and Nielsen, R.K. (2019). ‘Old, Educated, and Politically Diverse: The Audience of Public Service News’, pp. 15–19, 29–30. Reuters Institute for the Study of Journalism. Available at: https://reutersinstitute.politics.ox.ac.uk/our-research/old-educated-and-politically-diverse-audience-public-service-news[/footnote] Although this group still encounters public service media content, they tend to do so on external services: younger viewers (16–34 year olds) are more likely to watch BBC content on subscription video-on-demand (SVoD) services rather than through BBC iPlayer (4.7 minutes per day on SVoD vs. 2.5 minutes per day on iPlayer).[footnote]Ofcom. (2021). Small Screen: Big Debate – Recommendations to Government on the future of Public Service Media. Available at: https://www.smallscreenbigdebate.co.uk/__data/assets/pdf_file/0023/221954/statement-future-of-public-service-media.pdf[/footnote] They are not necessarily aware of the source of the content and do not create an emotional connection with the public service media as a trusted brand. Meanwhile, platforms gain valuable audience insight data through this consumption which they do not pass onto the public service media organisations.[footnote]House of Commons Digital, Culture, Media and Sport Committee. (2021). The future of public service broadcasting, HC 156. Available at: https://publications.parliament.uk/pa/cm5801/cmselect/cmcumeds/156/156.pdf[/footnote]

Regulation

Legislation has not kept pace with the rate of technological change. Public service media are trying to grapple with the dynamics of the competitive digital landscape on stagnant or declining budgets, while continuing to meet their obligations to provide linear TV and radio broadcasting to a still substantial legacy audience.

The UK broadcasting regulator Ofcom published recommendations in 2021, repeating its previous demands for an urgent update to the public service media system to make it sustainable for the future. These include modernising the public service objectives, changing licences to apply across broadcast and online services and allowing greater flexibility in commissioning across platforms.[footnote]Ofcom. (2021). Small Screen: Big Debate – Recommendations to Government on the future of Public Service Media. Available at: https://www.smallscreenbigdebate.co.uk/__data/assets/pdf_file/0023/221954/statement-future-of-public-service-media.pdf[/footnote]

The Digital, Culture, Media and Sport Select Committee of the House of Commons has also demanded regulatory change. It warned that ‘hurdles such as the Public Interest Test inhibit the ability of [public service broadcasters] to be agile and innovate at speed in order to compete with other online services’ and that the core principle of universality would be threatened unless public service media were better able to attract younger audiences.[footnote]House of Commons Digital, Culture, Media and Sport Committee. (2021). The future of public service broadcasting, HC 156. Available at: https://publications.parliament.uk/pa/cm5801/cmselect/cmcumeds/156/156.pdf[/footnote]

Although there has been a great deal of activity around other elements of technology regulation, particularly the Online Safety Bill in the UK and the Digital Services Act in the European Union, the regulation of public service media has not been treated with the same urgency. There is so far no Government white paper for a promised Media Bill that would address this in the UK and the European Commission’s proposals for a European Media Freedom Act are in the early stages of consultation.[footnote]European Commission. (2022). ‘European Media Freedom Act: Commission launches public consultation’. Available at: https://ec.europa.eu/commission/presscorner/detail/en/ip_22_85[/footnote]

Political context

Public service media have always been a political battleground and have often had fractious relationships with the government of the day. But the rise of populist political movements and governments has created new fault lines and made public service media a battlefield in the culture wars. The Polish and Hungarian Governments have moved to undermine the independence of public service media, while the far-right AfD party in eastern Germany refused to approve funding for public broadcasting.[footnote]The Economist. (2021). ‘Populists are threatening Europe’s independent public broadcasters’. Available at: https://www.economist.com/europe/2021/04/08/populists-are-threatening-europes-independent-public-broadcasters[/footnote] In the UK, the Government has frozen the licence fee for two years and has said future funding arrangements are ‘up for discussion’. It has also been accused of trying to appoint an ideological ally to lead the independent media regulator Ofcom. Elsewhere in Europe, journalists from public service media have been attacked by anti-immigrant and COVID-denial protesters.[footnote]The Economist. (2021).[/footnote]

At the same time, public service media are criticised as unrepresentative of the publics they are supposed to serve. In the UK, both the BBC and Channel 4 have attempted to address this by moving parts of their workforce out of London.[footnote]The Sutton Trust. (2019). Elitist Britain, pp. 40–42. Available at: https://www.suttontrust.com/our-research/elitist-britain-2019/; Friedman, S. and Laurison, D. (2019). ‘The class pay gap: why it pays to be privileged’. The Guardian. Available at: https://www.theguardian.com/society/2019/feb/07/the-class-pay-gap-why-it-pays-to-be-privileged[/footnote] As social media has removed traditional gatekeepers to the public sphere, there is less acceptance of and deference towards the judgement of media decision-makers. In a fragmented public sphere, it becomes harder for public service media to ‘hold the ring’ – on issues like Brexit, COVID-19, race and transgender rights, public service media find themselves distrusted by both sides of the argument.

Although the provision of information and educational resources through the COVID-19 pandemic has given public service media a boost, both in audiences and in levels of trust, they can no longer take their societal value or even their continued existence for granted.[footnote]BBC. (2021). Annual Plan 2021-22. Available at: http://downloads.bbc.co.uk/aboutthebbc/reports/annualplan/annual-plan-2021-22.pdf[/footnote] Since the arrival of the internet, their monopoly on disseminating real-time information to a wide public has been broken and so their role in both the media and democratic landscape is up for grabs.[footnote]Interview with Jannick Kirk Sørensen, Associate Professor in Digital Media, Aalborg University (2021).[/footnote] For some, this means public service media is redundant.[footnote]Booth, P. (2020). New Vision: Transforming the BBC into a subscriber-owned mutual. Institute of Economic Affairs. Available at: https://iea.org.uk/publications/new-vision[/footnote] For others, its function should now be to uphold national culture and distinctiveness in the face of the global hegemony of US-owned platforms.[footnote]Department for Digital, Culture, Media & Sport and John Whittingdale OBE MP. (2021). John Whittingdale’s speech to the RTS Cambridge Convention 2021. UK Government. Available at: https://www.gov.uk/government/speeches/john-whittingdales-speech-to-the-rts-cambridge-convention-2021[/footnote]

The Institute for Innovation and Public Purpose has proposed reimagining the BBC as a ‘market shaper’ rather than a market fixer, based on a concept of dynamic public value,[footnote]Mazzucato, M., Conway, R., Mazzoli, E., Knoll E. and Albala, S. (2020). Creating and measuring dynamic public value at the BBC, p.22. UCL Institute for Innovation and Public Purpose. Available at: https://www.ucl.ac.uk/bartlett/public-purpose/sites/public-purpose/files/final-bbc-report-6_jan.pdf[/footnote] while the Media Reform Coalition calls for the creation of a Media Commons of independent, democratic and accountable media organisations, including a People’s BBC and Channel 4.[footnote]Grayson, D. (2021). Manifesto for a People’s Media. Media Reform Coalition. Available at: https://drive.google.com/file/u/1/d/1_6GeXiDR3DGh1sYjFI_hbgV9HfLWzhPi/view?usp=embed_facebook[/footnote] The wide range of ideas in play demonstrates how open the possible futures of public service media could be.

Introducing recommendation systems

The main steps in the development of a recommendation: user engagement with the platform, data gathering, algorithmic analysis and recommendation generation.

Day-to-day, we might turn to friends or family for their recommendations when it comes to decisions large and small. From dining out and entertainment, to big purchases. We might also look at expert reviews. But in the last decade, there has been a critical change in where recommendations come from and how they’re used. Recommendations have now become a pervasive feature of the digital products we use.

Recommendation systems are a type of software that filter information based on contextual data and according to criteria set by its designers. In this section, we briefly outline how recommendation systems operate and how they are used in practice by European public service media. At least a quarter of European public service media have begun deploying recommendation systems. They are mainly used on video platforms but they are only applied on small sections of services – the vast majority of public service content continues to be manually curated by editors.

In media organisations, producers, editors and journalists have always made implicit and explicit decisions about what to give prominence to, from what stories to tell and what programmes to commission, to – just as importantly – how those stories are presented. Deciding what makes the front page, what gets prime time, what makes top billing on the evening news – these are all acts of recommendation. For some, the entire institution is a system for recommending content to their audiences.

Public service media organisations are starting to automate these decisions by using recommendation systems.

Recommendation systems are context-driven information filtering systems. They don’t use explicit search queries from the user (unlike search engines) and instead rank content based only on contextual information.[footnote]Tennenholtz, M. and Kurland, O. (2019). ‘Rethinking Search Engines and Recommendation Systems: A Game Theoretic Perspective’. Communications of the ACM, December 2019, 62(12), pp. 66–75. Available at: https://cacm.acm.org/magazines/2019/12/241056-rethinking-search-engines-and-recommendation-systems/fulltext; Jannach, D. and Adomavicius, G. (2016), ‘Recommendations with a Purpose’. RecSys ’16: Proceedings of the 10th ACM Conference on Recommender Systems, pp7–10. Available at: https://doi.org/10.1145/2959100.2959186; Jannach, D., Zanker, M., Felfernig, and Friedrich, G. (2010). Recommender Systems: An Introduction. Cambridge University Press. doi: 10.1017/CBO9780511763113; Ricci, F., Rokach, L. and Shapira, B. (2015). Recommender Systems Handbook. Springer New York: New York. doi: 10.1007/978-1-4899-7637-6[/footnote]

This can include:

  • the item being viewed, e.g. the current webpage, the article being read, the video that just finished playing etc.
  • the item being filtered and recommended, e.g. the length of the content, when the content was published, characteristics of the content, e.g. drama, sport, news – often described as metadata about the content
  • the users, e.g. their location or language preferences, their past interactions with the recommendation system etc.
  • the wider environment, e.g. the time of day.

Examples of well-known products utilising recommendation systems include:

  • Netflix’s homepage
  • Spotify’s auto-generated playlists and auto-play features
  • Facebook’s ‘People You May Know’ and ‘News Feed’
  • YouTube’s video recommendations
  • TikTok’s ‘For You’ page
  • Amazon’s ‘Recommended For You’, ‘Frequently Bought Together’, ‘Items Recently Viewed’, ‘Customers Who Bought This Item Also Bought’, ‘Best-Selling’ etc.[footnote]Singh, S. (2020). Why Am I Seeing This? – Case study: Amazon. New America. Available at: https://www.newamerica.org/oti/reports/why-am-i-seeing-this[/footnote]
  • Tinder’s swiping page[footnote]Liu, S. (2017). ‘Personalized Recommendations at Tinder’ [presentation]. Available at: https://www.slideshare.net/SessionsEvents/dr-steve-liu-chief-scientist-tinder-at-mlconf-sf-2017[/footnote]
  • LinkedIn’s ‘Recommend for you’ jobs page.
  • Deliveroo or UberEats’ ‘recommended’ sort for restaurants.

Recommendation systems and search engines

It is worth acknowledging the difference between recommendation systems and search engines, which can be thought of as query-driven information filtering systems. They filter, rank and display webpages, images and other items primarily in response to a query from a user (such as Google searching for ‘restaurants near me’). This is then often combined with the contextual information mentioned above. Google Search is the archetypal search engine in most Western countries but other widely used search engines include Yandex, Baidu and Yahoo. Many public service media organisations offer a query-driven search feature on their services that enables users to search for news stories or entertainment content.

In this report, we have chosen to focus on recommendation systems rather than search engines as the context-driven rather than query-driven approach of recommendation systems is much more analogous to traditional human editorial judgment and content curation.

Broadly speaking, recommendation systems take a series of inputs, filter and select which ones are most important, and produce an output (the recommendation). The inputs and outputs of recommendation systems are subject to content moderation (in which the pool of content is pre-screened and filtered) and curation (in which content is selected, organised and presented).

This starts by deciding what to input into the recommendation system. The pool of content to draw from is often dictated by the nature of the platform itself, such as activity from your friends, groups, events, etc. alongside adverts, as in the case of Facebook. In the case of public service media, the pool of content is often their back catalogue of audio, video or news content.

This content will have been moderated in some way before it reaches the recommendation system, either manually by human moderators or editors, or automatically through software tools. On Facebook, this means attempts to remove inappropriate user content, such as misinformation or hate speech, from the platform entirely, according to moderation guidelines. For a public service media organisation, this will happen in the commissioning and editing of articles, radio programmes and TV shows by producers and editorial teams.

The pool of content will then be further curated as it moves through the recommendation system, as certain pieces of content might be deemed appropriate to publish but not to recommend in a particular context, e.g. Facebook might want to avoiding recommending you posts in languages you don’t speak. In the case of public service media, this generally takes the form of business rules, which are editorial guidelines implemented directly into the recommendation system.

Some business rules apply equally across all users and further constrain the set of content that the system recommends content from, such as only selecting content from the past few weeks. Other rules apply after individual user recommendations have been generated and filter those recommendations based on specific information about the user’s context, such as not recommending content the user has already consumed.

For example, below are business rules that were implemented in BBC Sounds’ Xantus recommendation system, as of summer 2021:[footnote]Note that the business rules are subject to change, and so the rules given here are intended to be an indicative example only, representing a snapshot of practice at one point in time. See: Al-Chueyr Martins, T. (2021). ‘From an idea to production: the journey of a recommendation engine’ [presentation recording]. MLOps London. Available at: https://www.youtube.com/watch?v=dFXKJZNVgw4[/footnote]

Non-personalised business rules Personalised business rules
Recency Already seen items
Availability Local radio (if not consumed previously)
Excluded ‘master brands’, e.g., particular radio channels[footnote]Smethurst, M. (2014). Designing a URL structure for BBC programmes. Available at: https://smethur.st/posts/176135860[/footnote] Specific language (if not consumed previously)
Excluded genres Episode picking from a series
Diversification (1 episode per brand/series)

How different types of recommendation systems work

Not all recommendation systems are the same. One major difference relates to what categories of items a system is filtering and curating for. This can include, but isn’t limited to:

  • content, e.g. news articles, comments, user posts, podcasts, songs, short-form video, long-form video, movies, images etc. or any combination of these content types
  • people, e.g. dating app profiles, Facebook profiles, Twitter accounts etc.
  • metadata, e.g. the time, data, location, category etc. of a piece of content or the age, gender, location etc. of a person.

In this report, we mainly focus on:

  1. Media content recommendation systems: these systems rank and display pieces of media content, e.g. news articles, podcasts, short-form videos, radio shows, television shows, movies etc. to users of news websites, video-on-demand and streaming services, music and podcast apps etc.
  2. Media content metadata recommendation systems: these rank and display suggestions for information to classify pieces of media content, e.g. genre, people or places which appear in the piece of media, or other tags, to journalists, editors or other members of staff at media organisations.

Another important distinction between applications of recommendation systems is the role of the provider in choosing which set of items the recommendation system is applied to. There are three categories of use for recommendation systems:

  1. Open recommending: The recommendation system operates primarily on items that are generated by users of the platform, or otherwise indiscriminately automatically aggregated from other sources, without the platform curating or individually approving the items. Examples include YouTube, TikTok’s ‘For You’ page, Facebook’s ‘News Feed’ and many dating apps.
  2. Curated recommending: The recommendation system operates on items which are curated, approved or otherwise editorialised by the platform operating the recommendation system. These systems still primarily rely on items generated by external sources, sometimes blended with items produced by the platform. Often these external items will come in the form of licensed or syndicated content such as music, films, TV shows, etc. rather than user-generated items. Examples include Netflix, Spotify and Disney+.
  3. Closed recommending: The recommendation system operates exclusively on items generated or commissioned by the platform operating the recommendation system. Examples include most recommendation systems used on the website of news organisations.

Lastly, there are different types of technical approaches that a recommendation system may use to sort and filter content. The approaches detailed below are not mutually exclusive and can be combined in recommendation systems in particular contexts:

Type of filtering Example What does it do?
Collaborative filtering ‘Customers Who Bought This Item Also Bought’ on Amazon The system recommends items to users based on the past interactions and preferences of other users who are classified as having similar past interactions and preferences. These patterns of behaviour from other users are used to predict how the user seeing the recommendation would rate new items. Those item rating predictions are used to generate recommendations of items that have a high level of similarity with content previously popular with similar users.
Matrix factorisation Netflix’s ‘Watch Next’ feature A subclass of collaborative filtering, this method codifies users and items into a small set of categories based on all the user ratings in a system. When Netflix recommends movies, a user may be codified by how much they like action, comedy, etc. and a movie might be codified by how much it fits into these genres. This codified representation can then be used to guess how much a user will like a movie they haven’t seen before, based on whether these codified summaries ‘match’.

 

Content-based filtering Netflix’s ‘Action Movies’ list These methods recommend items based on the codified properties of the item stored in the database. If the profile of items a user likes mostly consists of action films, the system will recommend other items that are tagged as action films. The system does not draw on user data or behaviour to make recommendations.

Of these typologies, the public service media that we surveyed only use closed recommendation systems as they are applying recommendations to content they have commissioned or produced. However, we found examples of public service media using all types of filtering approaches: collaborative filtering, content-based filtering and hybrid recommendation systems.

How do European public service media organisations use recommendation systems?

The use of recommendation systems is common but not ubiquitous among public service media organisations in Europe. As of 2021, at least a quarter of European Broadcasting Union (EBU) member organisations were using recommendation systems on at least one of their content delivery platforms.[footnote]See Annex 1 for more details.[/footnote] Video-on-demand platforms are the most common use case for recommendation systems, followed by audio-on-demand and news content. As well as these public-facing recommendation systems, some public service media also use recommendation systems for internal-only purposes, such as systems that assist journalists and producers with archival research.[footnote]Interview with Ben Fields, Lead Data Scientist, Digital Publishing, BBC (2021).[/footnote]

Figure 1: Recommendation system use by European public service media by platform (EBU, 2020)

Platform on which public service media offers personalised recommendations Number of European Broadcasting Union member organisations Examples
Video-on-demand At least 18 BBC iPlayer
Audio-on-demand At least 10 BBC Sounds, ARD Audiothek
News content At least 7 VRT NWS app

Among the EBU member organisations which reported using recommendation systems in a 2020 survey, recommendations were displayed:

  • in a dedicated section on the on-demand homepage (by at least 16 organisations)
  • in the player as ‘play next’ suggestions (by at least 10 organisations)
  • as ‘top picks’ on the on-demand homepage (by at least 9 organisations).

Even among organisations that have adopted recommendation systems, their use remains very limited. NPO in the Netherlands was the only organisation we encountered that aims to have a fully algorithmically driven homepage on its main platform. In most cases, the vast majority of content remains under human editorial control, with only small sub-sections of the interface offering recommended content.

As editorial independence is a key public service value, as well as a differentiator of public service media from its private-sector competitors, it is likely most public service media will retain a significant element of curation. The requirement for universality also creates a strong incentive to ensure that there is a substantial foundation of shared information to which everyone in society should be exposed.

Recommendation systems in the BBC

The BBC is significantly larger in staff, output and audience than other European public service media organisations. It has a substantial research and development department and has been exploring the use of recommendation systems across a range of initiatives since 2008.[footnote]See Annex 2 for more details.[/footnote]

In 2017, the BBC Datalab was established with the aim of helping audiences discover relevant content by bringing together data from across the BBC, augmented machine learning and editorial expertise.[footnote]BBC. (2019). ‘Join the DataLab team at the BBC!’. BBC Careers. Available at: https://careerssearch.bbc.co.uk/jobs/job/Join-the-DataLab-team-at-the-BBC/40012; BBC Datalab. ‘Machine learning at the BBC’. Available at: https://datalab.rocks/[/footnote] It was envisioned as a central capability across the whole of the BBC (TV, radio, news and web) which would build a data platform for other BBC teams that would create consistent and relevant experiences for audiences across different products. In practice, this has meant collaborating with different product teams to develop recommendation systems.

The BBC now uses several recommendation systems, at different degrees of maturity, across different forms of media, including:

  • written content, e.g. the BBC News app and some international news services, such as the Spanish-language BBC Mundo, recommending additional new stories[footnote]McGovern, A. (2019). ‘Understanding public service curation: What do “good” recommendations look like?’. BBC. Available at: https://www.bbc.co.uk/blogs/internet/entries/887fd87e-1da7-45f3-9dc7-ce5956b790d2[/footnote]
  • audio-on-demand, e.g. BBC Sounds recommending radio programmes and music mixes a user might like
  • short-form video, e.g. BBC Sport and BBC+ (now discontinued) recommending videos the user might like
  • long-form video, e.g. BBC iPlayer recommending TV shows or films the user might like.
Approaches to the development of recommendation systems

Public service media organisations have the choice to buy an external ‘off the shelf’ recommendation system or build it themselves.

The BBC initially used third-party providers of recommendation systems but, as part of a wider review of online services, began to test the pros and cons of bringing this function in-house. Building on years of their own R&D work, the BBC found they were able to build a recommendation system that not only matched but could outperform the bought-in systems. Once it was clear that personalisation would be central to the future strategy of the BBC, they decided to bring all systems in-house with the aim of being ‘in control of their destiny’.[footnote]Interview with Andrew McParland, Principal Engineer, BBC R&D (2021).[/footnote] The perceived benefits include building up technical capability and understanding within the organisation, better control and integration of editorial teams, better alignment with public service values and greater opportunity to experiment in the future.[footnote]Commercial (i.e. non public service) BBC services however still use external recommendation providers. See: Taboola. (2021). ‘BBC Global News Chooses Taboola as its Exclusive Content Recommendations Provider’. Available at: https://www.taboola.com/press-release/bbc-global-news-chooses-taboola-as-its-exclusive-content-recommendations-provider[/footnote]

The BBC has far greater budgets and expertise than most other public service media organisations to experiment with and develop recommendation systems. But many other organisations have also chosen to build their own products. Dutch broadcaster NPO has a small team of only four or five data scientists, focused on building ‘smart but simple’ recommendations in-house, having found third-party products did not cater to their needs. It is also important to them that they should be able to safeguard their audience data and be able to offer transparency to public stakeholders about the way their algorithms work, neither of which they felt confident about when using commercial providers.[footnote]Interview with Arno van Rijswijk, Head of Data & Personalization, and Sarah van der Land, Digital Innovation Advisor, Nederlandse Publieke Omroep (NPO) (2021).[/footnote]

Several public service media organisations have joined forces through the EBU to develop PEACH[footnote]European Broadcasting Union. PEACH. Available at: https://peach.ebu.io/[/footnote] – a personalisation system that can be adopted by individual organisations and adapted to their needs. The aim is to share technical expertise and capacity across the public service media ecosystem, enabling those without their own in-house development teams to still adopt recommendation systems and other data-driven approaches. Although some public service media feel this is still not sufficiently tailored to their work,[footnote]Interview with Arno van Rijswijk, Head of Data & Personalization, and Sarah van der Land, Digital Innovation Advisor, Nederlandse Publieke Omroep (NPO) (2021).[/footnote] others find it not only caters to their needs but that it embodies their public service mission through its collaborative approach.[footnote]Interview with Matthias Thar, Bayerische Rundfunk (2021).[/footnote]

Although we are aware that some public service media continue to use third-party systems, we did not manage to secure research interviews with any organisations that currently do so.

How are public service media recommendation systems currently governed and overseen?

The governance of recommendation systems in public service media is created through a combination of data protection legislation, media regulation and internal guidelines. In this section, we outline the present and future regulatory environment in the UK and EU, and how internal guidelines influence development in the BBC and other public service media. Some public service media have reinterpreted their existing guidelines for operationalising public service values to make them relevant to the use of recommendation systems.

The use of recommendation systems in public service media is not governed by any single piece of legislation or governance. Oversight is generated through a combination of the statutory governance of public service media, general data protection legislation and internal frameworks and mechanisms. This complex and fragmented picture makes it difficult to assess the effectiveness of current governance arrangements.

External regulation

The structures that have been established to regulate public service media are based around analogue broadcast technologies. Many are ill-equipped to provide oversight of public service media’s digital platforms in general, let alone to specifically oversee the use of recommendation systems.

For instance, although Ofcom regulates all UK broadcasters, including the particular duties of public service media, its remit only covers the BBC’s online platforms and not, for example, the ITV Hub or All 4. Its approach to the oversight of BBC iPlayer is to set broad obligations rather than specific requirements and it does not inspect the use of recommendation systems. Both the incentives and sanctions available to Ofcom are based around access to the broadcasting spectrum and so are not relevant to the digital dissemination of content. In practice this means that the use of recommendation systems within public service media are not subject to scrutiny by the communications regulator.

However, like all other organisations that process data, public service media within the European Union are required to comply with the General Data Protection Regulation (GDPR). The UK adopted this legislation before leaving the EU, though  a draft Data Protection and Digital Information Bill (‘Data Reform Bill’) introduced in July 2022 includes a number of important changes, including removing the prohibition on automated decision-making, and maintaining restrictions for automated decision-making only if special categories of data are involved. The draft bill also introduces a new ground to allow the processing of special categories of data for the purpose of monitoring and correcting algorithmic bias in AI systems. A separate set of provisions centred around fairness and explainability for AI systems is also expected as part of the Government’s upcoming white paper on AI governance.

The UK GDPR shapes the development and implementation of recommendation systems because it requires:

  • Consent: the UK GDPR requires that the use of personal data be made with freely-given, genuine and unambiguous consent from an individual. There are other lawful bases for processing personal data that do not require consent, including legal obligations, processing in a vital interest and processing for a ‘legitimate interest’ (a justification that public authorities cannot rely on if they are processing for their tasks as a public authority).
  • Data minimisation: under Article 5(1), the ‘data minimisation’ principle of the UK GDPR states that personal data should be ‘adequate, relevant and limited to what is necessary in relation to the purposes for which they are processed’. Under Article 17 of the UK GDPR, the ‘right to erasure’ grants individuals the right to have personal data erased that is not necessary for the purposes of processing.
  • Automated decision-making, the right to be informed and explainability:  under the UK GDPR, data subjects have a right not to be subject to solely automated decisions that do not involve human intervention, such as profiling.[footnote]The Article 29 Working Group defines profiling in this instance as ‘automated processing of data to analyze or to make predictions about individuals’.[/footnote] Where such automated decision-making occurs, meaningful information about the logic involved, the significance and the envisaged consequences of such processing need to be provided to the data subject (Article 15 (1) h). Separate guidance from the Information Commissioner’s Office also touches on making AI systems explainable for users.[footnote]Information Commissioner’s Office and The Alan Turing Institute. (2021). Explaining decisions made with AI. Available at: https://ico.org.uk/for-organisations/guide-to-data-protection/key-dp-themes/explaining-decisions-made-with-artificial-intelligence/[/footnote]

Our interviews with practitioners indicated that GDPR compliance is foundational to their approach to recommendation systems, and that careful consideration must be paid to how personal data is collected and used. While the forthcoming Data Reform Bill makes several changes to the UK GDPR, most of these effects on the development and implementation of recommendation systems will likely continue under the current bill’s language.

GDPR regulates the use of data that a recommendation system draws on, but there is not currently any legislation that specifically regulates the ways in which recommendation systems are designed to operate on that data, although there are a number of proposals in train at national and European levels.

In July 2022, the European Parliament adopted the Digital Services Act, which includes (in Article 24a) an obligation for all online platforms to explain, in their terms and conditions, the main parameters of their recommendation system and the options for users to modify or influence those parameters. There are additional requirements imposed on very large online platforms (VLOPs) to provide at least one option for each of their recommendation systems which is not based on profiling (Article 29). There are also further obligations for VLOPs in Article 26 to perform systemic risk assessments, including taking into account the design of the recommendation systems (Article 26 (2) a) and to implement steps to mitigate risk by testing and adapting their recommendation systems (Article 27 (1) ca).

In order to ensure compliance with the transparency provisions in the regulation, the Digital Services Act includes a provision that enables independent auditors and vetted researchers to have access to the data that led to the company’s risk assessment conclusions and mitigation decisions (Article 31). This provision ensures oversight over the self-assessment (and over the independent audit) that companies are required to carry out, as well as scrutiny over the choices large companies make around their recommendation systems.

The draft AI Act proposed by the European Commission in 2021 also includes recommendation systems in its remit. The proposed rules require harm mitigations such as risk registers, data governance and human oversight but only make obligations mandatory for AI systems used in ‘high-risk’ applications. Public service media are not mentioned within this category, although due to their democratic significance it’s possible they might come into consideration. Outside the high-risk categories, voluntary adoption is encouraged. These proposals are still at an early stage of development and negotiation and are unlikely to be adopted until at least 2023.

In another move, in January 2022 the European Commission launched a public consultation on a proposed European Media Freedom Act that aims to further increase the ‘transparency, independence and accountability of actions affecting media markets, freedom and pluralism within the EU’. The initiative is a response to populist governments, particularly in Poland and Hungary attempting to control media outlets, as well as an attempt to bring media regulation up to speed with digital technologies. The proposals aim to secure ‘conditions for [media markets’] healthy functioning (e.g. exposure of the public to a plurality of views, media innovation in the EU market)’. Though there is little detail so far, this framing could allow for the regulation of recommendation systems within media organisations.

In the UK, public service media are excluded from the draft Online Safety Bill which imposes responsibilities on platforms to safeguard users from harm. Ofcom, as well as the Digital Culture Media and Sport Select Committee, have called for urgent reform to regulation that would update the governance of public service media for the digital age. As of this report, there has been no sign of progress on a proposed Media Bill that would provide this guidance.

Internal oversight

Public service media have well-established practices for operationalising their mission and values through the editorial guidelines described earlier. But the introduction of recommendation systems has led many of them to reappraise these and, in some cases, introduce additional frameworks to translate these values for the new context.

The BBC has brought together teams from across the organisation to discuss and develop a set of machine learning engine principles, which they believe will uphold the Corporation’s mission and values:[footnote]Macgregor, M. (2021). Responsible AI at the BBC: Our Machine Learning Engine Principles. BBC Research and Development. Available at: https://www.bbc.co.uk/rd/publications/responsible-ai-at-the-bbc-our-machine-learning-engine-principles[/footnote]

  • Reflecting the BBC’s values of trust, diversity, quality, value for money and creativity.
  • Using machine learning to improve our audience’s experience of the BBC
  • Carrying out regular review, ensuring data is handled securely and that algorithms serve our audiences equally and fairly
  • Incorporating the BBC’s editorial values and seeking to broaden, rather than narrow horizons.
  • Continued innovation and human-in-the-loop oversight.

These have then been adopted into a checklist for teams to use in practice:

‘The MLEP [Machine Learning Engine Principles] Checklist sections are designed to correspond to each stage of developing a ML project, and contain prompts which are specific and actionable. Not every question in the checklist will be relevant to every project, and teams can answer in as much detail as they think appropriate. We ask teams to agree and keep a record of the final checklist; this self-audit approach is intended to empower practitioners, prompting reflection and appropriate action.[footnote]Macgregor, M. (2021).[/footnote]

Reflecting on putting this into practice, BBC staff members observed that ‘the MLEP approach is having real impact in bringing on board stakeholders from across the organisation, helping teams anticipate and tackle issues around transparency, diversity, and privacy in ML systems early in the development cycle’.[footnote]Boididou, C., Sheng, D., Moss, M. and Piscopo, A. (2021), ‘Building Public Service Recommenders: Logbook of a Journey’. RecSys ’21: Proceedings of the 15th ACM Conference on Recommender Systems, pp. 538–540. Available at: https://doi.org/10.1145/3460231.3474614[/footnote]

Other public service media organisations have developed similar frameworks. Bayerische Rundfunk, the public broadcaster for Bavaria in Germany, found that their existing values needed to be translated into practical guidelines for working with algorithmic systems and developed ten core principles.[footnote]Bedford-Strohm, J., Köppen, U. and Schneider, C. (2020). ‘Our AI Ethics Guidelines’. Bayerisch Rundfunk. https://www.br.de/extra/ai-automation-lab-english/ai-ethics100.html[/footnote] These align in many ways to the BBC principles but have additional elements, including a commitment to transparency and discourse, ‘strengthening open debate on the future role of public service media in a data society’, support for the regional innovation economy, engagement in collaboration and building diverse and skilled teams.[footnote]Bedford-Strohm, J., Köppen, U. and Schneider, C. (2020).[/footnote]

In the Netherlands, public service broadcaster NPO along with commercial media groups and the Netherlands Institute for Sound and Vision drew up a declaration of intent.[footnote]Media perspectives. (2021). ‘Intentieverklaring voor verantwoord gebruik van KI in de media. [Letter of intent for responsible use of AI in the media]’. Available at: https://mediaperspectives.nl/intentieverklaring/[/footnote] Drawing on the European Union high-level expert group principles on ethics in AI, the declaration is a commitment to the responsible use of AI in the media sector. NPO are developing this into a ‘data promise’ that offers transparency to audiences about their practices. 

Other stakeholders

Beyond these formal structures, the use of recommendation systems in public service media is shaped by these organisations’ accountability to, and scrutiny by wider society.

All the public service media organisations we interviewed welcomed this scrutiny in principle and were committed to openness and transparency.  Most publish regular blogposts about their work, present at academic conferences and invite feedback about their work. These, however, reach a small and specialist audience.

There are limited opportunities for the broader public to understand and influence the use of recommendation systems. In practice, there is little accessible information about recommendation systems on most public service media platforms and even where it exists, teams admit that it is rarely read.

The Voice of the Listener and Viewer, a civil society group that represents audience interests in the UK, has raised concerns with the BBC about a lack of transparency in its approach to personalisation but has been dissatisfied with the response. The Media Reform Coalition has proposed that recommendations systems used in UK public service media should be co-designed with citizens’ media assemblies and that the underlying algorithms should be made public.[footnote]Grayson, D. (2021). Manifesto for a People’s Media. Media Reform Coalition. Available at: https://drive.google.com/file/u/1/d/1_6GeXiDR3DGh1sYjFI_hbgV9HfLWzhPi/view?usp=embed_facebook[/footnote]

Despite this low level of public engagement, public service media organisations were sensitive to external perceptions of their use of recommendation systems. Teams expected that, as public service media, they would be held to a higher standard than their commercial competitors. At the BBC in particular, staff frequently mentioned concerns about how their work might be seen by the press, the majority of which tends to take an anti-BBC stance. In practice, we have found little coverage of the BBC’s use of algorithms outside of specialist publications such as Wired.

Public service media have a dual role, both as innovators in the use of recommendation services and as scrutineers of the impacts of new technologies. The BBC believes it has a ‘critical contribution, as part of a mixed AI ecosystem, to the development of beneficial AI both technically, through the development of AI services, and editorially, by encouraging informed and balanced debate’.[footnote]BBC. (2017). Written evidence to the House of Lords Select Committee on Artificial Intelligence. Available at: https://data.parliament.uk/writtenevidence/committeeevidence.svc/evidencedocument/artificial-intelligence-committee/artificial-intelligence/written/70493.html[/footnote] At Bayerische Rundfunk, this combined responsibility has been operationalised by integrating the product team and data investigations team into an AI and Automation Lab. However, we are not aware of any instances where public service media have reported on their own products and subjected them to critical scrutiny. 

Why this matters

The history of public service media, their current challenges and the systems for their governance are the framing context in which these organisations are developing and deploying recommendation systems. As with any technology, organisations must consider how the tool can be used in ways that are consistent with their values and culture and whether it can address the problems they face.

In his inaugural speech, BBC Director-General Tim Davie identified increased personalisation as a pillar of addressing the future role of public service media in a digital world:[footnote]BBC Media Centre. (2020). Tim Davie’s introductory speech as BBC Director-General. Available at: https://www.bbc.co.uk/mediacentre/speeches/2020/tim-davie-intro-speech[/footnote]

‘We will need to be cutting edge in our use of technology to join up the BBC, improving search, recommendations and access. And we must use the data we hold to create a closer relationship with those we serve. All this will drive love for the BBC as a whole and help make us an indispensable part of everyday life. And create a customer experience that delivers maximum value.’

But recommendation systems also crystallise the current existential dilemmas of public service media. The development of a technology whose aim is optimisation requires an organisation to be explicit about what and who it is optimising for. A data-driven system requires an institution to quantify those objectives and evaluate whether or not the tool is helping them to achieve them.

This can seem relatively straightforward when setting up a recommendation system for e-commerce, for example, where the goal is to sell more units. Other media organisations may also have clear metrics around time spent on a platform, advertising revenues or subscription renewals.

In this instance, the broadly framed public service values that have proven flexible to changing contexts in the past are a hindrance rather than a help. A concept like ‘diversity’ is hard to pin down and feed into a system.[footnote]Hildén, J. (2021). ‘The Public Service Approach to Recommender Systems: Filtering to Cultivate’. Television & New Media, 23(7). Available at: https://doi.org/10.1177/15274764211020106[/footnote] Organisations that are supposed to serve the public as both citizens and consumers must decide which role gets more weight.

Recommendation systems might offer an apparently obvious solution to the problem of falling public service media audience share – if you are able to better match the vast amount of content in public service media catalogues to listeners and viewers, you should be able to hold and grow your audience. But is universality achieved if you reach more people but they don’t share a common experience of a service? And how do you measure diversity and ensure personalised recommendations still offer a balance of content?

‘The introduction of algorithmic systems will force [public service media] to express its values and goals as measurable key performance indicators, which could be useful and perhaps even necessary. But this could also create existential threats to the institution by undermining the core principles and values that are essential for legitimacy.’[footnote]Sørensen, J.K. and Hutchinson, J. (2018). ‘Algorithms and Public Service Media’. Public Service Media in the Networked Society: RIPE@2017, pp.91–106. Available at: http://www.nordicom.gu.se/sites/default/files/publikationer-hela-pdf/public_service_media_in_the_networked_society_ripe_2017.pdf[/footnote]

Recommendation systems force product teams within public service media organisations to settle on an interpretation of public service values, at a time when the regulatory, social and political context makes them particularly unclear.

It also means that this interpretation will be both instantiated and then systematised in a way that has never previously occurred. As we saw with the example of the impartiality guidelines of the BBC, individuals and teams have historically made decisions under a broad governance framework and founded on editorial judgement. Inconsistencies in those judgements could be ironed out through the multiplicity of individual decisions, the diversity of contexts and the number of different decision-makers. Questions of balance could be considered over a wider period of time and breadth of output. Evolving societal norms could be adopted as audience expectations change.

However, building a decision-making system sets a standardised response to a set of questions and repeats that every time. In this way it nails an organisation’s colours to one particular mast and then replicates that approach repeatedly.

Stated goals and potential risks of using recommendation systems in public service media

Organisations deploy recommendation systems to address certain objectives. However, these systems also bring potential risks. In this chapter, we look at what public service media aim to achieve through deploying recommendation systems and the potential drawbacks.

Stated goals of recommendation systems

In this section, we look at the stated objectives for the use of recommendation systems and the degree to which public service media reference those objectives and motivations when justifying their own use of recommendation systems.

Recommendation systems bring several benefits to different actors, including users who access the recommendations (in the case of public service media, audiences), as well as the organisations and businesses that maintain the platforms on which recommendation systems operate. Some of the effects of recommendation systems are also of broader societal interest, especially where the recommendations interact with large numbers of users, with the potential to influence their behaviour. Because they serve the interests of multiple stakeholders,[footnote]Milano, S., Taddeo, M. and Floridi, L. (2021). ‘Ethical aspects of multi-stakeholder recommendation systems’. The Information Society, 37(1). Available at: https://doi.org/10.1080/01972243.2020.1832636; Abdollahpouri, H., Adomavicius, G., Burke, R., et al. (2020). ‘Multistakeholder recommendation: Survey and research directions’. User Modeling and User-Adapted Interaction, pp.127–158. Available at: https://doi.org/10.1007/s11257-019-09256-1[/footnote] recommendation systems support data-based value creation in multiple ways, which can pull in different directions.[footnote]Tempini, N. (2017). ‘Till data do us part: Understanding data-based value creation in data-intensive infrastructures’. Information and Organization, 27(4). Available at: http://dx.doi.org/10.1016/j.infoandorg.2017.08.001 [/footnote]

Four key areas of value creation are:

  1. Reducing information overload for the receivers of recommendations: It would be overwhelming for individuals to trawl the entire catalogue of Netflix or Spotify, for example. Their recommendation systems reduce the amount of content to a manageable number of choices for the audience. This creates value for users.
  2. Improved discoverability of items: E-commerce sites can recommend items they are particularly keen to sell, or direct people to niche products for which there is a specific customer base. This creates value for businesses and other actors that provide the items in the recommender’s catalogue. It can also be a source of societal value, for example where improved discoverability increases the diversity of news items that are accessed by the audience.
  3. Attention capture: Targeted recommendations which cater to users’ preferences encourage people to spend more time on services, generating revenue through subscriptions or advertising. This is a source of economic value for platform providers, who monetise attention via advertising revenue or paid subscriptions. But it can also be a source of societal value, if it means that people pay more attention to content that has public service value, in line with the mandate for universality.
  4. Data gathering to derive business insights and analysis: For example, platforms gain valuable insights into their audience through A/B testing which enables them to plan marketing campaigns or commission content. This is a source of economic value, when it is used to derive business insights. But under appropriate conditions, it could be a source of societal value, for example by enabling socially responsible scientific research (see our recommendations below).

We explored how these objectives map to the motivations articulated by public service media organisations for their use of recommendation systems.

1. Reducing information overload

‘Under conditions of information abundance and attention scarcity, the modern challenges to the realisation of media diversity as a policy goal lie less and less in guaranteeing a diversity of supply and more in the quest to create the conditions under which users can actually find and choose between diverse content.’[footnote]Helberger, N., Karppinen, K. and D’Acunto, L. (2018). ‘Exposure diversity as a design principle for recommender systems’. Information, Communication & Society, 21(2). Available at: https://doi.org/10.1080/1369118X.2016.1271900[/footnote]

We heard from David Graus: ‘So finding different ways to enable users to find content is core there. And in that context, I think recommender systems really serve to be able to surface content that users may not have found otherwise, or may surface content that users may not know they’re interested in.’

We heard from David Graus: ‘So finding different ways to enable users to find content is core there. And in that context, I think recommender systems really serve to be able to surface content that users may not have found otherwise, or may surface content that users may not know they’re interested in.’

2. Improved discoverability

Public service media also deploy recommendation systems with the objective of showcasing much more of their vast libraries of content. BBC Sounds, for example, has more than 200,000 items available, of which only a tiny amount can be surfaced either through broadcast schedules or an editorially curated platform. Recommendation systems can potentially unlock the long tail of rarely viewed content and allow individuals’ specific interests to be met.

They can also, in the view of some organisations, meet the public service obligation of diversity by exposing audiences to a greater variety of content.[footnote]Interview with David Graus, Lead Data Scientist, Randstad Groep Nederland (2021). This point was also captured in separate studies of public service media organisations – see: Hildén, J. (2021). ‘The Public Service Approach to Recommender Systems: Filtering to Cultivate’. Television & New Media, 23(7). Available at: https://doi.org/10.1177/15274764211020106[/footnote] Recommendation systems need not simply cater to, or replicate people’s existing interests but can actively push new and surprising content.

This approach is also deployed in commercial settings, notably in Spotify’s ‘Discover’ playlists, as novelty is also required for audience retention. Additionally, some public service media organisations, such as Swedish Radio and NPO, are experimenting with approaches that promote content they consider particularly high in public value.

Traditional broadcasting provides one-to-many communication. Through personalisation, platforms have created a new model of many-to-many communication, creating ‘fragmented user needs’.[footnote]Interview with Uli Köppen, Head of AI + Automation Lab, Co-Lead BR Data, Bayerische Rundfunk (2021).[/footnote] Public service media must now grapple with how they create their own way of engaging in this landscape. The BBC’s ambition for the iPlayer is to make output, ‘accessible to the audience wherever they are, whatever devices they are using, finding them at the right moments with the right content’.[footnote]BBC. (2021). BBC Annual Plan 2021-22. Available at: http://downloads.bbc.co.uk/aboutthebbc/reports/annualplan/annual-plan-2021-22.pdf[/footnote]

Jonas Schlatterbeck, ARD (German public broadcaster), takes a similar view:

‘We can’t actually serve majorities anymore with one content. It’s not like the one Saturday night show that will attract like half of the German population […] but more like tiny mosaic pieces of different content that are always available to pretty much everyone but that are actually more targeted.’[footnote]Interview with Jonas Schlatterbeck, Head of Content ARD Online & Leiter Programmplanung, ARD (2021).[/footnote]

3. Attention capture

The need to maintain audience reach in a fiercely competitive digital landscape was mentioned by almost every public service media organisation we spoke to.

Universality, the obligation to reach every section of society, is central to the public service remit.

And if public service media lose their audience to their digital competitors, they cannot deliver the other societal benefits within their mission. As Koen Muylaert of Belgian VRT said: ‘we want to inspire people, but we also know that you can only inspire people if they intensively use your products, so our goal is to increase the activity on our platform as well. Because we have to fight for market share’.[footnote]Interview with Koen Muylaert, Project Lead, VRT data platform and data science initiative, Vlaamse Radio- en Televisieomroeporganisatie (VRT) (2021).[/footnote]

The assumption among most public service media organisations is that recommendation systems improve engagement, although there is still little conclusive evidence of this in academic literature. The BBC has specific targets for 16-34 year-olds to use the iPlayer and BBC Sounds, and staff consider recommendations as a route to achieving those metrics.[footnote]BBC. (2021). BBC Annual Plan 2021-22. Available at: http://downloads.bbc.co.uk/aboutthebbc/reports/annualplan/annual-plan-2021-22.pdf[/footnote]

From our interview with David Caswell, Executive Product Manager, BBC News Labs:

‘We have seen that finding in our research on several occasions that there’s sort of some transition that audiences and particularly younger audiences have gone through where there’s an expectation of personalization they don’t expect to be doing the same thing again and again and again, and in terms of active searching for things they expect they expect a personalized experience… There isn’t a lot of tolerance, increasingly with younger and digitally native audiences for friction in the experience. And so personalization is a major technique for removing friction from the experience because audience members don’t have to do all the work of discovery and selection and so on, they can have that done for them that this is.’[footnote]Interview with David Caswell, Executive Product Manager, BBC News Labs (2021).[/footnote]

Across the teams we interviewed from European public service media organisations there was widespread consensus that audiences now expect content to be personalised. Netflix and Spotify’s use of recommendation systems was described as a ‘gold standard’ for public service media organisations to aspire to. But few of our interviewees offered evidence to support this view of audience expectations.

‘I see the risk that when we are compared with some of our competitors that are dabbling with a much more sophisticated personalisation, there is a big risk of our services being perceived as not adaptable and not relevant enough.’[footnote]Interview with Olle Zachrison, Deputy News Commissioner & Head of Digital News Strategy, Swedish Radio (2021).[/footnote]

4. Data gathering and behavioural interventions

Recommendation systems collect and analyse a wealth of data in order to serve personalised recommendations to their users. The data collected often pertains to user interactions with the system, including data that is produced as a result of interventions on the part of the system that are intended to influence user behaviour (interventional data).[footnote]Greene, T., Martens, D. and Shmueli, G. (2022) ‘Barriers to academic data science research in the new realm of algorithmic behaviour modification by digital platforms’. Nature Machine Intelligence, 4(4), pp. 323–330. Available at: https://doi.org/10.1038/s42256-022-00475-7[/footnote] For example, user data collected by a recommendation system may include data about how different users responded to A/B tests, so that the system developers can track the effectiveness of different designs or recommendation strategies in stimulating some desired user behaviour. 

Interventional data can thus be used to support targeted behavioural interventions, as well as scientific research into the mechanisms that underpin the effectiveness of recommendations. This marks recommendation systems as a key instrument of what Shoshana Zuboff has called a system of ‘surveillance capitalism’.[footnote]Zuboff, S. (2015). ‘Big other: Surveillance Capitalism and the Prospects of an Information Civilization’. Journal of Information Technology, 30(1). Available at: https://doi.org/10.1057/jit.2015.5[/footnote] In this system, platforms extract economic value from personal data, usually in the form of advertising revenue or subscriptions, at the expense of the individual autonomy afforded to individual users of the technology.

As access to the services provided by the platforms becomes essential to daily life, users increasingly find themselves tracked in all aspects of their online experience, without meaningful options to avoid it. The possibility of surveillance constitutes a grave risk associated with the use of recommendation systems.

Because recommendation systems have been mainly researched and developed in commercial settings, many of the techniques and  types of data collected work within this logic of surveillance.[footnote]van Dijck, J. (2014). ‘Datafication, dataism and dataveillance: Big Data between scientific paradigm and ideology’. Surveillance & Society, 12(2). Available at: https://doi.org/10.24908/ss.v12i2.4776; Srnicek, N. (2017). Platform capitalism. Polity.[/footnote] However, it is also possible to envisage uses of recommendation systems that do not obey the same logic.[footnote]Lane, J. (2020). Democratizing Our Data: A Manifesto. MIT Press.[/footnote] Recommendation systems used by public service media are a case in point. Public service media organisations are in a position to decide which data to collect and use in the service of creating public value, scientific value and individual value for their audiences, instead of economic value that would be captured by shareholders.[footnote]Tempini, N. (2017). ‘Till data do us part: Understanding data-based value creation in data-intensive infrastructures’. Information and Organization, 27(4). Available at: http://dx.doi.org/10.1016/j.infoandorg.2017.08.001[/footnote]

Examples of public value that could be created from user data include insights into effective and impartial communication that serves the public interest and fosters community building. Social science research into the effectiveness of behavioural interventions, and basic research into the psychological mechanisms that underpin audience’s trust in recommendations would contribute to the creation of scientific value from behavioural data. From the perspective of the audience, value could be created by fostering user empowerment to learn more about their own interests and develop their tastes, letting users feel more in control and understand the value of the content that they can access.

We found little evidence of public service media deploying recommendation systems with the explicit aim of capturing data on their audiences and content or deriving greater insights. On the contrary, interviewees stressed the importance of data minimisation and privacy. At Bayerische Rundfunk for example, a product owner said that the collection of demographic data on the audience was a red line that they would not cross.[footnote]Interview with Matthias Thar, Bayerische Rundfunk (2021).[/footnote]

However, we did find that most public service media organisations introduced recommendation systems as part of a wider deployment of automated and data-driven approaches. In many cases, these are accompanied by significant organisational restructures to create new ways of working adapted to the technologies, as well as to respond to the budget cuts that almost all public service media are facing.

Public service media organisations are often fragmented, with teams separated by region and subject matter and with different systems for different channels and media that have evolved over time. The use of recommendation systems requires a consistent set of information about each item of content (commonly known as metadata). As a result, some public service media have started to better connect different services so that recommendation systems can draw on them.

For instance, Swedish Radio has overhauled its entire news output to improve its digital service, creating standalone items of content that do not need to be slotted into a particular programme or schedule but can be presented in a variety of contexts. Alongside this, it has introduced a scoring system to rank its content against its own public values, prompting a rearticulation of those values as well as a renewed emphasis on their importance.

Bayerische Rundfunk (BR) is creating a new infrastructure for the consistent use of data as a foundation for the future use of recommendation systems. This is already allowing for news stories to automatically upload data specific to different localities, as well as generating automated text on data-heavy stories such as sports results. This allows BR to cover a broader range of sports and cater to more specialist interests, as well as freeing up editorial teams from mundane tasks.

While there is not a direct objective of behavioural intervention and data capture at present, the introduction of recommendation systems is part of a wider orientation towards data-driven practices across public service media organisations. This has the potential to enable wider data collection and analysis to generate business insights in the future.

Conclusion

We find that public service media organisations articulate similar objectives to the field more broadly, in their motivations for deploying recommendation systems, although unlike commercial actors, they do not currently use recommendations for the explicit aim of data capture and behavioural intervention. In some respects they reframe these established motivations to align with their public service mission and values.

Many staff across public service media organisations display a belief that because the organisation is motivated by public service values, and produces content that adheres to those values, the use of recommendation systems to filter that content is a furtherance of their mission.

This has meant that staff at public service media organisations have not always critically examined whether the recommendation system itself is operating in accordance with public service values.

However, public service media organisations have begun to put in place principles and governance mechanisms to encourage staff to explicitly and systematically consider how the development of their systems furthers their public service values. For example, the BBC published its Machine Learning Engine Principles in 2019 and subsequently continues to iterate on a checklist for project teams to put those principles into practice.[footnote]Macgregor, M. (2021). Responsible AI at the BBC: Our Machine Learning Engine Principles. BBC Research and Development. Available at: https://www.bbc.co.uk/rd/publications/responsible-ai-at-the-bbc-our-machine-learning-engine-principles[/footnote]

Public service media organisations are also in the early stages of developing new metrics and methods to measure the public service value of the outputs of the recommendation systems, both with explicit measures of ‘public service value’ and implicitly through evaluation by editorial staff. We explore these more in our chapter on evaluation and in our case studies on the BBC’s use of recommendation systems.

Additionally, we found that alongside these stated motivations, public service media interviewees had internalised a set of normative values around recommendation systems. When asked to define what a recommendation system is in their own terms, they spoke of systems helping users to find ‘relevant’, ‘useful’, ‘suitable’, ‘valuable’ or ‘good’ content.[footnote]This is not unique to the BBC, and many academic papers and industry publications also reflect a similar implicit normative framework in their definitions of recommendation systems.[/footnote]

This framing around user benefit obscures the fact that the systems are ultimately deployed to achieve organisations’ goals, and so if they are ‘relevant’ or ‘useful’ this is because that helps achieve the organisations’ goals, not because of an inherent property of the system.[footnote]The organisations’ goals are not necessarily in tension with that of the users, e.g. helping audiences finding more relevant content might help audiences get better value for money (which is a goal of many public service media organisations) but that is still goal which shapes how the recommendation system is developed, rather than a necessary feature of the system.[/footnote] It also adopts the vocabulary of commercial recommendation systems (e.g. targeted advertising options encourage users to opt for more ‘relevant’ adverts) which the Competition and Markets Authority has identified as problematic. This indicates that public service media are essentially adopting the paradigm established by the use of commercial recommendation systems.

Potential risks from recommendation systems

In this section, we explore some of the ethical risks associated with the use of recommendation systems and how they might manifest in uses by public service media.

A review of the literature on recommendation systems helps identify some of the potential ethical and societal risks that have been raised in relation to their use beyond the specific context of public service media. Milano et al highlight six areas of concern for recommendation systems in general:[footnote]Milano, S., Taddeo, M. and Floridi, L. (2020). ‘Recommender systems and their ethical challenges’. AI & Society, 35, pp.957–967. Available at: https://doi.org/10.1007/s00146-020-00950-y[/footnote]

  1. Privacy risks to users of a recommendation system: including direct risks from non-compliance with existing privacy regulations and/or malicious use of personal data, and indirect risks resulting from data leaks, deanonymisation of public datasets or unwanted exposure of inferred sensitive characteristics to third parties.
  2. Problematic or inappropriate content could be recommended and amplified by a recommendation system.
  3. Opacity in the operation of a recommendation system could lead to limited accountability and lower the trustworthiness of the recommendations.
  4. Autonomy: recommendations could limit users’ autonomy by manipulating their beliefs or values, and by unduly restricting the range of meaningful options that are available to them.
  5. Fairness constitutes a challenge for any algorithmic system that operates using human-generated data and is therefore liable to (re)produce social biases. Recommendation systems are no exception, and can exhibit unfair biases affecting a variety of stakeholders whose interests are tied to recommendations.
  6. Social externalities such as polarisation, the formation of echo chambers, and epistemic fragmentation, can result from the operation of recommendation systems that optimise for poorly defined objectives.

How these risks are viewed and addressed by public service media

In this section, we examine the extent to which ethical risks of recommendation systems, identified in the literature, are present in the development and use of recommendation systems in practice by public service media.

1. Privacy

The data gathering and operation of recommendation systems can pose direct and indirect privacy risks. Direct privacy risks come from how personal data is handled by the platform, as its collection, usage and storage need to follow procedures to ensure prior consent from individual users. In the context of EU law, these stages are covered by General Data Protection Regulation (GDPR).

Indirect privacy risks arise when recommendation systems expose sensitive user data unintentionally. For instance, indirect privacy risks may come about as a result of unauthorised data breaches, or when a system reveals sensitive inferred characteristics about a user (e.g. targeted advertising for baby products could indicate a user is pregnant).

Privacy relates to a number of public service values: independence (act in the interest of audiences), excellence (high standards of integrity) and accountability (good governance).

Privacy was raised as a potential risk by every interviewee from a public service organisation. Specifically, public service media were concerned about users’ consent to the use of their data, emphasising data security as a key concern for the responsible collection and use of user data.[footnote]Interview with Jonas Schlatterbeck, Head of Content ARD Online & Leiter Programmplanung, ARD (2021). [/footnote] Several interviewees stressed that public service media organisations do not generally require mandatory sign-in for certain key products, such as news. Other services, focusing more on entertainment, such as BBC iPlayer, do require sign-on, but the amount of personal data collected is limited.

Sebastien Noir, Head of Software, Technology and Innovation at the European Broadcasting Union, emphasised how the need to comply with privacy regulations in practice means that projects have to jump through several hoops with legal teams before trials with user data are allowed. While this uses up time and resources in project development, it also means that robust measures are in place to protect users from direct threats to privacy. Koen Muylaert,  at Belgian VRT, also spoke to us about how there is a distinction between personal data, which poses privacy risks, and behavioural data, which may be safer to use for public service media recommendation systems and which they actively monitor.[footnote]Interview with Koen Muylaert, Project Lead, VRT data platform and data science initiative, Vlaamse Radio- en Televisieomroeporganisatie (VRT) (2021).[/footnote]

None of the organisations that we interviewed spoke to us about indirect threats to privacy or ways to mitigate them.

2. Problematic or inappropriate content

Open recommendation systems on commercial platforms that host limitless, user-generated content have a high risk of recommending low quality or harmful content. This risk is lower for public service media that deploy closed recommendation systems to filter their own catalogue of content which has already been extensively scrutinised for quality and adherence to editorial guidelines. Nonetheless, some risk may still exist for closed recommendation systems, such as the risk of recommended age-inappropriate content to younger users.

The risk of inappropriate content relates to the public service media values of excellence (high standards of integrity, professionalism and quality) and independence (completely impartial and independent from political commercial and other influences and ideologies).

In interviews, many members of public service media staff were generally confident that recommendations would be of high quality and represent public service values because the content pool had already passed that test. Nonetheless, some staff identified a risk that the system could surface inappropriate content, for example, archive items that include sexist or racist language that is no longer acceptable or through the juxtaposition of items that could be jarring.

However, a more commonly identified potential risk arises in connection to independence and impartiality. Many of the interviewees we spoke to mentioned that the algorithms used to generate user recommendations needed to be impartial. The BBC and other public service media organisations have traditionally operated a policy of ‘balance over time and output’, meaning a range of views on a subject or party political voices will be heard over a given period of programming on a specific channel. However, recommendation systems disrupt this. The audience is no longer exposed to a range of content broadcast through channels. Instead, individuals are served up specific items of content without the balancing context of other programming. In this way they may only encounter one side of an argument.

Therefore, some interviewees expressed that fine-tuning balanced recommendations are especially important in this context. This is an area where the close integration of editorial and technical teams was seen to be essential

3. Opacity of the recommendation

Like many other algorithmic systems, many recommendation systems operate as black boxes whose internal workings are sometimes difficult to interpret, even for their developers. The process by which a recommendation is generated is often not transparent to individual users or other parties that interact with a recommendation system. This can have negative effects, by limiting the accountability of the system itself, and diminishing the trust that audiences put in the good operation of the service.

Opacity is a challenge to the public service media values of independence (autonomous in all aspects of the remit) and accountability (be transparent and subject to constant public scrutiny). The issue of opacity and the risks that it raises was touched upon in several of our interviews.

The necessity to exert more control over the data and algorithms used for building recommendation systems was among the motivations for the BBC in bringing their development in house. The same is true of other public service media in Europe. While most European broadcasters did not choose to bring the development of recommendation systems in house, many of them now rely on PEACH, a recommendation system developed collaboratively by several public service media organisations under the umbrella of the European Broadcasting Union (EBU).

Previously, the BBC as well as other public service media had relied on external commercial contractors to build the recommendation systems they used. This however meant that they could exert little control over the data and algorithms used, which represented a risk. In the words of Sebastien Noir, Head of Software, Technology and Innovation at the EBU:

‘As a broadcaster, you are defined by what you promote to the people, that’s your editorial line. This is, in a way, also your brand or your user experience. If you delegate that to a third party company, […] then you have a problem, because you have given your very identity, the way you are perceived by the people to a third party company […] No black box should be your editorial line.’[footnote]Interview with Sébastien Noir, Head of Software, Technology and Innovation, and Dmytro Petruk, Developer, European Broadcasting Union (2021).[/footnote]

But bringing the development of recommendation systems in-house does not solve all the issues connected with the opacity of these systems. Jannick Sørenson, Associate Professor in Digital Media at Aalborg University, summarised the concern:

‘I think the problem of the accountability, first within the public service institution, is that editors, they have no real chance to understand what data scientists are doing. And data scientists, neither they do. […] And so the dilemma here is that it requires a lot of specialised knowledge to understand what is going on inside this process of computing recommendation[s]. Right. And, I mean, with Machine Learning, it’s become literally impossible to follow.’[footnote]Interview with Jannick Kirk Sørensen, Associate Professor in Digital Media, Aalborg University (2021).[/footnote]

Sørenson highlighted how the issue of opacity arises both internally and externally for public service media.

Internally to the institution, the opacity of the systems utilised to produce recommendations hinders the collaboration of editorial and technical staff. Some public service media organisations, such as Swedish Radio, have tried to tackle this issue by explicitly having both a technical and an editorial project lead, while Bayerische Rundfunk have established an interdisciplinary team with their AI and Automation Lab[footnote]We explore these examples in more detail later in the chapter.[/footnote]

Documentation is another approach taken by public service media organisations to reduce the opacity of the system. For example, the BBC’s Machine Learning Engine Principles checklist (as of version 2.0) explicitly asks teams to document what their model does and how it was created, e.g. via a data science decision log, and to create a Plain English explanation or visualisation of the model to communicate the model’s purpose and operation.

Externally, public service media struggle to provide effective explanations to audiences about the systems that they use. The absence of industry standards for explanation and transparency was identified as a risk. Olle Zachrison, Deputy News Commissioner & Head of Digital News Strategy, Swedish Radio, also expressed this worry:

‘One particular risk, I think, with all these kind of more automatic services, and especially with the introduction of […] AI powered services, is that the audience doesn’t understand what we’re doing. And […] I know that there’s a big discussion going on at the moment, for example, about Explainable AI. How should we explain in a better way what the services are doing? […] I think that there’s a very big need for kind of industry dialogue about setting standards here, you know.’[footnote]Interview with Olle Zachrison, Deputy News Commissioner & Head of Digital News Strategy, Swedish Radio (2021).[/footnote]

Other interviewees, however, highlighted that the use of explanations has limited efficacy in addressing the external opacity of individual recommendations, since users rarely pay attention to them. Sarah van der Land, Digital Innovation Advisor at NPO in the Netherlands, cited internally conducted consumer studies as evidence that audiences might not care about explanations:

‘Recently, we did some experiments also on data insight, into what extent our consumers want to have feedback on why they get a certain recommendation? And yeah, unfortunately, our research showed that a lot of consumers are not really interested in the why. […] Which was quite interesting for us, because we thought, yeah, of course, as a public value, we care about our consumers. We want to elaborate on why we do the things we do and why, based on which data, consumers get these recommendations. But yeah, they seem to be very little interested in that.’[footnote]Interview with Arno van Rijswijk, Head of Data & Personalization, and Sarah van der Land, Digital Innovation Advisor, Nederlandse Publieke Omroep (2021).[/footnote]

This finding indicates that pursuing this strategy has limited practical effects in improving the value of recommendations for audiences. David Graus, Lead Data Scientist, Randstad Groep Nederland, also told us that he is sceptical of the use of technical explanations, but that ‘what is more important is for people to understand what a recommender system is, and what it aims to do, and not how technically a recommendation was generated.’[footnote]Interview with David Graus, Lead Data Scientist, Randstad Groep Nederland (2021).[/footnote] This could be achieved by providing high-level explanations of the processes and data that were used to produce the recommendations, instead of technical details of limited interest to non-technical stakeholders.

4. Autonomy

Research on recommendation systems has highlighted how they could pose risks to user autonomy, by restricting people’s access to information and by potentially being used to shape preferences or emotions. Autonomy is a fundamental human value which ‘generally can be taken to refer to a person’s effective capacity for self-governance’.[footnote]Prunkl, C. (2022). ‘Human autonomy in the age of artificial intelligence’. Nature Machine Intelligence, 4, pp.99–101. Available at: doi: https://doi.org/10.1038/s42256-022-00449-9[/footnote] Writing on the concept of human autonomy in the age of AI, Prunkl distinguishes two dimensions of autonomy: one internal, relating to the authenticity of the beliefs and values of a person; and the other external, referring to the person’s ability to act, or the availability of meaningful options that enables them to express agency.

The risk to autonomy relates to the public service media value of universality (creating a public sphere, in which all citizens can form their own opinions and ideas, aiming for inclusion and social cohesion).

Public service media historically have made choices on behalf of their audiences in line with what the organisation has determined is in the public interest. In this sense audiences have limited autonomy due to public service media organisations restricting individuals’ access to information, albeit with good intentions.

The use of recommendation systems could, in one respect, be seen as increasing the autonomy of audiences. A more personalised experience, that is more tailored to the individual and their interests, could support the ‘internal’ dimension of autonomy, because it could enable a recommendation system to more accurately reflect the beliefs and values of an individual user, based on what other users of that demographic, region or age might like.

At the same time, public service media strive to ‘create a public sphere, in which all citizens can form their own opinions and ideas, aiming for inclusion and social cohesion’.[footnote]European Broadcasting Union. (2012). Empowering Society: A Declaration on the Core Values of Public Service Media, p. 4. Available at: https://www.ebu.ch/files/live/sites/ebu/files/Publications/EBU-Empowering-Society_EN.pdf[/footnote] There is a risk in using recommendation systems that public service media might filter information in such a way that they inhibit people’s autonomy to form their views independently.[footnote]Interview with David Caswell, Executive Product Manager, BBC News Labs (2021).[/footnote]

By design, recommendation systems tailor recommendations to a specific individual, often in such a way where these recommendations are not visible to other people. This means individual members of the audience may not share a common context or may be less aware of what information others have access to, a condition that Milano et al have called ‘epistemic fragmentation’.[footnote]Milano, S., Mittelstadt, B., Wachter, S. and Russell, C. (2021), ‘Epistemic fragmentation poses a threat to the governance of online targeting’. Nature Machine Intelligence. Available at: https://doi.org/10.1038/s42256-021-00358-3[/footnote] Coming to an informed opinion often requires being able to have meaningful conversations about a topic with other people. If recommendations isolate individuals from each other, then this may undermine the ability of audiences to form authentic beliefs and reason about their values. Since this ability is essential to having autonomy, epistemic fragmentation poses a risk.

Recommendations are also based on an assumption that there is such a thing as a single, legible individual for whom content can be personalised. In practice, people’s needs vary according to context and relationships. They may want different types of content at different times of day, whether they are watching videos with family or listening to the news in the car, for example. However, contextual information is difficult to factor in a recommendation, and doing so requires access to more user data which could pose additional privacy risks. Moreover, recommendations are often delivered via a user’s account with a service that uses recommendation systems. However, some people may choose to share accounts, create a joint one or maintain multiple personal accounts to compartmentalise different aspects of their information needs and public presence.[footnote]Milano, S., Taddeo, M. and Floridi, L. (2021). ‘Ethical aspects of multi-stakeholder recommendation systems’. The Information Society, 37(1). Available at: https://doi.org/10.1080/01972243.2020.1832636[/footnote]

Finally, the use of recommendation systems by public service media can pose a risk to autonomy when the categories that are used to profile users are not accurate, not transparent or not easily accessible and modifiable by the users themselves. This concern is linked to the opacity of the system, but it was not addressed explicitly as a risk to user autonomy in our interviews.

As above, several interviews highlighted that internal research indicates users do not want more explanations and control over the recommendation system, when this comes at the cost of a frictionless experience. If so, public service media need to consider whether there is a trade-off between supporting autonomy and the ease of use of a recommendation system, and research alternative strategies to provide audiences with more meaningful opportunities to participate in the construction of their digital profiles.

5. Fairness

Researchers have documented how the use of machine learning and AI in applications ranging from credit scoring to facial recognition,[footnote]Buolamwini, J. and Gebru, T. (2018). ‘Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification’. Proceedings of the 1st Conference on Fairness, Accountability and Transparency. Conference on Fairness, Accountability and Transparency, PMLR, pp. 77–91. Available at: https://proceedings.mlr.press/v81/buolamwini18a.html[/footnote] medical triage to parole decisions,[footnote]Angwin, J., Larson, J., Mattu, S. and Kirchner, L. (2016). ‘Machine Bias’. ProPublica. Available at: https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing[/footnote] advert delivery[footnote]Sweeney, L. (2013). ‘Discrimination in online ad delivery’. arXiv. Available at: https://doi.org/10.48550/arXiv.1301.6822[/footnote] to automatic text generation[footnote]Noble, S. U. (2018). Algorithms of Oppression. New York: New York University Press; Bender, E.M., Gebru, T., McMillan-Major, A. and Shmitchell, S. (2021). ‘On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?’. FAccT ’21: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, pp.610–623. Available at: https://doi.org/10.1145/3442188.3445922[/footnote] and many others, often leads to unfair outcomes which perpetuate historical social biases or introduce new, machine-generated ones. Given the pervasiveness of these systems in our societies, this has given rise to increasing pressure to improve their fairness, which has contributed to a burgeoning  area of research.

This risk relates to the public service media value of universality (reach all segments of society, with no-one excluded) and diversity (support and seek to give voice to a plurality of competing views – from those with different backgrounds, histories and stories. Help build a more inclusive, less fragmented society).

Developers of algorithmic systems today can draw on a growing array of technical approaches to addressing fairness issues; however, fairness remains a challenging issue that cannot be fully solved by technical fixes. Instead, as Wachter et al argue in the context of EU law, the best approach may be to recognise that algorithmic systems are inherently and inevitably biased, and to put in place accountability mechanisms to ensure that there are no biases that perpetuate unfair discrimination, but to the contrary biases are used to help to redress historical injustices.[footnote]Wachter, S., Mittelstadt, B. and Russell, C. (2020). ‘Why Fairness Cannot Be Automated: Bridging the Gap Between EU Non-Discrimination Law and AI’. Computer Law & Security Review, 41. Available at: http://dx.doi.org/10.2139/ssrn.3547922[/footnote]

Recommendation systems are no exception. Biases in recommendation can arise at a variety of levels and for different stakeholders. From the perspective of users, a recommendation system could be unfair if the quality of the recommendations varies across users. For example, if a music recommendation system is much worse at predicting the tastes of and serving interesting recommendations to a minority group, this could be unfair.

Recommendations could also be unfair from a provider perspective. For instance, one recent study found a film recommendation system trained on a well-known dataset (MovieLens 10M), and designed to optimise for relevance to users, systematically underrepresented films by female directors.[footnote]Boratto, L., Fenu, G. and Marras, M. (2021) ‘Interplay between upsampling and regularization for provider fairness in recommender systems’. User Modeling and User-Adapted Interaction, 31(3), pp. 421–455.Available at: https://doi.org/10.1007/s11257-021-09294-8[/footnote] This example illustrates a phenomenon that is more pervasive. Since recommendation systems are primarily built to optimise for user relevance, provider-side unfairness has been observed to emerge in a variety of settings, ranging from content recommendations to employment websites.[footnote]Biega, A. J., Gummadi, K. P. and Weikum, G. (2018). ‘Equity of Attention: Amortizing Individual Fairness in Rankings’. SIGIR ’18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, pp. 405–414. Available at: https://dl.acm.org/doi/10.1145/3209978.3210063[/footnote]

Because different categories of stakeholders derive different types of value from recommendation systems, issues of fairness can arise separately for each of them. In e-commerce applications, for example, users derive value from relevant recommendations for items that they might be interested in buying, while sellers derive value from their items being exposed to more potential buyers. Moreover, attempts to address unfair bias for one category of stakeholders might lead to making things worse for another category. In the case of e-commerce applications, for example, attempts to improve provider-side fairness could have negative effects on the relevance of recommendations for users. Bringing these competing interests together, comparing them and devising overarching fairness metrics remains an open challenge.[footnote]Abdollahpouri, H., Adomavicius, G., Burke, R., et al. (2020). ‘Multistakeholder recommendation: Survey and research directions’. User Modeling and User-Adapted Interaction, pp.127–158. Available at: https://doi.org/10.1007/s11257-019-09256-1[/footnote]

Issues of fairness were not prominently mentioned by our interview participants. When fairness was referenced, it was primarily with regards to fairness concerns for users and whether recommendation systems performed better for some demographics than others. However, the extent to which recommendation systems are currently used across public service media organisations we spoke to was low enough that the risk did not generate too much concern among many staff. Sebastien Noir, European Broadcasting Union, said that ‘Recommendation appears, at least for the moment more than something like [the] cherry on the cake, it’s a little bit of a personalised touch on the world where everything is still pretty much broadcast content where everyone gets to receive the same content.’[footnote]Interview with Sébastien Noir, Head of Software, Technology and Innovation, and Dmytro Petruk, Developer, European Broadcasting Union (2021).[/footnote] Since, for now, recommendations represent a very small portion of the content that users access on these platforms, the risk that this poses to fairness was deemed to be very low. 

However, if recommendations were to take a more prominent role in future, this would pose concerns that need to be addressed. Some of our BBC interviewees expressed a concern that some recommendations currently cater best to the interests of some demographics, while they work less well for others. Differential levels of accuracy and quality of experience across groups of users is a known issue in recommendation systems, although the way in which it manifests can be difficult to predict before the system is deployed.

In general, our respondents believed that ‘majority’ users, whose informational needs and preferences are closest to the average, and therefore more predictable, tend to be served best by a recommendation system – though many acknowledge this assertion has been difficult to empirically prove. If the majority of BBC users belong to a specific demographic, this could skew the system towards their interests and tastes, posing fairness issues with respect to other demographics. However, this can sometimes be reversed when other factors beyond user relevance, such as increasing the diversity of users and the diversity of content, are introduced. Therefore, the emerging patterns from recommendations are difficult to predict, but will need to be monitored on an ongoing basis. BBC interviewees reported that this issue is currently addressed by looping in more editorial oversight.

6. Social effects or externalities

One of the features of recommendation systems that has attracted most controversy in recent years is their apparent tendency to produce negative social effects. Social media networks that use recommendation systems to structure user feeds, for instance, have come under scrutiny for increasing polarisation by optimising for engagement. Other social networks have come under fire for facilitating the spread of disinformation.

The social externality risk relates to the public service media values of universality (create a public sphere, in which all citizens can form their own opinions and ideas, aiming for inclusion and social cohesion) and diversity (support and seek to give voice to a plurality of competing views – from those with different backgrounds, histories and stories. Help build a more inclusive, less fragmented society).

Pariser introduced the concept of a ‘filter bubble’, which can be understood as an informational ecosystem where individuals are only or predominantly exposed to certain types of content, while they never come into contact with other types.[footnote]Pariser, E. (2011). The filter bubble: what the Internet is hiding from you. Penguin Books.[/footnote] The philosopher C Thi Nguyen has offered an analysis of how filter bubbles might develop into echo chambers, where users’ beliefs are reflected at them and reinforced through interaction with media that validates them, leading to potentially dangerous escalation.[footnote]Nguyen, C. T. (2018). ‘Why it’s as hard to escape an echo chamber as it is to flee a cult’. Aeon. Available at: https://aeon.co/essays/why-its-as-hard-to-escape-an-echo-chamber-as-it-is-to-flee-a-cult[/footnote] However, some recent empirical research has cast doubt on the extent to which recommendation systems deployed on social media really give rise to filter bubbles and political polarisation in practice.[footnote]Arguedas, A. R., Robertson, C. T., Fletcher, R. and Nielsen R.K. (2022). ‘Echo chambers, filter bubbles, and polarisation: a literature review.’ Reuters Institute for the Study of Journalism. Available at: https://reutersinstitute.politics.ox.ac.uk/echo-chambers-filter-bubbles-and-polarisation-literature-review[/footnote]

In one study, it was observed that consuming news through social media increases the diversity of content consumed, with users engaging with a larger and more varied selection of news sources.[footnote]Scharkow, M., Mangold, F., Stier, S. and Breuer, J. (2020). ‘How social network sites and other online intermediaries increase exposure to news’. Proceedings of the National Academy of Sciences, 117(6), pp. 2761–2763. Available at: https://doi.org/10.1073/pnas.1918279117[/footnote] These studies highlight how recommendation systems can be programmed to increase the diversity of exposure to varied sources of content.[footnote]A similar finding exists in other studies of public service media organisations – see: Hildén, J. (2021). ‘The Public Service Approach to Recommender Systems: Filtering to Cultivate’. Television & New Media, 23(7). Available at: https://doi.org/10.1177/15274764211020106[/footnote] However, they do not control for the quality of the sources or the individual reaction to the content (e.g. does the user pay attention or merely scroll down on some of the news items?). Without this information it is difficult to know what the effects are of exposure to different types of sources. More research is needed to probe the links between exposure to diverse sources and the influence this has on the evolution of political opinions. 

Another known risk for recommendation systems is exposure to manipulation by external agents. Various states, for example Russia and China, have been documented to engage in what has been called ‘computational propaganda’. This type of propaganda exploits some features of recommendation systems on social media to spread mis- or disinformation, with the aim of destabilising the political context of the countries targeted. State-sponsored ‘content farms’ have been documented to produce content that is engineered to be picked up by recommendation systems to go viral. This kind of hostile strategy is made possible by the vulnerability of the recommendation system, especially open ones, because the system is programmed to optimise for engagement.

The risk that the use of recommendation systems could increase polarisation and create filter bubbles was regarded as very low by our interviewees. Unlike social media that recommend content generated by users or other organisations, the BBC and other public service media that we spoke to operate closed content platforms. This means that all the content recommended on their platforms has already passed multiple editorial checks, including for balanced and truthful reporting.

The relatively minor role that recommendation systems play on the platform currently also means that they do not pose a risk of creating filter bubbles. Therefore, this was not recognised as a pressing concern.

However, many raised concerns that recommendation systems could undermine the principle of diversity by serving audiences homogenous content. Historically, programme schedulers have had mechanisms to expose audiences to content they might not choose of their own accord – for example by ‘hammocking’ programmes of high public value between more popular items on the schedule and relying on audiences not to switch channels. Interviewees also mentioned the importance of serendipity and surprise as part of the public service remit. This could be lost if audiences are only offered content based on their previous preferences. These concerns motivate ongoing research into new methods for producing more accurate and diversified recommendations.[footnote]Paudel, B., Christoffel, F., Newell, C. and Bernstein, A. (2017). ‘Updatable, Accurate, Diverse, and Scalable Recommendations for Interactive Applications’. ACM Transactions on Interactive Intelligent Systems, 7(1), pp.1–34. Available at: https://doi.org/10.1145/2955101[/footnote]

Conclusion

The categories of risk related to the use of recommendation systems, identified in the literature, can be applied to their use in the context of public service media. However, the way in which these risks manifest and the emphasis that organisations put on them can be quite different to a commercial context.

We found that public service media have, to a greater or lesser extent, mitigated their exposure to these risks through a number of factors such as the high quality of the content being recommended; the limited deployment of the systems; the substantial level of human curation; a move towards greater integration of technical and editorial teams; ethical principles; associated practice checklists and system documentation. It is not enough for public service media organisations to believe that having a public service mission will ensure that recommendation systems serve the public. If public service media are to use recommendation systems responsibly, they must interrogate and mitigate the potential risks.

We find these risks can also be seen in relation to the six core public service values of universality, independence, excellence, diversity, accountability and innovation.

We believe it is useful for public service media to consider both the known risks, as understood within the wider research field, as well as the risks in relation to public service values. By approaching the potential challenges of recommendation systems through this dual lens, public service media organisations should be able to develop and deploy systems in line with their public service remit.

An additional consideration, broader than any specific risk category, is that of audience trust in public service media. Trust doesn’t fall under any specific category because it is associated with the relationship between  public service media and their audience more broadly. But failure to address the risks identified by the categories can negatively affect trust. All public service media organisations place trust as central to their mission. In the context of a fragmented digital media environment, their trustworthiness has taken on increased importance and is now a unique quality that distinguishes them from other media and which is pivotal to the argument in favour of sustaining public service media. Many public service media organisations are beginning to recognise and address the potential risks of recommendation systems and it is vital that this continues in order to retain audience trust.

Additional challenges for public service media

As well as the ethical risks described above, public service media face practical challenges in implementing recommendation systems that stem from their mission, the make-up of their teams and their organisational infrastructure.

Quantifying values

Recommendation systems filter content according to criteria laid down by the system developers. Public service media organisations that want to filter content in ways that prioritise public service values first need to translate these values into information that is legible to an algorithmic system. In other words, the values must be quantified as data.

However, as we noted above, public service values are fluid, can change over time and depend on context. And as well as the stated mission of public service media, laid down in charters, governance and guidelines, there are a set of cultural norms and individual gut instincts that determine day-to-day decision making and prioritisation in practice. Over time, public service media have developed a number of ways to measure public value, through systems such as the public value test assessment and with metrics such as audience reach, value for money and surveys of public sentiment (see section above). However, these only account for public value at a macro level. Recommendation systems that are filtering individual items of content require metrics that quantify values at a micro level.

Swedish Radio is a pioneer in attempting to do this work of translation. Olle Zachrison of Swedish Radio summarised it as: ‘we have central tenets to our public service mission stuff that we have been talking about for decades and also stuff that is in the kind of gut of the news editors. But in a way, we had to get them out there in an open way and into a system also, that we in a way could convert those kinds of editorial values that have been sitting in these kind of really wise news assessments for years, but to get them out there into a system that we also convert them into data.’[footnote]Interview with Olle Zachrison, Deputy News Commissioner & Head of Digital News Strategy, Swedish Radio (2021).[/footnote]

Working across different teams and different disciplines

The development and deployment of recommendation systems for public service media requires expertise in both technical development and content creation and curation. This proves challenging in a number of ways.

Firstly, technology talent is hard to come by, especially when public service media cannot offer anything near the salaries available at commercial rivals.[footnote]Interview with Dietmar Jannach, Professor, University of Klagenfurt (2021).[/footnote] Secondly, editorial teams often do not trust or value the role of technologists, especially when the two do not work closely with each other.[footnote]Interview with Nic Newman, Senior Research Associate, Reuters Institute for the Study of Journalism (2021).[/footnote] In some organisations, the introduction of recommendation systems stalls because it is perceived as a direct threat to editorial jobs and an attempt to replace journalists with algorithms.[footnote]Interview with Sébastien Noir, Head of Software, Technology and Innovation, and Dmytro Petruk, Developer, European Broadcasting Union (2021).[/footnote]

Success requires bridging this gap and coordinating between teams of experts in technical development, such as developers and data scientists, and experts in content creation and curation, the journalists and editors.[footnote]Boididou, C., Sheng, D., Moss, M. and Piscopo, A. (2021), ‘Building Public Service Recommenders: Logbook of a Journey’. RecSys ’21: Proceedings of the 15th ACM Conference on Recommender Systems, pp. 538–540. Available at: https://doi.org/10.1145/3460231.3474614[/footnote]

As Sørensen and Hutchinson note: ‘Data analysts and computer programmers (developers) now perform tasks that are key determinants for exposure to public service media content. Success is no longer only about making and scheduling programmes. This knowledge is difficult to communicate to journalists and editors, who typically don’t engage in these development projects […] Deep understanding of how a system recommends content is shared among a small group of experts’.[footnote] Sørensen, J.K. and Hutchinson, J. (2018). ‘Algorithms and Public Service Media’. Public Service Media in the Networked Society: RIPE@2017, pp.91–106. Available at: http://www.nordicom.gu.se/sites/default/files/publikationer-hela-pdf/public_service_media_in_the_networked_society_ripe_2017.pdf[/footnote]

Some, such as Swedish Radio and BBC News Labs, have tried to tackle this issue by explicitly having two project leads, one with an editorial background and one with a technical background, to emphasise the importance of working together and symbolically indicate that this was a joint process.[footnote]Interview with Olle Zachrison, Deputy News Commissioner & Head of Digital News Strategy, Swedish Radio (2021); BBC News Labs. ‘About’. Available at: https://bbcnewslabs.co.uk/about[/footnote] Swedish Radio’s Olle Zachrison noted that: 

‘We had a joint process from day one. And we also deliberately had kind of two project managers, one, clearly from the editorial side, like a very experienced local news editor. And the other guy was the product owner for our personalization team. So they were the symbols internally of this project […] that was so important for the, for the whole company to kind of team up behind this and also for the journalists and the product people to do it together.’

If this coordination fails, this can ‘weaken the organisation strategically and, on a practical level, create problems caused by failing to include or correctly mark the metadata that is essential for findability’.

Bayerische Rundfunk has established a unique interdisciplinary team. The AI and Automation Lab has a remit to not only create products, but also produce data-driven reporting and coverage of the impacts of artificial intelligence on society. Building from the existing data journalism unit, the Lab fully integrates the editorial and technical teams under the leadership of Director Uli Köppen. Although she recognises the challenges of bringing together people from different backgrounds, she believes the effort has paid off:

‘This technology is so new, and it’s so hard to persuade the experts to work in journalism. We had the data team up and running, these are journalists that are already in the mindset at this intersection of tech and journalism. And I had the hope that they are able to help people from other industries to dive into journalism, and it’s easier to have this kind of conversation with people who already did this cultural step in this hybrid world.

‘It was astonishing how those journalists helped the new people to onboard and understand what kind of product we are. And we are also reinventing our role as journalists in the product world. And this really worked out so I would say it’s worth the effort.’

Metadata, infrastructure and legacy systems

In order to filter content, recommendation systems require clear information about what that content is. For example, if a system is designed to show people who enjoyed soap operas other series that they might enjoy, individual items of content must be labelled as being soap operas in a machine-readable format. This kind of labelling is called metadata.

However, public service media have developed their programming around the needs of individual channels and stations organised according to particular audiences and tastes (e.g. BBC Radio 1 is aimed at a younger audience around music, BBC Radio 4 at an older audience around speech content) or by a particular region (e.g. in Germany Bayerische Rundfunk serves Bavaria, WDR serves West Germany but both are members of the federal broadcaster ARD). Each of these channels will have evolved their own protocols and systems and may label content differently – or not at all. This means the metadata to draw on for the deployment of recommendation systems is often sparse and low quality, and the metadata infrastructure is often disjointed and unsystematic.

We heard from many interviewees across public service media organisations that access to high-quality metadata was one of the most significant barriers to implementing recommendation systems. This was particularly an issue when they wanted to go beyond the most simplistic approaches and experiment with assigning public service value to pieces of content or measuring the diversity of recommended content.

Recommendation system projects often required months of setting up systems for data collection, then assessing and cleaning that data, before the primary work of building a recommendation system could begin. To achieve this requires a significant strategic and financial commitment on the part of the organisation, as well as buy-in from the editorial teams involved in labelling.

Evaluation of recommendation systems

We’ve explored the possible benefits and harms of recommendation systems, and how those benefits and harms might manifest in a public service media context. To try to understand whether and when those benefits and harms occur, developers of recommendation systems need to evaluate their systems. Conversely, looking at how developers and organisations evaluate their recommendation systems can tell us what benefits and harms, and to whom, they prioritise and optimise for in their work.[footnote]Evaluation of recommendation systems in not limited to the developers and deployers of those systems. Other stakeholders such as users, government, regulators, journalists and civil society organisations may all have their own goals for what they think a particular recommendation system should be optimising for. Here however, we focus on evaluation as seen by the developer and deployer of the system, as this is where there is the tightest feedback loop between evaluation and changes to the system and the developers and deployers generally have privileged access to information about the system and a unique ability to run tests and studies on the system. For more on how regulators (and others) can evaluate social media companies in an online-safety context, see: Ada Lovelace Institute. (2021). Technical methods for regulatory inspection of algorithmic systems. Available at: https://www.adalovelaceinstitute.org/report/technical-methods-regulatory-inspection/[/footnote]

In this chapter, we look at:

  • how recommendation systems can be evaluated
  • how public service media organisations evaluate their own recommendation systems
  • how evaluation might be done differently in future.

How recommendation systems are evaluated

In this section, we lay out a framework for understanding the evaluation of recommendation systems as a three-stage process of:

  1. Setting objectives.
  2. Identifying metrics.
  3. Selecting methods to measure those metrics.

This framework is informed by three aspects of evaluation (objectives, metrics and methods) as identified by Francesco Ricci, Professor of Computer Science at the Free University of Bozen-Bolzano.

Objectives

Evaluation is a process of determining how well a particular system achieves a particular set of goals or objectives. To evaluate a system, you need to know what goals you are evaluating against.[footnote]Interview with Francesco Ricci, Professor of Computer Science, Free University of Bozen-Bolzano (2021).[/footnote]

However, this is not a straightforward exercise. There is no singular goal for a recommendation system and different stakeholders will have different goals for the system. For example, on a privately-owned social media platform:

  • the engineering team’s goal might be to create a recommendation system that serves ‘relevant’ content to users
  • the CEO’s goal might be to maximise profit while minimising personal reputational risk
  • the audience’s goal may be to discover new and unexpected content (or just avoid boredom).

If a developer wants to take into account the goals of all the stakeholders in their evaluation, they will need to decide how to prioritise or weigh these different goals.

Balancing goals is ultimately a ‘political’ or ‘moral’ question, not a technical one, and there will never be a universal answer about how to weigh these different factors, or even who the relevant stakeholders whose goals should be weighted are.

Any process of evaluation ultimately needs a process to determine the relevant stakeholders for a recommendation system and how their priorities should be weighted.

This is made more difficult because people are often confused or uncertain about their goals, or have multiple competing goals, and so the process of evaluation will need to help people clarify their goals and their own internal weightings between those goals.[footnote]Interview with Francesco Ricci.[/footnote]

Metrics

Furthermore, goals are often quite general and whether they have been met cannot be directly observed.[footnote]Interview with Francesco Ricci, Professor of Computer Science, Free University of Bozen-Bolzano (2021).[/footnote] Therefore, once a goal has been decided, such as ‘relevance to the user’, the goal needs to be operationalised into a set of specific metrics to judge the recommendation system against.[footnote]Operationalising is a process of defining how a vague concept, which cannot be directly measured, can nevertheless be estimated by empirical measurement. This process inherently involves replacing one concept, such as ‘relevance’, with a proxy for that concept, such as ‘whether or not a user clicks on an item’ and thus will always involve some degree of error.[/footnote] These metrics can be quantitative, such as the number of users who click on an item, or qualitative, such as written feedback from users about how they feel about a set of recommendations.

Whatever the metrics used, the choice of metrics is always a choice of a particular interpretation of the goal. The metric will always be a proxy for the goal, and determining a proxy is a political act that grants power to the evaluator to decide what metrics reflect their view of the problem to be solved and the goals to be achieved.[footnote]Beer, D. (2016). Metric Power. London: Palgrave Macmillan. Available at: https://doi.org/10.1057/978-1-137-55649-3[/footnote]

The people who define these metrics for the recommendation system are often the engineering or product teams. However, these teams are not always the same people who set the goals of an organisation. Furthermore, they may not directly interact with other stakeholders who have a role in setting the goals of the organisation or the goal of deploying the recommendation system.

Therefore, through misunderstanding, lack of knowledge or lack of engagement with others’ views, the engineering and product teams’ interpretation of the goal will likely never quite match the intention of the goal as envisioned by others.

Metrics will also always be a simplified vision of reality, summarising individual interactions with the recommendation system into a smaller set of numbers, scores or lines of feedback.[footnote]Raji, I. D., Bender, E. M., Paullada, A. et al. (2021). ‘AI and the Everything in the Whole Wide World Benchmark’, p2. arXiv. Available at: https://doi.org/10.48550/arXiv.2111.15366[/footnote] This does not mean metrics cannot be useful indicators of real performance; this very simplicity is what makes them useful in understanding the performance of the system. However, those creating the metrics need to be careful not to confuse the constructed metric with the reality underlying the interactions of people with the recommendation system. The metric is a measure of the interaction, not the interaction itself.

Methods

Evaluating is then the process of measuring these metrics for a particular recommendation system in a particular context, which requires gathering data about the performance of the recommendation system. Recommendation systems are evaluated in three main ways:[footnote]Gunawardana, A. and Shani, G. (2015). ‘Evaluating Recommender Systems’. Recommender Systems Handbook, pp 257–297. Available at: https://doi.org/10.1007/978-0-387-85820-3_8[/footnote]

  1. Offline evaluations test recommendation systems without real users interacting with the system, for example by measuring recommendation system performance on historical user interaction data or in a synthetic environment with simulated users.
  2. User studies test recommendation systems against a small set of users in a controlled environment with the users being asked to interact with the system and then typically provide explicit feedback about their experience afterwards.
  3. Online evaluations test recommendation systems deployed in a live environment, where the performance of the recommendation system is measured against interactions with real users.

These methods of evaluation are not mutually exclusive and a recommendation system might be tested with each method sequentially, as it moves from design to development to deployment.

Offline evaluation has been a historically popular way to evaluate recommendation systems. It is comparatively easy to do, due to the lack of interaction with real users or a live platform. In principle, they are reproducible by other evaluators, and allow standardised comparison of the results of different recommendation system.[footnote]Jannach, D. and Jugovac, M. (2019), ‘Measuring the Business Value of Recommender Systems’. ACM Transactions on Management Information Systems, 10(4), pp 1–23. Available at: https://doi.org/10.1145/3370082[/footnote]

However, there is increasing concern that offline evaluation results based on historical interaction data do not translate well into real-world recommendation system performance. This is because the training data is based on a world without the new recommendation system in it, and evaluations therefore cannot account for how that system might itself shift wider aspects of the service like user preferences.[footnote]Rohde, D., Bonner, S., Dunlop, T., et al. (2018). ‘RecoGym: A Reinforcement Learning Environment for the problem of Product Recommendation in Online Advertising’. arXiv. Available at: https://doi.org/10.48550/arXiv.1808.00720; Beel, J. and Langer, S. (2015)., ‘A Comparison of Offline Evaluations, Online Evaluations, and User Studies in the Context of Research-Paper Recommender Systems’. Proceedings of the 19th International Conference on Theory and Practice of Digital Libraries (TPDL), pp.153-168. Available at: doi: 10.1007/978-3-319-24592-8_12; Jannach, D., Pu, P., Ricci, F. and Zanker, M. (2021). ‘Recommender Systems: Past, Present, Future’. AI Magazine, 42 (3). Available at: https://doi.org/10.1609/aimag.v42i3.18139[/footnote] This limits their usefulness in evaluating which recommendation system would actually be the best performing in the dynamic live environments most stakeholders are interested in, such as a video-sharing website with an ever-growing set of videos and ever-changing set of viewers and content creators.

Academics we spoke to in the field of recommendation systems identified user studies in labs and simulations as the state of the art in academic recommendation system evaluation. Whereas in industry, common practice is to use online evaluation via A/B testing to optimise key performance indicators.[footnote]Interview with Dietmar Jannach, Professor, University of Klagenfurt (2021).[/footnote]

How do public service media evaluate their recommendation systems?

In this section, we use the framework of objectives, metrics and methods to examine how public service media organisations evaluate their recommendation systems in practice.

Objectives

As we discussed in the previous chapter, recommendation systems are ultimately developed and deployed to serve the goals of the organisation using them; in this case, public service media organisations. In practice, however, the objectives that recommendation systems are evaluated against are often multiple levels of operationalisation and contextualisation down from the overarching public service values of the organisation.

For example, as discussed previously, the BBC Charter agreement sets out the mission and public purposes of the organisation for the following decade. These are derived from the public service values, but are also shaped by political pressures as the Charter is negotiated with the British Government of the time.

The BBC then publishes an annual plan setting out the organisation’s strategic priorities for that year, drawing explicitly on the Charter’s mission and purposes. These annual plans are equally shaped by political pressures, regulatory constraints and challenges from commercial providers. The plan also sets out how each product and service will contribute towards meeting those strategic priorities and purposes, setting the goals for each of the product teams.

For example, the goals of BBC Sounds as a product team in 2021 were to:

  1. Increase the audience size of BBC Sounds’ digital products.
  2. Increase the demographic breadth of consumption across BBC Sounds’ products, especially among the young.
  3. Convert ‘lighter users’ into regular users.
  4. Enable users to more easily discover content from the more than 50 hours of new audio produced by the BBC on an hourly basis.[footnote]According to David Jones (Executive Product Manager, BBC Sounds, interviewed in 2021), his top-line KPI is to reach 900,000 members of the British population who are under 35 by March 2022. These numbers are determined centrally by BBC senior managers based on the BBC’s Service Licence for BBC Online and Red Button. See: BBC Trust. (2016). BBC Online and Red Button Service Licence. Available at: http://downloads.bbc.co.uk/bbctrust/assets/files/pdf/regulatory_framework/service_licences/online/2016/online_red_button_may16.pdf[/footnote]

These objectives map onto the goals for using recommendation systems we discussed in the previous chapter. Specifically, the first three relate to capturing audience attention and the fourth relates to reducing information overload and improving discoverability for audiences.

These product goals then inform the objectives of the engineering and product teams in the development and deployment of a recommendation system, as a feature within the wider product.

At each stage, as the higher level objectives are interpreted and contextualised lower down, they may not always align with each other.

The objectives for the development and deployment of recommendation systems in public service media seem most clear for entertainment products, e.g. audio-on-demand and video-on-demand. Here, the goal of the system is clearly articulated as a combination of audience engagement with reaching underserved demographics and serving more diverse content. These are often explicitly linked by the development teams to achieving the public service values of diversity and a personalised version of universality, which they see as serving the needs of each and every group in society

In these cases, public service media organisations seem better at articulating goals for recommendation systems when they are using recommendation systems for a similar purpose as private-sector commercial media organisations. This seems, in part, because there is greater existing knowledge of how to operationalise those objectives, and the developers can draw on their own private sector experience and existing industry practice, open-source libraries and similar resources.

However, when setting objectives that focus more focus on public service value, public service media organisations often seem less clear about the goals of the recommendation system within the wider product.

This seems partly because in the domain of news, for example, the use of recommendation systems by public service media is more experimental and at an earlier stage of maturity. Here, the motivations often come further apart from commercial providers, with the implicit motivation of public service media developers seemingly to augment existing editorial capabilities with a recommendation system, rather than drive engagement with the news content. This means public service media developers have less existing practices and resources to draw upon for translating product goals and articulating recommendation system objectives in those domains.

In general, it seems that some public service values are easier to operationalise in the context of recommendation systems than others, such as diversity and universality. These values get privileged over others, such as accountability, in the development of recommendation systems, as they are the easiest to translate through from the overarching set of organisational values down to the product and feature objectives.

Metrics

Public service media organisations have struggled to operationalise their complex public service values into specific metrics. There seem to be three broad responses to this:

  1. Fall back on established engagement metrics, e.g. click-through rate and watch time, often with additional quantitative measures of the diversity of audience content consumption.
  2. The above approach combined with attempts to create crude numerical measures (e.g. a score from 1 to 5) of ‘public service value’ for pieces of content, often reducing complex values to a single number subjectively judged by journalists, then measuring the consumption of content with a ‘high’ public service value score.
  3. Try to indirectly optimise for public service value by making their metrics the satisfaction of editorial stakeholders, whose preferences are seen as the best ‘ground truth’ proxy for public service value. Then optimise for lists of recommendations which are seen to have high public service value by editorial stakeholders.

Karin van Es found that, as of 2017, the European Broadcasting Union and the Dutch public service media organisation NPO evaluated pilot algorithms using the same metrics found in commercial systems i.e. stream starts and average‐minute ratings.[footnote]van Es, K. F. (2017). ‘An Impending Crisis of Imagination : Data‐Driven Personalization in Public Service Broadcasters’. Media@LSE. Available at: https://dspace.library.uu.nl/handle/1874/358206[/footnote] As van Es notes, these metrics are a proxy for audience retention and even if serving diverse content was an explicit goal in designing the system, the chosen metrics reflect – and will ultimately lead to – a focus on engagement over diversity.

Therefore, despite different stated goals, the public service media use of recommendation systems ends up optimising for similar outcomes as private providers.

By now, most public service media organisations using recommendation systems also have explicit metrics for diversity, although there is no single shared definition of diversity across the different organisations, nor is there one single metric used to measure the concept.

However, most quantitative metrics for diversity in the evaluation of public service media recommendation systems focus on diversity in terms of audience exposure to unique pieces of content or to categories of content, rather than on the representation of demographic groups and viewpoints across the content audiences are exposed to.[footnote]This was generally attributed by interviewees to a combination of a lack of metadata to measure the representativeness within content and assumption that issues of representation within content were better dealt with at the point at which content is commissioned, so that the recommendation systems have diverse and representative content over which to recommend.[/footnote]

Some aspects of diversity, as Hildén observes, are easier to define and ‘to incorporate into a recommender system than others. For example, genres and themes are easy to determine at least on a general level, but questions of demographic representation and the diversity of ideas and viewpoints are far more difficult as they require quite detailed content tags in order to work. Tagging content and attributing these tags to users might also be politically sensitive especially within the context of news recommenders’.[footnote]Hildén, J. (2021). ‘The Public Service Approach to Recommender Systems: Filtering to Cultivate’. Television & New Media, 23(7). Available at: https://doi.org/10.1177/15274764211020106[/footnote]

Commonly used metrics for diversity include intra-list diversity, i.e. the average difference between each pair of items in a list of recommendations and inter-list diversity, i.e. the ratio of items recommended to total items recommended across all the lists of recommendations.

Some public service media organisations are experimenting with more complex measures of exposure diversity. For example, Koen Muylaert at Belgian VRT explained how they measure an ‘affinity score’ for each user for each category of content, e.g. your affinity with documentaries or with comedy shows, which increases as you watch more pieces of content in that category.[footnote]Interview with Koen Muylaert, Project Lead, VRT data platform and data science initiative, Vlaamse Radio- en Televisieomroeporganisatie (VRT) (2021).[/footnote] VRT then measures the diversity of content that each user consumes by looking at the difference between a user’s affinity scores for different categories.[footnote]By measuring the entropy of the distribution of affinity scores across categories, and trying to improve diversity by increasing that entropy.[/footnote] RT see this method of measuring diversity as valuable because they can explain it to others and measure it across users over time, to track how new iterations of their recommendation system increase users’ exposure to diverse content.

To improve on this, some public service media organisations have tried to implement ‘public service value’ as an explicit metric in evaluating their recommendation systems. NPO, for example, ask a panel of 1,500 experts and ordinary citizens to assess the public value of each piece of content, including the diversity of actors and viewpoints represented in the content, and then ask those panellists to assign a single ‘public value’ from 1 to 100 to all pieces of content on their on-demand platform. They then calculate an average ‘public value’ score for the consumption history of each user. According to Sara van der Land, Digital Innovation Advisor at NPO, their target is to make sure that the average ‘public value’ score of every user rises over time.[footnote]Interview with Arno van Rijswijk, Head of Data & Personalization, and Sarah van der Land, Digital Innovation Advisor, Nederlandse Publieke Omroep (2021).[/footnote]

At the moment, they are only specifically focusing on optimising for that metric within a specific ‘public value’ recommendations section within their wider on-demand platform, which is a mixture of recommendations based on user engagement and  the ‘public value’ of the content. However, through experiments, they found there was a trade-off between optimising for ‘public value’ and viewership, as noted by Arno van Rijswijk, Head of Data & Personalization at NPO:

‘When we’re focusing too much on the public value, we see that the percentage of people that are watching the actual content from the recommender is way lower than when you’re using only the collaborative filtering algorithm […] So when you are focusing more on the relevance then people are willing to watch it. And when you’re adding too much weight on the public values, people are not willing to watch it anymore.’

This resulted in them choosing to have a ‘low ratio’ of public value content to engaging content, making explicit the choice that public service media organisations often do and have to make between audience retention and other public service values like diversity, at least over the short-term these metrics measure.

Others, when faced with the inadequacy of conventional engagement and diversity metrics, have tried to indirectly optimise for public service value by making their metrics the satisfaction of editorial stakeholders, whose preferences are seen as the best ‘ground truth’ proxy for public service value.

In the early stages of developing an article-to-article news recommendation system in 2018,[footnote]The Datalab team was experimenting with and evaluating a number of approaches using a combination of content and user interaction data, such as neural network approaches that combine both content and user data as well as collaborative filtering models based only on user interactions.[/footnote] the BBC Datalab initially used a number of quantitative metrics for its offline evaluation.[footnote]Panteli, M., Piscopo, A., Harland, A., Tutcher, J. and Moss, F. M. (2019). ‘Recommendation systems for news articles at the BBC’, p. 4. CEUR Workshop Proceedings. Available at: http://ceur-ws.org/Vol-2554/paper_07.pdf[/footnote]

They evaluated these using offline metrics, with proxies for engagement, diversity and relevance to audiences, including:

  • hit rate, i.e. whether the list of recommended articles includes an article a user did in fact view within 30 minutes of viewing the original article
  • normalised discounted cumulative gain, i.e. how relevant the recommended articles were assumed to be to the user, with a higher weighting for the relevance of articles higher up in the list of recommendations
  • intra-list diversity, i.e. the average difference between every pair of articles in a list of recommendations
  • inter-list diversity, i.e. the ratio of unique articles recommended to total articles recommended across all the lists of recommendations
  • popularity-based surprisal, i.e. how novel the articles recommended were
  • recency, i.e. how old the articles recommended were when shown to the user.

However, they found that performance on these metrics didn’t match the editorial teams’ priorities. When they tried to instead operationalise into metrics what public service value meant to the editors,  existing quantitative metrics were unable to capture editorial preferences and creating new ones was not straightforward. As Alessandro Piscopo, Lead Data Scientist, BBC Datalab notes:[footnote]Interview with Alessandro Piscopo, Principal Data Scientist, BBC Datalab (2021).[/footnote]

‘We did notice that in some cases, one of the recommender prototypes was going higher in some metrics and went to editorial and [they would] say well we just didn’t like it […] Sometimes it was just comments from editorial world, we want to see more depth. We want to see more breadth. Then you have to interpret what that means.’

This difficulty in finding appropriate metrics led to the Datalab team changing their primary method of evaluation, from offline evaluation to user studies with BBC editorial staff, which they called ‘subjective evaluation’.[footnote]Piscopo, A. (2021). ‘Building public service recommenders: Logbook of a journey’ [presentation recording]. The Academic Fringe Festival. Available at: https://www.youtube.com/watch?v=Q2EYAxX5Pnk[/footnote]

In this approach, they asked editorial staff to score each list of articles generated by the recommendation systems as either: unacceptable, inappropriate, satisfactory or appropriate. The editors were then prompted to describe what properties they considered in choosing how appropriate the recommendations were. The development team would then iterate the recommendation system based on the scoring and written feedback along with discussion with editorial about the recommendation.

Early in the process, the Datalab team agreed with editorial what percentage of each grade they were aiming for, and so what would be a benchmark for success in creating a good recommendation system. In this case, the editorial team decided that they wanted:[footnote]Piscopo, A. (2021); Interview with Alessandro Piscopo, Principal Data Scientist, BBC Datalab (2021).[/footnote]

  1. No unacceptable recommendations, on the basis that any unacceptable recommendations would be detrimental to the reputation of the BBC.
  2. Maximum 10% inappropriate recommendations.

This change of metrics meant that the evaluation of the recommendation system, and the iteration of the system as a result, was optimising for the preferences of the editorial team, over imperfect measures of audience engagement, relevance and diversity. The editors are seen as the most reliable ‘source of truth’ for public service value, in lieu of better quantitative metrics.

Methods

Public service media often rely on internal user studies with their own staff as an evaluation method during the pre-deployment stage of recommendation system development. For example, Greg Detre, ex-Chief Data Scientist at Channel 4, said that when developing a recommendation system for All 4 in 2016, they would ask staff to subjectively compare the output of two recommendation systems side by side, based on the staff’s understanding of Channel 4’s values:

‘So we’re making our recommendations algorithms fight, “Robot Wars” style, pick the one that you think […] understood this view of the best, good recommendations are relevant and interesting to the viewer. Great recommendations go beyond the obvious. Let’s throw in something a little unexpected, or showcase the Born Risky programming that we’re most proud of, [clicking the] prefer button next to the […]one you like best […] Born Risky, which was one of the kind of Channel Four cultural values for like, basically being a bit cheeky. Going beyond the mainstream, taking a chance. It was one of, I think, a handful of company values.’[footnote]Interview with Greg Detre, ex-Chief Data Scientist, Channel 4 (2021).[/footnote]

Similarly, when developing a recommendation system for BBC Sounds, the BBC Datalab decided to use a process of qualitative evaluation. BBC Sounds uses a factorisation machine approach, which is a mixture of content matching and collaborative filtering. This uses your listening history, metadata about the content and other users’ listening history to make recommendations in two ways:

  1. It recommends items that have similar metadata to items you have already listened to.
  2. It recommends items that have been listened to by people with otherwise similar listening histories.

When evaluating this approach, BBC compared the new factorisation machine recommendation system head-to-head with the existing external provider’s recommendations.

They recruited 30 BBC staff members under the age of 35 to be test users.[footnote]Al-Chueyr Martins, T. (2021). ‘From an idea to production: the journey of a recommendation engine’ [presentation recording]. MLOps London. Available at: https://www.youtube.com/watch?v=dFXKJZNVgw4[/footnote] They then showed these test users two sets of nine recommendations side by side. One set was provided by the current external provider’s recommendation system, and the other set was provided by the team’s internal factorisation machine recommendation system. The users were not told which system had produced which set of recommendations, and had to choose whether they preferred ‘A’ or ‘B’, or ‘both’ or ‘neither’, and then explain their decision why in words.

Over 60% of test users preferred the recommendation sets provided by the internal factorisation machine.[footnote]Al-Chueyr Martins, T. (2021).[/footnote] This convinced the stakeholders that the system should move into production and A/B testing, and helped editorial teams get hands-on experience evaluating automated curations, increasing their confidence in the recommendation system.

Similarly, when later deploying the recommendation system to create personalised sorting system for feature items, the Datalab team held a number of digital meetings with editorial staff, showing them the personalised and non-personalised featured items side-by-side. The Datalab then got feedback from the editors on which they preferred.[footnote]Interview with Alessandro Piscopo, Principal Data Scientist, BBC Datalab (2021).[/footnote] This approach allowed them to more directly capture internal staff preferences and manually step towards meeting those preferences. However, the team acknowledged its limitations upfront, particularly in terms of scale.[footnote]Interview with Alessandro Piscopo.[/footnote] Editorial teams and other internal staff only have so much capacity to judge recommendations, and thus would struggle to assess every edge case or judge recommendations, if every recommendation changed depending on the demographics of the audience member viewing it. 

Once the recommendation systems are deployed to a live environment, i.e. accessible by audiences on their website or app, public service media all have some form of online evaluation in place, most commonly in the form of A/B testing in which viewers are given two different recommendations to choose from.

Channel 4 used online evaluation in the form of A/B testing to evaluate the recommendation system used by their video-on-demand service, All 4 Greg Detre noted that:

‘We did A/B test it eventually. And it didn’t show a significant effect. That said [Channel 4] had an already somewhat good system in place. That was okay. And we were very constrained in terms of the technical solutions that we were allowed, there were only a very, very limited number of algorithms that we were able to implement, given the constraints that have already been agreed when I got there. And so as a result, the solution we came up with was, you know, efficient in terms of it was fast to compute in real time, and easy to sort of deploy, but it wasn’t that great… I think perhaps it didn’t create that much value.’[footnote]Interview with Greg Detre, ex-Chief Data Scientist, Channel 4 (2021).[/footnote]

BBC Datalab also used A/B testing in combination with continued user studies and behavioural testing. By April/May 2020, editorial had given sign-off and the recommendation system was deemed ready for initial deployment.[footnote]Piscopo, A. (2021). ‘Building public service recommenders: Logbook of a journey’ [presentation recording]. The Academic Fringe Festival. Available at: https://www.youtube.com/watch?v=Q2EYAxX5Pnk[/footnote]

During deployment, the team took a ‘failsafe approach’ with weekly monitoring of the live version of the recommendation system by editorial staff. This included further subjective evaluation described above and behavioural tests. In these behavioural tests, developers use a list of pairs of inputs and desired outputs, comparing the output of the recommendation system with the desired output for each given input.[footnote]See: BBC. RecList. GitHub. Available at: https://github.com/bbc/datalab-reclist; Tagliabue, J. (2022). ‘NDCG Is Not All You Need’. Towards Data Science. Available at: https://towardsdatascience.com/ndcg-is-not-all-you-need-24eb6d2f1227[/footnote]

After deployment, there was still a need to understand the effect and success of the recommendation systems. This took the form of A/B testing the live system. This included measuring the click-through rate on the recommended articles. However, members of the development team noted it was only a rough proxy for user satisfaction and were working to go beyond click-through rate.

Ultimately at the post-deployment stage, the success of the recommendation system is determined by the product teams, with input by development teams in the identification of appropriate metrics. It is editorial considerations that are central to product teams decide which metrics they think they are best suited to evaluate for.[footnote]Interview with Alessandro Piscopo, Principal Data Scientist, BBC Datalab (2021).[/footnote]

Once the system reaches the stage of online evaluation, these methods can only tell public service media whether the recommendation system was worthwhile after it is has already been built and considering the time and resources invested in building it. Therefore the evaluation becomes about whether to continue to use and maintain the system given the operating costs versus the costs involved in removing or replacing it. This can mean even systems that only provide limited value to the audience or to the public service media organisation will remain in use in this phase of evaluation.

How could evaluations be done differently?

In this section, we explore how the objectives, metrics and methods for evaluating recommendation systems could be done differently by public service media organisations.

Objectives

Some public service media organisations could benefit from more explicitly drawing a connection from their public service values to the organisational and product goals and finally to the recommendation system itself, showing how each level links to the next. This can help prevent value drift as goals go through several levels of interpretation and operationalisation, and help contextualise the role of the recommendation system in achieving public value within the wider process of content delivery.

More explicitly connecting these objectives can help organisations to recognise that, while a product as a whole should achieve public service objectives, a recommendation system doesn’t need to achieve every objective in isolation. While a recommendation system’s objectives should not be in conflict with the higher level objectives, they may only need to achieve some of those goals (e.g. its primary purpose might be to attract and engage younger audiences and thus promote diversity and universality). Therefore, its contribution to the product and organisational objectives should be seen in the context of the overall audience experience and the totality of the content an individual user interacts with. Evaluating against the recommendation system’s feature-level objectives alone is not enough to know whether a recommendation system is also consistent with product and organisational objectives.

Audience involvement in goal-setting

Another area worthy of further exploration is providing greater audience input and control over the objectives and therefore the initial system design choices. This could involve eliciting individual preferences from a panel of audience members and then working with staff to collaboratively trade-off and explicitly set different weighting for different objectives of the system. This should take place as part of a broader co-design approach at the product level. This is because the evaluation process for a recommendation system should include the option to say a recommendation system is not the most appropriate tool for achieving the higher-level objectives of the product and providing the outcomes the staff and the audiences want from the product, rather than constraining audiences to just choose between different versions of a recommendation system.

Making safeguards an explicit objective in system evaluation

A final area worthy of exploration is building in system safeguards like accountability, transparency and interpretability as explicit objectives in the development of the system, rather than just as additional governance considerations. Some interviewees suggested making considerations such as interpretability a specific objective in evaluating recommendation systems. By explicitly weighing those considerations against other objectives and attempting to measure the degree of interpretability or transparency, it would ensure greater salience of those safeguards in the selection of systems.[footnote]Interview with Greg Detre, ex-Chief Data Scientist, Channel 4 (2021).[/footnote]

Metrics

More nuanced metrics for public service value

If public service media organisations want to move beyond optimising for a mix of engagement and exposure diversity in their recommendation systems, then they will need to develop better metrics to measure public service value. As we’ve seen above, some are already moving in this direction with varying degrees of success, but more experimentation and learning will be required.

When creating metrics for public service value, it will be important to disambiguate between different meanings of ‘public service value’. A public service media organisation cannot expect to have one quantitative measure of ‘public service value’, which conflates a number of priorities that can be in tension with one another.

One approach would be to explicitly break each public service value down into separate metrics for universality, independence, excellence, diversity, accountability and innovation, and most likely sub-values within those. This could help public service media developers to clearly articulate the components of each value and make it explicit how they are weighted against each other. However, quantifying concepts like accountability and independence can be challenging to do, and this approach may struggle to work in practice. More experimentation is needed.

The most promising approach may be to adopt more subjective evaluations of recommendation systems. This approach recognises that ‘public service value’ is going to be inherently subjective and uses metrics which reflect that. Qualitative metrics based on feedback from individuals interacting with the recommendation system can let developers balance the tensions between different aspects of public service value. This places less of a burden on developers to weight those values themselves, which they might be poorly suited to, and can accommodate different conceptions of public service value from different stakeholders.

However, subjective evaluations do have their limits. They are only able to evaluate a tiny subset of the overall recommendations, and will only capture the subjective evaluation of features appearing in that subset. These evaluations may miss features that were not present in the content evaluated, or which are only able to be observed in aggregate over some wider set of recommendations. These challenges can be mitigated by broadening subjective evaluations to a more representative sample of the public, but that may raise other challenges around the costs of running these evaluations at that scale.

More specific metrics

In a related way, evaluation metrics could be improved by greater specificity and explicitness about what concept the metric is trying to measure and therefore explicitness about how different interpretations of the same high-level concept are weighted.[footnote]van Es, K. F. (2017). ‘An Impending Crisis of Imagination : Data‐Driven Personalization in Public Service Broadcasters’. Media@LSE. Available at: https://dspace.library.uu.nl/handle/1874/358206[/footnote] In particular, public service media organisations could be more explicit about the kind of diversity they want to optimise, e.g. unique content viewed, the balance of categories viewed or the representation of demographics and viewpoints across recommendations, and whether they care about each individual’s exposure or exposure across all users.

Longer-term metrics

Another issue identified is that most metrics used in the evaluation of recommendation systems, within public service media and beyond, are short-term metrics, measured in days or weeks, rather than years. Yet at least some of the goals of stakeholders will be longer-term than the metrics used to approximate them. Users may be interested in both immediate satisfaction and in discovering new content so they continue to be informed and entertained in the future. Businesses may both be trying to maximise quarterly profits and also trying to retain users into the future to maximise profits in the quarters to come.

Short-term metrics are not entirely ineffective at predicting long-term outcomes. Better outcomes right now could mean better outcomes months or years down the road, so long as the context the recommendation system is operating in stays relatively stable and the recommendation system itself doesn’t change user behaviour in ways that lead to poorer long-term outcomes.

By definition, long-term consequences take a longer time to occur, and thus there is a longer waiting period between a change in the recommendation system and the resulting change in outcome. A longer period between action and evaluation also means a greater number of confounding variables which make it more challenging to assess the causal link between the change in the system and the change in outcomes.

Dietmar Jannach, Professor at the University of Klagenfurt, highlighted this was a problem across academic and industry evaluations, and that ‘when Netflix changes the algorithms, they measure, let’s see, six weeks, two months to try out different things in parallel and look what happens. I’m not sure they know what happens in the long run.’[footnote]Interview with Dietmar Jannach, Professor, University of Klagenfurt (2021).[/footnote]

Methods

Simulation-based evaluation

One possible method to estimate long-term metrics is to use simulation-based offline evaluation approaches. In this approach, the developers use a virtual environment with a set of content which can be recommended and a user model which simulates the expected preferences of users based on parameters selected by the developers (which could include interests, demographics, time already spent on the product, previous interactions with the product etc.).[footnote]Ie, E., Hsu, C., Mladenov, M. et al. (2019). ‘RecSim: A Configurable Simulation Platform for Recommender Systems’. arXiv. Available at: https://doi.org/10.48550/arXiv.1909.04847[/footnote] This recommendation system then makes recommendations to the user model, which generates a simulated response to that recommendation. The user model can also update its preferences in response to the recommendations it has received, e.g. a user might become more or less interested in a particular category of content, and model the simulated users’ overall satisfaction with the recommendations over time.

This provides some indication of how the dynamics of the recommendation system and changes to it might play out over a long period of time. It can evaluate how users respond to a series of recommendations over time and therefore whether a recommendation system could lead to audience satisfaction or diverse content exposure over a period longer than a single recommendation or user session. However, this approach still has many of the limitations of other kinds of offline evaluation. Historical user interaction data is still required to model the preferences of users, and that data is not neutral because it is itself the product of interaction with the previous system, including any previous recommendation system that was in place.

The user model is also only based on data from previous users, which might not generalise well to new users. Given that many of these recommendation systems are put in place to reach new audiences, specifically younger and more diverse audiences than those who currently use the service, the simulation-based evaluation might lead to unintentionally underserving those audiences and overfitting to existing user preferences.

Furthermore, the simulation can only model the impact of parameters coded into it by the developers. The simulation only reflects the world as a developer understands it, and may not reflect the real considerations users take into account in interacting with recommendation systems, nor the influences on user behaviour beyond the product.

This means that if there are unexpected shocks, exogenous to the recommendation system, that change user interaction behaviour to a significant degree, then the simulation will not take those factors into account. For example, a simulation of a news recommendation system’s behaviour in December 2019 would not be a good source of truth for a recommendation system in operation during the COVID-19 pandemic. The further the simulation tries to look ahead at outcomes, the more vulnerable it will be to changes in the environment that may invalidate its results.

User panels and retrospective feedback

After deployment, asking audiences for informed and retrospective feedback on their recommendations is a promising method for short-term and long-term recommendation system evaluation.[footnote]Stray, J., Adler, S. and Hadfield-Menell, D. (2020), ‘What are you optimizing for? Aligning Recommender Systems with Human Values’, pp. 4–5. Participatory Approaches to Machine Learning ICML 2020 Workshop (July 17). Available at: https://participatoryml.github.io/papers/2020/42.pdf[/footnote] This could involve asking the users to review, rate and provide feedback on a subsection of the recommendations they received over the previous month, in a similar manner to the subjective evaluations undertaken by the BBC Datalab. This would provide development and product teams with much more informative feedback than through A/B testing.

This could be particularly effective in the form of a representative longitudinal user panel which returns to the same audience members at regular intervals to get their detailed feedback on recommendations.[footnote]Stray, J. (2021). ‘Beyond Engagement: Aligning Algorithmic Recommendations With Prosocial Goals’. Partnership on AI. Available at: https://www.partnershiponai.org/beyond-engagement-aligning-algorithmic-recommendations-with-prosocial-goals/[/footnote] Participants in these panels should be compensated for their participations to recognise the contribution they are making to the improvement of the system and ensure long-term retention of participants. This would allow development and product teams to gauge how audience responses change over time, by seeing how they react to the same recommendations months later, to understand how their opinions on that recommendation may have changed over time, including in response to changes to the underlying system over longer periods.

Case studies

Through two case studies, we examine how the differing prioritisation of values in different forms of public service media and the differing nature of the content itself manifests itself in different approaches to recommendation systems. We will focus on the use of recommendation systems across BBC News for news content, and BBC Sounds for audio-on-demand.

Case study 1: BBC News

Introduction

BBC News is the UK’s dominant news provider and one of the world’s most influential news organisations.[footnote]This case study focuses on the parts of BBC News that function as a public service, rather than BBC Global News, the international commercial news division.[/footnote] It reaches 57% of UK adults every week and 456 million globally. Its news websites are the most-visited English language news websites on the internet.[footnote]As of 2021, BBC News on TV and radio reaches 57% of UK adults every week and across all channels, BBC News globally reaches a weekly global audience of 456 million adults., Ssee: BBC Media Centre. (2021). ‘BBC on track to reach half a billion people globally ahead of its centenary in 2022′. BBC Media Centre. Available at: https://www.bbc.co.uk/mediacentre/2021/bbc-reaches-record-global-audience; BBC News is equally influential globally within the domain of digital news. By one measure, the BBC News and BBC World News websites combined are the most-visited English-language news websites, receiving three to four times the website traffic of the New York Times, Daily Mail, or The Guardian, see: Majid, A. (2021). ‘Top 50 largest news websites in the world: Surge in traffic to Epoch Times and other ring-wing sites’. Press Gazette. Available at: https://pressgazette.co.uk/top-50-largest-news-websites-in-the-world-right-wing-outlets-see-biggest-growth/; As of 2021, BBC News Online reaches 45% of UK adults every week, approximately triple the reach of its nearest competitors: The Guardian (17%), Sky News Online (14%) and the MailOnline (14%). Estimates of UK reach are based on a sample 2029 adults surveyed by YouGov (and their partners) using an online questionnaire at the end of January and beginning of February 2021. See: Reuters Institute for Institute for the Study of Journalism. Reuters Institute Digital News Report 2021, 10th Edition, p. 62. Available at: https://reutersinstitute.politics.ox.ac.uk/sites/default/files/2021-06/Digital_News_Report_2021_FINAL.pdf[/footnote] For most of the time that BBC News has had an online presence, it has not used any recommendation systems on its platforms.

In recent years, BBC News has taken a more experimental approach to recommendation systems, with a number of different systems for recommending news content developed, piloted and deployed across the organisation.[footnote]The team initially developed an experimental recommendation system for BBC Mundo, the BBC World Service’s Spanish-language news website. See: Panteli, M., Piscopo, A., Harland, A., Tutcher, J. and Moss, F. M. (2019). ‘Recommendation systems for news articles at the BBC’, p.1. CEUR Workshop Proceedings. Available at: http://ceur-ws.org/Vol-2554/paper_07.pdf; These are also live on BBC World Service websites in Russian, Hindi and Arabic and in beta on the BBC News App. See: Piscopo, A. (2021). ‘Building public service recommenders: Logbook of a journey’ [presentation recording]. The Academic Fringe Festival. Available at: https://www.youtube.com/watch?v=Q2EYAxX5Pnk; Al-Chueyr Martins, T. (2019). ‘Responsible Machine Learning at the BBC’ [presentation]. Available at: https://www.slideshare.net/alchueyr/responsible-machine-learning-at-the-bbc-194466504[/footnote]

Goal

For editorial teams, the goal of adding recommendation systems to BBC News was to augment editorial curation and make it easier to scale on a more personalised level. This addresses challenges relating to editors facing an ‘information overload’ of content to recommend. Additionally, product teams at BBC believed this feature would improve the discoverability of news content for different users.[footnote]Panteli, M., Piscopo, A., Harland, A., Tutcher, J. and Moss, F. M. (2019). ‘Recommendation systems for news articles at the BBC’, p. 4. CEUR Workshop Proceedings. Available at: http://ceur-ws.org/Vol-2554/paper_07.pdf[/footnote]

What did they build?

From around 2019, , a team (which later become part of BBC Datalab) collaborated with a team building out the BBC News app to develop a content-to-content recommendation system. This focused on ‘onward journeys’ from news articles. Partway through each article the recommendation system generated a section that was titled ‘You might be interested in’ (in the language relevant to that news website) that listed four recommended articles.[footnote]Interview with Alessandro Piscopo, Principal Data Scientist, BBC Datalab (2021).[/footnote]

Figure 2: BBC News ‘You might be interested in’ section (image courtesy of the BBC)

The recommendation system is combined with a set of business rules which constrain the set of articles that the system recommends content from. The rules aim to ensure ‘sufficient quality, breadth, and depth’ in the recommendations.[footnote]Piscopo, A. (2021). ‘Building public service recommenders: Logbook of a journey’ [presentation recording]. The Academic Fringe Festival. Available at: https://www.youtube.com/watch?v=Q2EYAxX5Pnk[/footnote]

For example, these included:

  • recency, e.g. only selecting content from the past few weeks
  • unwanted content, e.g. content in the wrong language
  • contempt of court
  • elections
  • children-safe content.

In an earlier project, this team had developed an experimental recommendation system for BBC Mundo, the BBC World Service’s Spanish-language news website.[footnote]Panteli, M., Piscopo, A., Harland, A., Tutcher, J. and Moss, F. M. (2019). ‘Recommendation systems for news articles at the BBC’, p. 4. CEUR Workshop Proceedings. Available at: http://ceur-ws.org/Vol-2554/paper_07.pdf[/footnote] Similar recommendation systems are also live on BBC World Service websites in Russian, Hindi and Arabic and in beta on the BBC News App.[footnote]Piscopo, A. (2021). ‘Building public service recommenders: Logbook of a journey’ [presentation recording]. The Academic Fringe Festival. Available at: https://www.youtube.com/watch?v=Q2EYAxX5Pnk; Al-Chueyr Martins, T. (2019). ‘Responsible Machine Learning at the BBC’ [presentation]. Available at: https://www.slideshare.net/alchueyr/responsible-machine-learning-at-the-bbc-194466504[/footnote]

Figure 3: BBC Mundo recommendation system (image courtesy of the BBC)

Figure 4: Recommendation system on BBC World Service website in Hindi (image courtesy of the BBC)

Criteria (and how they relate to public service values)

The BBC News team eventually settled on a content-to-content recommendation system using a model (called ‘tf-idf’) that encoded article data (like text) and metadata (like the categorical tags that editorial teams gave the article) into vectors. Once articles were represented as vectors, additional metrics could be applied to measure the similarity between them. This enabled the ability to penalise more popular content.[footnote]Crooks, M. (2019). ‘A Personalised Recommender from the BBC’. BBC Data Science. Available at: https://medium.com/bbc-data-science/a-personalised-recommender-from-the-bbc-237400178494[/footnote]

The business rules the BBC used sought to ensure ‘sufficient quality, breadth, and depth’ in the recommendations, which aligns with the BBC’s values around universality and excellence.[footnote]Piscopo, A. (2021). ‘Building public service recommenders: Logbook of a journey’ [presentation recording]. The Academic Fringe Festival. Available at: https://www.youtube.com/watch?v=Q2EYAxX5Pnk[/footnote]

There was also an emphasis on the recommendation system needing to be easy to understand and explain. This can be attributed to BBC News being more risk-averse than other parts of the organisation.[footnote]Piscopo, A. (2021).[/footnote] Given the BBC’s mandate to be a ‘provider of accurate and unbiased information’ and BBC News that staff themselves identify as ‘the product that likely contributes most to its reputation as a trustworthy and authoritative media outlet’.[footnote]Panteli, M., Piscopo, A., Harland, A., Tutcher, J. and Moss, F. M. (2019). ‘Recommendation systems for news articles at the BBC’, p. 4. CEUR Workshop Proceedings. Available at: http://ceur-ws.org/Vol-2554/paper_07.pdf[/footnote] It is unsurprising they would want to pre-empt any accusations of bias for any automated news recommendation system, by making it understandable to audiences.

Evaluation

The Datalab team experimented with a number of approaches using a combination of content and user interaction data.

Initially, they found that a content-to-content approach to item recommendations was more suited to the editorial requirements for the product, and user interaction data was therefore less relevant to the evaluation of the recommender, prompting a shift to a different approach.

As they began to compare different content-to-content approaches, they found that performance in quantitative metrics often didn’t match the editorial teams priorities, and it was difficult to operationalise editorial judgement of public service value into metrics. As Alessandro Piscopo notes: ‘We did notice that in some cases, one of the recommender prototypes was going higher in some metrics and went to editorial and [they would] say well we just didn’t like it.’ And, ‘Sometimes it was just comments from editorial world, we want to see more depth. We want to see more breadth. Then you have to interpret what that means.’[footnote]Interview with Alessandro Piscopo, Principal Data Scientist, BBC Datalab (2021).[/footnote]

The Datalab team chose to take a subjective evaluation-first approach, whereby editors would directly compare and comment on the output of two recommendation systems. This approach allowed them to capture editorial preferences more directly and manually work towards meeting those preferences.

However, the team acknowledged its limitations upfront, particularly in terms of scale.[footnote]Interview with Alessandro Piscopo.[/footnote] They tried to pick articles that would bring up the most challenging cases. However, editorial teams only have so much capacity to judge recommendations, and thus would struggle to assess every edge case or judge every recommendation. This issue would be even more acute if in a future recommendation system, every article’s associated recommendations changed depending on the demographics of the audience member viewing it.

By May 2020, editorial had given sign-off and the recommendation system was deemed ready for initial deployment.[footnote]Piscopo, A. (2021). ‘Building public service recommenders: Logbook of a journey’ [presentation recording]. The Academic Fringe Festival. Available at: https://www.youtube.com/watch?v=Q2EYAxX5Pnk[/footnote] During deployment, the team took a ‘failsafe approach’ with weekly monitoring of the live version of the recommendation system by editorial staff, alongside A/B testing measuring the click-through rate on the recommended articles. However, members of the development team noted it was only a rough proxy for user satisfaction and were working to go beyond click-through rate.

Case Study 2: BBC Sounds

Introduction

BBC Sounds is the BBC’s audio streaming and download service for live radio, music, audio-on-demand and podcasts,[footnote]BBC. ‘What is BBC Sounds?’. Available at: https://www.bbc.co.uk/contact/questions/help-using-bbc-services/what-is-sounds[/footnote] replacing the BBC’s previous live and catch-up audio service, iPlayer Radio.[footnote]The BBC Sounds website replaced the iPlayer Radio website in October 2018; the BBC Sounds app was launched in beta in the United Kingdom in June 2018 and made available internationally in September 2020, with the iPlayer Radio app decommissioned for the United Kingdom in September 2019 and internationally in November 2020. See: BBC. (2018). ‘The next major update for BBC Sounds’ Available at: https://www.bbc.co.uk/blogs/aboutthebbc/entries/03e55526-e7b4-45de-b6f1-122697e129d9; BBC. (2018). ‘Introducing the first version of BBC Sounds’, Available at: https://www.bbc.co.uk/blogs/aboutthebbc/entries/bde59828-90ea-46ac-be5b-6926a07d93fb; BBC. (2020). ‘An international update on BBC Sounds and BBC iPlayer Radio’. Available at: https://www.bbc.co.uk/blogs/internet/entries/166dfcba-54ec-4a44-b550-385c2076b36b; BBC Sounds. ‘Why has the BBC closed the iPlayer Radio app?’. Available at: https://www.bbc.co.uk/sounds/help/questions/recent-changes-to-bbc-sounds/iplayer-radio-message[/footnote] A key difference between BBC Sounds and iPlayer Radio is that BBC Sounds was built with personalisation and recommendation as a core component of the product, rather than as a radio catch-up service.[footnote]In May 2019, six months after the launch of BBC Sounds, James Purnell, then Director of Radio & Education at the BBC, said that ‘“The [BBC Sounds] app, for instance, is built for personalisation, but is not yet fully personalised. This means that right now a user sees programmes that have not been curated for them. That is changing, as of this month in fact. By the autumn, Sounds will be highly personalised.’” See: BBC Media Centre. (2019). ‘Changing to stay the same – Speech by James Purnell, Director, Radio & Education, at the Radio Festival 2019 in London.’ Available at: https://www.bbc.co.uk/mediacentre/speeches/2019/bbc.com/mediacentre/speeches/2019/james-purnell-radio-festival/[/footnote]

Goal

The goals of BBC Sounds as a product team are:

  • increase the audience size of BBC Sounds’ digital products
  • increase the demographic breadth of consumption across BBC Sounds’ products, especially among the young[footnote]According to David Jones (Executive Product Manager, BBC Sounds, interviewed in 2021), his top-line KPI is to reach 900,000 members of the British population who are under 35 by March 2022. These numbers are determined centrally by BBC senior managers based on the BBC’s Service Licence for BBC Online and Red Button. See: BBC Trust. (2016). BBC Online and Red Button Service Licence. Available at: http://downloads.bbc.co.uk/bbctrust/assets/files/pdf/regulatory_framework/service_licences/online/2016/online_red_button_may16.pdf [/footnote]
  • convert ‘lighter users’ who only engage a certain number of times a week into regular users
  • enable users to more easily discover content from the more than 50 hours of new audio produced by the BBC on an hourly basis.

Product

BBC Sounds initially used an outsourced recommendation system from a third-party provider. Having knowledge about the inner working of the recommendation systems and the ability to quickly iterate were seen as valuable by the development team, as it proved challenging to request changes to the external provider. The BBC decided it wanted to own the technology and the experience as a whole, and believed they could achieve better value-for-money for TV License-payers by bringing the system in-house. So the BBC Datalab developed a hybrid recommendation system named Xantus for BBC Sounds.

BBC Sounds use a factorisation machine approach, which is a mixture of content matching and collaborative filtering. This uses your listening history, metadata about the content, and other users’ listening history to make recommendations in two ways:

  1. It recommends items that have similar metadata to items you have already listened to.
  2. It recommends items that have been listened to by people with otherwise similar listening histories.

Figure 5: BBC Sounds’ ‘Recommended For You’ section (image courtesy of the BBC)

Figure 6: ‘Music Mixes’ on BBC Sounds (image courtesy of the BBC)

Criteria (and how they relate to public service media values)

On top of this factorisation machine approach are a number of business rules. Some rules apply equally across all users and constrain the set of content that the system recommends content from, e.g. only selecting content from the past few weeks. Other rules apply after individual user recommendations have been generated and filter the recommendations based on specific information about the user, e.g. not recommending content the user has already consumed.

As of summer 2021, the business rules used in the BBC Sounds’ Xantus recommendation system were:[footnote]Note that the business rules are subject to change, and so the rules given here are intended to be an indicative example only, representing a snapshot of practice at one point in time. See: Al-Chueyr Martins, T. (2021). ‘From an idea to production: the journey of a recommendation engine’ [presentation recording]. MLOps London. Available at: https://www.youtube.com/watch?v=dFXKJZNVgw4[/footnote]

Non-personalised business rules Personalised business rules
Recency Already seen items
Availability Local radio (if not consumed previously)
Excluded ‘master brands’, e.g. particular radio channels[footnote]Smethurst, M. (2014). Designing a URL structure for BBC programmes. Available at: https://smethur.st/posts/176135860[/footnote] Specific language (if not consumed previously)
Excluded genres Episode picking from a series
Diversification (1 episode per brand/series)

Governance

Editorial and others help define the business rules for Sounds.[footnote]Interview with Kate Goddard, Senior Product Manager, BBC Datalab (2021).[/footnote] The product team adopted the business rules from the incumbent system and then checked whether they made sense in the context of the new system. They constantly review the business rules. Kate Goddard, Senior Product Manager, BBC Datalab, noted that: 

‘Making sure you are involving [editorial values] at every stage and making sure there is strong collaboration between data scientists in order to define business rules to make sure we can find good items. For instance with BBC Sounds you wouldn’t want to be recommending news content to people that’s more than a day or two old and that would be an editorial decision along with UX research and data. So, it’s a combination of optimizing for engagement while making sure you are working collaboratively with editorial to make sure you have the right business rules in there.’

Evaluation

To decide whether to progress further with the prototype, the team decided to use a process of subjective evaluation. The Datalab team showed recommendations generated by both the new factorisation machine recommendation system head-to-head with the existing external provider’s recommendations and got feedback from the editors on which of the two they liked.[footnote]Interview with Alessandro Piscopo, Principal Data Scientist, BBC Datalab (2021).[/footnote] The factorisation machine recommendation system was preferred by the editors and so was deployed into the live environment.

After deployment, UX testing, qualitative feedback and A/B testing were used to fine-tune the system. In their initial A/B tests, they were optimising for engagement, looking at click-throughs, play throughs and play completes. In these tests, they were able to achieve:[footnote]Al-Chueyr Martins, T. (2021). ‘From an idea to production: the journey of a recommendation engine’ [presentation recording]. MLOps London. Available at: https://www.youtube.com/watch?v=dFXKJZNVgw4[/footnote]

  • 59% increase in interactions in the ‘Recommended for You’ rail
  • 103% increase in interactions for under-35s.

 

Outstanding questions and areas for further research and experimentation

Through this research we have built up an understanding of the use of recommendation systems in public service media in the BBC and Europe, as well as the opportunities and challenges that arise. This section offers recommendations to address some of the issues that have been raised and indicate areas beyond the scope of this project that merit further research. These recommendations are directed at the research community, including funders, regulators and public service media organisations themselves.

There is an opportunity for public service media to define a new, responsible approach to the development of recommendation systems that work to the benefit of society as a whole and offer an alternative to the paradigm established by big technology platforms. Some initiatives that are already underway could underpin this, such as the BBC’s Databox project with the University of Nottingham and subsequent work on developing personal data stores.[footnote]Sharp, E. (2021). ‘Personal data stores: building and trialling trusted data services’. BBC R&Desearch & Development. Available at: https://www.bbc.co.uk/rd/blog/2021-09-personal-data-store-research; Leonard, M. and Thompson, B. (2020), ‘Putting audience data at the heart of the BBC’. BBC Research & Development. Available at: https://www.bbc.co.uk/rd/blog/2020-09-personal-data-store-privacy-services[/footnote] These personal data stores primarily aim to address issues around data ownership and portability, but could also act as a foundation for more holistic recommendations across platforms and greater user control over the data used in recommending them content.

But in making recommendations to public service media we recognise the pressures they face. In the course of this project, a real-terms cut to BBC funding has been announced and the corporation has said it will have to reduce the services it offers in response.[footnote]Hansard – Volume 707: debated on Monday 17 January 2022. ‘BBC Funding’. UK Parliament. Available at: https://hansard.parliament.uk//commons/2022-01-17/debates/7E590668-43C9-43D8-9C49-9D29B8530977/BBCFunding[/footnote] We acknowledge that, in the absence of new resources and faced with the reality of declining budgets, public service media organisations would have to cut other activities to carry out our suggestions. 

We therefore encourage both funders and regulators to support organisations to engage in public service innovation as they further explore the use of recommendation systems. Historically the BBC has set a precedent for using technology to serve the public good, and in doing so brought soft power benefits to the UK. As the UK implements its AI strategy, it should build on this strong track record and comparative advantage and invest in the research and implementation of responsible recommendation systems.

1. Define public service value for the digital age

Recommendation systems are designed to optimise against specific objectives. However, the development and implementation of recommendation systems is happening at a time when the concept of public service value and the role of public service media organisations in the wider media landscape is rapidly changing.

Although we make specific suggestions for approaches to these systems, unless public service media organisations are clear about their own identities and purpose, it will be difficult for them to build effective recommendation systems. It is essential that public service media revisit their values in the digital age, and articulate their role in the contemporary media ecosystem.

In the UK, significant work has already been done by Ofcom as well as the Digital, Culture, Media and Sport Select Committee to identify the challenges public service media face and offer new approaches to regulation. Their recommendations must be implemented so that public service media can operate within a paradigm appropriate to the digital age and build systems that address a relevant mission.

2. Fund a public R&D hub for recommendation systems and responsible recommendation challenges

There is a real opportunity to create a hub for the research and development of recommendation systems that are not tied to industry goals. This is especially important as recommendation systems are one of the prime use cases of behaviour modification technology, but research into it is impaired by lack of access to interventional data.[footnote]Greene, T., Martens, D. and Shmueli, G. (2022). ‘Barriers to academic data science research in the new realm of algorithmic behaviour modification by digital platforms’. Nature Machine Intelligence, 4, pp.323–330. Available at: https://www.nature.com/articles/s42256-022-00475-7[/footnote]

Existing academic work on responsible recommendations could be brought together into a public research hub on responsible recommendation technology, with the BBC as an industry partner. It could involve developing and deploying methods for democratic oversight of the objectives of recommendation systems and the creation and maintenance of useful datasets for researchers outside of private companies.

We recommend that the strategy for using recommendation systems in public service media should be integrated within a broader vision to make this part of a publicly accountable infrastructure for social scientific research.

Therefore, as part of UKRI’s National AI Research and Innovation (R&I) Programme, set out in the UK AI Strategy, it should fund the development of a public research hub on recommendation technology. This programme could also connect with the European Broadcasting Union’s PEACH project, which has similar goals and aims.

Furthermore, one of the programme’s aims is to create challenge-driven AI research and innovation programmes for key UK priorities. The arrival of Netflix in 2006 spurred the development of today’s recommendation systems. The UK could create new challenges to spur the development of responsible recommendation system approaches  encouraging a better information environment. For example, the hub could release a dataset and benchmark for a challenge on generating automatic labels for a dataset of news items.

3. Publish research into audience expectations of personalisation

There was a striking consensus in our interviews with public service media teams working on recommendation systems that personalisation was both wanted and expected by the audience. However, we were offered little evidence to support this belief. Research in this area is essential for a number of reasons.

  1. Public service media exist to serve the public. They must not assume they are acting in the public interest without any evidence of their audience’s views towards recommendation systems.
  2. The adoption of recommendation systems without evidence that they are either wanted or needed by the public raises the risk that public service media are blindly following a precedent set by commercial competitors, rather than defining a paradigm aligned to their own missions.
  3. Public service media have limited resources and multiple demands. It is not strategic to invest heavily in the development and implementation of these systems without an evidence base to support their added value.

If research into user expectations of recommendation systems does exist, the BBC should strive to make this public.

4. Communicate and be transparent with audiences

Although most public service media organisations profess a commitment to transparency about their use of recommendation systems, in practice there is limited effective communication with their audiences about where and how recommendation systems are being used.

What communication there is tends to adopt the language of commercial services, for example talking about ‘relevance’. In our interviews, we found that within teams there was no clear responsibility for audience communication. Staff often assumed that few people would want to know more, and that any information provided would only be accessed by a niche group of users and researchers.

However, we argue that public service organisations have a responsibility to explain their practices clearly and accessibly and to put their values of transparency into practice. This should not only help retain public trust at a time when scandals from big technology companies have understandably made people view algorithmic

systems with suspicion, but also develop a new, public service narrative around the use of these technologies.

Part of this task is to understand what a meaningful explanation of a recommendation system looks like. Describing the inner workings of algorithmic decision-making is not only unfeasible but probably unhelpful. However, they can educate audiences about the interactive nature of recommendation systems. They can make salient the idea that when consuming content through a recommendation system, they are in effect ‘voting with their attention’. Their viewing behaviour is something private, but at the same time affects what the system learns and what others will view.

Public service media should invest time and research into understanding how to usefully and honestly articulate their use of recommendation systems in ways that are meaningful to their audiences.

This communication must not be one-way. There must be opportunities for audience members to give feedback and interrogate the use of the systems, and raise concerns where things have gone wrong.

5. Balance user control with convenience

However, transparency alone is not enough. Giving users agency over the recommendations they see is an important part of responsible recommendation. Simply giving users direct control over the recommendation system is an obvious and important first step, but it is not a universal solution.

Some interviewees pointed to evidence that the majority of users do not choose to use these controls and instead opt for the default setting. But there is also evidence that younger users are beginning to use a variety of accounts, browsers and devices, with different privacy settings and aimed at ‘training’ the recommendation algorithm to serve different purposes.

Many public service media staff we spoke with described providing this level of control. Some challenges that were identified include the difficulty of measuring how well the recommendations meet specific targets, as well as risks relating to the potential degradation of the user experience.

Firstly, some of our interviewees noted how it would be more difficult to measure how well the recommendation system is performing on dimensions such as diversity of exposure, if individual users were accessing recommendations through multiple accounts. Secondly, it was highlighted how recommendation systems are trained on user behavioural data, and therefore giving more latitude to users to intentionally influence the recommendations may give rise to negative dynamics that degrade the overall experience for all users over the long run, or even expose the system to hostile manipulation attempts.

While these are valid concerns, we believe that there is some space for experimentation, between giving users no control and too much control. For example, users could be allowed to have different linked profiles, and key metrics could be adjusted to take into account the content that is accessed across these profiles. Users could be more explicitly shown how to interact with the system to obtain different styles of recommendations, making it easy to maintain different ‘internet personas’. Some form of ongoing monitoring for detecting adversarial attempts at influencing recommendation choices could also be explored. We encourage the BBC to experiment with these practices and publish research on their findings.

Another trial worth exploring is allowing ‘joint’ user recommendation profiles, where the recommendations are made based on multiple individuals’ aggregated interaction history and preferences, such as a couple, a group of friends or a whole community. This would allow users to create their own communities and ‘opt-in’ to who and what influenced their recommendations in an intuitive way. This could enabled by the kind of personal data stores being explored by the BBC and Belgian VRT.[footnote]Sharp, E. (2021). ‘Personal data stores: building and trialling trusted data services’. BBC Research & Development. Available at: https://www.bbc.co.uk/rd/blog/2021-09-personal-data-store-research[/footnote]

There are multiple interesting versions of this approach. In one version, you would see recommendations ‘meant’ for others and know it was a recommendation based on their preferences. In another version, users would simply be exposed to a set of unmarked recommendations based on all their combined preferences.

Another potential approach to pilot would be to create different recommendation systems that coexist and allow users to choose which they want to use or offer different ones at different times of day or when significant events happen (e.g. switching to a different recommendation system during the run up to an election or overriding them with breaking news). Such an approach might offer a chance to invite audiences to play a more active part in the formulation of recommendations, and open up opportunities for experimentation, which would need to be balanced against the additional operational costs that would be introduced.

6. Expand public participation

Beyond transparency or individual user choice and control over the parameters of the recommendation systems already deployed, users and wider society could also have greater input during the initial design of the recommendation systems and in the subsequent evaluations and iterations.

This is particularly salient for public service media organisations, as unlike private companies, which are primarily accountable to their customers and shareholders, public service media organisations see themselves as having a universal obligation to wider society. Therefore, even those who are not direct consumers of content should have a say in how public service media recommendations are shaped.

User panels

One approach to this, suggested by Jonathan Stray, is to create user panels that provide informed, retrospective feedback about live recommendation systems.[footnote]Stray, J. (2021). ‘Beyond Engagement: Aligning Algorithmic Recommendations With Prosocial Goals’. Partnership on AI. Available at: https://www.partnershiponai.org/beyond-engagement-aligning-algorithmic-recommendations-with-prosocial-goals/[/footnote] These would involve paying users for detailed, longitudinal data about their experiences with the recommendation system. 

This could involve daily questions about their satisfaction with their recommendations, or monthly reviews where users are shown a summary of their recommendations and interaction with them. They could be asked how happy they are with the recommendations, how well do their interests are served and how informed they feel.

This approach could provide new, richer and more detailed metrics for developers to optimise the recommendation systems against, which would potentially be more aligned with the interests of the audience. It might also open up the ability to try new approaches to recommendation, such as reinforcement learning techniques that optimise for positive responses to daily and monthly surveys.

Co-design

A more radical approach would be to involve audience communities directly in the design of the recommendation system. This could involve bringing together representative groups of citizens, analogous to citizens’ assemblies, which have direct input and oversight of the creation of public service media recommendation systems, creating a third core pillar in the design process, alongside editorial teams and developer teams. This is an approach that has been proposed by the Media Reform Coalition Manifesto for a People’s Media.[footnote]Grayson, D. (2021). Manifesto for a People’s Media. Media Reform Coalition. Available at: https://drive.google.com/file/u/1/d/1_6GeXiDR3DGh1sYjFI_hbgV9HfLWzhPi/view?usp=embed_facebook[/footnote]

These would allow citizens to ask questions of the editors and developers about how the system is intended to work, what kinds of data inform those systems and about what alternative approaches exist (including not using recommendation systems at all). These groups could then set out their requirements for the system and iteratively provide feedback on versions of the system as its developed, in the same way that editorial teams have, for example, by providing qualitative feedback on recommendations provided by different systems.

7. Standardise metadata

Each public service media organisation should have a central function that standardises the format, creation and maintenance of metadata across the organisation.

Inconsistent, poor quality metadata was consistently highlighted as a barrier to developing recommendation systems in public service media, particularly in developing more novel approaches that go beyond user engagement and try to create diverse feeds of recommendations.

Institutionalising the collection of metadata and making access to it more transparent across each individual organisation is an important investment in public service media’s future capabilities.

We also think it’s worth exploring how much metadata can be standardised across European media organisations. The European Broadcasting Union (EBU)’s ‘A European Perspective’ project is already trialling bringing together content from across different European public service media organisations onto a single platform, underpinned by the EBU’s PEACH system for recommendations and the EuroVOX toolkit for automated language services. Further cross-border collaboration could be enabled by sharing best practices among member organisations.

8. Create shared recommendation system resources

Some public service media organisations have found it valuable to have access to recommendations-as-a-service provided by the European Broadcasting Union (EBU) through their PEACH platform. This reduces the upfront investment required to start using the recommendation system and provides a template for recommendations that have already been tested and improved upon by other public service media organisations.

One area identified as valuable for the future development of PEACH was greater flexibility and customisation. For example, some asked for the ability to incorporate different concepts of diversity into the system and control the relative weighting of diversity. Others would have found it valuable to be able to incorporate more information on the public service value of content into the recommendations directly.

We also heard from several interviewees that they would value a similar repository for evaluating recommendation systems on metrics valued by public service media, including libraries in common coding languages, e.g. Python, and a number of worked examples for measuring the quality of recommendations. The development of this could be led by the EBU or a single organisation like the BBC.

This would help systemise the quantifying of public service values and collate case studies of how values are quantified. This would be best as an open-source repository that others outside of public service media could learn from and draw on. This would:

  • lower costs and thus easier to justify investment
  • reduce the technical burden, making it easier for newer and smaller teams to implement
  • point to how they’re used elsewhere, reducing the burden of proof and making the alternative approach appear less risky
  • provide source of existing ideas, meaning the team have to spend less time either coming up with their own (which might be suboptimal and discover that for themselves) or spend time wading through the technical literature.

Future public service media recommendation systems projects, and responsible recommendation system development more broadly, could then more easily evaluate their system against more sophisticated metrics than just engagement.

9. Create and empower integrated teams

When developing and deploying recommendation systems, public service media organisations need to integrate editorial and development teams from the start. This ensures that the goals of the recommendation system are better aligned with the organisation’s goals as a whole and ensures the systems augment and complement existing editorial expertise.

An approach that we have seen applied successfully is having two project leads, one with an editorial background and one with a technical development background, who are jointly responsible for the project.

Public service media organisations could also consider adopting a combined product and content team. This can ensure that both editorial and development staff have a shared language and common context, which can reduce the burden of communication and help staff feel like they have a common purpose rather than competition between the different teams.

Methodology

To investigate our research questions, we adopted two main methods:

  1. Literature review
  2. Semi-structured interviews

Our literature review surveyed current approaches to recommendation systems, the motivations and risks in using recommendation systems, and existing approaches and challenges in evaluating recommendation systems. We then focused in on reviewing existing public information on the operation of recommendation systems across European public service media, and the existing theorical work and case studies on the ethics implications of the use of those systems.

In order to situate the use of these systems, we also surveyed the history and context of public service media organisations, with a particular focus on previous technological innovations and attempts at measuring values.

We also undertook 29 semi-structured interviews with 8 current and 3 former BBC staff members, across engineering, product and editorial, 9 interviews with current and former staff from other public service media organisations and the European Broadcasting Union, and 9 further interviews with external experts from academia, civil society and regulators.

Partner information and acknowledgements

This work was undertaken with support from the Arts and Humanities Research Council (AHRC).

This report was co-authored by Elliot Jones, Catherine Miller and Silvia Milano, with substantive contributions from Andrew Strait.

We would like to thank the BBC for their partnership on this project, and in particular, the following for their support, feedback and cooperation throughout the project:

  • Miranda Marcus, Acting Head, BBC News Labs
  • Tristan Ferne, Lead Producer, BBC R&D
  • George Wright, Head of Internet Research and Future Services, BBC R&D
  • Rhia Jones, Lead R&D Engineer for Responsible Data-Driven Innovation

We would like to thank the following colleagues for taking the time to be interviewed for this project:

  • Alessandro Piscopo, Principal Data Scientist, BBC Datalab
  • Anna McGovern, Editorial Lead for Recommendations and Personalisation, BBC
  • Arno van Rijswijk, Head of Data & Personalization, & Sarah van der Land, Digital Innovation Advisor, Nederlandse Publieke Omroep
  • Ben Clark, Senior Research Engineer, Internet Research & Future Services, BBC Research & Development
  • Ben Fields, Lead Data Scientist, Digital Publishing, BBC
  • David Caswell, Executive Product Manager, BBC News Labs
  • David Graus, Lead Data Scientist, Randstad Groep Nederland
  • David Jones, Executive Product Manager, BBC Sounds
  • Debs Grayson, Media Reform Coalition
  • Dietmar Jannach, Professor, University of Klagenfurt
  • Eleanora Mazzoli, PhD Researcher, London School of Economics
  • Francesco Ricci, Professor of Computer Science, Free University of Bozen-Bolzano
  • Greg Detre, Chief Product & Technology Officer, Filtered and former Chief Data Scientist, Channel 4
  • Jannick Kirk Sørensen, Associate Professor in Digital Media, Aalborg University
  • Jonas Schlatterbeck, Head of Content ARD Online & Leiter Programmplanung, ARD
  • Jonathan Stray, Visiting Scholar, Berkeley Center for Human-Compatible AI
  • Kate Goddard, Senior Product Manager, BBC Datalab
  • Koen Muylaert, Head of Data Platform, VRT
  • Matthias Thar, Bayerische Rundfunk
  • Myrna McGregor, BBC Lead, Responsible AI+ML
  • Natalie Fenton, Professor of Media and Communications, Goldsmiths, University of London
  • Nic Newman, Senior Research Associate, Reuters Institute for the Study of Journalism
  • Olle Zachrison, Deputy News Commissioner & Head of Digital News Strategy, Swedish Radio
  • Sébastien Noir, Head of Software, Technology and Innovation, European Broadcasting Union and Dmytro Petruk, Developer, European Broadcasting Union
  • Sophie Chalk, Policy Advisor, Voice of the Listener & Viewer
  • Uli Köppen, Head of AI + Automation Lab, Co-Lead BR Data, Bayerische Rundfunk

1–12 of 15

Skip to content

Understanding public attitudes towards artificial intelligence (AI), and how to involve people in decision-making about AI, is becoming ever-more urgent in the UK and internationally. As new technologies are developed and deployed, and governments move towards proposals for AI regulation, policymakers and industry practitioners are increasingly navigating complex trade-offs between opportunities, risks, benefits and harms.

Taking into account people’s perspectives and experiences in relation to AI – alongside expertise from policymakers and technology developers and deployers – is vital to ensure AI is aligned with societal values and needs, in ways that are legitimate, trustworthy and accountable.

As the UK Government and other jurisdictions consider AI governance and regulation, it is imperative that policymakers have a robust understanding of relevant public attitudes and how to involve people in decisions.

This rapid review is intended to support policymakers – in the context of the UK AI Safety Summit and afterwards – to build that understanding. It brings together a review of evidence about public attitudes towards AI that considers the question: ‘What do the public think about AI?’ In addition, it provides knowledge and methods to support policymakers to meaningfully involve the public in current and future decision-making around AI.

Introduction

Why is it important to understand what the public think about AI?

We are experiencing rapid development and deployment of AI technologies and heightened public discourse on their opportunities, benefits, risks and harms. This is accompanied by increasing interest in public engagement and participation in policy decision-making, described as a ‘participatory turn’ or ‘deliberative wave’.

However, there is some hesitation around the ability or will of policy professionals and governments to consider the outcomes of these processes meaningfully, or to embed them into policies. Amid accelerated technological development and efforts to develop and coordinate policy, public voices are still frequently overlooked or absent.

The UK’s global AI Safety Summit in November 2023 invites ‘international governments, leading AI companies and experts in research’ to discuss how coordinated global action can help to mitigate the risks of ‘frontier AI’.[1] Making AI safe requires ‘urgent public debate’.[2]These discussions must include meaningful involvement of people affected by AI technologies.

The Ada Lovelace Institute was founded on the principle that discussions and decisions about AI cannot be made legitimately without the views and experiences of those most impacted by the technologies. The evidence from the public presented in this review demonstrates that people have nuanced views, which change in relation to perceived risks, benefits, harms, contexts and uses.

In addition, our analysis of existing research shows some consistent views:

  • People have positive attitudes about some uses of AI (for example, in health and science development).
  • There are concerns about AI for decision-making that affects people’s lives (for example, eligibility for welfare benefits).
  • There is strong support for the protection of fundamental rights (for example, privacy).
  • There is a belief that regulation is needed.

The Ada Lovelace Institute’s recent policy reports Regulating AI in the UK[3] and Foundation models in the public sector[4] have made the case for public participation and civil society involvement in the regulation of AI and governance of foundation models. Listening to and engaging the public is vital not only to make AI safe, but also to make sure it works for individual people and wider society.

Why is public involvement necessary in AI decision-making?

This rapid review of existing research with different publics, predominantly in the UK, shows consistency across a range of studies as to what the public think about different uses of AI, as well as providing calls to action for policymakers. It draws important insights from existing evidence that can help inform just and equitable approaches to developing, deploying and regulating AI.

This evidence must be taken into account in decision-making about the distribution of emerging opportunities and benefits of AI – such as the capability of systems to develop vaccines, identify symptoms of diseases like cancers and help humans adapt to the realities of climate change. It should also be considered in decision-making to support governance of AI-driven technologies that are already in use today in ways that permeate the everyday lives of individuals and communities, including people’s jobs and the provision of public services like healthcare, education or welfare.

This evidence review demonstrates that listening to the public is vital in order for AI technologies and uses to be trustworthy. It also evidences a need for more extensive and deeper research on the many uses and impacts of AI across different publics, societies and jurisdictions. Public views point towards ways to harness the benefits and address the challenges of AI technologies, as well as to the desire for diverse groups in society to be involved in how decisions are made.

In summary, the evidence that follows presents an opportunity for policymakers to listen to and engage with the views of the public, so that policy can navigate effectively the complex and fast-moving world of AI with legitimacy, trustworthiness and accountability in decision-making processes.

What this rapid evidence review does and does not do

This review brings together research conducted with different publics by academics, researchers in public institutions, and private companies, assessed against the methodological rigour of each research study. It addresses the following research questions:

  • What does the existing evidence say about people’s views on AI?
  • What methods of public engagement can be used by policymakers to involve the public meaningfully in decisions on AI?

As a rapid evidence review, this is not intended to be a comprehensive and systematic literature review of all available research. However, we identify clear and consistent attitudes, drawn from a range of research methods that should guide policymakers’ decision-making at this significant time for AI governance.

More detail is provided in the ‘Methodology’ section.

How to read this review

…if you’re a policymaker or regulator concerned with AI technologies:

The first part of this review summarises themes identified in our analysis of evidence relating to people’s views on AI technologies. The headings in this section synthesise the findings into areas that relate to current policy needs.

In the second part of the report, we build on the findings to offer evidence-based solutions for how to meaningfully include the views of the public in decision-making processes. The insights come from this review of evidence alongside research into public participation.

The review aims to support policymakers in understanding more about people’s views on AI, about different kinds of public engagement and in finding ways to involve the public in decisions on AI uses and regulation.

…if you’re a developer or designer building AI-driven technologies, or a deployer or organisation using them or planning to incorporate them:

Read Findings 1 to 5 to understand people’s expectations, hopes and concerns for how AI technologies need to be designed and deployed.

Findings 6 and 7 will support understanding of how to include people’s views in the design and evaluation of technologies, to make them safer before deployment.

…if you’re a researcher, civil society organisation, public participation practitioner or member of the public interested in technology and society:

We hope this review will be a resource to take stock of people’s views on AI from evidence across a range of research studies and methods.

In addition to pointing out what the evidence shows so far, Findings 1 to 6 also indicate gaps and omissions, which are designed to support the identification of further research questions to answer through research or public engagement.

Clarifying terms

The public

Our societies are diverse in many ways, and historic imbalances of power mean that some individuals and groups are more represented than others in both data and technology use, and more exposed than others to the opportunities, benefits, risks or harms of different AI uses.

 

There are therefore many publics whose views matter in the creation and regulation of AI. In this report, we refer to ‘the public’ to distinguish citizens and residents from other stakeholders, including the private sector, policy professionals and civil society organisations. We intentionally use the singular form of ‘public’ as a plural (‘the public think’), to reinforce the implicit acknowledgement that society includes many publics with different levels of power and lived experiences.

Safety

While the UK’s AI Safety Summit of 2023 has been framed around ‘safety’, there is no consensus definition of this term, and there are many ways of thinking about risks and harms from AI. The idea of ‘safety’ is employed in other important domains – like medicines, air travel and food – to ensure that systems and technologies enjoy public trust. As AI increasingly forms a core part of our digital infrastructure, our concept of AI safety will need to be similarly broad.[5]

 

The evidence in this report was not necessarily or explicitly framed by questions about ‘safety’. It surfaces people’s views about the potential or perceived opportunities, benefits, risks and harms presented by different uses of AI. People’s lived experience and views on AI technologies are useful, to understand what safety might mean in its broader scope and where policymakers’ attention – for example in national security – does not reflect diverse publics’ main concerns.

AI and AI systems

We use the UK Data Ethics Framework’s definition to understand AI systems, which they describe as technologies that ‘carry out tasks that are commonly thought to require human intelligence. [AI systems] deploy digital tools to find repetitive patterns in very large amounts of data and use them, in various ways, to perform tasks without the need for constant human supervision’.[6]

 

With this definition in mind, our analysis of attitudes to AI includes attitudes to data because data and data-driven technologies (like artificial intelligence and computer algorithms) are deeply intertwined, and AI technologies are underpinned by data collection, use, governance and deletion. In this review, we focus on research into public attitudes towards AI specifically, but draw on research about data more broadly where it is applicable, relevant and informative.

Expectations

Public attitudes research often describes public ‘expectations’. Where we have reported what the public ‘expect’ in this review, our interpretation of this term means what the public feel is required from AI practices and regulation. ’Expectation’, in this sense, does not refer to what people predict will happen.

 

Summary of findings

What do the public think about AI?

  • 1.  Public attitudes research is consistent in showing what the public think about some aspects of AI, which the findings below identify. This evidence is an opportunity for policymakers to ensure the views of the public are included in next steps in policy and regulation.
  • 2. There isn’t one ‘AI’: the public have nuanced views and differentiate between benefits, opportunities, risks and harms of existing and potential uses of different technologies.
    • The public have nuanced views about different AI technologies.
    • Some concerns are associated with socio-demographic differences.
  • 3. The public welcome AI uses when they can make tasks efficient, accessible and supportive of public benefit, but they also have specific concerns about other uses and effects, especially if AI uses that replace human decision-making affect people’s lives.
    • The public recognise potential benefits of uses of AI relating to efficiency, accessibility and working for the public good.
    • The public are concerned about an overreliance on technology over professional judgement and human communication.
    • Public concerns relate to the impacts of uses of AI, on jobs, privacy or societal inequalities.
    • In relation to foundation models: existing evidence from the public indicates that they have concerns about uses beyond mechanical, low-risk analysis tasks, and around their impact on jobs.
  • 4. Regulation and the way forward: people have clear views on how to make AI work for people and society.
    • The evidence is consistent in showing a demand for regulation of data and AI that is independent and has ‘teeth’.
    • The public are less trusting of private industry developing and regulating AI-driven technologies than other stakeholders.
    • The public are concerned about ethics, privacy, equity, inclusiveness, representativeness and non-discrimination. The use of data-driven technologies should not exacerbate unequal social stratification or create a two-tiered society.
    • Explainability of AI-driven decisions is important to the public.
    • The public want to be able to address and appeal decisions determined by AI.
  • 5. People’s involvement: people want to have a meaningful say over decisions that affect their everyday lives.
    • The public want their views and experiences to be included in decision-making processes.
    • The public expect to see diversity in the views that are included and heard.

How can involving the public meaningfully in decision-making support safer AI?

  • 6. There are important gaps in research with underrepresented groups, those impacted by specific AI uses, and in research from different countries.
    • Different people and groups, such as young people or people from minoritised ethnic communities, have distinct views about AI.
    • Some people, groups and parts of the world are underrepresented in the evidence.

 

  • 7. There is a significant body of evidence that demonstrates ways to meaningfully involve the public in decision-making, but making this happen requires a commitment from decision-makers to embed participatory processes.
    • Public attitudes research, engagement and participation involve distinct methods that deliver different types of evidence and outcomes.
    • Complex or contested topics need careful and deep public engagement.
    • Deliberative and participatory engagement can provide informed and reasoned policy insights from diverse publics.
    • Using participation as a consultative or tick-box exercise risks the trustworthiness, legitimacy and effectiveness of decision-making.
    • Empirical practices, evidence and research on embedding participatory and deliberative approaches can offer solutions to policymakers.

 

Different research methods, and the evidence they produce

There are three principal types of evidence in this review:

  1. Representative surveys, which give useful, population-level insights but can be consultative (meaning participants have low agency) for those involved.
  2. Deliberative research, which enables informed and reasoned policy conclusions from groups reflective of a population (meaning a diverse group of members of the public).
  3. Co-designed research, which can embed people’s lived experiences into research design and outputs, and make power dynamics (meaning knowledge and agency) between researchers and participants more equitable.

 

Different methodologies surface different types of evidence. Table 1 in the Appendix summarises some of the strengths of different research methods included in this evidence review.

 

Most of the evidence in this review is from representative surveys (14 studies), followed closely by deliberative processes (nine processes) and qualitative interviews and focus groups (six studies). In addition, there is one study involving peer research. This gap in the number of deliberative studies compared to quantitative research, alongside evidence included in Finding 7, may indicate the need for more in-depth public engagement methods.

Detailed findings

What do the public think about AI?

Finding 1: Public attitudes research is consistent in showing what the public think about some aspects of AI, which the findings below identify.

 

This evidence is an opportunity for policymakers to ensure the views of the public are included in next steps in policy and regulation.

Our synthesis of evidence shows there is consistency in public attitudes to AI across studies using different methods.

These include positive attitudes about some uses of AI (for example, advancing science and some aspects of healthcare), concerns about AI making decisions that affect people’s lives (for example, assessing eligibility for welfare benefits), strong support for the protection of fundamental rights (for example, privacy) and a belief that regulation is needed.

The evidence is consistent in showing concerns with the impact of AI technologies in people’s everyday lives, especially when these technologies replace human judgement. This concern is particularly evident in decisions with substantial consequences on people’s lives, such as job recruitment and access to financial support; when AI technologies replace human compassion in contexts of care; or when they are used to make complex and moral judgements that require taking into account soft factors like trust or opportunity. People are also concerned about privacy and the normalisation of surveillance.

The evidence is consistent in showing a demand for public involvement and for diverse views to be meaningfully engaged in decision-making related to AI uses.

We develop these views in detail in the following findings and reference the studies that support them.

Finding 2: There isn’t one ‘AI’

 

The public have nuanced views and differentiate between benefits, opportunities, risks and harms of existing and potential uses of different technologies

The public have nuanced views about different AI technologies

  • The public see some uses of AI as clearly beneficial. This was an insight from the joint Ada Lovelace Institute and The Alan Turing Institute’s research report How do people feel about AI?, which asked about specific AI-driven technologies.[7] In the nationally representative survey of the British public, people identified 11 of the 17 technologies we asked about as either somewhat or very beneficial. The use of AI for detecting the risk of cancer was seen as beneficial by nine out of ten people.
  • The public see some uses of AI as concerning. The same survey found the public also felt other uses were more concerning than beneficial, like advanced robotics or targeted advertising. Uses in care were also viewed as concerning by half of people, with 55% either somewhat or very concerned by virtual healthcare assistants, and 48% by robotic care assistants. In a separate qualitative study members of the UK public suggested that ‘the use of care robots would be a sad reflection of a society that did not value care givers or older people’.[8]
  • Overall, the public can simultaneously perceive the benefits as well as the risks presented by most applications of AI. More importantly, the public identify concerns to be addressed across all technologies, even when seen as broadly beneficial, as found in How do people feel about AI? by the Ada Lovelace Institute and The Alan Turing Institute.[9] Similarly, a recent survey in the UK by the Office for National Statistics found that, when people were asked to rank whether AI would have a positive or negative impact in society, the most common response was in between both ends, or neutral.[10] In a recent qualitative study in the UK, USA and Germany, participants also ‘saw benefits and concerns in parallel: even if they had a concern about a particular AI use case, they could recognise the upsides, and vice versa’.[11] Other research, including both surveys and qualitative studies in the USA[12] [13] and Germany,[14] has also found mixed views depending on the application of AI.

This nuance in views, depending on the context in which a technology is used, is illustrated by one of the comments of a juror during the Citizens’ Biometrics Council:

‘Using it [biometric technology] for example to get your money out of the bank, is pretty uncontroversial. It’s when other people can use it to identify you in the street, for example the police using it for surveillance, that has another range of issues.’
– Juror, The Citizens’ Biometrics Council[15]

 Some concerns are associated with socio-demographic differences

  • Higher awareness, levels of education and levels of information are associated with more concerns about some types of technologies. Individual differences in education levels can exacerbate concerns. The 2023 survey of the British public How do people feel about AI? found that those who have a degree-level education and feel more informed about technology are less likely to think that technologies such as facial recognition, eligibility technologies and targeted advertising in social media were beneficial.[16] A prior BEIS Public Attitudes Tracker reported similar findings.[17]
  • In the USA, the Pew Research Center found in 2023 that ‘those who have heard a lot about AI are 16 points more likely now than they were in December 2022 to express greater concern than excitement about it.’[18] Similarly, existing evidence suggests that public concerns around data should not be dismissed as uninform,[19] which goes against the assumption that the more people know about a technology, the more they will support it.

Finding 3: The public welcome AI uses that can make tasks efficient, accessible and supportive of public benefit

 

But they also have specific concerns about other uses and effects, especially if AI uses that replace human decision-making affect people’s lives.

The public recognise potential benefits of uses of AI relating to efficiency, accessibility and working for the public good

  • The public see the potential of AI-driven technologies in improving efficiency including speed, scale and cost-saving potential for some tasks and applications. They particularly welcome its use in mechanical tasks,[20] [21] in health, such as for early diagnosis, and in the scientific advancement of knowledge.[22] [23] [24] [25] For example, a public dialogue on health data by the NHS AI Lab found that the perceived benefits identified by participants included ‘increased precision, reliability, cost-effectiveness and time saving’ and that ‘through further discussion of case studies about different uses of health data in AI research, participants recognised additional benefits including improved efficiency and speed of diagnosis’.[26]
  • Improving accessibility is another potential perceived benefit of some AI uses, although other uses can also compromise For example, How do people feel about AI? by the Ada Lovelace Institute and The Alan Turing Institute found that accessibility was the most commonly selected benefit for robotic technologies that can make day-to-day activities easier for people.[27] These technologies included driverless cars and robotic vacuum cleaners. However, there is also a view that these benefits may be compromised due to digital divides and inequalities. For example, members of the Citizens’ Biometrics Council, who reconvened in November 2022 to consider the Information Commissioner’s Office (ICO)’s proposals for guidance on biometrics, raised concerns that while there is potential for biometrics to make services more accessible, an overreliance on poorly designed biometric technologies would create more barriers for people who are disabled or digitally excluded.[28]
  • For controversial uses of AI, such as certain uses of facial recognition or biometrics, there may be support when the public benefit is clear. The Citizens’ Biometrics Council that Ada convened in 2021 felt the use of biometrics was ‘more ok’ when it was in the interests of members of the public as a priority, such as in instances of public safety and health.[29] However, they concluded that the use of biometrics should not infringe people’s rights, such as the right to privacy. They also asked for safeguards related to regulation as described in Finding 4, such as independent oversight and transparency on how data is used, as well as addressing bias and discrimination or data management. The 2023 survey by the Ada Lovelace Institute and The Alan Turing Institute, How do people feel about AI?, found that speed was the main perceived benefit of facial recognition technologies, such as its use to unlock a phone, for policing and surveillance and at border control. But participants also raised concerns related to false accusations or accountability for mistakes.[30]

The public are concerned about an overreliance on technology over professional judgement and human communication

‘“Use data, use the tech to fix the problem.” I think that’s very indicative of where we’re at as a society at the moment […] I don’t think that’s a good modality for society. I don’t think we’re going down a good road with that.’
– Jury member, The rule of trust[31]

  • There are concerns in the evidence reviewed that an overreliance on data-driven systems will affect people’s agency and autonomy.[32] [33] [34] Relying on technology over professional judgement seems particularly concerning for people when AI is applied to eligibility, scoring or surveillance, because of the risk of discrimination and not being able to explain decisions that have high stakes, including those related to healthcare or jobs.[35] [36] [37] [38]
  • The nationally representative survey of the British public How do people feel about AI? found that not being able to account for individual circumstances was a concern related to this loss of agency. For example, almost two thirds (64%) of the British public were concerned that workplaces would rely too heavily on AI over professional judgement for recruitment.
  • Qualitative studies help to understand that this concern relates to fear of losing autonomy as well as fairness over important decisions, even when people can see the benefits of some uses. For example, in a series of workshops conducted in the USA, a participant said: ‘[To] have your destiny, or your destination in life, based on mathematics or something that you don’t put in for yourself… to have everything that you worked and planned for based on something that’s totally out of your control, it seems a little harsh. Because it’s like, this is what you’re sent to do, and because of [an] algorithm, it sets you back from doing just that. It’s not fair.[39]
  • Autonomy remains important, even when technologies are broadly seen as beneficial. Research by the Centre for Data Ethics and Innovation (CDEI) found that, even when the benefits of AI were broadly seen to outweigh the risks in terms of improving efficiency, the risks are more front-of-mind with strong concern about societal reliance on AI and where this may leave individuals and their autonomy.’[40]
  • There is a concern that algorithm-based decisions are not appropriate for making complex and moral judgements, and that they will generate ‘false confidence in the quality, reliability and fairness of outputs’.[41] [42] A study involving workshops in Finland, Germany, the UK and the USA found as examples of these complex or moral judgements those that ‘moved beyond assessment of intangibles like soft factors, to actions like considering extenuating circumstances, granting leniency for catastrophic events in people’s lives, ‘giving people a chance’, or taking into account personal trust’[43]. A participant from Finland said: ‘I don’t believe an artificial intelligence can know whether I’m suitable for some job or not.[44]
  • Research with the public also shows concerns that an overreliance on technology will result in a loss of compassion and the human touch in important services like health care.[45] [46] This concern is also raised in relation to technologies using foundation models: ‘Imagine yourself on that call. You need the personal touch for difficult conversations.[47]

Concerns also relate to the impacts of uses of AI on jobs, privacy or societal inequalities

  • Public attitudes research also finds some concern about job loss or reduced job opportunities for some applications of AI. For example, in a recent survey of the British public, the loss of jobs was identified as a concern by 46% of participants in relation to the use of robotic care assistants and by 47% in relation to facial recognition at border control as this would replace border staff.[48] Fear of the replacement or loss of some professions is also echoed in research from other countries in Europe[49] [50] and from the USA.[51] [52] For example, survey results from 2023 found that nearly two fifths of American workers are worried that AI might make some or all of their job duties obsolete.[53]
  • The public care about privacy and how people’s data is used, especially for the use of AI in everyday technologies such as smart speakers or for targeted advertising in social media.[54] [55] For example, the surveyHow do people feel about AI? found that over half (57%) of participants are concerned that smart speakers will gather personal information that could be shared with third parties, and that 68% are concerned about this for targeted social media adverts.[56] Similarly, the 2023 survey by the Pew Research Center in the USA found that people’s concerns about privacy in everyday uses of AI are growing, and that increase relates to a perceived lack of control over people’s own personal information.[57]
  • The public have also raised concerns about how some AI uses can be a threat to people’s rights, including the normalisation of surveillance.[58] Jurors in a deliberation on governance during pandemics were concerned about whether data collected during a public health crisis – in this case, the COVID-19 pandemic – could subsequently be used to surveil, profile or target particular groups of people. In addition, survey findings from March 2021 showed that minority ethnic communities in the UK were more concerned than white respondents about legal and ethical issues around vaccine passports.[59] In the workplace, whether in an office or working remotely, over a third (34%) of American workers were worried that their ‘employer uses technology to spy on them during work hours’, regardless of whether or not they report knowing they were being monitored at work.[60]
  • The public also care about the risk that data-driven technologies exacerbate inequalities and biases. Deliberative engagements ask for proportionality and a context-specific approach to the use of AI and data-driven technologies.[61] [62] For example, bias and justice were core themes raised by the Citizens’ Biometrics Council that Ada convened in 2021. The members of the jury made six recommendations to address bias, discrimination and accuracy issues, such as ensuring technologies are accurate before they are deployed, fixing them to remove bias and taking them through an Ethics Committee.[63]

‘There is a stigma attached to my ethnic background as a young Black male. Is that stigma going to be incorporated in the way technology is used? And do the people using the technologies hold that same stigma? It’s almost reinforcing the fact that people like me get stopped for no reason.’
– Jury member, The Citizens’ Biometrics Council[64]

Foundation models: existing evidence from the public indicates that they have concerns about uses beyond mechanical, low-risk analysis tasks and around their impact on jobs

The evidence from the public so far on foundation models[65] is consistent with attitudes to other applications of AI. People can see both benefits and disadvantages relating to these technologies, some of which overlap with attitudes towards other applications of AI, while others are specific to foundation models. However, evidence from the public about these technologies is limited, and more public participation is needed to better understand how the public feel foundation models should be developed, deployed and governed. The evidence below is from a recent qualitative study by the Centre for Data Ethics and Innovation (CDEI).[66]

  • People see the role of foundation models as potentially beneficial in assisting and augmenting mechanical, low-stakes human capabilities, rather than replacing them.[67] For example, participants in this study saw foundation models as potentially beneficial when they were doing data synthesis or analysis tasks. This could include assisting policymaking by synthesising population data or advancing scientific research by speeding up analysis or finding new patterns in the data, which were some of the potential uses presented to participants in the study.

 

‘This is what these models are good at [synthesising large amounts of population data]… You don’t need an emotional side to it – it’s just raw data.’,
– Interviewee, Public perceptions of foundation models[68]

 

  • Similar concerns around job losses found in relation to other applications of AI were raised by participants in the UK in relation to technologies built on foundation models.[69] There was concern that the replacement of some tasks by technologies based on foundation models would also mean workers lose the critical skills to judge whether a foundation model was doing a job well.
  • Concerns around bias extend to technologies based on foundation models. Bias and transparency were front of mind: ‘[I want the Government to consider] transparency – we should be declaring where AI has been applied. And it’s about where the information is coming from, ensuring it’s as correct as it can be and mitigating bias as much as possible.’ There was a view that bias could be mitigated by ensuring that the data training these models is cleaned so that it is accurate and representative.[70]
  • There are additional concerns about trade-offs between accuracy and speed when using foundation models. Inaccuracy of foundation models is a key concern among members of the public. This inaccuracy would require checks that may compromise potential benefits such as speed and make the process more inefficient. As this participant working in education said: ‘I don’t see how I feed the piece of [homework] into the model. I don’t know if in the time that I have to set it up and feed it the objectives and then review afterwards, whether I could have just done the marking myself?[71]
  • People are also concerned by the inability of foundation models to provide emotional intelligence. The lack of emotional intelligence and inability to communicate like a human, including understanding non-verbal cues and communication in context, was another concern raised in the study from the Centre for Data Ethics and Innovation, which meant participants did not see technologies based on foundation models as useful in decision-making.[72]

 

‘The emotional side of things… I would worry a lot as people call because they have issues. You need that bit of emotional caring to make decisions. I would worry about the coldness of it all.’
– Interviewee, Public perceptions of foundation models[73]


Finding 4: Regulation and the way forward: People have clear views on how to make AI work for people and society.

 

The evidence is consistent in showing a demand for regulation of data and AI that is independent and has ‘teeth’.

  • The public demand regulation around data and AI.[74] [75] [76] [77] Within the specific application of AI systems in biometric technologies, the Citizens’ Biometrics Council felt an independent body is needed to bring governance and oversight together in an otherwise crowded ecosystem of different bodies working towards the same goals.[78] The Council felt that regulation should also be able to enforce penalties for breaches in the law that were proportionate to the severity of such breaches, surfacing a desire for regulation with ‘teeth’. The Ada Lovelace Institute’s three-year project looking at COVID-19 technologies highlighted that governance and accountability measures are important for building public trust in data-driven systems.[79]
  • The public want regulation to represent their best interests. Deliberative research from the NHS AI Lab found that: ‘Participants wanted to see that patients’ and the public’s best interests were at the heart of decision-making and that there was some level of independent oversight of decisions made.’[80] Members of the Ada Lovelace Institute’s citizens’ jury on data governance during a pandemic echoed this desire for an independent regulatory body that can hold data-driven technology to account, adding that they would value citizen representation within such a body.[81]
  • Independence is important. The nationally representative public attitudes survey How do people feel about AI? revealed that a higher proportion of individuals felt that an independent regulator was best placed to ensure AI is used safely than other bodies, including private companies and the Government.[82] This may reflect differential relations of trust and trustworthiness between civil society and other stakeholders involved in data and AI, which we discuss in the next section.

The public are less trusting of private industry developing and regulating AI-driven technologies than they are of other stakeholders

  • Evidence from the UK and the USA finds that the public do not trust private companies as developers or regulators of AI-driven technologies, and instead hold higher trust in scientists and researchers or professionals and independent regulatory bodies, respectively.[83] [84] [85] [86] [87] For example, when asked how concerned they are with different stakeholders developing high-impact AI-driven technologies, such as systems that determine an individual’s eligibility for welfare benefits or their risk of developing cancer from a scan, survey results found that the public are most concerned by private companies being involved and least concerned by the involvement of researchers or universities.[88]
  • UK research also shows that the public do not trust private companies to act with safety or accountability in mind. The Centre for Data Ethics and Innovation’s public attitudes survey found that only 43% of people trusted big technology companies to take actions with data safely, effectively, transparently and with accountability, with this figure decreasing to 30% for social media companies specifically.[89]
  • The public are critical of the motivations of commercial organisations that develop and deploy AI systems in the public sector. Members of a public dialogue on data stewardship were sceptical of the involvement of commercial organisations in the use of health data.[90] Interviews with members of the UK public on data-driven healthcare technologies also revealed that many did not expect technology companies to act on anyone’s interests but their own.[91]

‘[On digital health services] I’m not sure that all of the information is kept just to making services better within the NHS. I think it’s used for [corporations] and large companies that do not have the patients’ best interests at heart, I don’t think.’

– Interviewee, Access Denied? Socioeconomic inequalities in digital health services [92]

The public are concerned about ethics, privacy, equity, inclusiveness, representativeness and non-discrimination, and about exacerbating unequal social stratification and creating a two-tiered society

  • The public support using and developing data-driven technologies when appropriate considerations and guardrails are in place. An earlier synthesis of public attitudes to data by the Ada Lovelace Institute shows support for the use of data-driven technologies when there is a clear benefit to society,[93] with public attitudes research into AI revealing broad positivity for applications of AI in areas like health, as described earlier in this report. Importantly, this positivity is paralleled by high expectations around ethics and responsibility to limit how and where these technologies can be used.[94] However, perceptions around innovation and regulation are not always at odds with each other. A USA participant from a qualitative study stated that ‘there can be a lot of innovation with guardrails’.[95]
  • There is a breadth of evidence highlighting that principles of equity, inclusion, fairness and transparency are important to the public:
    • The Ada Lovelace Institute’s deliberative research shows that the public believe equity, inclusiveness and non-discrimination need to be embedded into data governance during pandemics for governance to be considered trustworthy,[96] or before deploying biometric technologies.[97] The latter study highlighted that data-driven systems should not exacerbate societal inequalities or create a two-tiered society, with the public questioning the assumption that individuals have equal access to digital infrastructure and expressing concern around discriminatory consequences that may arise from applications of data-driven technology.[98]
    • Qualitative research in the UK found that members of the public feel that respecting privacy, transparency, fairness and accountability underpins good governance of AI.[99] Ethical principles such as fairness, privacy and security were valued highly in an online survey of German participants in the evaluation of the application of AI in making decisions around tax fraud.[100] These participants equally valued a range of ethical principles, highlighting the importance of taking a holistic approach to the development of AI-driven systems. Among children aged 7–11 years in Scotland, who took part in deliberative research, fairness was a key area of interest after being introduced to real-life examples of uses of AI.[101]
    • The public also emphasise the importance of considering the context within which AI-driven technologies are applied. Qualitative research in the UK found that in high-risk applications of AI, such as mental health chatbots or HMRC fraud detection services, individuals expect more information to be provided on how the system has been designed and tested than for lower-risk applications of AI, such as music streaming recommendation systems.[102] As mentioned earlier in this report, members of the Ada Lovelace Institute’s Citizens’ Biometrics Council similarly emphasised proportionality in the use of biometric technology across different contexts, with use in contexts that could enforce social control deemed inappropriate, while other uses around crime prevention elicited mixed perspectives.[103]
  • Creating a trustworthy data ecosystem is seen as crucial in avoiding resistance or backlash to data-driven technologies.[104] [105] Building data ecosystems or data-driven technologies that are trustworthy is likely to improve public acceptance of these technologies. However, a previous analysis of public attitudes to data suggests that aims to build trust can often place the burden on the public to be more trusting rather than demand more trustworthy practices from other stakeholders.[106] Members of a citizens’ jury on data governance highlighted that trust in data-driven technologies is contingent on the trustworthiness of all stakeholders involved in the design, deployment and monitoring of these technologies.[107] These stakeholders include the developers building technologies, the data governance frameworks in place to oversee these technologies and the institutions tasked with commissioning or deploying these technologies.
  • Listening to the public is important in establishing trustworthiness. Trustworthy practices can include better consultation with, listening to, and communicating with people, as suggested by UK interviewees when reflecting on UK central Government deployment of pandemic contact tracing apps.[108] These participants felt that mistrust of central Government was in part related to feeling as though the views of citizens and experts had been ignored. Finding 5 further details public attitudes around participation in data-driven ecosystems.

‘The systems themselves are quite exclusionary, you know, because I work with people with experiences of multiple disadvantages and they’ve been heavily, heavily excluded because they say they have complex needs, but what it is, is that the system is unwilling to flex to provide what those people need to access those services appropriately.’

– Interviewee, Access Denied? Socioeconomic inequalities in digital health services [109]

Explainability of AI-driven decisions is important to the public

  • It is important for people to understand how AI-driven decisions are made, even if that reduces the accuracy of that decision for reasons relating to fairness and accountability.[110] [111] [112] [113] [114] [115] The How do people feel about AI? survey of British public attitudes by the Ada Lovelace Institute and The Alan Turing Institute found that explainability was important because it helped with accountability and the need to consider individual differences in circumstance.[116] When balancing accuracy of an AI-powered decision against an explanation into how that decision was made, or the possibility of humans making all decisions, most people in the survey preferred the latter two options. At the same time, a key concern across most AI technologies – such as virtual healthcare assistants and technologies that assess eligibility for welfare or loan repayment risk – was around accountability for mistakes if things go wrong, or the need to consider individual and contextual circumstances in automated decision-making.
  • Exposing bias and supporting personal agency is also linked to support for explainability. In scenarios where biases could impact decisions, such as in job application screening decisions, participants from a series of qualitative workshops highlighted that explanations could be a mechanism to provide oversight and expose discrimination, as well as to support personal agency by allowing individuals to contest decisions and advocate for themselves: ‘A few participants also worried that it would be difficult to escape from an inaccurate decision once it had been made, as decisions might be shared across institutions, leaving them essentially powerless and without recourse.’[117]
  • However, there are trade-offs people make between explainability and accuracy depending on the context. The extent to which a decision is mechanical versus subjective, the gravity of the consequences of the decision, whether it is the only chance at a decision or whether information can help the recipient take meaningful action are some of the criteria identified in research with the public when favouring accuracy over explainability.[118]
  • The type of information people want from explanations behind AI-driven technologies also varies depending on context. A qualitative study involving focus groups in the UK, USA and Germany found that transparency and explainability were important, and that the method for providing this transparency depended on the type of AI technology, use and potential negative impact: ‘For AI products used in healthcare or finance, they wanted information about data use, decision-making criteria and how to make an appeal. For AI-generated content, visual labels were more important.’[119]

The public want to be able to address and appeal decisions determined by AI

  • It is important for the public that there are options for redress when mistakes have been made using AI-driven technologies.[120] [121] When asked what would make them more comfortable with the use of AI, the second most commonly chosen option by the public in the How do people feel about AI? attitudes survey was ‘procedures in place to appeal AI decisions’, selected by 59% of people, with only ‘laws and regulation’, selected by more people (62%).[122] In line with the value of explanations in providing accountability and transparency as previously discussed, workshops with members of the general public across several countries also found that explanations accompanying AI-made decisions were seen as important, as they could support appeals to change decisions if mistakes were made.[123] For example, as part of a study commissioned by the Centre for Data Ethics and Innovation, participants were presented with a scenario in which AI was used to detect tax fraud. They concluded that they would want to understand what information is used, outside of the tax record, in order to identify someone’s profile as a risk. As the quote below shows, understanding the criteria was important to address a potential mistake with significant consequences:

‘I would like to know the criteria used that caused me to be flagged up [in tax fraud detection services using AI], so that I can make sure everything could be cleared up and clear my name.’
– Interviewee, AI Governance[124]

The public ask for agency, control and choice in involvement, as well as in processes of consent and opt-in for sharing data

  • The need for agency and control over data and how decisions are made was a recurrent theme in our rapid review of evidence. People are concerned that AI systems can take over people’s agency in high-stakes decisions that affect their lives. In the Ada Lovelace Institute’s and The Alan Turing Institute’s recent survey of the British public, people noted concerns about AI replacing professional judgements, not being able to account for individual circumstances and a lack of transparency and accountability in decision-making. For example, almost two thirds (64%) were concerned that workplaces would rely too heavily on AI for recruitment compared to professional judgements.[125]
  • The need for control is also mentioned in relation to consent. For example, the Ada Lovelace Institute’s previous review of evidence Who Cares what the Public Think? found that ‘people often want more specific, granular and accessible information about what data is collected, who it is used by, what it is used for and what rights data subjects have over that use.’[126] A juror from the Citizens’ Biometrics Council also referenced the importance of consent:

‘One of the things that really bugs me is this notion of consent: in reality [other] people determine how we give that consent, like you go into a space and by being there you’ve consented to this, this and this. So, consent is nothing when it’s determined how you provide it.’
– Jury member, The Citizens’ Biometrics Council[127]

  • Control also relates to privacy. Lack of privacy and control over the content people see in social media and the data that is extracted was also identified as a consistent concern in the recent survey of the British public conducted by the Ada Lovelace Institute and The Alan Turing Institute.[128] In this study 69% of people identified invasion of privacy as a concern around targeted consumer advertising and 50% were concerned about the security of their personal information.
  • Consent is particularly important in high-stakes uses of AI. Consent was also deemed important in a series of focus groups conducted in the UK, USA and Germany, especially ‘where the use of AI has more material consequences for someone affected, like a decision about a loan, participants thought that people deserved the right to consent every time’.[129] In the same study, participants noted consent is about informed choice, rather than just choosing yes or no.
  • The need for consent is ongoing and complicated by the pervasiveness of some technologies. Consent remained an issue for members of the Citizens’ Biometric Council that the Information Commissioner’s Office (ICO) reconvened in While some participants welcomed the inclusion of information on consent in the new guidance by the ICO, others remained concerned because of the increased pervasiveness of biometrics, which would make it more difficult for people to be able to consent.[130]
  • The demand for agency and control is also linked to demands for transparency in data-driven systems. For example, the citizens’ juries the Ada Lovelace Institute convened on health systems in 2022 found that ‘agency over personal data was seen as an extension of the need for transparency around data-driven systems. Where a person is individually affected by data, jurors felt it was important to have adequate choice and control over its use.’[131]

‘If we are giving up our data, we need to be able to have a control of that and be able to see what others are seeing about us. That’s a level of mutual respect that needs to be around personal data sharing.’
– Jury member, The Rule of Trust [132]

Finding 5: People’s involvement: people want to have a meaningful say over decisions that affect their everyday lives.

 

The public want their views and experiences to be included in decision-making processes.

  • There is a demand from participants in research for more meaningful involvement of the public and of lived experience in the development of, implementation of and policy decision-making on data-driven systems and AI. For example, in a public dialogue for the NHS AI Lab, participants ‘flagged that any decision-making approaches need to be inclusive, representative, and accessible to all’. The research showed that participants valued a range of expertise, including the lived experience of patients.[133]
  • The public want their views to be valued, not just heard.[134] In the Ada Lovelace Institute’s peer research study on digital health services, participants were concerned that they were not consulted or even informed about new digital health services.[135] The research from the NHS AI Lab also found that, at the very least, when involvement takes place, the public wants their views to be given the same consideration as the views of other stakeholders.[136] The evidence also shows expectation for inclusive engagement and multiple channels of participation.[137]

There needs to be diversity in the views that are included and heard

  • A diversity of views and public participation need to be part of legislative and oversight bodies and processes.[138] The Citizens’ Biometrics Council that the Ada Lovelace Institute convened in 2020 also suggested the need to include the public in a broad representative group of individuals charged with overseeing an ongoing framework for governance and a register on the use of biometric technologies.[139] Members of the Ada Lovelace Institute’s citizens’ jury on data governance during a pandemic advocated for public representation in any regulatory bodies overseeing AI driven technologies.[140] Members of a public dialogue on data stewardship particularly stressed the importance of ensuring those that are likely to be affected by decisions are involved in the decision-making process.[141]

‘For me good governance might be a place where citizens […] have democratic parliament of technology, something to hold scrutiny.’
– Jury member, The Rule of Trust.[142]

  • This desire for involvement in decisions that affect them is felt even by children as young as 7–11 years old. Deliberative engagement with children in Scotland shows that they want agency over the data collected about them, and want to be consulted about the AI systems created with that data.[143] The young participants wanted to make sure that many children from different backgrounds would be consulted when data was gathered to create new systems, to ensure outcomes from these systems were equitable for all children.

‘We need to spend more time in a place to collect information about it and make sure we know what we are working with. We also need to talk to lots of different children at different ages.’
– Member of Children’s Parliament, Exploring Children’s Rights and AI[144]

How can involving the public meaningfully in decision-making support safer AI?

Finding 6: There are important gaps in research with underrepresented groups, those impacted by specific AI uses, and in research from different countries.

 

Different people and groups, like young people or people from minoritised ethnic communities, have distinct views about AI.

Evidence points to age and other socio-demographic differences as factors related to varying public attitudes to AI.[145] [146] [147]

  • Young people have different views on some aspects of AI. For example, the survey of British public attitudes How do people feel about AI? showed that the belief that the companies developing AI technologies should be responsible for the safety of those technologies was more common among people aged 18–24 years old than in older age groups. This suggests that younger people have high expectations of private companies and some degree of trust in them carrying out their corporate responsibilities.[148]
  • Specific concerns around technology may also relate to some socio-demographic characteristics. Polling from the USA suggests worries around job losses due to AI are associated with age (workers under 58 are more concerned than those over 58) and ethnicity (people from Asian, Black and Hispanic backgrounds are more concerned than those from white backgrounds).[149] And although global engagement on AI is limited, the available evidence suggests that there may be wide geographical differences in feelings about AI and fairness, and trust in both the companies using AI, and in the AI systems, to be fair.[150] [151]

Some people, groups and parts of the world are underrepresented in the evidence

  • Some publics are underrepresented in some of the evidence.[152] [153]
    • Sample size, recruitment, methods used for taking part in research, as well as other factors can affect the quality of insights that research is able to represent across different publics. For example, the How do people feel about AI? survey of public attitudes is limited in its ability to represent the views of groups of people who are racially minoritised, such as Black or Asian populations, due to small sample sizes. This can be a methodological limitation of representative, quantitative research, and so is present in the research findings despite a recognition by researchers that these groups may be disproportionately affected by some of the technologies surveyed.[154] There is therefore a need for quantitative and qualitative research among those most impacted and least represented by some uses of AI, especially marginalised or minoritised groups and younger age groups.
  • There is an overrepresentation of Western-centric views:
    • Existing evidence identified comes from English-speaking Western countries, often conducted by ‘a small group of experts educated in Western Europe or North America’.[155] [156] This is also evidenced in the gaps of this rapid review, and the Ada Lovelace Institute recognises that as a predominantly UK-based organisation, it might face barriers to discovering and analysing evidence emerging from across the world. In the context of global summits and discussions on global governance, and particularly recognising that the AI supply chain transcends boundaries of nations and regions, there is a need for research and evidence that includes different contexts and political economies, where views and experiences may vary in different ways across AI uses.

Finding 7: There is a significant body of evidence that demonstrates ways to meaningfully involve the public in decision-making.

 

But making this happen requires a commitment from decision-makers to embed participatory processes.

As described in the findings above, the public want to be able to have a say in – and to have control over – decisions that impact their lives. They also think that members of the public should be involved in legislative and oversight processes. This section introduces some of the growing evidence on how to do this meaningfully.

Public attitudes research, engagement and participation involve distinct methods that deliver different evidence and outcomes 

Different methods of public engagement and participation produce different outcomes, and it is important to understand their relative strengths and limitations in order to use them effectively to inform policy (see Table 1).

Some methods are consultative, whereas others enable a deeper involvement. According to the International Association for Public Participation (IAPP) framework, methods can embed the public deeper into decision-making to increase the impact they have on those decisions.[157] This framework has been further developed in the Ada Lovelace Institute’s report Participatory data stewardship, which sets out the relationships between different kinds of participatory practices.[158]

Surveys are quantitative methods of collecting data that capture immediate attitudes influenced by discourse, social norms and varied levels of knowledge and experience.[159] [160] They produce responses predominantly by using closed questions that require direct responses. Analysis of survey results helps researchers and policymakers to understand the extent to which some views are held across populations, and to track changes over time. However, quantitative methods are less suited to answering the ‘why’ or ‘how’ questions. In addition, they do not allow for an informed and reasoned process. As others have pointed out: ‘surveys treat citizens as subjects of research rather than participants in the process of acquiring knowledge or making judgements.’[161]

Some qualitative studies can provide important insight into people’s views and lived experience, such as focus groups or interviews. However, there is a risk that participation remains at the consultative level, depending on how the research is designed and embedded in decision-making processes.

Public deliberation can enable deep insights and recommendations to inform policy through an informed, reasoned and deliberative process of engagement. Participants are usually randomly selected to reflect the diversity of a population or groups, in the context of a particular issue or question. They are provided with expert guidance and informed, balanced evidence, and given time to learn, understand and discuss. These processes can be widened through interconnected events to ensure civil society and the underrepresented or minoritised groups less likely to attend these deliberative processes are included in different and relevant ways.[162] There is a risk that the trust of participants in these processes is undermined if their contributions are not seriously considered and embedded in policies.

Complex or contested topics need careful and deep public engagement

We contend that there is both a role and a need for methods of participation that provide in-depth involvement. This is particularly important when what is at stake are not narrow technical questions but complex policy areas that permeate all aspects of people’s lives, as is the case with the many different uses of AI in society. The Ada Lovelace Institute’s Rethinking data report argued the following:

‘Through a broad range of participatory approaches – from citizens’ councils and juries that directly inform local and national data policy and regulation, to public representation on technology company governance boards – people are better represented, more supported and empowered to make data systems and infrastructures work for them, and policymakers are better informed about what people expect and desire from data, technologies and their uses.[163]

Similar lessons have been learned from policymaking in climate. The Global Science Partnership finds that: ‘Through our experience delivering pilots worldwide as part of the Global Science Partnership, we found that climate policy making can be more effective and impactful when combining the expertise of policymakers, experts and citizens at an early stage in its development, rather than through consulting on draft proposals.’[164]

Other research has also argued that some AI uses, in particular those that risk civil and human rights, are in more need of successfully incorporating public participation. For example, a report by Data & Society finds that AI uses related to access to government services and benefits, retention of biometric or health data, surveillance or uses that bring new ethical challenges like generative AI or self-driving cars, require in-depth public engagement.[165]

Deliberative and participatory engagement can provide informed and reasoned policy insights from diverse publics

Evidence about participatory and deliberative approaches shows their potential for enabling rigorous engagement processes, in which publics who are reflective of the diversity of views in the population are exposed to a range of knowledge and expertise. According to democratic theorists, inclusive deliberation is a key mechanism to enable collective decision-making.[166]

Through a shared process of considered deliberation and reasoned judgement with others, deliberative publics are able to meaningfully understand different data-driven technologies and the impact they are having or can have on different groups.[167] [168]

Research on these processes shows that ‘deliberating citizens can and do influence policies’, and that they are being implemented in parliamentary contexts by civil society, private companies and international institutions.[169]

Using participation as a tick-box exercise risks the trustworthiness, legitimacy and effectiveness of decision-making

Evidence from public participation research identifies the risk of using participation to simply tick a box to demonstrate public engagement, or as a stamp of approval for a decision that has already been substantially made. For example, participants in a deliberative study by the NHS AI Lab discussed the need for public engagement to be meaningful and impactful, and considered how lived experience would impact decision-making processes alongside the agendas of other stakeholders.[170]

There is a need to engage with the public in in-depth processes that are consequential in their influence in government policy.[171] Our Rethinking data report also referred to this risk:

‘In order to be successful, such initiatives need political will, support and buy-in, to ensure that their outcomes are acknowledged and adopted. Without this, participatory initiatives run the risk of ‘participation washing’, whereby public involvement is merely tokenistic.’[172]

Other lessons from public engagement in Colombia, Kenya and the Seychelles also represent the need for ‘deep engagement at all stages through the policymaking process’ to improve effectiveness, trust and transparency.[173]

Experiences and research on institutionalising participatory and deliberative approaches can offer solutions to policymakers

The use of participatory and deliberative approaches and the evidence of its impact are growing in the UK and many parts of the world[174] in what has been described as a ‘turn toward deliberative systems’[175] or ‘deliberative wave’.[176] [177] However, there is a need for policy professionals and governments to take the results from these processes seriously and embed them in policy.

Frameworks like OECD’S ‘Institutionalising Public Deliberation’ provide a helpful summary of some of the ways in which this can happen, including examples like the Ostbelgien model, the city of Paris’ model or Bogota’s itinerant assembly.[178]

Ireland’s experience running deliberative processes that culminated in policy change,[179] or the experience of the London Borough of Newham with its standing assembly, offer other lessons.

At a global level, the Global Assembly on the Climate and Ecological Crisis held in 2021 serves as a precedent for what a global assembly or similar permanent citizens’ body on AI could look like, including civil society and underrepresented communities.[180] An independent evaluation found that the Global Assembly ‘established itself as a potential player in global climate governance, but it also spotlighted the challenges of influencing global climate governance on the institutional level.’[181] This insight shows the importance of these processes to be connected to decision-making organs for them to be consequential.

Deliberative and participatory processes have been used for decades in many areas of policymaking, but their use by governments to involve the public in decisions on AI remains surprisingly unexplored:

‘Despite their promising potential to facilitate more effective policymaking and regulation, the role of public participation in data and technology-related policy and practice remains remarkably underexplored, if compared – for example – to public participation in city planning and urban law.’[182]

This review of existing literature demonstrates ways to operationalise or institutionalise the involvement of the public into legislative processes and lessons on how to avoid them becoming consultative exercises. We do not claim that all these processes and examples have always been successful, and point to evidence of a lack of commitment from governments to implement the recommendations by citizens being one of the reasons why they can fail.[183] We contend that there is currently a significant opportunity for governments to consider processes that can embed the participation of the public in meaningful and consequential ways – and that doing this will improve outcomes for people affected by technologies and for current and future societies.

 

Conclusions

This rapid review shows public attitudes research is consistent in showing what the public think about the potential benefits of AI, their concerns and how they think it should be regulated.

  • It is important for governments to listen to and act on this evidence, paying attention in particular to different AI uses and how they currently have impacts on people’s everyday lives. AI uses affecting decision-making around services and jobs, or affecting human and civil rights, require particular attention. The public do not see AI as just one thing and have nuanced views about its different uses, risks and impacts. AI uses in advancing science and improving health diagnosis are largely seen as positive, and so are its uses in tasks that can be made faster and more efficient. However, the public are concerned about relying on AI systems to make decisions that impact people’s lives, such as in job recruitment or accessing financial support, either through loans or welfare.
  • The public are also concerned with uses of AI that replace human judgement, communication and emotion, in aspects like care or decisions that need to account for context and personal circumstances.
  • There are also concerns about privacy, especially in relation to uses of AI in people’s everyday lives, like targeted advertising, robotic home assistants or surveillance.
  • There is emerging evidence that the public have equivalent concerns about the use of foundation models. Whereas they may be welcome when they facilitate or augment mechanical, low-risk tasks or speed up data analysis, the public are concerned about trading off accuracy for speed. They are also concerned about AI uses replacing human judgement or emotion, and about their potential to amplify bias and discrimination.

Policymakers should use evidence from public attitudes research to strengthen regulation and independent oversight of AI design, development, deployment and uses and to meaningfully engage with diverse publics in the process.

  • Evidence from the public shows a preference for independent regulation with ‘teeth’ that demands transparency and includes mechanisms for assessing risk before deployment of technologies, as well as for accountability and redress.
  • The public want to maintain agency and control over how data is used and for what purposes.
  • Inclusion and non-discrimination are important for people. There is a concern that both the design and uses of AI technologies will amplify exclusion, bias or discrimination, and the public want regulatory frameworks that prevent this.
  • Trust in data-driven systems is contingent on the trustworthiness of all stakeholders involved. The public find researchers and academics more trustworthy than the private sector. Engaging the public in the design, deployment, regulation and monitoring of these systems is also important to avoid entrenching resistance.

Policymakers should use diverse methods and approaches to engage with diverse publics with different views and experiences, and in different contexts. Engaging the public in participatory and deliberative processes to inform policy requires embedded, institutional commitment so that the engagement is consequential rather than tokenistic.

  • The research indicates differences in attitudes across demographics, including age and socio-economic background, and there is a need for more evidence from underrepresented groups and specific publics impacted by specific AI uses.
  • There is also a need for evidence from different contexts across the globe, especially considering that the AI supply chain transcends political jurisdictions.
  • The public want to have a say in decisions that affect their lives, and want spaces for themselves and representative bodies to be part of legislative and monitoring processes.
  • Different research and public participation approaches result in different outcomes. While some methods are best suited to consulting the public on a particular issue, others enable them to be involved in decision-making. Participatory and deliberative methods enable convening publics that are reflective of diversity in the population to offer informed and reasoned conclusions that can inform practice and policy.
  • Evidence from deliberative approaches shows ways for policymakers to meaningfully include the public in decision-making processes at a local, national, regional and global level, such as through citizens’ assemblies or juries and working with civil society. These processes need political will to be consequential.

Methodology

To conduct this rapid evidence review, we combined research with the public carried out by the Ada Lovelace Institute with studies by other research organisations, assessed against criteria that gave a high level of confidence in the robustness of the research.

We used keyword-based online searches to identify evidence about public attitudes in addition to our own research. We also assessed the quality and relevance of recent studies encountered or received through professional networks that we had not identified through our search. We incorporated these in the review when the assessment of studies delivered methodological confidence. A thematic analysis was conducted to categorise and identify recurrent themes. These themes have been developed and structured around a set of key findings that aim to speak directly to policy professionals.

The evidence resulting from this process is largely from the UK, complemented by other research done predominantly in other English-speaking countries. This is a limitation for a global conversation on AI, and more research from across the globe and diverse publics and contexts is needed.

We focus on research conducted within recent years. The oldest evidence included dates from 2014, and the vast majority has been published since 2018. We chose this focus to ensure findings are relevant, given that events of recent years have had a profound influence on public attitudes towards technology, such as the Cambridge Analytica scandal,[184] the growth and prominence of large technology companies in society, the impacts of the COVID-19 pandemic and the popularisation of large-language models including ChatGPT and Bard.

Various methodologies are used in the research we have cited, from national online surveys to deliberative dialogues, qualitative focus groups and more. Each of these methods has different strengths and limitations, and the strengths of one approach can complement the limitations of another.

Table 1: Research methodologies and the evidence they surface

Research method Type of evidence Level of citizen involvement[185]
Representative surveys ·       Understanding the extent to which some views are held across some groups in the population.

·       Potential to track how views change over time or how they differ across groups in the population

 

 

Consultation
Deliberative processes like citizens’ juries or citizens’ assemblies  

·       Participants reflective of the diversity in a population reach conclusions and policy recommendations based on an informed and reasoned process, that considers pros and cons and different expertise and lived experiences.

 

 

Involve, collaborate or empower (depending on the extent to which the process is embedded in decision-making and recommendations from participants are consequential)
Qualitative research like focus groups or in-depth interviews ·       In-depth understanding of the type of views that exist on a topic either in a collective or individual setting, the contextual and socio-demographic reasons behind those views and an understanding of trade-offs people make in their thinking about a topic. Consultation
Co-designed research ·       Participants’ lived experience and knowledge is included in the research process from the start (including the problem that needs to be solved and how to approach the research), and power in how decisions are made is distributed. Involve, collaborate or empower (depending on the extent to which power is shared across participants and researchers and the extent to which it has an impact on decision-making)

Acknowledgements

This report was co-authored by Dr Anna Colom, Roshni Modhvadia and Octavia Reeve.

We are grateful to the following colleagues for their review and comments on a draft of this paper:

  • Reema Patel, ESRC Digital Good Network Policy Lead
  • Ali Shah, Global Principal Director for Responsible AI at Accenture and advisory board member at the Ada Lovelace Institute
  • Professor Jack Stilgoe, Co-lead Policy and Public Engagement Strategy, Responsible AI UK

Bibliography

Ada Lovelace Institute, ‘The Citizens’ Biometrics Council. Recommendations and Findings of a Public Deliberation on Biometrics Technology, Policy and Governance’ (2021) <https://www.adalovelaceinstitute.org/project/citizens-biometrics-council/>

Ada Lovelace Institute, ‘The Citizens’ Biometrics Council’ (2021) <https://www.adalovelaceinstitute.org/report/citizens-biometrics-council/>

Ada Lovelace Institute, ‘Participatory Data Stewardship: A Framework for Involving People in the Use of Data’ (2021) <https://www.adalovelaceinstitute.org/report/participatory-data-stewardship/>

Ada Lovelace Institute, ‘Rethinking Data and Rebalancing Digital Power’ (2022) <https://www.adalovelaceinstitute.org/project/rethinking-data/>

Ada Lovelace Institute, ‘The Rule of Trust: Findings from Citizens’ Juries on the Good Governance of Data in Pandemics.’ (2022) <https://www.adalovelaceinstitute.org/wp-content/uploads/2022/07/The-rule-of-trust-Ada-Lovelace-Institute-July-2022.pdf>

Ada Lovelace Institute, ‘Who Cares What the Public Think?’ (2022) <https://www.adalovelaceinstitute.org/evidence-review/public-attitudes-data-regulation/>

Ada Lovelace Institute, ‘Access Denied? Socioeconomic Inequalities in Digital Health Services’ (2023) <https://www.adalovelaceinstitute.org/wp-content/uploads/2023/09/ADALOV1.pdf>

Ada Lovelace Institute, ‘Listening to the Public. Views from the Citizens’ Biometrics Council on the Information Commissioner’s Office’s Proposed Approach to Biometrics.’ (2023) <https://www.adalovelaceinstitute.org/report/listening-to-the-public/>

Ada Lovelace Institute, ‘Regulating AI in the UK’ (2023) <https://www.adalovelaceinstitute.org/report/regulating-ai-in-the-uk/> accessed 1 August 2023

Ada Lovelace Institute, ‘Foundation Models in the Public Sector: Key Considerations for Deploying Public-Sector Foundation Models’ (2023) Policy briefing <https://www.adalovelaceinstitute.org/policy-briefing/foundation-models-public-sector/>

Ada Lovelace Institute, ‘Lessons from the App Store’ (2023) <https://www.adalovelaceinstitute.org/wp-content/uploads/2023/06/Ada-Lovelace-Institute-Lessons-from-the-App-Store-June-2023.pdf> accessed 27 September 2023

Ada Lovelace Institute and Alan Turing Institute, ‘How Do People Feel about AI? A Nationally Representative Survey of Public Attitudes to Artificial Intelligence in Britain’ (2023) <https://www.adalovelaceinstitute.org/report/public-attitudes-ai/> accessed 6 June 2023

American Psychological Association, ‘2023 Work in America Survey: Artificial Intelligence, Monitoring Technology, and Psychological Well-Being’ (https://www.apa.org, 2023) <https://www.apa.org/pubs/reports/work-in-america/2023-work-america-ai-monitoring> accessed 26 September 2023

BEIS, ‘Public Attitudes to Science’ (Department for Business, Energy and Industrial Strategy/Kantar Public 2019) <https://www.kantar.com/uk-public-attitudes-to-science>

BEIS, ‘BEIS Public Attitudes Tracker: Artificial Intelligence Summer 2022, UK’ (Department for Business, Energy & Industrial Strategy 2022) <https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1105175/BEIS_PAT_Summer_2022_Artificial_Intelligence.pdf>

BritainThinks and Centre for Data Ethics and Innovation, ‘AI Governance’ (2022) <https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1146010/CDEI_AI_White_Paper_Final_report.pdf>

Budic M, ‘AI and Us: Ethical Concerns, Public Knowledge and Public Attitudes on Artificial Intelligence’, Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society (ACM 2022) <https://dl.acm.org/doi/10.1145/3514094.3539518> accessed 22 August 2023

‘CDEI | AI Governance’ (BritainThinks 2022) <https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1177293/Britainthinks_Report_-_CDEI_AI_Governance.pdf> accessed 22 August 2023

Central Digital & Data Office, ‘Data Ethics Framework’ (GOV.UK, 16 September 2020) <https://www.gov.uk/government/publications/data-ethics-framework> accessed 23 May 2023

Centre for Data Ethics and Innovation, ‘Public Attitudes to Data and AI: Tracker Survey (Wave 2)’ (2022) <https://www.gov.uk/government/publications/public-attitudes-to-data-and-ai-tracker-survey-wave-2>

Children’s Parliament, Scottish AI Alliance and The Alan Turing Institute, ‘Exploring Children’s Rights and AI. Stage 1 (Summary Report)’ (2023) <https://www.turing.ac.uk/sites/default/files/2023-05/exploring_childrens_rights_and_ai.pdf>

Cohen K and Doubleday R (eds), Future Directions for Citizen Science and Public Policy (Centre for Science and Policy 2021)

Curato N, Deliberative Mini-Publics: Core Design Features (Bristol University Press 2021)

Curato N and others, ‘Global Assembly on the Climate and Ecological Crisis: Evaluation Report’ [2023] https://eprints.ncl.ac.uk <https://eprints.ncl.ac.uk> accessed 26 October 2023

Daedalus, ‘Twelve Key Findings in Deliberative Democracy Research’ (2017) 146 Daedalus 28 <https://direct.mit.edu/daed/article/146/3/28-38/27148> accessed 6 August 2021Davies M and Birtwistle M, ‘Seizing the “AI Moment”: Making a Success of the AI Safety Summit’ (7 September 2023) <https://www.adalovelaceinstitute.org/blog/ai-safety-summit/>

Doteveryone, ‘People, Power and Technology: The 2020 Digital Attitudes Report’ (2020) <https://doteveryone.org.uk/wp-content/uploads/2020/05/PPT-2020_Soft-Copy.pdf> accessed 21 September 2023

Farbrace E, Warren J and Murphy R, ‘Understanding AI Uptake and Sentiment among People and Businesses in the UK’ (Office for National Statistics 2023)

Farrell DM and others, ‘When Mini-Publics and Maxi-Publics Coincide: Ireland’s National Debate on Abortion’ [2020] Representation 1 <https://www.tandfonline.com/doi/full/10.1080/00344893.2020.1804441> accessed 19 July 2021

Gilman M, ‘Democratizing AI: Principles for Meaningful Public Participation’ (Data & Society 2023) <https://datasociety.net/wp-content/uploads/2023/09/DS_Democratizing-AI-Public-Participation-Brief_9.2023.pdf> accessed 5 October 2023

Global Assembly Team, ‘Report of the 2021 Global Assembly on the Climate and Ecological Crisis’ (2022) <http://globalassembly.org>

Global Science Partnership, ‘The Inclusive Policymaking Toolkit for Climate Action’ (2023) <https://www.globalsciencepartnership.com/_files/ugd/b63d52_8b6b397c52b14b46a46c1f70e04839e1.pdf> accessed 3 October 2023

Goldberg S and Bächtiger A, ‘Catching the “Deliberative Wave”? How (Disaffected) Citizens Assess Deliberative Citizen Forums’ (2023) 53 British Journal of Political Science 239 <https://www.cambridge.org/core/product/identifier/S0007123422000059/type/journal_article> accessed 8 September 2023

González F and others, ‘Global Reactions to the Cambridge Analytica Scandal: A Cross-Language Social Media Study’ [2019] WWW ’19: Companion Proceedings of The 2019 World Wide Web Conference 799

Grönlund, Kimmo, Bächtiger, André, and Setälä, Maija, Deliberative Mini-Publics. Invovling Citizens in the Democratic Process (ECPR Press 2014)

Hadlington L and others, ‘The Use of Artificial Intelligence in a Military Context: Development of the Attitudes toward AI in Defense (AAID) Scale’ (2023) 14 Frontiers in Psychology 1164810 <https://www.frontiersin.org/articles/10.3389/fpsyg.2023.1164810/full> accessed 24 August 2023

IAP2, ‘IAP2 Spectrum of Public Participation’ <https://iap2.org.au/wp-content/uploads/2020/01/2018_IAP2_Spectrum.pdf>

Ipsos, ‘Global Views on AI 2023: How People across the World Feel about Artificial Intelligence and Expect It Will Impact Their Life’ (2023) <https://www.ipsos.com/sites/default/files/ct/news/documents/2023-07/Ipsos%20Global%20AI%202023%20Report%20-%20NZ%20Release%2019.07.2023.pdf> accessed 3 October 2023

Ipsos MORI, Open Data Institute and Imperial College Health Partners, ‘NHS AI Lab Public Dialogue on Data Stewardship’ (NHS AI Lab 2022) <https://www.ipsos.com/en-uk/understanding-how-public-feel-decisions-should-be-made-about-access-their-personal-health-data-ai>

Kieslich K, Keller B and Starke C, ‘Artificial Intelligence Ethics by Design. Evaluating Public Perception on the Importance of Ethical Design Principles of Artificial Intelligence’ (2022) 9 Big Data & Society 205395172210929 <https://journals.sagepub.com/doi/10.1177/20539517221092956?icid=int.sj-full-text.similar-articles.3#:~:text=The%20results%20suggest%20that%20accountability,systems%20is%20slightly%20less%20important.> accessed 22 August 2023

Kieslich K, Lünich M and Došenović P, ‘Ever Heard of Ethical AI? Investigating the Salience of Ethical AI Issues among the German Population’ [2023] International Journal of Human–Computer Interaction 1 <http://arxiv.org/abs/2207.14086> accessed 22 August 2023

Landemore H, Democratic Reason: Politics, Collective Intelligence, and the Rule of the Many (2017)

Lazar S and Nelson A, ‘AI Safety on Whose Terms?’ (2023) 381 Science 138 <https://www.science.org/doi/10.1126/science.adi8982> accessed 13 October 2023

‘Majority of Britons Support Vaccine Passports but Recognise Concerns in New Ipsos UK KnowledgePanel Poll’ (Ipsos, 31 March 2021) <https://www.ipsos.com/en-uk/majority-britons-support-vaccine-passports-recognise-concerns-new-ipsos-uk-knowledgepanel-poll> accessed 27 September 2023

Mellier C and Wilson R, ‘Getting Real About Citizens’ Assemblies: A New Theory of Change for Citizens’ Assemblies’ (European Democracy Hub: Research, 10 October 2023)

Milltown Partners and Clifford Chance, ‘Responsible AI in Practice: Public Expectations of Approaches to Developing and Deploying AI’ (2023) <https://www.cliffordchance.com/content/dam/cliffordchance/hub/TechGroup/responsible-ai-in-practice-report-2023.pdf>

Nussberger A-M and others, ‘Public Attitudes Value Interpretability but Prioritize Accuracy in Artificial Intelligence’ (2022) 13 Nature Communications 5821 <https://www.nature.com/articles/s41467-022-33417-3> accessed 8 June 2023

OECD, ‘Innovative Citizen Participation and New Democratic Institutions: Catching the Deliberative Wave’ (OECD 2021) <https://www.oecd-ilibrary.org/governance/innovative-citizen-participation-and-new-democratic-institutions_339306da-en> accessed 5 January 2022

OECD, ‘Institutionalising Public Deliberation’ (OECD) <https://www.oecd.org/governance/innovative-citizen-participation/icp-institutionalising%20deliberation.pdf>

Rainie L and others, ‘AI and Human Enhancement: Americans´Openness Is Tempered by a Range of Concerns’ (Pew Research Center 2022) <https://www.pewresearch.org/internet/2022/03/17/how-americans-think-about-artificial-intelligence/>

Thinks Insights & Strategy and Centre for Data Ethics and Innovation, ‘Public Perceptions of Foundation Models’ (Centre for Data Ethics and Innovation 2023) <https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1184584/Thinks_CDEI_Public_perceptions_of_foundation_models.pdf>

Tyson A and Kikuchi E, ‘Growing Public Concern about the Role of Artificial Intelligence in Daily Life’ (Pew Research Center 2023) <https://www.pewresearch.org/short-reads/2023/08/28/growing-public-concern-about-the-role-of-artificial-intelligence-in-daily-life/>

UK Government, ‘Iconic Bletchley Park to Host UK AI Safety Summit in Early November’ <https://www.gov.uk/government/news/iconic-bletchley-park-to-host-uk-ai-safety-summit-in-early-november>

van der Veer SN and others, ‘Trading off Accuracy and Explainability in AI Decision-Making: Findings from 2 Citizens’ Juries’ (2021) 28 Journal of the American Medical Informatics Association 2128 <https://academic.oup.com/jamia/article/28/10/2128/6333351> accessed 3 May 2023

Woodruff A and others, ‘A Qualitative Exploration of Perceptions of Algorithmic Fairness’, Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (ACM 2018) <https://dl.acm.org/doi/10.1145/3173574.3174230> accessed 22 August 2023

Woodruff A and others, ‘“A Cold, Technical Decision-Maker”: Can AI Provide Explainability, Negotiability, and Humanity?’ (arXiv, 1 December 2020) <http://arxiv.org/abs/2012.00874> accessed 22 August 2023

Wright J and others, ‘Privacy, Agency and Trust in Human-AI Ecosystems: Interim Report (Short Version)’ (The Alan Turing Institute)

Zhang B and Dafoe A, ‘Artificial Intelligence: American Attitudes and Trends’ [2019] SSRN Electronic Journal <https://www.ssrn.com/abstract=3312874> accessed 22 August 2023


Footnotes

[1] UK Government, ‘Iconic Bletchley Park to Host UK AI Safety Summit in Early November’ <https://www.gov.uk/government/news/iconic-bletchley-park-to-host-uk-ai-safety-summit-in-early-november>.

[2] Seth Lazar and Alondra Nelson, ‘AI Safety on Whose Terms?’ (2023) 381 Science 138 <https://www.science.org/doi/10.1126/science.adi8982> accessed 13 October 2023.

[3] Ada Lovelace Institute, ‘Regulating AI in the UK’ (2023) <https://www.adalovelaceinstitute.org/report/regulating-ai-in-the-uk/> accessed 1 August 2023.

[4] Ada Lovelace Institute, ‘Foundation Models in the Public Sector: Key Considerations for Deploying Public-Sector Foundation Models’ (2023) Policy briefing <https://www.adalovelaceinstitute.org/policy-briefing/foundation-models-public-sector/>.

[5] Matt Davies and Michael Birtwistle, ‘Seizing the “AI Moment”: Making a Success of the AI Safety Summit’ (7 September 2023) <https://www.adalovelaceinstitute.org/blog/ai-safety-summit/>.

[6] Central Digital & Data Office, ‘Data Ethics Framework’ (GOV.UK, 16 September 2020) <https://www.gov.uk/government/publications/data-ethics-framework> accessed 23 May 2023.

[7] Ada Lovelace Institute and Alan Turing Institute, ‘How Do People Feel about AI? A Nationally Representative Survey of Public Attitudes to Artificial Intelligence in Britain’ (2023) <https://www.adalovelaceinstitute.org/report/public-attitudes-ai/> accessed 6 June 2023.

[8] James Wright and others, ‘Privacy, Agency and Trust in Human-AI Ecosystems: Interim Report (Short Version)’ (The Alan Turing Institute).

[9] Ada Lovelace Institute and Alan Turing Institute (n 7).

[10] Emily Farbrace, Jeni Warren and Rhian Murphy, ‘Understanding AI Uptake and Sentiment among People and Businesses in the UK’ (Office for National Statistics 2023).

[11] Milltown Partners and Clifford Chance, ‘Responsible AI in Practice: Public Expectations of Approaches to Developing and Deploying AI’ (2023) <https://www.cliffordchance.com/content/dam/cliffordchance/hub/TechGroup/responsible-ai-in-practice-report-2023.pdf>.

[12] Lee Rainie and others, ‘AI and Human Enhancement: Americans´ Openness Is Tempered by a Range of Concerns’ (Pew Research Center 2022) <https://www.pewresearch.org/internet/2022/03/17/how-americans-think-about-artificial-intelligence/>.

[13] Baobao Zhang and Allan Dafoe, ‘Artificial Intelligence: American Attitudes and Trends’ [2019] SSRN Electronic Journal <https://www.ssrn.com/abstract=3312874> accessed 22 August 2023.

[14] Kimon Kieslich, Marco Lünich and Pero Došenović, ‘Ever Heard of Ethical AI? Investigating the Salience of Ethical AI Issues among the German Population’ [2023] International Journal of Human–Computer Interaction 1 <http://arxiv.org/abs/2207.14086> accessed 22 August 2023.

[15] Ada Lovelace Institute, ‘The Citizens’ Biometrics Council. Recommendations and Findings of a Public Deliberation on Biometrics Technology, Policy and Governance’ (2021) <https://www.adalovelaceinstitute.org/project/citizens-biometrics-council/>.

[16] Ada Lovelace Institute and Alan Turing Institute (n 7).

[17] BEIS, ‘Public Attitudes to Science’ (Department for Business, Energy and Industrial Strategy/Kantar Public 2019) <https://www.kantar.com/uk-public-attitudes-to-science>.

[18] Alec Tyson and Emma Kikuchi, ‘Growing Public Concern about the Role of Artificial Intelligence in Daily Life’ (Pew Research Center 2023) <https://www.pewresearch.org/short-reads/2023/08/28/growing-public-concern-about-the-role-of-artificial-intelligence-in-daily-life/>.

[19] Ada Lovelace Institute, ‘Who Cares What the Public Think?’ (2022) <https://www.adalovelaceinstitute.org/evidence-review/public-attitudes-data-regulation/>.

[20] Thinks Insights & Strategy and Centre for Data Ethics and Innovation, ‘Public Perceptions of Foundation Models’ (Centre for Data Ethics and Innovation 2023) <https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1184584/Thinks_CDEI_Public_perceptions_of_foundation_models.pdf>.

[21] Allison Woodruff and others, ‘“A Cold, Technical Decision-Maker”: Can AI Provide Explainability, Negotiability, and Humanity?’ (arXiv, 1 December 2020) <http://arxiv.org/abs/2012.00874> accessed 22 August 2023.

[22] Ada Lovelace Institute and Alan Turing Institute (n 7).

[23] Ipsos MORI, Open Data Institute and Imperial College Health Partners, ‘NHS AI Lab Public Dialogue on Data Stewardship’ (NHS AI Lab 2022) <https://www.ipsos.com/en-uk/understanding-how-public-feel-decisions-should-be-made-about-access-their-personal-health-data-ai>.

[24] BEIS (n 17).

[25] Woodruff and others (n 21).

[26] Ipsos MORI, Open Data Institute and Imperial College Health Partners (n 23).

[27] Ada Lovelace Institute and Alan Turing Institute (n 7).

[28] Ada Lovelace Institute, ‘Listening to the Public. Views from the Citizens’ Biometrics Council on the Information Commissioner’s Office’s Proposed Approach to Biometrics.’ (2023) <https://www.adalovelaceinstitute.org/report/listening-to-the-public/>.

[29] Ada Lovelace Institute, ‘The Citizens’ Biometrics Council. Recommendations and Findings of a Public Deliberation on Biometrics Technology, Policy and Governance’ (n 15).

[30] Ada Lovelace Institute and Alan Turing Institute (n 7).

[31] Ada Lovelace Institute, ‘The Rule of Trust: Findings from Citizens’ Juries on the Good Governance of Data in Pandemics.’ (2022) <https://www.adalovelaceinstitute.org/wp-content/uploads/2022/07/The-rule-of-trust-Ada-Lovelace-Institute-July-2022.pdf>.

[32] BritainThinks and Centre for Data Ethics and Innovation, ‘AI Governance’ (2022) <https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1146010/CDEI_AI_White_Paper_Final_report.pdf>.

[33] Ada Lovelace Institute and Alan Turing Institute (n 7).

[34] Allison Woodruff and others, ‘A Qualitative Exploration of Perceptions of Algorithmic Fairness’, Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems (ACM 2018) <https://dl.acm.org/doi/10.1145/3173574.3174230> accessed 22 August 2023.

[35] Ada Lovelace Institute and Alan Turing Institute (n 7).

[36] Woodruff and others (n 21).

[37] Rainie and others (n 12).

[38] Ipsos MORI, Open Data Institute and Imperial College Health Partners (n 23).

[39] Woodruff and others (n 34).

[40] BritainThinks and Centre for Data Ethics and Innovation (n 32).

[41] Ada Lovelace Institute, ‘The Rule of Trust: Findings from Citizens’ Juries on the Good Governance of Data in Pandemics.’ (n 31).

[42] Woodruff and others (n 21).

[43] ibid.

[44] ibid.

[45] ibid.

[46] Ada Lovelace Institute, ‘Access Denied? Socioeconomic Inequalities in Digital Health Services’ (2023) <https://www.adalovelaceinstitute.org/wp-content/uploads/2023/09/ADALOV1.pdf>.

[47] Thinks Insights & Strategy and Centre for Data Ethics and Innovation (n 20).

[48] Ada Lovelace Institute and Alan Turing Institute (n 7).

[49] Marina Budic, ‘AI and Us: Ethical Concerns, Public Knowledge and Public Attitudes on Artificial Intelligence’, Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society (ACM 2022) <https://dl.acm.org/doi/10.1145/3514094.3539518> accessed 22 August 2023.

[50] Kieslich, Lünich and Došenović (n 14).

[51] Rainie and others (n 12).

[52] American Psychological Association, ‘2023 Work in America Survey: Artificial Intelligence, Monitoring Technology, and Psychological Well-Being’ (https://www.apa.org, 2023) <https://www.apa.org/pubs/reports/work-in-america/2023-work-america-ai-monitoring> accessed 26 September 2023.

[53] ibid.

[54] BEIS (n 17).

[55] Tyson and Kikuchi (n 18).

[56] Ada Lovelace Institute and Alan Turing Institute (n 7).

[57] Tyson and Kikuchi (n 18).

[58] Ada Lovelace Institute, ‘The Rule of Trust: Findings from Citizens’ Juries on the Good Governance of Data in Pandemics.’ (n 31).

[59] ‘Majority of Britons Support Vaccine Passports but Recognise Concerns in New Ipsos UK KnowledgePanel Poll’ (Ipsos, 31 March 2021) <https://www.ipsos.com/en-uk/majority-britons-support-vaccine-passports-recognise-concerns-new-ipsos-uk-knowledgepanel-poll> accessed 27 September 2023.

[60] American Psychological Association (n 52).

[61] Ada Lovelace Institute, ‘The Rule of Trust: Findings from Citizens’ Juries on the Good Governance of Data in Pandemics.’ (n 31).

[62] Ada Lovelace Institute, ‘The Citizens’ Biometrics Council. Recommendations and Findings of a Public Deliberation on Biometrics Technology, Policy and Governance’ (n 15).

[63] ibid.

[64] ibid. ibid.

[65] ‘Explainer: What Is a Foundation Model?’ <https://www.adalovelaceinstitute.org/resource/foundation-models-explainer/> accessed 26 October 2023

[66] Thinks Insights & Strategy and Centre for Data Ethics and Innovation (n 20). ibid.

[67] Thinks Insights & Strategy and Centre for Data Ethics and Innovation (n 20). ibid.

[68] Thinks Insights & Strategy and Centre for Data Ethics and Innovation (n 20). ibid.

[69] Thinks Insights & Strategy and Centre for Data Ethics and Innovation (n 20). ibid.

[70] Thinks Insights & Strategy and Centre for Data Ethics and Innovation (n 20). ibid.

[71] Thinks Insights & Strategy and Centre for Data Ethics and Innovation (n 20).

[72] ibid. ibid.

[73] Thinks Insights & Strategy and Centre for Data Ethics and Innovation (n 20). ibid.

[74] Ada Lovelace Institute, ‘Who Cares What the Public Think?’ (n 19).

[75] Ada Lovelace Institute, ‘The Rule of Trust: Findings from Citizens’ Juries on the Good Governance of Data in Pandemics.’ (n 31).

[76] Ada Lovelace Institute, ‘The Citizens’ Biometrics Council. Recommendations and Findings of a Public Deliberation on Biometrics Technology, Policy and Governance’ (n 15).

[77] Doteveryone, ‘People, Power and Technology: The 2020 Digital Attitudes Report’ (2020) <https://doteveryone.org.uk/wp-content/uploads/2020/05/PPT-2020_Soft-Copy.pdf> accessed 21 September 2023.

[78] Ada Lovelace Institute, ‘The Citizens’ Biometrics Council. Recommendations and Findings of a Public Deliberation on Biometrics Technology, Policy and Governance’ (n 15).

[79] Ada Lovelace Institute, ‘Lessons from the App Store’ <https://www.adalovelaceinstitute.org/wp-content/uploads/2023/06/Ada-Lovelace-Institute-Lessons-from-the-App-Store-June-2023.pdf> accessed 27 September 2023.

[80] Ipsos MORI, Open Data Institute and Imperial College Health Partners (n 23).

[81] Ada Lovelace Institute, ‘The Rule of Trust: Findings from Citizens’ Juries on the Good Governance of Data in Pandemics.’ (n 31).

[82] Ada Lovelace Institute and Alan Turing Institute (n 7).

[83] Ipsos MORI, Open Data Institute and Imperial College Health Partners (n 23).

[84] Ada Lovelace Institute and Alan Turing Institute (n 7).

[85] Centre for Data Ethics and Innovation, ‘Public Attitudes to Data and AI: Tracker Survey (Wave 2)’ (2022) <https://www.gov.uk/government/publications/public-attitudes-to-data-and-ai-tracker-survey-wave-2>.

[86] Zhang and Dafoe (n 13).

[87] Ipsos MORI, Open Data Institute and Imperial College Health Partners (n 23).

[88] Ada Lovelace Institute and Alan Turing Institute (n 7).

[89] Centre for Data Ethics and Innovation (n 84) 2.

[90] Ipsos MORI, Open Data Institute and Imperial College Health Partners (n 23).

[91] Wright and others (n 8).

[92] Ada Lovelace Institute, ‘Access Denied? Socioeconomic Inequalities in Digital Health Services’ (n 46).

[93] Ada Lovelace Institute, ‘Who Cares What the Public Think?’ (n 19).

[94] ibid.

[95] Milltown Partners and Clifford Chance (n 11).

[96] Ada Lovelace Institute, ‘The Rule of Trust: Findings from Citizens’ Juries on the Good Governance of Data in Pandemics.’ (n 31).

[97] Ada Lovelace Institute, ‘The Citizens’ Biometrics Council’ (Ada Lovelace Institute 2021) <https://www.adalovelaceinstitute.org/report/citizens-biometrics-council/>.

[98] Ada Lovelace Institute, ‘The Rule of Trust: Findings from Citizens’ Juries on the Good Governance of Data in Pandemics.’ (n 31).

[99] ‘CDEI | AI Governance’ (BritainThinks 2022) <https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1177293/Britainthinks_Report_-_CDEI_AI_Governance.pdf> accessed 22 August 2023. ibid.

[100] Kimon Kieslich, Birte Keller and Christopher Starke, ‘Artificial Intelligence Ethics by Design. Evaluating Public Perception on the Importance of Ethical Design Principles of Artificial Intelligence’ (2022) 9 Big Data & Society 205395172210929 <https://journals.sagepub.com/doi/10.1177/20539517221092956?icid=int.sj-full-text.similar-articles.3#:~:text=The%20results%20suggest%20that%20accountability,systems%20is%20slightly%20less%20important.> accessed 22 August 2023.

[101] Children’s Parliament, Scottish AI Alliance and The Alan Turing Institute, ‘Exploring Children’s Rights and AI. Stage 1 (Summary Report)’ (2023) <https://www.turing.ac.uk/sites/default/files/2023-05/exploring_childrens_rights_and_ai.pdf>.

[102] ‘CDEI | AI Governance’ (n 98).

[103] Ada Lovelace Institute, ‘The Citizens’ Biometrics Council’ (n 96).

[104] Ada Lovelace Institute, ‘Who Cares What the Public Think?’ (n 19).

[105] Ada Lovelace Institute, ‘The Rule of Trust: Findings from Citizens’ Juries on the Good Governance of Data in Pandemics.’ (n 31).

[106] Ada Lovelace Institute, ‘Who Cares What the Public Think?’ (n 19).

[107] Ada Lovelace Institute, ‘The Rule of Trust: Findings from Citizens’ Juries on the Good Governance of Data in Pandemics.’ (n 31).

[108] Wright and others (n 8).

[109] Ada Lovelace Institute, ‘Access Denied? Socioeconomic Inequalities in Digital Health Services’ (n 46). ibid.

[110] Ada Lovelace Institute and Alan Turing Institute (n 7).

[111] Ada Lovelace Institute, ‘The Rule of Trust: Findings from Citizens’ Juries on the Good Governance of Data in Pandemics.’ (n 31).

[112] Sabine N van der Veer and others, ‘Trading off Accuracy and Explainability in AI Decision-Making: Findings from 2 Citizens’ Juries’ (2021) 28 Journal of the American Medical Informatics Association 2128 <https://academic.oup.com/jamia/article/28/10/2128/6333351> accessed 3 May 2023.

[113] Woodruff and others (n 21).

[114] Anne-Marie Nussberger and others, ‘Public Attitudes Value Interpretability but Prioritize Accuracy in Artificial Intelligence’ (2022) 13 Nature Communications 5821 <https://www.nature.com/articles/s41467-022-33417-3> accessed 8 June 2023.

[115] Woodruff and others (n 34).

[116] Ada Lovelace Institute and Alan Turing Institute (n 7).

[117] Woodruff and others (n 21).

[118] ibid.

[119] Milltown Partners and Clifford Chance (n 11).

[120]  Woodruff and others (n 21).

[121] Ada Lovelace Institute and Alan Turing Institute (n 7).

[122] ibid.

[123] Woodruff and others (n 21).

[124] BritainThinks and Centre for Data Ethics and Innovation (n 32). ibid.

[125] Ada Lovelace Institute and Alan Turing Institute (n 7).

[126] Ada Lovelace Institute, ‘Who Cares What the Public Think?’ (n 19).

[127] Ada Lovelace Institute, ‘The Citizens’ Biometrics Council. Recommendations and Findings of a Public Deliberation on Biometrics Technology, Policy and Governance’ (n 15). ibid.

[128] Ada Lovelace Institute and Alan Turing Institute (n 7).

[129] Milltown Partners and Clifford Chance (n 11).

[130] Ada Lovelace Institute, ‘Listening to the Public. Views from the Citizens’ Biometrics Council on the Information Commissioner’s Office’s Proposed Approach to Biometrics.’ (n 28).

[131] Ada Lovelace Institute, ‘The Rule of Trust: Findings from Citizens’ Juries on the Good Governance of Data in Pandemics.’ (n 31).

[132] ibid.

[133] Ipsos MORI, Open Data Institute and Imperial College Health Partners (n 23). ibid.

[134] Ipsos MORI, Open Data Institute and Imperial College Health Partners (n 23).

[135] Ada Lovelace Institute, ‘Access Denied? Socioeconomic Inequalities in Digital Health Services’ (n 46).

[136] Ipsos MORI, Open Data Institute and Imperial College Health Partners (n 23).

[137] ibid.

[138] Ada Lovelace Institute, ‘The Citizens’ Biometrics Council. Recommendations and Findings of a Public Deliberation on Biometrics Technology, Policy and Governance’ (n 15). ibid.

[139] Ada Lovelace Institute, ‘The Citizens’ Biometrics Council’ (n 96). ibid.

[140] Ada Lovelace Institute, ‘The Rule of Trust: Findings from Citizens’ Juries on the Good Governance of Data in Pandemics.’ (n 31). ibid.

[141] Ipsos MORI, Open Data Institute and Imperial College Health Partners (n 23). ibid.

[142] Ada Lovelace Institute, ‘The Rule of Trust: Findings from Citizens’ Juries on the Good Governance of Data in Pandemics.’ (n 31).

[143] Children’s Parliament, Scottish AI Alliance and The Alan Turing Institute (n 100).

[144] ibid.

[145] Ada Lovelace Institute and Alan Turing Institute (n 7).

[146] BEIS, ‘BEIS Public Attitudes Tracker: Artificial Intelligence Summer 2022, UK’ (Department for Business, Energy & Industrial Strategy 2022) <https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1105175/BEIS_PAT_Summer_2022_Artificial_Intelligence.pdf>.

[147] Tyson and Kikuchi (n 18).

[148] Ada Lovelace Institute and Alan Turing Institute (n 7).

[149] American Psychological Association (n 52).

[150] Ipsos, ‘Global Views on AI 2023: How People across the World Feel about Artificial Intelligence and Expect It Will Impact Their Life’ (2023) <https://www.ipsos.com/sites/default/files/ct/news/documents/2023-07/Ipsos%20Global%20AI%202023%20Report%20-%20NZ%20Release%2019.07.2023.pdf> accessed 3 October 2023.

[151] Tyson and Kikuchi (n 18).

[152] Ada Lovelace Institute and Alan Turing Institute (n 7).

[153] Rainie and others (n 12).

[154] Ada Lovelace Institute and Alan Turing Institute (n 7).

[155]

Wright and others (n 8).

[156] Woodruff and others (n 21).

[157] IAP2, ‘IAP2 Spectrum of Public Participation’ <https://iap2.org.au/wp-content/uploads/2020/01/2018_IAP2_Spectrum.pdf>.

[158] Ada Lovelace Institute, ‘Participatory Data Stewardship: A Framework for Involving People in the Use of Data’ (2021) <https://www.adalovelaceinstitute.org/report/participatory-data-stewardship/>.

[159] Lee Hadlington and others, ‘The Use of Artificial Intelligence in a Military Context: Development of the Attitudes toward AI in Defense (AAID) Scale’ (2023) 14 Frontiers in Psychology 1164810 <https://www.frontiersin.org/articles/10.3389/fpsyg.2023.1164810/full> accessed 24 August 2023.

[160] Katie Cohen and Robert Doubleday (eds), Future Directions for Citizen Science and Public Policy (Centre for Science and Policy 2021).

[161] ibid. ibid.

[162] Claire Mellier and Rich Wilson, ‘Getting Real About Citizens’ Assemblies: A New Theory of Change for Citizens’ Assemblies’ (European Democracy Hub: Research, 10 October 2023).

[163] Ada Lovelace Institute, ‘Rethinking Data and Rebalancing Digital Power’ (2022) <https://www.adalovelaceinstitute.org/project/rethinking-data/>.

[164] Global Science Partnership, ‘The Inclusive Policymaking Toolkit for Climate Action’ (2023) <https://www.globalsciencepartnership.com/_files/ugd/b63d52_8b6b397c52b14b46a46c1f70e04839e1.pdf> accessed 3 October 2023.

[165] Michele Gilman, ‘Democratizing AI: Principles for Meaningful Public Participation’ (Data & Society 2023) <https://datasociety.net/wp-content/uploads/2023/09/DS_Democratizing-AI-Public-Participation-Brief_9.2023.pdf> accessed 5 October 2023.

[166] Hélène Landemore, Democratic Reason: Politics, Collective Intelligence, and the Rule of the Many (2017).

[167] Nicole Curato, Deliberative Mini-Publics: Core Design Features (Bristol University Press 2021).

[168] OECD, ‘Institutionalising Public Deliberation’ (OECD) <https://www.oecd.org/governance/innovative-citizen-participation/icp-institutionalising%20deliberation.pdf>.

[169] Nicole Curato and others, ‘Twelve Key Findings in Deliberative Democracy Research’ (2017) 146 Daedalus 28 <https://direct.mit.edu/daed/article/146/3/28-38/27148> accessed 6 August 2021.

[170] Ipsos MORI, Open Data Institute and Imperial College Health Partners (n 23).

[171] Ada Lovelace Institute, ‘Rethinking Data and Rebalancing Digital Power’ (n 162).

[172] ibid.

[173] Global Science Partnership (n 163).

[174] Grönlund, Kimmo, Bächtiger, André, and Setälä, Maija, Deliberative Mini-Publics. Involving Citizens in the Democratic Process (ECPR Press 2014).

[175] Curato and others (n 168).

[176] OECD, ‘Innovative Citizen Participation and New Democratic Institutions: Catching the Deliberative Wave’ (OECD 2021) <https://www.oecd-ilibrary.org/governance/innovative-citizen-participation-and-new-democratic-institutions_339306da-en> accessed 5 January 2022.

[177] Saskia Goldberg and André Bächtiger, ‘Catching the “Deliberative Wave”? How (Disaffected) Citizens Assess Deliberative Citizen Forums’ (2023) 53 British Journal of Political Science 239 <https://www.cambridge.org/core/product/identifier/S0007123422000059/type/journal_article> accessed 8 September 2023.

[178] OECD (n 167).

[179] David M Farrell and others, ‘When Mini-Publics and Maxi-Publics Coincide: Ireland’s National Debate on Abortion’ [2020] Representation 1 <https://www.tandfonline.com/doi/full/10.1080/00344893.2020.1804441> accessed 19 July 2021.

[180] Global Assembly Team, ‘Report of the 2021 Global Assembly on the Climate and Ecological Crisis’ (2022) <http://globalassembly.org>. ibid.

[181] Nicole Curato and others, ‘Global Assembly on the Climate and Ecological Crisis: Evaluation Report’ (2023).

[182] Ada Lovelace Institute, ‘Rethinking Data and Rebalancing Digital Power’ (n 162).

[183] Mellier and Wilson (n 161).

[184] Felipe González and others, ‘Global Reactions to the Cambridge Analytica Scandal: A Cross-Language Social Media Study’ [2019] WWW ’19: Companion Proceedings of The 2019 World Wide Web Conference 799.

[185] Ada Lovelace Institute, ‘Participatory Data Stewardship: A Framework for Involving People in the Use of Data’ (n 157).


Image credit: Kira Allman

1–12 of 15

Skip to content

A report which sets out how regulation provides clear, unambiguous rules, which are necessary if the UK is to embrace AI on terms that will be beneficial for people and society.

Regulate to innovate provides evidence for how the UK might develop its approach to AI regulation, which is in line with its ambition for innovation – as set out in the UK AI Strategy – as well as recommendations for the Office for AI’s forthcoming White Paper on the regulation and governance of AI.

Executive summary

In its 2021 National AI Strategy, the UK Government laid out its ambition to make the UK an ‘AI superpower’, bringing economic and societal benefits through innovation. Realising this goal has the potential to transform the UK’s society and economy over the coming decades, and promises significant economic and societal benefits. But the rapid development and proliferation of AI systems also poses significant risks.

As with other disruptive and emerging technologies,[footnote]Mazzucato, M. (2015). ‘From Market Fixing to Market-Creating: A New Framework for Economic Policy’, SSRN Electronic Journal. Available at: https://doi.org/10.2139/ssrn.2744593.[/footnote] creating a successful, safe and innovative AI-enabled economy will be dependent on the UK Government’s ability to establish the right approach to governing and regulating AI systems. And as the UK AI Council’s Roadmap, published in January 2021, states, ‘the UK will only feel the full benefits of AI if all parts of society have full confidence in the science and the technologies, and in the governance and regulation that enable them.’[footnote] AI Council. (2021). AI Roadmap. UK Government. January 2021. Available at: www.gov.uk/government/publications/ai-roadmap [accessed 11 October 2021].[/footnote]

The UK is well placed to develop the right regulatory conditions for AI to flourish, and to balance the economic and societal opportunities with associated risks,[footnote]Office for AI. (2021). National AI Strategy. UK Government. Available at: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1020402/National_AI_Strategy_-_PDF_version.pdf.[/footnote] but urgently needs to set out its approach to this vital, complex task.

However, articulating the right governance and regulatory environment for AI will not be easy.

By virtue of their ability to develop and operate independently of human control, and to make decisions with moral and legal consequences, AI systems present a uniform set of general regulatory and legal challenges concerning agency, causation, accountability and
control. At the same time, the specific regulatory questions posed by AI systems vary considerably across the different domains and industries in which they might be deployed.

Regulators must therefore be able to find ways of accounting consistently for the general properties of AI while also attending to the peculiarities of individual use cases and business models. While other states and economic blocs are already in the process of engaging with tough but unavoidable regulatory challenges through new draft legislation, the UK has still to commit to its regulatory approach to AI.

In September 2021, the Office for AI pledged to set out the Government’s position on AI regulation in a White Paper, to be published in early 2022. Over the course of 2021, the Ada Lovelace Institute convened a cross-disciplinary panel of experts to explore approaches to AI regulation, and inform the development of the Government’s position. Based on this, and Ada’s own research, this report sets out how the UK might develop its approach to AI regulation in line with its ambition for innovation. In this report we:

  1. explore some of the aims and objectives of AI regulation that might have been considered alongside economic growth
  2. outline some of the challenges associated with regulating AI
  3. review the regulatory toolkit, and options for rules and system design, which address technologies, markets and use-specific issues
  4. identify and evaluate some of the different tools and approaches that might be used to overcome the challenges of AI regulation
  5. assess the institutional and legal conditions required for the effective regulation of AI
  6. raise outstanding questions that the UK Government will have to answer in setting out and realising its approach to AI regulation.

The report also identifies a series of conclusions for policymakers, as well as specific recommendations for the Office for AI’s White Paper on the regulation and governance of AI. To present a viable roadmap for the UK’s regulatory ecosystem, the White Paper will need to make clear commitments in three important areas:

  • The development of new, clear regulations for AI.
  • Improved regulatory capacity and coordination.
  • Improved transparency standards and accountability mechanisms.

The development of new, clear regulations for AI

We make the case for the UK Government to:

  • develop a clear description of AI systems that reflects its overall approach to AI regulation, and criteria for regulatory intervention
  • create a central function to oversee the development and implementation of AI-specific, domain-neutral statutory rules for AI systems that are rooted in legal
    and ethical principles
  • require individual regulators to develop sector-specific codes of practice for the regulation of AI.

Improved regulatory capacity and coordination

We argue that there is a need for:

  • expanded funding for regulators to help them deal with analytical and enforcement challenges posed by AI systems
  • expanded funding and support for regulatory experimentation and the development of anticipatory and participatory capacity within individual regulators
  • the development of formal structures for capacity sharing, coordination and intelligence sharing between regulators dealing with AI systems
  • consideration of what additional powers regulators may need to enable them to make use of a greater variety of regulatory mechanisms.

Improving transparency standards and accountability mechanisms

The impacts of AI systems may not always be visible to, or controllable by, policymakers and regulators alone. As such, regulation and regulatory intelligence gathering will have to be complemented by, and coordinated with extra-regulatory mechanisms such as standards,
investigative journalism and activism. We argue that the UK Government should consider:

  • using the UK’s influence over international standards to improve the transparency and auditability of AI systems
  • how best to maintain and strengthen laws and mechanisms to protect and enable journalists, academics, civil-society organisations, whistleblowers and citizen auditors to hold developers and deployers of AI systems to account.

Overall, this report finds that, far from being an impediment to innovation, effective, future-proof regulation will provide companies and developers with the space to experiment and take risks without being hampered by concerns about legal, reputational or ethical exposure.

Regulation is also necessary to give the public the confidence to embrace AI technologies, and to ensure continued access to foreign markets.

The report also highlights how regulation is an indispensable tool, alongside robust industry codes of practice and judicious public-funding and procurement decisions, to help navigate the narrow path between the risks and harms these technologies present.

We propose that the clear, unambiguous rules that regulation can provide are necessary if
the UK is to embrace AI on terms that will be beneficial in the long term.

To support this approach, we should resist the characterisation that regulation is the enemy of
innovation: modern, relevant, effective regulation will be the brakes that allow us to drive the UK’s AI vehicle successfully and safely into new and beneficial territories.

Finally, this research outlines the major questions and challenges that will need to be addressed in order to develop effective and proportionate AI regulation. In addition to supporting the UK Government’s thinking on how to become an ‘AI superpower’ in a manner that manages risk and results in broadly felt public benefit, we hope this report will contribute to live debates on AI regulation in Europe and the rest of the world.

How to read this report

This report is principally aimed at influencing the emerging policy discourse around the regulation of AI in the UK, and around the world.

  • In the introduction we argue that regulation represents the missing link in the UK’s overall AI strategy, and that addressing this gap will be critical to the UK’s plans to become an AI superpower.
  • Chapter 1 sets out the aims and objectives UK AI regulation should pursue, in addition to economic growth.
  • Chapter 2 reviews the generic regulatory toolkit, and sets out the different ways that regulatory rules and systems can be conceived and configured to deal with different kinds of problems, technologies and markets.
  • Chapters 3 and 4 review some of the specific challenges associated with regulating AI systems, and set out some of the tools and approaches that have the potential to help overcome or ameliorate these difficulties.
  • Chapter 5 articulates some general lessons for policymakers considering how to regulate AI in a UK context.
  • Chapter 6 sets out some specific recommendations for the Office for AI’s forthcoming White Paper on the regulation and governance of AI.

If you’re a UK policymaker thinking about how to regulate AI systems

We encourage you to read the recommendations at the end of this report, which set out some of the key pieces of guidance we hope the Office for AI will incorporate in their forthcoming White Paper.

If you’re from a regulatory body

Explore the mechanisms and approaches to regulating AI, set out in chapter 3, which may provide some ideas for how your organisation can hold these systems more accountable.

If you’re a policymaker from outside of the UK

Many of the considerations articulated in this report are, despite the UK framing, applicable to other national contexts. The considerations for regulating AI that are set out in chapters 1, 2 and 3 are universally applicable.

If you’re a developer of AI systems, or an AI academic

The introduction and the lessons for policymakers section set out why the UK needs to take a new approach to the regulation of AI.

A note on terminology: Throughout this report, we use ‘regulation’ to refer to the codified ‘hard’ rules and directives established by governments to control and govern a particular domain or technology. By contrast, we use the term ‘governance’ to refer to non-regulatory means by which a domain or technology might be controlled or influenced, such as norms, conventions, codes of practice and other ‘soft’ interventions.

 

The terms ex ante (before the event) and ex post (after the event) are used throughout this document. Here, ‘ex ante’ regulation typically refers to regulatory mechanisms intended to prevent or ameliorate future harms, whereas ‘ex post’ refers to mechanisms intended to remedy harms after the fact, or to provide redress.

Introduction

In its 2021 National AI Strategy, the UK Government outlines three core pillars for setting the country on a path towards becoming a global AI and science superpower. These are:[footnote]Office for AI. (2021). National AI strategy. UK Government. Available at: www.gov.uk/government/publications/national-ai-strategy.[/footnote]

  1. investing in the long-term needs of the AI ecosystem
  2. supporting the transition to an AI-enabled economy
  3. ensuring the UK gets the national and international governance of AI technologies right to encourage innovation, investment and protect the public and ‘fundamental values’.[footnote] The strategy uses, but does not define a range of terms related to values, including ‘fundamental values’, ‘our ethical values’, ‘our democratic values’, ‘UK values’, ‘fundamental UK values’ and ‘open society values’. It also refers to ‘values such as fairness, openness, liberty, security, democracy, rule of law and respect for human rights’[/footnote]

As part of its third pillar, the strategy states the Office for AI will set out a ‘national position on governing and regulating AI’ in a White Paper in early 2022. This report seeks to help the Office for AI develop this forthcoming strategy, setting out some of the key challenges associated
with the regulation of AI, different options for approaching the task and a series of concrete recommendations for the UK Government.

The publication of the new AI strategy represents an important articulation of the UK’s ambitions to cultivate and utilise the power of AI. It provides welcome detail on the Government’s proposed approach to AI investment, and their plans to increase the use of AI systems throughout different parts of the economy. Whether the widespread adoption of AI systems will increase economic growth remains to be seen, but it is a belief that underpins this Government’s strategy, and this paper does not seek to explore that assumption.[footnote]One challenge is whether increasing AI adoption may only serve to consolidate the power of a handful of US-based tech companies who use their resources to acquire AI-based start ups. A 2019 UK Government review of digital competition found that ‘over the last 10 years the 5 largest firms have made over 400 acquisitions globally. See Furman, J. (2019). Unlocking digital competition, Report of the Digital Competition Expert Panel. HM Treasury. Available at: www.gov.uk/government/publications/unlocking-digital-competition-report-of-the-digital-competition-expert-panel.[/footnote]

The strategy also highlights some areas that will require further policy thinking and development in the near future. The chapter ‘Governing AI effectively’, notes some of the challenges associated with governing and regulating AI systems that are top of mind for this Government and surveys some of the different regulatory approaches that could be taken, but remains agnostic on which might work best for the UK.

Instead, it asks whether the UK’s current approach to AI regulation is adequate, and commits to set out ‘the Government’s position on the risks and harms posed by AI technologies and our proposal to address them’ in a White Paper in early 2022. In making a commitment to set out the UK’s ‘national position on governing and regulating AI’, the Government has set itself an ambitious timetable for articulating how it intends to address one of the most important gaps in current UK AI policy.

This report explores how the UK’s National AI Strategy might address the regulation and governance of AI systems. It is informed by the Ada Lovelace Institute’s own research and analysis into mechanisms for regulating AI, as well as two expert workshops that the Institute convened in April and May 2021. These convenings brought together academics, public and civil servants, regulators and representatives from civil society organisations to discuss:

  1. How the UK’s regulatory and governance mechanisms may have to evolve and adapt into order to serve the needs and ambitions of the UK’s approach to AI.
  2. How Government policy can support the UK’s regulatory and governance mechanisms to undergo these changes.

The Government is already in the process of drawing up and consulting on plans for the future of UK data regulation and governance, much of which relates to the use of data for AI systems.[footnote]Department for Digital, Culture Media & Sport. (2021). Data: A new direction. UK Government. Available at: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1016395/Data_Reform_Consultation_Document__ Accessible_.pdf[/footnote] While relevant to AI, dataprotection law does not holistically address the kinds of risks and impacts AI systems may present – and is not enough on its own to provide AI developers, users and the public with the clarity and protection they need to integrate these technologies into society with confidence.

Where work to establish a supporting ecosystem for AI is already underway, the Government has so far focused primarily on developing and setting out AI-governance measures, such as the creation of bodies like the Centre for Data Ethics and Innovation (CDEI), with less attention
and activity on specific approaches to the regulation of AI systems.[footnote] The first UK AI strategy (called the UK AI Sector Deal), published in 2017 and updated in 2019, makes relatively little mention of the role of regulation and governance. In discussing how to build trust in the adoption of AI and address its challenges, the strategy is limited to calls for the creation of the Centre for Data Ethics and Innovation to ‘ensure safe, ethical and ground-breaking innovation in AI and data-driven technologies.’ Though the CDEI, since its inception, has produced various helpful pieces of evidence and guidance on ethical best practice around AI (such as a review into bias in algorithmic decision-making and an adoption guide for privacy-enhancing technologies), thinking on how regulation, specifically, might support the responsible development and use of AI remains less advanced.[/footnote]

To move forward, the UK Government will have to answer fundamental questions on the regulation of AI systems in the forthcoming White Paper, including:

  • What should the goal of AI regulation be, and what kinds of regulatory tools and mechanisms can help achieve those objectives?
  • Do AI systems require bespoke regulation, or can the regulation of these systems be wrapped into existing sector-specific regulations, or a broader regulatory package for digital technologies?
  • Should regulating AI require the creation of a single AI regulator, or empower existing regulatory bodies with the capacity and resources to regulate these systems?
  • What kinds of governance practices work for AI systems, and how can regulation incentivise and empower these kinds of practices?
  • How can regulators best address some of the underlying root causes of the harms associated with AI systems?[footnote]Balayan, A., and Gürses, S., (2021). Beyond Debiasing: Regulating AI and its inequalities. European Digital Rights. Available at: https://edri.org/our-work/if-ai-is-the-problem-is-debiasing-the-solution.[/footnote]

For the UK’s AI industry it will be vital that the Government provides actionable answers to these questions. Creating a world-leading AI economy will require consistent and understandable rules, clear objectives and meaningful enforcement mechanisms.

Other world leaders in AI development are already establishing regulations around AI. In April 2021, the European Commission released a draft proposal for the regulation of AI (part of a suite of regulatory proposals for digital markets and services), which proposes a risk-based
model for establishing certain requirements on the sale and deployment of AI technologies.[footnote] European Commission. (2021). A Regulation of the European Parliament and of the Council Laying Down Harmonised Rules on Artifical Intelligence (Artificial Intelligence Act) and Amending Certain Union Legislative Acts. Available at: https://eur-lex.europa.eu/legal-content/EN/ALL/?uri=CELEX%3A52021PC0206 [accessed 4 October 2021].[/footnote] While this draft is still subject to extensive review, it has the potential to set a new global standard for AI regulation that other countries are likely to follow.

In August 2021, the Cyberspace Administration of China passed a set of draft regulations for algorithmic systems,[footnote] Cyberspace Administration of China (国家互联网信息办公室). (2021). Notice of the State Internet Information Office on the Regulations on the Management of Recommendations for Internet Information Service Algorithms (Draft for Solicitation of Comments). 27 August. Available at: www-cac-gov-cn.translate.goog/2021-08/27/c_1631652502874117.htm?_x_tr_sch=http&_x_tr_sl=zh-CN&_x_tr_tl=en&_x_tr_hl=en&_x_tr_pto=ajax,nv,elem [Accessed 16 September 2021].[/footnote] which includes requirements and standards for the design, use and kinds of data that algorithmic systems can use.[footnote]For an interesting analysis, see Schaefer, K. (2021). 27 August. Available at: https://twitter.com/kendraschaefer/status/1431134515242496002 [accessed 22 October 2021].[/footnote] The USA is taking a slower and more fragmented route to the regulation of AI, but is also heading towards establishing its own approach.[footnote] Since 2019, numerous government offices – including the White House’s Office of Science and Technology Policy, the National Institute of Standards and Technology, and the Department of Defence Innovation Board – have set out positions and principles for a national framework on AI.[/footnote]

Throughout 2021, the US Congress has introduced several pieces of federal AI governance and data-protection legislation, such as the Information Transparency and Personal Data Control Act, which would establish similar requirements to the EU GDPR.[footnote]US Congress. (2021). H.R.1816 – Information Transparency & Personal Data Control Act. Available at: www.congress.gov/bill/117th-congress/house-bill/1816/.[/footnote] In October 2021, the White House Office of Science and Technology Policy announced its intention to develop a ‘bill of rights’ to ‘clarify the rights and freedoms [that AI systems] should respect.’[footnote] Lander, E., and Nelson, A. (2021). ‘Americans need a bill of rights for an AI-powered world,’ Wired, 10 October. Available at: www.wired.com/story/opinion-bill-of-rights-artificial-intelligence [accessed 11 October 2021].[/footnote] Moreover, it is looking increasingly likely that geostrategic considerations will push the EU and the USA into closer regulatory proximity over the coming years, with EU President von der Leyen having recently pushed for the EU and the USA to start collaborating together on the promotion and governance of AI systems.[footnote]In a November 2020 speech at the Council on Foreign Relations. See Branson, A. (2020). ‘European Commission woos US over AI agreement.’ Global Government Forum. Available at: www.globalgovernmentforum.com/european-commission-woosus-over-ai-agreemen[/footnote]

As the positions of the world’s most powerful states and economic blocs on the regulation of AI become clearer, more developed and potentially more aligned, it will be increasingly incumbent on the UK to set out its own plans, or risk getting left behind. Unless the UK carves out its own approach towards the regulation of AI, it risks playing catch-up with other nations, or having to default to approaches developed elsewhere that may not align with the Government’s particular strategic objectives. Moreover, if domestically produced AI systems do not align with regulatory standards adopted by other major trade blocs, this could have significant implications for companies operating in the UK’s domestic AI sector, who could find themselves excluded from non-UK markets.

As well as trade considerations, a clear regulatory strategy for AI will be essential to the UK Government’s stated ambitions to use AI to power economic growth, raise living standards and address pressing societal challenges like climate change. As the UK has learned from a variety of different industries, from its enduringly strong life-sciences sector,[footnote] Kent, C. (2019). ‘UK Healthcare Industry Analysis 2019: Why Britain Is a World Leader’. Pharmaceutical Technology. Available at: https://www.pharmaceutical-technology.com/sponsored/uk-healthcare-industry-analysis-2019/ [accessed 20 September 2021].[/footnote] to recent successes in fintech,[footnote]McLean, A., and Wood, I. (2015). ‘Do Regulators Hold the Key to FinTech Success?’, Financier Worldwide Available at: www.financierworldwide.com/do-regulators-hold-the-key-to-fintech-success [accessed 20 September 2021].[/footnote] a clear and robust regulatory framework is essential for the development and diffusion of new technologies and processes. A regulatory framework would ensure developers and deployers of AI systems know how to operate in accordance with the law and protect against the kinds of well-documented harms associated with these technologies,[footnote]McGregor, S. (2020). When AI Systems Fail: Introducing the AI Incident Database. Partnership on AI. Available at: https://partnershiponai.org/aiincidentdatabase.[/footnote] which can undermine public confidence in their development and use.

The need for clear and comprehensive AI regulation is pressing. As a complex, novel technology, the benefits of AI are yet to be evenly distributed to all members of society, yet there is a growing body of evidence around the ways they can cause harm.[footnote]Pownall, C. (2021). AI, algorithmic and automation incidents and controversies. Available at: https://charliepownall.com ai-algorithimic-incident-controversy-database.[/footnote] Across the world, AI systems are being increasingly used in high-stakes settings such as determining which job applicants are successful,[footnote]Dattner, B., Chamorro-Premuzic, T., Buchband, R., and Schettler, L. (2019). ‘The Legal and Ethical Implications of Using AI in Hiring’, Harvard Business Review, 25 April 2019. Available at: https://hbr.org/2019/04/the-legal-and-ethical-implications-of-using-ai-in-hiring [accessed 20 September 2021].[/footnote] what public benefits residents are eligible to claim,[footnote]Martinho-Truswell, E. (2018). ‘How AI Could Help the Public Sector’, Harvard Business Review, 26 January 2018. Available at: https://hbr.org/2018/01/how-ai-could-help-the-public-sector [accessed 20 September 2021].[/footnote] what kind of loan a prospective financial-services client can receive,[footnote]Faggella, D., (2020). ‘Artificial Intelligence Applications for Lending and Loan Management’, Emerj. Available at: https://emerj.com/ai-sector-overviews/artificial-intelligence-applications-lending-loan-management/ [accessed 20 September 2021].[/footnote] or what risk to society a person may potentially pose.[footnote]Tashea, J. (2017). ‘Courts Are Using AI to Sentence Criminals. That Must Stop Now’, Wired. Available at: www.wired.com/2017/04/ courts-using-ai-sentence-criminals-must-stop-now/ [accessed 20 September 2021][/footnote] In many of these instances, AI systems have not yet been proven capable of addressing these kind of tasks fairly or accurately; in others, they have not been properly integrated into the complex social environments in which they have been deployed.

But building such a regulatory framework for AI will not be easy. In virtue of their ability to develop and operate independently of human control, and to make decisions with moral and legal consequences, AI systems present a uniform set of regulatory and legal challenges
concerning agency, causation, accountability and control.[footnote]Turner, J. (2018). Robot Rules: Regulating Artificial Intelligence. Palgrave Macmillan.[/footnote]

At the same time, the specific regulatory questions posed by AI systems vary considerably across the different domains and industries in which they might be deployed. Regulators must find ways of accounting consistently for the general properties of AI, while also attending to the
peculiarities of individual use-cases and business models.

In these contexts, AI systems raise unprecedented legal and regulatory questions, such as their ability to automate morally significant decision-making processes in ways that can be difficult to predict, and their capacity to develop and operate independently of human control.

AI systems are also frequently complex and opaque, and often fail to fall neatly within the contours of existing regulatory systems – they either straddle regulatory remits, or else fall through the gaps in between them. And they are developed for a variety of purposes in different domains, where their impacts, benefits and risks may vary considerably.

These features can make it extremely difficult for existing regulatory bodies to understand if, how and in what manner to intervene.

As a result of this ubiquity and complexity, there is no pre-existing regulatory framework – from finance, medicine, product safety, consumer regulation or elsewhere – that can be reworked to readily apply to an overall, cross-cutting approach to UK AI regulation, nor any that look capable of playing such a role without substantial modifications. Instead, a coherent, effective, durable regulatory framework for AI will have to be developed from first principles, borrowing and adapting regulatory techniques, tools and ideas where they are relevant and developing new ones where necessary.

Difficulties posed by the intrinsic features of AI systems are compounded by the current nature of the business practices of many companies that develop AI systems. The developers of AI systems often fail to sit neatly within any one geographic jurisdiction, and face few
existing regulatory requirements to disclose details of how and where their systems operate. Moreover, the business models of many of the largest and most successful firms that develop AI systems tend towards market dominance, data agglomeration and user disempowerment.

All this makes the Office for AI’s task of using their forthcoming White Paper to set out the UK’s position on governing and regulating AI a substantial challenge. Even if the Office for AI limits itself to the articulation of a high-level direction of travel for AI regulation, doing so will involve adjudicating between competing values and visions of the UK’s relationship to AI, as well as between differing approaches to addressing the multiple regulatory challenges posed by the technology.

Over the course of 2021, the Ada Lovelace Institute has undertaken multiple research projects and convened expert conversations on many of issues relevant to how the UK should approach the regulation of AI.

These included:

  • two expert workshops exploring the potential underlying goals of a regulatory system for AI in the UK, the different ways it might be designed, and the tools and mechanisms it would require
  • workshops considering the EU’s emerging approach to AI regulation
  • research on algorithmic accountability in the public sector and on transparency methods of algorithmic decision-making systems.

Drawing on the insights generated, and on our own research and deliberation, this report sets out to answer the following questions on how the UK might go about developing its approach to the regulation of AI:

  1. What might the UK want to achieve with a regulatory framework for AI?
  2. What kinds of regulatory approaches and tools could support such outcomes?
  3. What are the institutional and legal conditions needed to enable them?

As well as influencing broader policy debates around AI regulation, it is our hope that these considerations are useful in informing the development of the Office for AI’s White Paper, the publication of which presents a critical opportunity to help ensure that regulation delivers on its promise to help the UK live up to its ambitions of becoming an ‘AI superpower’ – and ensuring that such a status delivers economic and societal benefits.

Expert workshops on the regulation of AI

 

In April and May 2021, the Ada Lovelace Institute (Ada) convened two expert workshops, bringing together academics, AI researchers, public and civil servants and civil-society organisations to explore how the UK Government should approach the regulation of AI. The insights gained from these workshops have, alongside Ada’s own research and deliberation, informed the discussions presented in this report.[footnote]Any references in this report to the views and insights of ‘expert participants’ are references to the discussions in the two workshops.[/footnote]

 

These discussions were initially framed around the approach of the UK’s National AI Strategy to AI regulation. In practice, they became broader dialogues about the UK’s relationship to AI, what the goals of Government policy regarding AI systems should be and the UK’s approach to their regulation.

 

  • Workshop one: Explored the underlying goals and aims of UK AI policy, particularly with regards to regulation and governance. A key aim here was to establish what long-term objectives, alongside economic growth, the UK should aspire to achieve through AI policy.
  • Workshop two: Concentrated on identifying the specific mechanisms and policy changes that would be needed for the realisation of a successful, joined-up approach to AI regulation. Participants were encouraged to consider the challenges associated with the different objectives of AI policy, as well as broader challenges associated with regulating AI. They then discussed what regulatory approaches, tools and techniques might be required to address them. Participants were also invited to consider whether the UK’s regulatory infrastructure itself may need to be adapted or supplemented.

 

The workshops were conducted under Chatham House rules. With the exception of presentations given by expert participants, none of the insights produced by these workshops are attributed specifically to individual people or organisations.

 

Expert participants are listed out in full in the acknowledgements section at the end of the report.

 

Representatives from the Office for AI also attended the workshops as observers.

UK AI strategies and regulation

The UK Government’s thinking on the regulation of AI has developed significantly over the past five years. This box sets out some of the major milestones in the Government’s position on the regulation and governance of AI over this time, with the aim of putting the 2021 UK AI Strategy into the context of recent history.

2017-19 UK AI strategy

The original UK AI strategy (called the UK AI Sector Deal), published in 2017 and updated in 2019, makes relatively little mention of the role of regulation.[footnote]European Commission. United Kingdom AI Strategy Report. Available at: https://knowledge4policy.ec.europa.eu/ai-watch/united-kingdom-ai-strategy-report_en.[/footnote] In discussing how to build trust in the adoption of AI and address its challenges, the strategy is limited to calls for the creation of the Centre for Data Ethics and Innovation (CDEI) to ‘ensure safe, ethical and ground-breaking innovation in AI and data-driven technologies’. The report also calls for the creation of the Office for AI to help the UK Government implement this strategy. The UK Government has since created guidance on the ethical adoption of data-driven technologies and the mitigation of potential harms, including guidelines, developed jointly with the Alan Turing Institute, for ethical AI use in the public sector,[footnote]Leslie, D. (2019). Understanding artificial intelligence ethics and safety: A guide for the responsible design and implementation of AI systems in the public sector. The Alan Turing Institute. Available at: https://doi.org/10.5281/zenodo.3240529.[/footnote] a review into bias in algorithmic decision-making[footnote]Centre for Data Ethics and Innovation. (2020). Review into bias in algorithmic decision-making. Available at: www.gov.uk/government/publications/cdei-publishes-review-into-bias-in-algorithmic-decision-making.[/footnote] and an adoption guide for privacy-enhancing technologies.[footnote]Centre for Data Ethics and Innovation. (2021). Privacy Enhancing Technologies Adoption Guide. Available at: https://cdeiuk.github.io/ pets-adoption-guide[/footnote]

2021 UK AI roadmap

In January 2021, the AI Council, an independent-expert committee that provides advice to the Office for AI on the AI ecosystem and its AI strategy implementation, published a roadmap with 16 recommendations for how the UK can develop a revised national AI strategy.[footnote]AI Council. (2021). AI Roadmap. UK Government. Available at: www.gov.uk/government/publications/ai-roadmap.[/footnote]

The roadmap states that:

  • A revised AI strategy presents an important opportunity for the UK Government to develop a strategy for the regulation and governance of AI technologies produced
    and sold in the UK, with the goal improving safety and public confidence in their use.
  • The UK must become ‘world-leading in the provision of responsible regulation and governance’.
  • Given the rapidly changing nature of AI’s development, the UK’s systems of governance must be ‘ready to respond and adapt more frequently than has typically been true of systems of governance in the past’.

The Council recommends ‘commissioning an independent entity to provide recommendations on the next steps in the evolution of governance mechanisms, including impact and risk assessments, best-practice principles, ethical processes and institutional mechanisms that will increase and sustain public trust’.

2021 Scottish AI strategy

Some parts of the UK have further articulated their approach to the regulation of AI. In March 2021, the Scottish Government released an AI strategy that includes five principles that ‘will guide the AI journey from concept to regulation and adoption to create a chain of trust throughout the entire process.’[footnote]Digital Scotland. (2021) Scotland’s AI Strategy: Trustworthy, Ethical and Inclusive. Available at: www.scotlandaistrategy.com.[/footnote]These principles draw on the Organisation for Economic Cooperation and Development’s (OECD’s) five complementary values-based principles for the responsible stewardship of trustworthy AI. These are:[footnote]Organisation for Economic Co-operation and Development. (2019). OECD Principles on Artificial Intelligence. Available at: www.oecd.org/going-digital/ai/principles.[/footnote]

  1. AI should benefit people and the planet by driving inclusive growth, sustainable
    development and wellbeing.
  2. AI systems should be designed in a way that respects the rule of law, human rights,
    democratic values and diversity, and they should include appropriate safeguards –
    for example, enabling human intervention where necessary – to ensure a fair
    and just society.
  3. There should be transparency and responsible disclosure around AI systems to
    ensure that people understand AI-based outcomes and can challenge them.
  4. AI systems must function in a robust, secure and safe way throughout their life cycles
    and potential risks should be continually assessed and managed.
  5. Organisations and individuals developing, deploying or operating AI systems should
    be held accountable for their proper functioning in line with the above principles.

The Scottish strategy also calls for the Government to ‘develop a plan to influence global AI
standards and regulations through international partnerships’.

2021 Digital Regulation Plan

In July 2021, the Department for Digital, Culture, Media, and Sport (DCMS) released a policy paper outlining their thinking on the regulation of digital technologies, including AI.[footnote]Department for Digital, Culture, Media & Sport. (2021). Plan for Digital Regulation. UK Government. Available at: www.gov.uk/government/publications/digital-regulation-driving-growth-and-unlocking-innovation.[/footnote] The paper provides high-level considerations, including the establishment of three principles that should guide future plans for the regulation of digital technologies. These are:

  1. Actively promote innovation: Regulation should ‘be designed to minimise unnecessary burdens on businesses’, be ‘outcomes-focused’, backed by clear evidence of harm, and consider the effects on innovation (a concept the paper does not define). The Government’s approach to regulation should also consider non-regulatory interventions like technical standards first.
  2. Achieve forward-looking and coherent outcomes: This section states regulation should be coordinated across regulators to reduce undue burdens or duplicating existing regulation. Regulation should take a ‘collaborative approach’ by working with businesses to test out new interventions and business models. Approaches to regulation should ‘address underlying drivers of harm rather than symptoms, in order
    to protect against future changes’.
  3. Exploit opportunities and address challenges in the international arena: Regulation should be interoperable with international regulations, and policymakers should ‘build in international considerations from the start’, including via the creation of international standards.

The Digital Regulation Plan includes several mechanisms for putting these principles into practice, including plans to create more regulatory coordination and cooperation, engagement in international forums, and plans to embed these principles across government. However, this policy paper stops short of providing specific recommendations, approaches or frameworks for the regulation of AI systems, and provides only a broad set of considerations that are top of mind for this Government. It does not address specific regulatory tools, mechanisms or approaches the UK should consider towards AI, nor does it provide specific guidance for the overall approach the UK should take towards regulating these technologies.

2021 UK AI Strategy

Released in September 2021, the most recent UK AI Strategy sets out three pillars to lead the UK towards becoming an AI science superpower, including:

  • investing in the long-term needs of the AI ecosystem
  • supporting the transition to an AI-enabled economy
  • ensuring the UK gets the national and international governance of AI technologies right to encourage innovation, investment and protect the public and fundamental values.

Sections one and two of the strategy include plans to launch a National AI Research and Innovation (R&I) programme to align funding priorities across UK research councils, plans to publish a Defence AI Strategy articulating military uses of AI, and other investments to expand investment in the UK’s AI sector. The third pillar on governance includes plans to pilot an AI Standards Hub to coordinate UK engagement in AI standardisation globally, fund the Alan Turing Institute to update guidance on AI ethics and safety in the public sector, and increase the capacity of regulators to address the risks posed by AI systems. In discussing AI regulation, it makes references to embedding values such as fairness, openness, liberty, security, democracy, the rule of law and respect for human rights.

Chapter 1: Goals of AI regulation

Recent policy debates around AI have emphasised cultivating and utilising the technology’s potential to contribute to economic growth. This focus is visible in the newly published AI strategy’s approach to regulation, which stresses the importance of ensuring that the regulatory system fosters public trust and a stable environment for businesses without unduly inhibiting AI innovation.

Although it is prominent in the current Government’s AI policy discussions, economic growth is just one of several underlying objectives for which the UK’s regulatory approach to AI could be
configured. As experts in our workshops pointed out, policymakers may also, for instance, want to stimulate the development of particular forms of AI, single out particular industries for disruption by the technology, or avoid particular consequences of the technology’s development and adoption.

Different underlying objectives will not necessarily be mutually exclusive, but prioritisation matters – choices about which to explicitly include and which to emphasise will have a significant effect on downstream policy choices. This is especially the case with regulation, where new regulatory institutions, approaches and tools will need to be chosen and coordinated with broader strategic goals in mind.

The first of the two expert workshops identified and debated desirable objectives for the regulation of AI in addition to economic growth – and explored what adopting these would mean, in concrete terms, for the UK’s regulatory system.[footnote] As set out above, the expert workshops considered the question of how the UK should approach the regulation of AI through the lens of the UK National AI Strategy, though the discussion quickly expanded to cover the UK’s regulatory approach to AI more generally.[/footnote]

A clear point of consensus among the workshop participants, and an important recommendation of this report, was that the Government’s approach to AI must not be focused exclusively on fostering economic growth, and must consider the unique properties of how AI systems are developed, procured and integrated.

Rather than concentrating exclusively on increasing the rate and extent of AI development and use, expert participants stressed that the Government’s approach to AI must also be attentive to the technology’s unique features, the particular ways it might manifest itself, and the specific effects it might have on the country’s economy, society and power structures.

The need to take account of the unique features of AI is a reason for developing a bespoke, codified regulatory approach to the technology – rather than accommodating it within a broader, technology-neutral, industrial strategy. Perhaps more importantly, though, workshop
participants were keen to highlight that many of AI’s most significant opportunities can only be utilised, and many of its risks can only be mitigated, with the help of an overarching Government strategy that sets out intentions for the use, regulation and governance of these systems. By attending to AI’s specific properties, it will be easier for Government to steer the beneficial development and use of AI to address societal challenges, and for the potential risks posed by the technology to be effectively managed.

In light of the specific challenges and opportunity AI poses, expert participants identified four additional objectives that might be usefully built into any AI strategy (outlined below). A common theme cutting across the discussion was that the UK should build in as an objective the protection and advancement of human rights and societally important values, such as agency, democracy, the rule of law, equality and privacy.


 

Objective 1: Ensure AI is used and developed in accordance with specific values and norms

A common refrain among participants was that the UK AI policy should articulate a set of high-level norms or ethical principles to govern the country’s desired relationship with AI systems. As several experts pointed out, other countries’ national AI strategies, including that of Scotland, have articulated a set of values.[footnote]Digital Scotland. (2021). Scotland’s AI Strategy: Trustworthy, Ethical and Inclusive. Available at: https://static1.squarespace.com/static/5dc00e9e32cd095744be7634/t/606430e006dc4a462a5fa1d4/1617178862157/Scotlands_AI_Strategy_Web_updated_single_page_aps.pdf [accessed 22 October 2021].[/footnote] The purpose of these principles would be to inform specific policy decisions in relation to AI, including the development of regulatory policy and sector-specific guidance and best practice.

The articulation of clear, universal and specific values in a prominent AI-policy document (such as an AI strategy) can help establish a common language and set of principles that could be referenced in future policy and public debates regarding AI. In this instance, the principles would set out how the Government should cultivate and direct the development of the technology, as well as how its use should be governed. They may also extend to the programming and decision-making architecture of AI systems themselves, setting out the values and priorities the UK public would want the developers and deployers of AI systems to uphold when putting them in operation.[footnote]Public opinion on these values and priorities would be determined empirically through, for instance, deliberative public engagement.[/footnote]

In its latest AI strategy, the UK Government makes brief references to several values, including fairness, openness, liberty, security, democracy, the rule of law and respect for human rights.[footnote]Office for AI. (2021). National AI strategy. UK Government. P. 50. Available at: www.gov.uk/government/publications/national-ai-strategy[/footnote] While the values and norms articulated by a national AI strategy would not themselves be able to adjudicate between competing interests and views on specific questions, they do create a framework for weighing and justifying particular courses of action. Medical ethics is a good example of the value of a common language and framework, as it provides medical practitioners with a toolkit to think about different value-laden decisions they might encounter in their practice.[footnote]British Medical Association. (n.d). Ethics. Available at: www.bma.org.uk/advice-and-support/ethics [accessed 20 September 2021].[/footnote] In the AI strategy, the values are not well defined enough to underpin this function, nor are they translated into clearly actionable steps to support their being upheld.

There are already a number of AI ethics principles developed by national and international organisations that the UK could draw from to further define and articulate its values for AI regulation.[footnote]Jobin, A., Ienca, M. & Vayena, E. (2019). ‘The global landscape of AI ethics guidelines’. Nature Machine Intelligence. 1.9 pp. 389–99. Available at: https://doi.org/10.1038/s42256-019-0088-2[/footnote] One example mentioned by expert participants is the Organisation for Economic Cooperation and Development’s (OECD’s) five complementary values-based principles for the responsible stewardship of AI,[footnote]Organisation for Economic Co-operation and Development. (2019). OECD Principles on Artificial Intelligence. Available at: www.oecd. org/going-digital/ai/principles/ [accessed 22 October 2021].[/footnote] which the Scottish AI strategy draws on heavily.[footnote]Digital Scotland. (2021). Scotland’s AI Strategy: Trustworthy, Ethical and Inclusive. Available at: https://static1.squarespace.com/static/5dc00e9e32cd095744be7634/t/606430e006dc4a462a5fa1d4/1617178862157/Scotlands_AI_Strategy_Web_updated_single_page_aps.pdf [accessed 22 October 2021].[/footnote]

Another idea raised by the expert participants was that UK AI policy (and industrial strategy more broadly), should aim to establish and support democratic, inclusive mechanisms for resolving value-laden policy and regulatory decisions. Here, expert participants suggested that
deliberative public-engagement exercises, such as citizens’ assemblies and juries, could be used to set high-level values, or to inform particularly controversial, value-laden policy questions. In addition, participatory mechanisms should be embedded in the development and oversight of governance approaches to AI and data – a topic explored in a recent Ada Lovelace Institute report on participatory data stewardship.[footnote]Ada Lovelace Institute (2021). Participatory data stewardship. Available at: www.adalovelaceinstitute.org/report/participatory-data-stewardship [accessed 20 September 2021].[/footnote]

Expert participants noted that sustained public trust in AI will be vital, and the existence of such processes could be a useful means of ensuring that policy decisions regarding AI are aligned with public values.

However, it is important to note that while ‘building public trust’ in AI is a common and valuable objective surfaced in AI-policy debates, this framing also places the burden of responsibility onto the public to ‘be more trusting’, and does not necessarily address the root issue: the trustworthiness of AI systems.

Public participation in UK AI policy must therefore be recognised as effective not only at framing or refining existing policies in ways that will be considered more acceptable to the public, but to define the fundamental values that underpin those policies. Without this, there
is a significant risk that AI will not align with public hopes, needs and concerns, and this will undermine trust and confidence.


Objective 2: Avoid or ameliorate specific risks and harms

Another commonly voiced view from workshop participants was that UK AI policy should be configured explicitly with a view to reduce, mitigate or completely avoid particular harms and categories of harms associated with AI and its business models. In outlining the particular kinds of harm that AI policy – and particularly regulation – should aim to address, reference was made to the following:

  • harms to individuals and marginalised groups
  • distributional harms
  • harms to free, open societies.

Harms to individuals and marginalised groups

In discussing the potential harms to individuals and marginalised groups associated with AI, participants highlighted the fact that AI systems:

  • Can exhibit bias, with the result that individuals may experience AI systems treating them unfairly or drawing unfair inferences about them. Bias can take many forms, and be expressed in several different parts of the AI product development lifecycle – including ‘algorithmic’ bias in which an AI system’s outputs unfairly bias human judgement.[footnote]Selwyn, N. (2021). Deb Raji on what ‘algorithmic bias’ is (…and what it is not). Data Smart Schools. Available at: https://data-smart-schools.net/2021/04/02/deb-raji-on-what-algorithmic-bias-is-and-what-it-is-not[/footnote]
  • Are often more effective or more accurate for some groups than for others.[footnote]Buolamwini, J., Gebru, T. (2018). ‘Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification.’ Proceedings of the 1st Conference on Fairness, Accountability and Transparency, PMLR 81:77 91. Available at: https://proceedings.mlr. press/v81/buolamwini18a.html.[/footnote] This can lead to various kinds of harm, ranging from individuals having false inferences made about their identity or characteristics,[footnote]Hill, K. (2020). ‘Another Arrest, and Jail Time, Due to a Bad Facial Recognition Match.’ New York Times. Available at: www.nytimes. com/2020/12/29/technology/facial-recognition-misidentify-jail.html[/footnote] to individuals being denied or locked out of services due to the failure of AI systems to work for them.[footnote]Ledford, H. (2019). ‘Millions of black people affected by racial bias in health-care algorithms.’ Nature. 574. 7780 pp. 608–9. Available at: www.nature.com/articles/d41586-019-03228-6[/footnote]
  • Tend to be optimised for particular outcomes.[footnote]Leslie, D. (2019). Understanding Artificial Intelligence Ethics and Safety: A Guide for the Responsible Design and Implementation of AI Systems in the Public Sector. The Alan Turing Institute. Available at: https://doi.org/10.5281/ZENODO.3240529.[/footnote] There is a tendency on the part of those developing AI systems to forget, or otherwise insufficiently consider, how the outcomes for which systems have been optimised might affect underrepresented groups within society.
  • Can cause, and often rely on, the violation of individual privacy rights.[footnote]Zuboff, S. (2019). The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power. London: Profile Books.[/footnote] A lack of privacy can impede an individual’s ability to interact with other people and organisations on equal terms and can cause individuals to change their behaviour.[footnote]Solove, D. J. (2011). Nothing to Hide: The False Tradeoff Between Privacy and Security. New Haven London: Yale University Press.[/footnote]

Distributional harms

Many of the harms associated with AI systems relate to the capacity of AI and its associated business models to drive and exacerbate economic inequality. Workshop participants listed several specific kinds of distributional harms that AI systems can raise:

  • The business models of leading AI companies tend towards monopolisation and concentration of market share. Because machine-learning algorithms base their outcomes on data, well-established AI companies that can collect proprietary datasets tend to have an advantage over newer companies, which can be self-perpetuating. In addition, the large amounts of data required to train some machine-learning algorithms present a high barrier of entry into the market, which can incentivise mergers, acquisitions and partnerships.[footnote]Furman, J., Coyle, D., Fletcher, A., McAuley, D., and Marsden, P. (2019). Unlocking digital competition, Report of the Digital Competition Expert Panel. HM Treasury. Available at: www.gov.uk/government/publications/unlocking-digital-competition-report-of-the-digital-competition-expert-panel.[/footnote] As several recent critiques have pointed out, addressing the harms of AI must look at the wider social, political and economic power underlying the development of these systems.[footnote]Balayan, A., Gürses, S. (2021). Beyond Debiasing: Regulating AI and its inequalities. European Digital Rights. Available at: https://edri.org/our-work/if-ai-is-the-problem-is-debiasing-the-solution[/footnote]
  • Labour’s declining share of GDP. Related to the tendency of AI-business models towards monopolisation, some economists have suggested that one reason for labour’s declining share of GDP in developed countries is that ‘superstar’ tech firms, which employ relatively few workers but produce significant dividends for investors, have come to represent an increasing share of overall economic activity.[footnote]Autor, D., Dorn, D., Katz, L., Patterson, C., and Van Reenen, J. (2020). ‘The Fall of the Labor Share and the Rise of Superstar Firms’, The Quarterly Journal of Economics, 135.2, 645–709. Available at: https://doi.org/10.1093/qje/qjaa004[/footnote]
  • Skills-biased technological change and automation. Expert participants also cited the potential for automation and skills-biased technological change driven by AI to lead to greater inequality. While it is contested whether the rise of AI will necessarily lead to greater economic inequality in the long term, economists have argued that the short-term disruption caused by the transition from one ‘techno-economic paradigm’ to a new one will lead to significant inequality unless policy responses are developed to counter these tendencies.[footnote]Perez, C. (2015). ‘Capitalism, Technology and a Green Global Golden Age: The Role of History in Helping to Shape the Future’, The Political Quarterly, 86 pp. 191–217. Available at: https://doi.org/10.1111/1467-923X.12240.[/footnote]
  • AI systems’ capacity to undermine the bargaining power between workers and employers, and to exacerbate inequalities between participants in markets. Finally, participants cited the ability of AI systems to undermine worker power and collective-bargaining capacity.[footnote]Institute for the Future of Work. (2021). The Amazonian Era: The gigification of work. Available at: https://www.ifow.org/publications/the-amazonian-era-the-gigification-of-work[/footnote] The use of AI systems to monitor and feedback on worker performance, and the application of AI to recruitment and pay-setting processes are two means by which AI could tip the balance of power further towards employers rather than workers.[footnote]Partnership on AI. (2021). Redesigning AI for Shared Prosperity: an Agenda. Pp 23–24. Available at: https://partnershiponai.org/wp-content/uploads/2021/08/PAI-Redesigning-AI-for-Shared-Prosperity.pdf.[/footnote]

Harms to free, open societies

Our expert participants also pointed to the capacity of AI systems to undermine many of the necessary conditions for free, open and democratic societies. Here, participants cited:

  • The use of AI-driven systems to distort competitive political processes. AI systems that tailor content to individuals based on their data profile or behaviour (mostly through social media or search platforms) can be used to influence voter behaviour and the direction of democratic debates. This is recognised as problematic because
    access to these systems is likely to be unevenly distributed across the population and political groups, and because the opacity of content creation and sharing can undermine the democratic ideal of a commonly shared and accessible political discourse – as well as ideals about public debate being subject to public reason.[footnote]Quong, J. (2018). ‘Public Reason’ in Zalta, E. N. and Hammer, E. (eds) The Stanford Encyclopedia of Philosophy. Stanford: Center for the Study of Language and Information. Available at: www.scirp.org/reference/referencespapers.aspx?referenceid=2710060 [accessed 20 September 2021].[/footnote]
  • The use of AI-driven systems to undermine the health and competitiveness of markets. In the market sphere, AI-enabled functions such as real-time, A/B testing,[footnote]Where two or more options are presented to users to determine which is more preferable.[/footnote] hypernudge,[footnote]Where an individual’s data and responses to stimuli is used to inform how choices are framed to them, with a view towards predisposing them towards particular choices. See: Yeung, K. (2017). ‘”Hypernudge”: Big Data as a Mode of Regulation by Design’, Information, Communication & Society, 20.1 pp.118–136. Available at: https://doi.org/10.1080/1369118X.2016.1186713.[/footnote] and personalised pricing and search[footnote]Where the prices or search results seen by a consumer are determined by their data profile. See: Competition and Markets Authority. (2018). Pricing algorithms: Economic working paper on the use of algorithms to facilitate collusion and personalised pricing, p. 63. Available at: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/746353/Algorithms_ econ_report.pdf.[/footnote] undermine the ability of consumers to choose freely between competing products in a market, and can significantly skew the balance of power between consumers and large companies.
  • Surveillance, privacy and the right to freedom of expression and assembly. The ability of AI-driven systems to monitor and surveil citizens has the potential to create a powerful negative effect on citizens exercising their rights to free expression and discourse – negatively affecting the tenor of democracies.
  • The use of AI systems to police and control citizen behaviour. It was noted that many AI systems could be used for more coercive methods of controlling or influencing citizens. Participants cited ‘social-credit’ schemes, such as the one being implemented in China, as an example of the kind of AI system that seeks to manipulate or enforce certain forms of social behaviour without adequate democratic oversight or control.[footnote]The authors of this paper note that many of the claims about the efficacy and goals of the Chinese social-credit system have been exaggerated in the Western media. See Matsakis, L. (2019). ‘How the West Got China’s Social Credit System Wrong.’ Wired. Available at: www.wired.com/story/china-social-credit-score-system.[/footnote]

Objective 3: Use AI to contribute to the solution of grand societal challenges

Another common view of workshop participants was that a country’s approach to AI regulation could be informed by its stated priorities and objectives for the use of AI in society. One of the common aims of many existing national AI strategies is to articulate how a country can leverage its AI ecosystem to develop solutions to, and means of addressing substantial, society-wide challenges facing individual nations – and indeed humanity – in coming decades.[footnote]Dutton, T. (2018). ‘An Overview of National AI Strategies’, Politics + AI. Available at: https://medium.com/politics-ai/an-overview-of-national-ai-strategies-2[/footnote]

Candidates for these challenges range from decarbonisation and dealing with the effects of climate change, navigating potential economic displacement brought about by AI systems (and the broader context of the ‘fourth industrial revolution’), to finding ways to manage the difficulties, and make best use, of an ageing population – which is itself one of the UK’s 2017 Industrial Strategy grand challenges. Workshop participants also referred to the potential for AI to be deployed to address the long-term effects of the COVID-19 pandemic and it’s potential to ameliorate future public-health crises.

Workshop participants emphasised that the purpose of articulating grand-societal challenges that AI can address was to provide an effective way to think about the coordination of different industrial strategy levers, from R&D and regulatory policy, to tax policy and public-sector
procurement. This approach would sidestep the risk of an AI national strategy that commands more AI for the sake of AI, or a strategy that places too much hope on the potential benefit of AI to bring positive societal change across all economic and societal sectors.

By articulating grand challenges that AI can address, the UK Government can help establish funding and research priorities for applications of AI that show high reward and proven efficacy. As an example, the French national AI strategy articulates several grand challenges as areas of focus for AI, including addressing the COVID-19 pandemic and fighting climate change.[footnote]European Commission. (2021). Knowledge for Policy: France AI Strategy Report. Available at: https://knowledge4policy.ec.europa.eu/ai-watch/france-ai-strategy-report_en[/footnote]

A reservation to consider with the societal-challenge approach is that it absolves Government of articulating a sense of direction when it comes to the UK’s relationship to AI. Setting out that we want AI to be used to address particular problems, and how AI is to be supported and guided to develop in a manner conducive to their solution, does not provide any indication of the level of risk we are willing to tolerate, the kinds of applications of AI we may or may not want to encourage or permit (all else remaining equal) or how our industrial and regulatory policy
should address difficult, values-based trade-offs.


Objective 4: Develop AI regulation as a sectoral strength

A fourth suggestion put forward by some workshop participants was that the UK should seek to develop AI regulation as a sectoral strength. There was limited agreement on what this goal might entail in practice, and whether it would be feasible.

Despite the UK’s strengths in academic AI research, most participants agreed that, because of existing market dynamics in the tech industry – in which a combination of mostly US and Chinese firms dominate the market, it will be very difficult to the UK market to create the
next industry powerhouse.

However, an idea that emerged in the first workshop was that the UK could potentially become world leading in flexible, innovative and ethical approaches to the regulation of AI. The UK Government has expressed explicit ambitions to lead the world in tech and data ethics since at
least 2018.[footnote]Kelion, L. (2018). ‘UK PM seeks ‘safe and ethical’ artificial intelligence.’ BBC News. 25 January. Available at: www.bbc.co.uk/news/technology-42810678.[/footnote] Workshop participants noted that the UK already has an established reputation for regulatory innovation, and that the country is potentially well placed to develop an approach to the regulation of AI that is compatible with EU standards, but more sophisticated and nuanced.

This idea received additional scrutiny in the second workshop, which saw a more sustained and critical discussion, detailed below, of what cultivating a niche in the regulation of AI might look like in practice, and of the benefits it might bring.

Why is leadership in AI regulation desirable?

Some participants challenged whether leadership in the regulation of AI would actually be desirable, and if so how.

It was noted that, in some cases, a country that drives the regulatory agenda for a particular technology or science will be in a good position to attract greater levels of expertise and investment. For instance, the UK is a world leader in biomedical research and technology, in large part because it has a robust regulatory system that ensures a high quality of accuracy, safety and public trust.[footnote]Calvert, M. J., Marston, E., Samuels, M., Cruz Rivera, S., Torlinska, B., Oliver, K., Denniston, A. K., and Hoare, S. (2019). ‘Advancing UK regulatory science and innovation in healthcare’, Journal of the Royal Society of Medicine. 114.1. pp. 5-11. Available at: https://doi.org/10.1177/0141076820961776[/footnote] It was cautioned, however, that the UK’s status with the regulation of biomedical technology is the product of the combination of demanding standards, a pragmatic approach to the interpretation of those standards and a rigorously enforced institutional regime.

Some expert panellists suggested that, despite the fact that many regulatory rules have been set at an EU level, the UK has become a leader in the regulation of the life sciences because it combined those high ethical and legal standards with sufficient flexibility to enable genuine innovation – rather than because it relaxed regulatory standards.

The UK can’t compete on regulatory substance, but could compete on some aspects of regulatory procedure and approach

There was a degree of scepticism among expert panellists about whether the model that has enabled the UK to achieve leadership in the regulation of the biomedical-sciences industry would be replicable or would yield the same results in the context of AI regulation. In contrast to the biomedical sciences – where there are strict and clearly defined routes into practice – it is difficult for a regulator to understand and control actors developing and deploying AI systems. The scale and the immediacy of the impacts of AI technologies also tends to be far greater
than in biomedical sciences, as is the number of domains in which AI systems could potentially be deployed.

In addition to this, it was noted that the EU also has ambitions to become a global leader in the ethical regulation of AI, as demonstrated by the European Commission’s proposed AI regulations.[footnote]European Commission. (2021). A Regulation of the European Parliament and of the Council Laying down Harmonised Rules on Artificial Intelligence (Artificial Intelligence Act) and Amending Certain Union Legislative Acts. Available at: https://eur-lex.europa. eu/legal-content/EN/ALL/?uri=CELEX%3A52021PC0206 [accessed 4 October 2021].[/footnote] It is therefore unclear what the UK might leverage to position itself as a distinct leader, alongside a larger, geographically adjacent and more influential economic bloc with a good track record of exporting its regulatory standards, which also has ambitions to occupy this space. The EU’s proposal of a comprehensive AI regulation also means that the UK does not have a first-mover advantage when it comes to the regulation of AI.

Many participants of our workshops thought it was unlikely that the UK would be able to compete with the EU (or other large economic blocs) on regulatory substance, or the specific rules and regulations governing AI. Some workshop participants observed that the comparatively small size of the UK market would mean that approval from a UK regulatory
body is of less commercial value to an AI company than regulatory approval from the EU.

In terms of regulatory substance, some participants considered whether the UK could make itself attractive as a place to develop AI products by lowering regulatory standards, but other participants noted this would be undesirable and would go against the grain of the UK’s strengths in the flexible enforcement of exacting regulatory standards. Moreover, participants suggested that a ‘race to the bottom’ approach would be counter-productive, given the size of the UK market and the higher regulatory standards that are already developing elsewhere.
Adopting this approach could mean that UK-based AI developers would not be able to sell their services and products in regions with higher regulatory standards.

Despite the limited prospects for the UK leading the world in the development of regulatory standards for AI, some workshop participants argued that it may be possible for the UK to lead on the processes and procedures for regulating AI. The UK does have a good reputation for
following regulatory processes and for regulatory process innovation (as exemplified by regulatory sandboxes, a model that has been replicated by many other jurisdictions, including the EU).[footnote]Privacy & Information Security Law Blog. (2021). Regulatory Sandboxes are Gaining Traction with European Data Protection Authorities. Hunton Andrews Kurth. Available at: https://www.huntonprivacyblog.com/2021/02/25/regulatory-sandboxes-are-gaining-traction-with-european-data-protection-authorities[/footnote]

While sandboxes no longer represent a unique selling point for the UK, the UK may be able to make itself more attractive to AI firms by establishing a series of regulatory practices and norms aimed at ensuring that companies have better guidance and support in complying with
regulations than they might receive elsewhere. These sorts of processes are particularly appealing to start-ups and small- to medium-sized enterprises (SMEs), who may struggle to navigate and comply with regulatory processes more than their larger counterparts.

A final caveat that several expert participants made was that, although more supportive regulatory processes might be enough to attract start-ups and early-stage AI ventures to the UK, keeping such companies in the UK as they grow will also require the presence of the right financial, legal and a research-and-development supportive ecosystem. While this report does not seek to answer the question of what this wider ecosystem should look like, it is clear that a regulatory framework is a necessary condition for the realisation of the Government’s stated
ambition of developing a world-leading AI sector, closely coordinated with policies to nurture and maintain these other enabling conditions.

Chapter 2: Challenges for regulating AI systems

Given AI’s relative novelty, complexity and applicability across both domains and industries, the effective and consistent regulation of AI systems presents multiple challenges. This chapter details some of the most significant of these, as highlighted by our expert workshop
participants, and sets out additional analysis and explanation of these issues. The following chapter, ‘Tools, mechanisms and approaches for regulating AI’, details some ways these challenges might be dealt with or overcome. Additional details on some of the different considerations when designing and configuring regulatory systems, which may be a useful companion to these two chapters, can be found in the annex.

The table below maps the regulatory challenges identified with the relevant tools, mechanisms and approaches for overcoming them.

Regulatory challenges and relevant tools, mechanisms and approaches

Challenges for regulating AI systems  Potentially useful approach, tool or mechanism
AI regulation demands bespoke, cross-cutting rules Regulatory capacity building

Regulatory coordination

The incentive structures and power dynamics of AI-business models can run counter to regulatory goals and broader societal values Regulatory capacity building

Regulatory coordination

It can be difficult to regulate AI systems in a manner that is proportionate Risk-based regulation
Professionalisation
Many AI systems are complex and opaque Regulatory capacity building
Algorithmic impact assessment
Transparency requirements
Inspection powers
External-oversight bodies
International standards
Domestic standards (e.g. via procurement)
AI harms can be difficult to separate from the technology itself Moratoria and bans

AI regulation demands bespoke, cross-cutting rules

Perhaps one of the biggest challenges presented by AI is that regulating it successfully is likely to require the development of new, domain-neutral laws and regulatory principles. There are several, interconnected reasons for this:

  1. AI presents novel challenges for existing legal and regulatory principles
  2. AI presents systemic challenges that require a coordinated response
  3. horizontal regulation will help avoid boundary disputes and aid industry-specific policy development
  4. effective, cross-cutting legal and regulatory principles won’t emerge organically
  5. the challenges of developing bespoke, horizontal rules for AI.

1. AI presents novel challenges for existing legal and regulatory principles

One argument for developing new laws and regulatory principles for AI is that those in existence are not fit for purpose.

AI has two features that present difficulties for contemporary legal principles. The first is its tendency to fully or partially automate moral decision-making processes in ways that can be opaque, difficult to explain and difficult to predict. The second is the capacity of AI systems
to develop and operate independently of human control. For these reasons, AI systems can challenge legal notions of agency and causation as the relationship between the behaviour of the technology and the actions of the user or developer can be unclear, and some AI systems
may change independently of human control and intervention.

While these principles have been unproblematically applied to legal questions concerning other emerging technologies, it is not clear that they will apply readily to those presented by AI. As barrister Jacob Turner explains, in contrast to AI systems, ‘a bicycle will not re-design
itself to become faster. A baseball bat will not independently decide to hit a ball or smash a window.’[footnote]Turner, J. (2018). Robot Rules: Regulating Artificial Intelligence. Palgrave Macmillan. P. 79.[/footnote]

2. AI presents systemic challenges that require a coordinated response

In addition to demanding new approaches to legal principles of agency and causation the effective regulation and governance of AI systems will require high levels of coordination.

As a powerful technology that can operate at scale and be applied in a wide range of different contexts, AI systems can manifest impacts at the level of the whole economy and the whole of society, rather than being confined to particular domains or sectors. Among policymakers
and industry professionals, AI is regularly compared to electricity, with claims that it can transform a wide range of different sectors.[footnote]Lynch, S. (2017). Andrew Ng: Why AI Is the New Electricity. Stanford Graduate School of Business. Available at: www.gsb.stanford.edu/ insights/andrew-ng-why-ai-new-electricity.[/footnote] Whether or not this is hyperbole, the ambition to integrate AI systems across a wide variety of core services and applications raises risks of significant negative outcomes. If governments aspire to use regulation and other policy mechanisms to control the systematic impacts of AI, they will have to coordinate legal and regulatory responses to particular uses of AI. Developing a general set of principles to which all regulators must adhere when dealing with AI is a practical way of doing this.

3. Horizontal regulation will help avoid boundary disputes and aid industry-specific policy development

There are also practical arguments for developing cross-cutting legal and regulatory principles for AI. The gradual shift from narrow to general AI will mean that attempts to regulate the technology exclusively through the rules applied to individual domains and sectors will become increasingly impractical and difficult. A fully vertical or compartmentalised approach to the regulation of AI would be likely to lead to boundary disputes, with persistent questions about whether particular applications or kinds of AI fall under the remit of one regulator or another – or both, or neither.

4. Effective, cross-cutting legal and regulatory principles won’t emerge organically

Clear, cross-cutting legal and regulatory principles for AI will have to be set out in legislation, rather than developed through, and set out in common law. Perhaps the most important reason for this is that setting out principles in statute makes it possible to protect against the potential harms of AI in advance (ex ante), rather than once things have gone wrong (ex post) – something a common law approach would be incapable of doing. Given the potential gravity and scope of the sorts of harms AI is capable of producing, it would be very risky to wait until
harms occur to develop legal and regulatory protections against them.

The Law Society’s evidence submission to the House of Commons Science and Technology Select Committee summarises some of reasons to favour a statutory approach to regulating and governing AI:

‘One of the disadvantages of leaving it to the Courts to develop solutions through case law is that the common law only develops by applying legal principles after the event when something untoward has already happened. This can be very expensive and stressful for all those affected. Moreover, whether and how the law develops depends on which cases are pursued, whether they are pursued all the way to trial and appeal, and what arguments the parties’ lawyers choose to pursue. The statutory approach ensures that there is a framework in place that everyone can understand.’[footnote]The Law Society. (2016). Written evidence submitted by the Law Society (ROB0037). Available at: http://data.parliament.uk/writtenevidence/committeeevidence.svc/evidencedocument/science-and-technology-committee/robotics-and-artificial-intelligence/written/32616.html [accessed 20 September 2021].[/footnote]

5. The challenges of developing bespoke, horizontal rules for AI

The need to develop new, domain-neutral, AI-specific law raises several difficult questions for policymakers. Who should be responsible for developing these legal and regulatory principles? What values and priorities should these principles reflect? How can we ensure that those developing the principles have a good enough understanding of the ways AI can and might develop and impact on society?

It can be difficult to regulate AI systems in a manner that is proportionate

Given the range of applications and uses of AI, a critical challenge in developing an effective regulatory approach is ensuring that rules and standards are strong enough to capture potential harms, while not being unjustifiably onerous for more innocuous or lower-risk
uses of the technology.

The difficulties of developing proportionate regulatory responses to AI are compounded because, as with many emerging technologies, it can be difficult for a regulatory body to understand the potential harms of a particular AI system before that system has become widely deployed or used. However, waiting for harms to become clear and manifest before embarking on regulatory interventions can come with significant risks. One risk is that harms may transpire to be grave, and difficult to reverse or compensate for. Another is that, by the time the harms of an AI system have become clear, these systems may be so integrated into economic life that ex post regulation becomes very difficult.[footnote]Liebert, W., and Schmidt, J. C. (2010). ‘Collingridge’s Dilemma and Technoscience: An Attempt to Provide a Clarification from the Perspective of the Philosophy of Science’, Poiesis & Praxis, 7.1–2 pp. (2010), 55–71 Available at: https://doi.org/10.1007/ s10202-010-0078-2.[/footnote]

The incentive structures and power dynamics created by AI-business models can run counter to regulatory goals and broader societal values

Several expert participants also noted that an approach to regulation must acknowledge the current reality around the market and business dynamics for AI systems. As many powerful AI systems rely on access to large datasets, the business models of AI developers can be heavily
skewed towards accumulating proprietary data, which can incentivise both extractive data practices and restriction of access to that data.

Many large companies now provide AI ‘as a service’, raising the barrier to entry for new organisations seeking to develop their own independent AI capabilities.[footnote]Cobbe, J., and Singh, J. (2021). ‘Artificial Intelligence as a Service: Legal Responsibilities, Liabilities, and Policy Challenges’. SSRN Electronic Journal. Available at: https://ssrn.com/abstract=3824736 or http://dx.doi.org/10.2139/ssrn.3824736.[/footnote] In the absence of strong countervailing forces, this can create incentive structures for businesses, individuals and the public sector that are misaligned with the ultimate goals of regulators and the values of the public. Expert participants in workshops and follow-up discussions identified two of these possible perverse incentive structures: data dependency and the data subsidy.

Data dependency

The principle of universal public services under democratic control is undermined by the public sector’s incentives to rely on large, private companies for data analytics, or for access to data on service users. These services promise efficiency benefits, but threaten to disempower
the public-service provider, with the following results:

  • Public-service providers may feel incentivised to collect more data on their service users that they can use to inform AI services.
  • By relying on data analytics provided by private companies, public services give up control of important decisions to AI systems over which they have little oversight or power.
  • Public-service providers may feel increasingly unable to deliver services effectively without the help of private tech companies.

The data subsidy

The principle of consumer markets that provide choice, value and fair treatment is undermined by the public’s incentives to provide their data for free or cheaper services (the ‘data subsidy’). This can result in phenomena like personalised pricing and search, which undermine consumer bargaining power and de facto choice, and can lead to the exploitation of vulnerable groups.

Many AI systems are complex and opaque

Another significant difficulty concerning the regulation of AI concerns the complexity and opacity of many AI systems. In practice, it can be very difficult for a regulator to understand exactly how an AI system operates, whether there is the potential for it to cause harm, and whether it has done so. The difficulty in understanding AI systems poses serious challenges, and in looking for solutions, it is helpful to distinguish between some of the sources of these challenges, which may include:

  1. regulators’ technical capacity and resources
  2. the opacity of AI developers
  3. the opacity of AI systems themselves.

1. Regulators’ technical capacity and resources

Firstly, many expert participants, including some from regulatory agencies, noted that existing regulatory bodies struggle to regulate AI systems due to a lack of capacity and technical expertise.

There are over 90 regulatory agencies in the UK that enforce legislation in sectors like transportation, public utilities, financial services, telecommunications, health and social services and many others. As of 2016, the total annual expenditure on these regulatory agencies was around £4 billion – but not all regulators receive the same amount, with some like the Competition and Markets Authority (CMA) or the Office of Communications (Ofcom) receiving far more than smaller regulators like the Equalities and Human Rights Commission (EHRC).[footnote]National Audit Office. (2017). A short guide to regulation. UK Government. Available at: www.nao.org.uk/wp-content/uploads/2017/0 9/A-Short-Guide-to-Regulation.pdf[/footnote]

Some regulators like the CMA and the Information Commissioner’s Office (ICO) already have some in-house employees specialising in data science and AI techniques, to reflect the nature of the work they do and kinds of organisations they regulate. But as AI systems become more widely used in various sectors of the UK economy, it becomes more urgent for regulators of all sizes to have access to the technical expertise required to evaluate and assess these systems, along with the powers necessary to investigate AI systems.

This poses questions about how regulators might best build their capacity to understand and engage with AI systems, or secure access to this expertise consistently.[footnote]Ada Lovelace Institute (Forthcoming). Technical approaches for regulatory inspection of algorithmic systems in social media platforms. Available at: https://www.adalovelaceinstitute.org/report/technical-methods-regulatory-inspection.[/footnote]

2. The opacity of AI developers

Secondly, many of the difficulties regulators have in understanding AI systems result from the fact that much of the information required to do so is proprietary, and that AI developers and tech companies are often unwilling to share information that they see as integral to their business model. Indeed, many prominent developers of AI systems have cited intellectual property and trade secrets as reasons to actively disrupt or prevent attempts to audit or assess their systems.[footnote]Facebook, for example, has recently shut down independent attempts to monitor and assess their platform’s behaviour. See: Kayser-Bril, N. (2021). AlgorithmWatch forced to shut down Instagram monitoring project after threats from Facebook. Algorithm Watch. Available at: https://algorithmwatch.org/en/instagram-research-shut-down-by-facebook/, and Bobrowsky, M. (2021). ‘Facebook Disables Access for NYU Research Into Political-Ad Targeting’. Wall Street Journal. Available at: www.wsj.com/articles/facebook-cuts-off-access-for-nyu-research-into-political-ad-targeting-11628052204.[/footnote]

While some UK regulators do have powers to inspect AI systems, where those systems are developed by regulated entities, the inspection of systems becomes much more difficult when those systems are provided by third parties. This issue poses questions about the powers regulators might need to require information from AI developers or users, along with standards of openness and transparency on the part of such groups.

3. The opacity of AI systems themselves

Finally, in some cases, there are also deeper issues concerning the ability of anyone, even the developers of an AI system, to understand the basis on which it may make decisions. The biggest of these is the fact that non-symbolic AI systems, which are the kind of AI responsible for some of the most recent impressive advances in the field, tend to operate as ‘black boxes’, whose decision-making sequences are difficult to parse. In some cases, it may be the case that certain types of AI systems may not be appropriate for deployment in settings where it is essential to be able to provide a contestable explanation.

These difficulties in understanding AI systems’ decision-making processes become especially problematic in cases where a regulator might be interested in protecting against ‘procedural’ harms, or ‘procedural injustices’. In these cases, a harm is recognised not because of the nature of the outcome, but because of the unfair or flawed means by which that outcome was produced.

While there are strong arguments to take these sorts of harms seriously, they can be very difficult to detect without understanding the means by which decisions have been made and the factors that have been taken into account. For instance, looking at who an automated credit-scoring system considers to be most and least creditworthy may not reveal any obvious unfairness – or at the very least will not provide sufficient evidence of procedural harm, as any discrepancies between different groups could theoretically have a legitimate explanation. It is only when considering how these decisions have been made, and whether the system has taken into account factors that should be irrelevant, that procedural unfairness can be identified or ruled out.

AI harms can be difficult to separate from the technology itself

The complexity of the ways that AI systems can and could be deployed means that there are likely to be some instances when regulators are unsure of their ability to effectively isolate potential harms from potential benefits.

These doubts may be caused by a lack of information or understanding of a particular application of AI. There will inevitably be some instances in which it is very difficult to understand exactly the level of risk posed by a particular form of the technology, and if and how the risks posed by it might be mitigated or controlled, without undermining the benefits of
the technology.

In other cases, these doubts may be informed by the nature of the application itself, or by considerations of the likely dynamics affecting its development. There may be instances where, due to the nature of the form or application of AI, it seems difficult to separate the harms it poses from its potential benefits. Regulators might also doubt whether particular high-risk forms or uses of AI can realistically be contained to a small set of heavily controlled uses. One reason for this is that the infrastructure and investment required to make limited deployments of a high-risk application possible create long-term pressure to use the technology more widely: the industry developing and providing the technology is incentivised to advocate for a greater variety of uses. Government and public bodies may also come under
pressure to expand the use of the technology to justify the cost of having acquired it.

Chapter 3: Tools, mechanisms and approaches for regulating AI systems

To address some of the challenges outlined in the previous section, our expert workshop participants identified a number of tools, mechanisms and approaches to regulation that could potentially be deployed as part of the Government’s efforts to effectively regulate AI systems at different stages of the AI lifecycle.

Some mechanisms can provide an ex ante pre-assessment of an AI system’s risk or impacts, while others provide ongoing monitoring obligations and ex post assessments of a system’s behaviour. It is important to understand that no single mechanism or approach will
be sufficient to regulate AI effectively – but that regulators will need a variety of tools in their toolboxes to draw on as needed.

Many of the mechanisms described below follow the National Audit Office’s Principles of effective regulation,[footnote]National Audit Office. (2021). Principles of effective regulation. UK Government. Available at: www.nao.org.uk/wp-content/uploads/2021/05/Principles-of-effective-regulation-SOff-interactive-accessible.pdf[/footnote] which we believe may offer a useful guide for the Government’s forthcoming White Paper.


Regulatory infrastructure – capacity building and coordination

Capacity building and coordination

The 2021 UK AI Strategy acknowledges that regulatory capacity and coordination will be a major area of focus for the next few years. Our expert participants also proposed sustained and significant expansion of the regulatory system’s overall capacity and levels of coordination, to support successful management of AI systems.

If the UK’s regulators are to adjust to the scale and complexity of the challenges presented by AI, and control the practices of large, multinational tech companies effectively, they will need greater levels of expertise, greater resourcing and better systems of coordination.

Expert participants were keen to stress that calls for the expansion of regulatory capacity should not be limited to the cultivation of technical expertise in AI, but should also extend to better institutional understanding of legal principles, human-rights norms and ethics. Improving regulators’ ability to understand, interrogate, predict and navigate the ethical and legal challenges posed by AI systems is just as important as improving their ability to understand and scrutinise the workings of the systems themselves.[footnote]Yeung, K., Howes, A., and Pogrebna, G. (2020). ‘AI Governance by Human Rights-Centered Design, Deliberation, and Oversight: An End to Ethics Washing’, in Dubber, M. D., Pasquale, F., and Das, S. (eds) The Oxford Handbook of Ethics of AI. Oxford: Oxford University Press. pp. 75–106. Available at: https://doi.org/10.1093/oxfordhb/9780190067397.013.5.[/footnote]

Expert participants also emphasised some of the limitations of AI-ethics exercises and guidelines that are not backed up by hard regulation and the law[footnote]Whittlestone, J., Nyrup, R., Alexandrova, A., Dihal, K., and Cave, S. (2019). Ethical and societal implications of algorithms, data, artificial intelligence: A roadmap for research. London: Nuffield Foundation. Available at: www.nuffieldfoundation.org/sites/default/files/files/Ethical-and-Societal-Implications-of-Data-and-AI-report-Nuffield-Foundat.pdf.[/footnote] – and cited this as an important reason to embed ethical thinking within regulators specifically.

There are different models for allocating regulatory resources, and for improving the system’s overall capacity, flexibility and cohesiveness, any model will need:

  • a means to allocate additional resources efficiently, avoiding duplication of effort across regulators, and guarding against the possibility of gaps and weak spots in the regulatory ecosystem
  • a way for regulators to coordinate their responses to the applications of AI across their respective domains, and to ensure that their actions are in accordance with any cross-cutting regulatory principles or laws regarding AI
  • a way for regulators to share intelligence effectively and conduct horizon-scanning exercises jointly.

One model would be to have centralised regulatory capacity that individual regulators could draw upon. This could consist of AI experts and auditors, as well as funding available to support capacity building in individual regulators. A key advantage of a system of centralised regulatory capacity is that regulators could draw on expertise and resources as and when needed, but the system would have to be designed to ensure that individual regulators had sufficient expertise to understand when they needed to call in additional resources.

An alternative way of delivering centralised regulatory capacity is a model where experts on AI and related disciplines are distributed within individual regulators and circulate around, reporting back cross-cutting intelligence and knowledge. This would build expert capacity and
understanding of the effects AI is having on different sectors and parts of the regulatory system, to identify common trends and to strategise and coordinate potential responses.

Another method would be to have AI experts permanently embedded within individual regulators, enabling them to develop deep expertise of the particular regulatory challenges posed by AI in that domain. In this model experts would have to communicate and liaise across regulatory bodies to prevent siloed thinking.

Finally, a much-discussed means of improving regulatory capacity is the formation of a new, dedicated AI regulator. This regulatory body could potentially serve multiple functions, from setting general regulatory principles or domain-specific rules for AI regulation, to providing capacity and advice for individual regulators, to overseeing and coordinating horizon-scanning exercises and coordinating regulatory responses to AI across the regulatory ecosystem.

Most expert participants did not feel that there would be much benefit from establishing an independent AI regulator for the purposes of setting and enforcing granular regulatory rules. There are some common and consistent questions that all kinds of AI systems raise around issues of accountability, fairness, explainability of automated decisions, the relationship between machine and human agency, privacy and bias.

However, most expert participants agreed that regulatory processes and rules need to be specific to the domain in which AI is being deployed. Some participants acknowledged that there may be some need for an entity to develop and maintain a common set of principles and
standards for the regulation of AI, and to ensure that individual regulators apply those principles in a manner that is consistent – by maintaining an overview of the coherence of all the regulatory rules governing AI, and by providing guidance for individual regulators on how to interpret the cross-industry regulatory principles.

None of the above models should be seen as mutually exclusive, nor substitutes for more money and resources being given to all regulators to deal with AI. Creating pooled-regulatory capacity that individual regulators can draw on need, and should not, come at the expense of
improving levels of expertise and analytic capacity within individual regulatory bodies.

With regards to regulatory coordination, several participants noted that existing models aimed at helping regulators work together on issues presented by AI systems should be continued and expanded. For example, the Digital Regulation Cooperation Forum functions with the CMA, ICO, Ofcom and the Financial Conduct Authority (FCA) to ‘ensure a greater level of cooperation given the unique challenges posed by regulation of online platforms’.[footnote]Digital Regulation Cooperation Forum. (2021). UK Government. Available at: www.gov.uk/government/collections/the-digital-regulation-cooperation-forum[/footnote]

Anticipatory capacity

If the regulatory system is to have a chance of addressing the potential harms posed by AI systems and business models effectively, it will need to better understand and anticipate those harms. The ability to anticipate AI harms is also fundamental to overcoming the difficulty
of designing effective ex ante rules to protect against harms that have not yet necessarily occurred on a large scale.

One promising approach to help regulators better understand and address the challenges posed by AI is ‘anticipatory regulation’, a set of techniques and principles intended to help regulators be more proactive, coordinated and democratic in their approach to emerging
technologies.[footnote]Armstrong, H., Gorst, C., Rae, J. (2019). Renewing Regulation: ‘anticipatory regulation’ in an age of disruption. Nesta. Available at: www.nesta.org.uk/report/renewing-regulation-anticipatory-regulation-in-an-age-of-disruption.[/footnote] These techniques include horizon-scanning and futures exercises, such as scenario mapping (especially as collaborations between regulators and other entities), along with iterative, collaborative approaches, such as regulatory sandboxes. They may also include
participatory-futures exercises like citizen juries that involve members of the public, particularly those from traditionally marginalised communities, to help anticipate potential scenarios.

There is already support for regulators to experiment with anticipatory techniques, such as that provided by the Regulators’ Pioneer Fund, and initiatives to embed horizon scanning and futures thinking into the regulatory system, such as the establishment of the Regulatory Horizons Council.[footnote]UK Government. (2021). Regulatory Horizons Council (RHC). Available at: www.gov.uk/government/groups/regulatory-horizons-council-rhc.[/footnote] However, for these techniques to become the norm among regulators, Government support for anticipatory methods will have to be more generous, provided by default and long term.

Workshop participants noted that harms posed by emerging technologies can be overlooked because policymakers lack understanding of how new technologies or services might affect
particular groups. Given this, some participants suggested that efforts to bring in a variety of perspectives to regulatory policymaking processes, via public-engagement exercises or through drives to improve the diversity of policymakers themselves, would have a positive
effect on the regulators’ capacity to anticipate and understand harms and unintended consequences of AI.[footnote]For some ideas on the kinds of participatory mechanisms policymakers could use, please read Ada Lovelace Institute. (2021). Participatory data stewardship. Available at: www.adalovelaceinstitute.org/report/participatory-data-stewardship.[/footnote]

Developing a healthy ecosystem of regulation and governance

Several participants in our workshops noted the need for the UK to adopt a regulatory approach to AI that enables an ‘ecosystem’ of governance and accountability that rewards and incentivises self-governance, and makes possible third-party, independent assessments and reviews of AI systems.

Given the capacity for AI technologies to be deployed in a range of settings and contexts, no single regulator may be capable of assessing an AI system for all kinds of harms and impacts. The Competition and Markets Authority, for example, seeks to address issues of competition and enable a healthy digital market. The Information and Commissioners Office seeks to address issues of data protection and privacy, while the Equalities and Human Rights Commission seeks to address fundamental human rights issues across the UK. AI systems can raise a variety of different risks which may fall under different regulatory bodies.

One major recommendation from workshop participants, and one evidenced in our research into assessment and auditing methods,[footnote]Ada Lovelace Institute and DataKind UK. (2020). Examining the Black Box: Tools for Assessing Algorithmic Systems. Available at: www.adalovelaceinstitute.org/report/examining-the-black-box-tools-for-assessing-algorithmic-systems/ [accessed 11 October 2021].[/footnote] is that successful regulatory frameworks enable an ecosystem of governance and accountability by empowering regulators, civil-society organisations, academics and members of the public to hold systems to account. The establishment of whistleblower laws, for example, can empower tech workers who identify inherent risks to come forward to a regulator.[footnote]Johnson, K. (2020). ‘From whistleblower laws to unions: How Google’s AI ethics meltdown could shape policy’. VentureBeat. Available at: https://venturebeat.com/2020/12/16/from-whistleblower-laws-to-unions-how-googles-ai-ethics-meltdown-could-shape-policy.[/footnote]

A regulatory framework might also enable greater access to assess a system’s impacts and behaviour by civil-society organisations and academic labs, who are currently responsible for the majority of audits and assessments that have identified alarming AI-system behaviour.
A regulatory framework that empowers other actors in the ecosystem can help remove the burden from individual regulators to perform these assessments entirely on their own.


Regulatory approaches – risk-based approaches
to regulating AI

In 2021, the European Commission released a draft risk-based framework to regulate AI systems that identifies what risk a system poses and assigns specific requirements for developers to meet based on that risk level.[footnote]European Commission. (2021). Regulation of the European Parliament and of the Council Laying down Harmonised Rules on Artificial Intelligence (Artificial Intelligence Act) and Amending Certain Union Legislative Acts. Available at: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:52021PC0206.[/footnote] Like the EU, the UK could consider adopting a risk-based approach to the regulation of AI systems, based on their impacts on society. Importantly, the levels of risk in the Commission’s proposed framework are not based on the underlying technological method used (for example, deep learning vs. reinforcement learning), but on the potential impact on ‘fundamental rights’.[footnote]Lum, K., and Chowdhury, R. (2021). ‘What is an “algorithm”? It depends whom you ask’. MIT Technology Review. Available at: www.technologyreview.com/2021/02/26/1020007/what-is-an-algorithm.[/footnote]

The EU model creates four tiers of risks posed by the use of AI in a particular context – unacceptable risk (uses that are banned), high, moderate and minimal risk. Each tier comes with specific requirements for developers of those systems to meet. High-risk systems, for
example, must undergo a self-conformity assessment and be listed on a European-wide public register.

While the EU AI regulation states the protection of fundamental rights is a core objective, another clear aim of this regulation is to develop harmonised rules of AI regulation for all member states to adopt. The proposed regulation seeks to ensure a consistent approach across all member states, and so pre-empt and overrule the development of national regulation of AI systems. To achieve this, it relies heavily on EU-standards bodies to establish specific requirements for certain systems to meet based on their risk category. As several academics have noted, these standards bodies are often inaccessible to civil-society organisations, and may be poorly suited for the purposes of regulating AI.[footnote]Veale, M., and Zuiderveen Borgesius, F. (2021). ‘Demystifying the Draft EU Artificial Intelligence Act’. Computer Law Review International. 22 (4). Available at: https://osf.io/preprints/socarxiv/38p5f; Cath-Speth, C. (2021). Available at: https://twitter.com/c___cs/status/1412457639611600900.[/footnote]

A risk-based approach to regulating AI will ensure not all uses of AI are treated the same, which may help avoid unnecessary regulatory scrutiny and wasting of resources on uses of AI that are low risk.

However, risk-based systems of regulation come with their own challenges. One major challenge relates to the identification of risks.[footnote]Baldwin, R., and Black, J. (2016). Driving Priorities in Risk-Based Regulation: What’s the Problem? Journal of Law and Society. 43.4 pp. 565–95. Available at: https://onlinelibrary.wiley.com/doi/pdf/10.1111/jols.12003[/footnote] How should a regulatory system determine what qualifies as a high-risk, medium-risk or low-risk application of a technology? Who gets to make this judgement, and according to what framework of risk? Risks are social constructs, and what may present a risk to one individual in society may benefit another. To mitigate this, if the UK chooses a risk-based approach to regulating AI, it should include a framework for defining and assessing risk that includes a participatory process involving civil-society organisations and those who are likely to be affected by those systems.

Some AI systems are dynamic technologies that can be used in different contexts, so assessing the risk of a system – like an open-source facial recognition API – may miss the unique risks it poses when deployed in different contexts or for different purposes. For example, identifying the presence of a face for a phone camera will create different risks than if
the system is used in the creation of a surveillance apparatus for a law-enforcement body. This suggests that there may need to be different mechanisms for assessing the risk of an AI system and its impacts at different stages of its ‘lifecycle’.

Some of the mechanisms described below have the potential to help both developers and regulators assess the risk of a system in early research and development stages, while others may be useful for assessing the risk of a system after it has been procured or deployed.
Mechanisms like impact assessments or participatory methods of citizen engagement offer a promising pathway for the UK to develop an effective tier-based system of regulation that captures risk at different stages of an AI system’s lifecycle. However, more work is needed to determine the effectiveness of these mechanisms.


Regulatory tools and techniques

This section provides some examples of mechanisms and tools for the regulation of AI that our expert participants discussed, and draws heavily on a recent report documenting the ‘first wave’ of public-sector algorithm accountability mechanisms.[footnote]Ada Lovelace Institute, AI Now Institute and Open Government Partnership. (2021). Algorithmic Accountability for the Public Sector. Available at: www.opengovpartnership.org/documents/algorithmic-accountability-public-sector.[/footnote]

This section is not a holistic description of all the mechanisms that regulators might use – sandboxes, for example, are notably absent – but rather seeks to describe some existing, emerging mechanisms for AI systems that are less well-known, and provides some guidance for the UK Government when considering the forthcoming White Paper and in its forthcoming AI Assurance Roadmap.[footnote]In its 2021 National AI Strategy, the UK Government states the Centre for Data Ethics and Innovation will publish a roadmap for ‘AI Assurance’ which sets out a number of different governance mechanisms and roles for different actors to play in holding AI systems more accountable.[/footnote]

Algorithmic impact assessments (AIAs)

To assess the potential impacts of an AI system on people and society, regulators will need new powers to audit, assess and inspect such systems. As the Ada Lovelace Institute’s report Examining the Black Box notes, the auditing and assessment of AI systems can occur prior to a system’s deployment and after its deployment.[footnote]Ada Lovelace Institute and DataKind UK. (2020). Examining the Black Box: Tools for Assessing Algorithmic Systems. Available at: www.adalovelaceinstitute.org/report/examining-the-black-box-tools-for-assessing-algorithmic-systems.[/footnote]

 

Impact assessments have a lengthy history of use in other sectors to assess human rights, equalities, data protection, financial and environmental impacts of a policy or technology ex ante. Their purpose is to provide a mechanism for holding developers and procurers of a technology more accountable for its impacts, by enabling greater external scrutiny of its risks and benefits.

 

Some countries and developers have begun to use algorithmic impact assessments (AIAs) as a mechanism to explore the impacts of an AI system prior to its use. AIAs offer a way for developers or procurers of a technology to engage members of affected communities about what impacts they might foresee an AI system causing, and to document potential impacts. They can also provide developers of a technology with a standardised mechanism for reflecting on intended uses and design choices in the early stages, enabling better organisational practices that can maximise the benefits of a system and minimise its harms. For example, the Canadian Directive on Automated
Decision-Making is a public-sector initiative that requires federal public agencies
to conduct an AIA prior to the production of an AI system.[footnote]As of the date of this report, only two AIAs have been completed by Canadian federal agencies under this directive. Treasury Board of Canada Secretariat, Government of Canada. (2019). Directive on Automated Decision Making. Available at: www.tbs-sct.gc.ca/pol/ doc-eng.aspx?id=32592.[/footnote]

 

While there is no one-size-fits-all approach to conducting AIAs, recent research has identified ten constitutive elements to any AIA process that ensure meaningful accountability.[footnote]Moss, E., Watkins, E.A., Singh, R., Elish, M.C., and Metcalf, J. (2021). Assembling Accountability Through Algorithmic Impact Assessment. Data & Society Research Institute. Available at: http://datasociety.net/library/assembling-accountability.[/footnote] These include the establishment of a clear independent assessor, the public posting of the results of the AIA, and the establishment of clear methods of redress.

Auditing and regulatory inspection

While impact assessments offer a promising method for an ex ante assessment of an AI system’s impacts on people and society, auditing and regulatory inspection powers offer a related method to assess an AI system’s behaviour and impacts ex post and over time.

Regulatory inspections are used by regulators in other sectors to investigate potentially harmful behaviours. Financial regulatory inspections, for example, enable regulators to investigate the physical premises, documents, computers and systems of banks and other
financial institutions. Regulatory inspections of AI systems could involve the use of similar powers to assess a system’s performance and accuracy, along with its broader impacts on society.[footnote]Ada Lovelace Institute and DataKind UK. (2020). Examining the Black Box: Tools for Assessing Algorithmic Systems. Available at: www.adalovelaceinstitute.org/report/examining-the-black-box-tools-for-assessing-algorithmic-systems.[/footnote]

Conducting a meaningful regulatory inspection of an algorithmic system would require regulators to have powers to accumulate specific types of evidence, including information on:

  • Policies – company policies and documentation that identify the goals of the AI system, what it seeks to achieve, and where its potential weaknesses lie.
  • Processes – assessment of a company’s process for creating the system, including what methods they chose and what evaluation metrics they have applied.
  • Outcomes – the ability to assess the outcomes of these systems on a range of different users of the system.[footnote]Ada Lovelace Institute and Reset. (2020). Inspecting algorithms in social media platforms. Available at: https://www. adalovelaceinstitute.org/report/inspecting-algorithms-in-social-media-platforms/[/footnote]

Regulatory inspections may make use of technical audits of an AI system’s performance or behaviour over a period of time. Technical-auditing methods can help to answer several kinds of questions relating to an AI system’s behaviour, such as whether a particular system is
producing biased outputs or what kind of content is being amplified to a particular user demographic by a social media platform.

In order to conduct technical audits of an AI system, regulators will need statutory powers granting them the ability to access, monitor and audit specific technical infrastructures, code and data underlying a platform or algorithmic system. It should be noted that most technical auditing of AI systems is currently undertaken by academic labs and civil-society organisations, such as the Gender Shades audit that identified racial and gender biases in several facial-recognition systems.[footnote]Buolamwini, J. and Gebru, T. (2018). Gender shades: intersectional accuracy disparities in commercial gender classification. In: Conference on Fairness, Accountability, and Transparency, 81, p1–15. New York: PLMR. Available at: http://proceedings.mlr. press/v81/buolamwini18a/buolamwini18a.pdf[/footnote]

Transparency requirements

Several expert participants noted a major challenge with regulating AI systems is the lack of transparency about where these systems are being used in both the public and private sectors. Without disclosure of the existence of these systems, it is impossible for regulators,
civil-society organisations, or members of the public to understand what AI-based decisions are being made about them or how their data is being used.

This lack of transparency creates an inherent roadblock for regulators to assess the risk of certain systems effectively, and anticipate future risk down the line. A lack of transparency may also undermine public trust in institutions that use these systems, diminishing trust in government institutions and consumer confidence in UK businesses that use AI
systems. The public outcry over the 2020 Ofqual A-level algorithm was in response to the deployment of an algorithmic system that had insufficient public oversight.[footnote]Office for Statistics Regulation. (2021). Ensuring statistical models command public confidence. Available at: https://osr.statisticsauthority.gov.uk/publication/ensuring-statistical-models-command-public-confidence/.[/footnote]

External-oversight bodies

Another mechanism a UK regulatory framework might consider implementing is a wider adoption of external-oversight bodies that review the procurement or use of AI systems in particular contexts. The West Midlands Police Department currently uses an external-ethics committee – consisting of police officials, ethicists, technologists and members of the local community – to review department requests to procure AI-based technologies, such as live facial-recognition systems and algorithms designed to predict an individual’s likelihood to commit a crime.[footnote]West Midlands Police and Crime Commissioner (2021). Ethics Committee. Available at: www.westmidlands-pcc.gov.uk/ ethics-committee/.[/footnote] While the committee’s decisions are non-binding, they are published on the West Midlands Police website.

External-oversight bodies can also serve the purpose of ensuring a more participatory form of public oversight of AI systems. By enabling members of an affected community to have a say in the procurement and use of these systems, external-oversight bodies can ensure the procurement, adoption and integration of AI-systems is carried out in accordance with democratic principles. Some attempts to create external-oversight bodies have been in bad faith, and these types of bodies must be given meaningful oversight and fair representation if they are to succeed.[footnote]Richardson, R. ed. (2019). Confronting Black Boxes: A Shadow Report of the New York City Automated Decision System Task Force. AI Now Institute. Available at: https://ainowinstitute.org/ads-shadowreport-2019.html.[/footnote]

Standards

In addition to laws and regulatory rules, standards for AI systems, products and services have the potential to form an important component of the overall governance of the technology.

One notable potential use of standards is around improving the transparency and explainability of AI systems. Regulators could develop standards, or standards for tools, to ensure data provenance (knowing where data came from), reproducibility (being able to recreate a given result) and data versioning (saving snapshot copies of the AI in specific states with a view to recording which input led to which output).

At an international level, the UK AI Strategy states that the UK must get more engaged in international standard-setting initiatives,[footnote]Office for AI. (2021). National AI Strategy. UK Government. Available at: https://www.gov.uk/government/publications/national-ai-strategy[/footnote] a conclusion that many expert participants also agreed with. The UK already exerts considerable influence over international standards on AI, but can and should aspire to do so more systematically.

At a domestic level, the UK could enforce specific standards of practice around the development, use and procurement of AI systems by public authorities. The UK Government has developed several non-binding guidelines around the development and use of data-driven technologies, including the UK’s Data Ethics Framework that guides responsible data
use by public-sector organisations.[footnote]Central Digital and Data Office. (2018). Data Ethics Framework. UK Government. Available at: www.gov.uk/government/publications/data-ethics-framework/data-ethics-framework.[/footnote] Guidelines and principles like these can help developers of AI systems identify what kinds of approaches and practices they should use that can help mitigate harms and maximise benefits. While these guidelines are currently voluntary, and are largely focused on the public sector, the UK could consider codifying them into mandatory requirements for both public- and private-sector organisations.

A related mechanism is the development of standardised public procurement requirements that mandate developers of AI systems undertake certain practices. The line between public and private development of AI systems is often blurry, and in many instances public-sector organisations procure AI systems from private agencies who maintain and support the system. Local authorities in the UK often procure AI systems from private developers, including for many high-stakes settings like decisions around border control and the allocation of state benefits.[footnote]BBC News. (2020). ‘Home Office Drops “racist” Algorithm from Visa Decisions’. 4 August. Available at: www.bbc.com/news/ technology-53650758; BBC News. (2021). ‘Council Algorithms Mass Profile Millions, Campaigners Say’. 20 July. Available at: www.bbc.com/news/uk-57869647.[/footnote]

Procurement agreements are a crucial pressure point where public agencies can place certain requirements around data governance, privacy and assessing impacts on a developer. The City of Amsterdamhas already created standardised language for this purpose in 2020. Called the ‘Standard Clauses for Municipalities for Fair Use of Algorithmic Systems’, this language places certain conditions on the procurement of data-driven systems, including that underlying data quality of a system is assessed and checked.[footnote]Municipality Amsterdam. (2020). Standard Clauses for Municipalities for Fair Use of Algorithmic Systems. Gemeente Amsterdam. Available at: www.amsterdam.nl/innovatie/[/footnote] The UK might therefore consider regulations that codify and enforce public-procurement criteria.

Despite the importance of standards in any regulatory regime for AI, they have several important limitations when it comes to addressing the challenges posed by AI systems. First, standards tend to be developed through consensus, and are often developed at an international level. As such, they can take a very long time to develop and modify. A flexible
regulatory system capable of dealing with issues that arise quickly or unexpectedly, should therefore avoid overreliance on standards, and will need other means of addressing important issues in the short term.

Moreover, standards are not especially well-suited to dealing with considerations of important and commonly held values such as such as agency, democracy, the rule of law, equality and privacy. Instead they are typically used to moderate the safety, quality and security of products. While setting standards on AI transparency and reporting could be instrumental in enabling regulators to understand the ethical impacts of AI systems, the qualitative nature of broader, values-based considerations could make standards poorly suited to addressing such questions directly.

It will therefore be important to avoid overreliance on standards, instead seeing them as a necessary but insufficient component of a convincing regulatory response to the challenges posed by AI.

The UK’s regulatory system will need get the balance between standards and rules right, and will need to be capable of dealing with issues pertaining to ethical and societal questions posed by AI as well as questions of safety, quality, security and consumer protection. Equally,
it will be important for the regulatory system to have mechanisms to respond to both short- and long-term problems presented by AI systems.

Though standards do have the potential to improve transparency and explainability, some participants in our expert workshops noted that the opaque nature of some AI systems places hard limits on the pursuit of transparency and explainability, regardless of the mechanism used
to pursue these goals. Given this, it was suggested that the regulatory system should place more emphasis on methods that sidestep the problem of explainability, looking at the outcomes of AI systems, rather than the processes by which those outcomes are achieved.[footnote]This is a relatively common approach taken by regulators currently, who understandably do not want to, or feel under-qualified to get into the business of auditing code. A difficulty with this approach is that the opacity of AI systems can make it difficult to predict and assess the outcomes of their use in advance. As a result, ‘outcomes-based’ approaches to regulating AI need to be grounded in clear accountability for AI decisions, rather than attempts to configure AI systems to produce more desirable outcomes.[/footnote]

A final caveat concerning standards is that standard setting is also currently heavily guided and influenced by industry groups, with the result that standards tend to be developed with a particular set of concerns and in mind.

Standards could potentially be a more useful complement to other regulatory and governance activity were their development to be influenced by a broader array of actors, including civil-society groups, representatives of communities particularly affected by AI, academics and regulators themselves. Should the UK become more actively involved in standard setting for AI systems, this would present a good opportunity to bring a greater diversity of voices and groups to the table.

Professionalisation

Another suggested mechanism by which the UK regulatory system could seek to address the risks and harms posed by AI systems was the pioneering of an ethical-certification and training framework for those people designing and developing AI systems. Establishing professional
standards could offer a way for regulators to enforce and incentivise particular governance practices, giving them more enforcement ‘teeth’.

There are several important differences between AI as a sector and domain of practice, and some of the sectors where training and professional accreditation have proven the most successful, such as medicine and the law. These professionalised fields have a very specific
domain of practice, the boundaries of which are clear and therefore easy to police. There are also strong and well established social, economic and legal sanctions for acting contrary to a professional code of practice.

Some expert panellists argued there is potentially a greater degree of tension between the business models for AI development and potential contents of an ethical certification for AI developers. Some expert participants noted that the objections to certain AI systems lie not
in how they are produced but in their fundamental business model, which may rely on practices like the mass collection of personal data or the development of mass-surveillance systems that some may see as objectionable. This raises questions about the scope and limits of professionalised codes of practice and how far they might be able to help.

Another common concept when discussing the professionalisation of the AI industry is that of fiduciary duties, which oblige professionals to act solely in the best interest of a client who has placed trust and dependence in them. However, some expert participants pointed out that though this model works well in industries like law and finance, it is less readily applicable to data-driven innovation and AI, where it is not the client of the professional who is vulnerable, but the end consumer or subject of the product being developed. The professional culture
of ethics exemplified by the fiduciary duty exists within the context of particular, trusting relationship between professional and client which isn’t mirrored in most AI business models.

Moratoria and bans

In response to worries about instances in which it may be impossible for regulators to assure themselves that they can successfully manage the harms posed by high-risk applications of AI, it may be desirable for the UK to refrain entirely from the development or deployment of
particular kinds of AI technology, either indefinitely or until such a time as risks and potential mitigations are better understood.

Facial recognition was cited by our expert workshop participants as an example of a technology that, in some forms, could pose sufficiently grave risks to an open and free society as to warrant being banned outright – or at the very least, being subjected to a moratorium. Other countries, including Morocco, have put in place temporary moratoria on the use of these kinds of systems until existing legal frameworks can be established.[footnote]National Control Commission for the Protection of Personal Data. (2020). Press release accompanying the publication of deliberation No. D-97-2020 du 26/03/2020’ (in French). Available at: https://www.cndp.ma/fr/presse-et-media/communique-de-presse/661-communique-de-presse-du-30-03-2020.html [accessed 22 October 2021].[/footnote] Similar bans exist on city uses of facial recognition in the US cities of Portland and San Francisco, though these have come with some criticism around their scope and effectiveness.[footnote]Simonite, T., and Barber, G. (2019). ‘It’s Hard to Ban Facial Recognition Tech in the iPhone Era’. Wired. Available at: www.wired.com/story/hard-ban-facial-recognition-tech-iphone.[/footnote]

One challenge with establishing bans and moratoria for certain technological uses is the necessity of developing a process for assessing the risks and benefits of these technologies, and endowing a regulator with the power to enact these restrictions. Currently, the UK has not endowed any regulator with explicit powers to make these bans of AI systems, nor with the capacity to develop a framework for assessing in which contexts certain uses of a technology would be worthy of a ban or moratoria. If the UK is to consider this mechanism, one initial step would be to develop a framework for the kinds of systems that may meet an unreasonable bar of risk.

Another worry expressed by some expert participants was whether bans and moratoria could end up destroying the UK’s own research and commercial capacity in a particular emerging technological field. Would a ban on facial-recognition systems, for example, be overly broad and risk creating a chilling effect on potential positive uses of the underlying technology?

Other expert participants were far less concerned with this possibility, and argued that bans and moratoria should focus on specific uses and outcomes of a technology rather than its underlying technique. A temporary moratoria could be restricted to specific high-risk
applications that require additional assessment of their effectiveness and impact, such as the use of live facial recognition in law enforcement settings. In the UK, current bans and moratoria on live facial recognition have been dealt with by court challenges like the recent decision on the New South Wales Police use of the technology.[footnote]Courts and Tribunals Judiciary. (2020). R (on the application of Edward Bridges) v. The Chief Constable of South Wales Police and the Secretary for the State for the Home Department. Case No: C1/2019/2670. Available at: https://www.judiciary.uk/wp-content/uploads/2020/08/R-Bridges-v-CC-South-Wales-ors-Judgment.pdf [accessed 22 October 2021].[/footnote]

Chapter 4: Considerations for policymakers

This section sets out some general considerations for policymakers, synthesised from our expert workshops and the Ada Lovelace Institute’s own research and deliberations. These are not intended to be concrete policy recommendations (see chapter 5), but are general
lessons about the parameters within which the Government’s approach to AI regulation and governance will need to be developed, and the issues that need to be addressed with the current regulatory system.

In summary, policymakers should consider the following:

  1. Government ambitions for AI will depend on the stability and certainty provided by robust, AI-specific regulation and law.
  2. High regulatory standards and innovative, flexible regulatory processes will be critical to supporting AI innovation and use.
  3. A critical challenge with regulating AI systems is that risks can arise at various stages of an AI system’s development and deployment.
  4. The UK’s approach to regulation could involve a combination of a unified approach to the governance of AI, with new, cross-cutting rules set out in statute, and sectoral approaches to regulation.
  5. Substantial regulatory capacity building will be unavoidable.
  6. Promising regulatory approaches and tools will need to be refined and embedded into regulatory systems and structures.
  7. New tools need to be ‘designed into’ the regulatory system.

1. Government ambitions for AI will depend on the stability and certainty provided by robust, AI-specific regulation and law

One of the clearest conclusions to be drawn from the considerations in the previous two sections is that, done properly, AI regulation is a prerequisite, rather than an impediment to the development of a flourishing UK AI ecosystem.

Government ambitions to establish the UK as a ‘science superpower’ and use emerging technologies such as AI to drive broadly felt, geographically balanced economic growth will rely on the ability of the UK’s regulatory system to provide stability, certainty and continued
market access for innovators and businesses, and accountability and protection from harms for consumers and the public.

In particular, without the confidence, guidance and support provided by a robust regulatory system for AI, companies and organisations developing AI or looking to exploit its potential will have to grapple with the legal and ethical ramifications of systems on their own. As AI
systems become more complex and capable – and as a greater variety of entities look to develop or make use of them – the existence of clear regulatory rules and a well-resourced regulatory ecosystem will become increasingly important in de-risking the development and use of AI, helping to ensure that it is not just large incumbents that are able to work with the technology.

Critically, the Government’s approach to the governance and regulation of AI needs to be attentive to the specific features and potential impacts of the technology. Rather than concentrating exclusively on increasing the rate and extent of AI development and diffusion, the UK’s approach to AI regulation must also be attentive to the particular ways the technology
might manifest itself, and the specific effects it stands to have on the country’s economy, society and power structures.

In particular, a strategy for AI regulation needs to be designed with the protection and advancement of important and commonly held values, such as agency, human rights, democracy, the rule of law, equality and privacy, in mind. The UK’s AI Strategy already makes reference to some of these values, but a strategy for regulation must provide greater clarity
on how these should apply to the governance of AI systems.

2. High regulatory standards and innovative, flexible regulatory processes will be critical to supporting AI innovation and use

In practice, creating the stability, certainty and continued market access needed to cultivate AI as a UK strength will require the Government to commit to developing and maintaining high, flexible regulatory standards for AI.

As observed by our workshop panellists, there is limited scope for the UK to develop more permissive regulatory standards than its close allies and neighbours, such as the USA and the European Union. Notably, as well as undermining public confidence in a novel and powerful
technology, aspiring to regulatory standards that are lower than those of the European Union would deprive UK-based AI developers of the ability to export their products and services not only to the EU, but to other countries likely to adopt or closely align with the bloc’s regulatory model.

There are, nonetheless, significant opportunities for the UK to do AI regulation differently to, and more effectively than, other countries. While the UK will need to align with its allies on regulatory standards, the UK is in a good position to develop more flexible, resilient and
effective regulatory processes. The UK has an excellent reputation and track record in regulatory innovation, and the use of flexible, pragmatic approaches to monitoring and enforcement. This expertise, which has in part contributed to British successes in fields such as bioscience and fintech, should be leveraged to produce a regulatory ecosystem that supports and empowers businesses and innovators to develop and exploit the potential of AI.

3. A critical challenge with regulating AI systems is that risks can arise at various stages of an AI system’s development and deployment

Unlike most other technologies, AI systems can raise different kinds of risks at different stages of a system’s development and deployment. The same AI system applied in one setting (such as a facial scan for authenticating entry to a private warehouse) can raise significantly
different risks when applied in another (such as authenticating entry to public transport). Similarly, some AI systems are dynamic, and their impacts can change drastically when fed new kinds of data or when deployed in a different context. An ex ante test of a system’s behaviour in ‘lab’ settings may therefore not provide an accurate assessment of that system’s actual impacts when deployed ‘in the wild’.

Many of the proposed models for regulating AI focus either on ex ante assessments that classify an AI system’s risk, or ex post findings of harm in a court of law. One option the UK might consider is an approach to AI regulation that includes regulatory attention at all stages of an AI system’s development and deployment. This may, for example, involve using ex ante algorithmic impact assessments (AIAs) of a system’s risks and benefits pre-deployment, along with post-deployment audits of that system’s behaviour.

If the UK chooses to follow this model, it will have to provide regulators with the necessary powers and capacity to undertake these kinds of holistic regulatory assessments. The UK may also consider delegating some of these responsibilities to independent third parties, such as
algorithmic-auditing firms.

4. The UK’s approach to regulation could involve a combination of a unified approach to the governance of AI, with new, cross-cutting rules set out in statute, and sectoral approaches to regulation

A common challenge raised by our expert participants was whether the UK should adopt a unified approach to regulating AI systems involving a central function that oversees all AI systems, or if regulation should be left to individual regulators who approach these issues on a sectoral or case-by-case basis.

One approach the UK Government could pursue is a combination of the two. While individual regulators can and should develop domain- and sector-specific regulatory rules for AI, there is also a need for a more general, overarching set of rules, which outline if and under what
circumstances the use of AI is permissible. The existence of such general rules is a prerequisite for a coherent, coordinated regulatory and legal response to the challenges posed by AI.

If they are to provide the stability, predictability and confidence needed for UK to get the most out of AI, these new, AI-specific regulatory rules will probably have to be developed and set out in statute.

The unique capacity of AI systems to develop and change independently of human control and intervention means that existing legal and regulatory rules will be likely to prove inadequate. While the UK’s common-law system may develop to accommodate some of these features, this will only happen slowly (if it happens at all) and there is no guarantee the resulting rules will be clear or amount to a coherent response to the technology.

5. Substantial regulatory capacity building will be unavoidable

The successful management of AI will require a sustained and significant expansion of the regulatory system’s overall capacity and levels of coordination.

There are several viable options for how to organise and allocate additional regulatory capacity, and to improve the ability of regulators to develop sector-specific regulatory rules that amount to a coherent whole. Regardless of the specific institutional arrangements, any
capacity building and coordination efforts must ensure that:

  1. additional resources can be allocated without too much duplication of effort, and that gaps and blind spots in the regulatory system are avoided
  2. regulators are able to understand how their responses to AI within their specific domains contribute to the broader regulatory environment, and are provided with clear guidance on how their policies can be configured to complement those of other regulators
  3. regulators are able to easily share intelligence and jointly conduct horizon-scanning exercises.

6. Promising regulatory approaches and tools will need to be refined and embedded into regulatory systems and structures

There are a number of tools and mechanisms that already exist, or that are currently being developed, that could enable regulators to effectively rise to the challenges presented by AI – many of which were pioneered by UK entities.

These include tools of so called ‘anticipatory regulation’, such as regulatory sandboxes, regulatory labs and coordinated horizon scanning and foresight techniques, as well as deliberative mechanisms for better understanding informed public opinion and values regarding emerging technologies, such as deliberative polling, citizens’ juries and assemblies.

Some of these tools are still emerging and should be tested further to determine their value, such as the use of transparency registers to disclose where AI systems are in operation, or algorithmic impact assessments to provide an ex ante assessment of an AI system’s benefits
and harms. While many of the above tools have the potential to prove invaluable in helping regulators and lawmakers rise to the challenges presented by AI, many are still nascent or have only been used in limited circumstances. Moreover, many of the tools needed to help regulators address the challenges posed by AI do not yet exist.

To ensure that regulators have the tools they require, there needs to be a substantial, long-term commitment to supporting regulatory innovation and experimentation, and to supporting the diffusion of the most mature, proven techniques throughout the regulatory ecosystem. This ongoing experimentation will be crucial to ensure that the regulatory system does not become overly dependent on particular kinds of regulatory interventions, but instead has a toolkit that allows it to respond quickly to emerging harms and dangers, as well as being able to develop
more nuanced and durable rules and standards in the longer term.

7. New tools need to be ‘designed into’ the regulatory system

As well as helping cultivate and refine new regulatory tools and techniques, further work is required to understand how regulatory structures and processes might be configured to best enable them.

This is particularly true of anticipatory and participatory mechanisms. The value of techniques like sandboxing, horizon scanning and citizen juries is unlikely to be fully realised unless the insight gained from these activities is systematically reflected in the development and
enforcement of broader regulatory rules.

A good example of how these tools are likely to be most useful if ‘designed into’ regulatory systems and processes is provided by risk-based regulation. Given the variety of applications of AI systems, the UK may choose to follow the approach of the European draft AI regulation and adopt some form of risk-based regulation to prevent gross over or under regulation of AI systems. However, if such an approach is to avoid creating gaps in the regulatory system, in which harmful practices escape appropriate levels of regulatory scrutiny, the system’s ability to
make and review judgements about which risk categories different AI systems should fall into will need to be improved.

One element of this will be using anticipatory mechanisms to help predict harms and unintended consequences that could arise from different uses of AI. Participatory mechanisms that involve regulators working closely with local-community organisations, members of
the public and civil society may also help regulators identify and assess risks to particular groups.

Perhaps the bigger challenge, though, will be to design processes by which the risk tiers into which different kinds of systems fall are regularly reviewed and updated, so that AI systems whose risk profiles may change over time do not end up being over or under regulated.

Chapter 5: Open questions for Government

This section sets out a series of open questions that we believe the White Paper on AI regulation and governance should respond to, before making a series of more specific recommendations about things that we believe it should commit to.

We acknowledge that these open questions touch on complex issues that cannot be easily answered. In the coming months, we encourage the Office for AI to engage closely with members of the public, academia, civil society and regulators to further develop these ideas.

Open questions

AI systems present a set of common, novel regulatory challenges, which may manifest differently in different domains, and which demand holistic solutions. A coherent regulatory response to AI systems therefore requires a combination of general, cross-cutting regulatory
rules and sector-specific regulations, tailored to particular uses of AI.

Finding the right balance between these two will depend on how the UK chooses to answer several open questions relating to the regulation of AI. A more detailed discussion around some of these questions, along with other considerations when designing and configuring regulatory systems, can be found in the annex.

What to regulate?

First, the UK Government must determine what kinds of AI systems it seeks to regulate, and what definition it will use to classify AI systems appropriately. Some possible options include:

  1. Regulating all AI systems equally. Anything classified as an ‘AI system’ must follow common rules. This may require the UK choosing a more precise definition of ‘AI system’ to ensure particular kinds of systems (such as those used to augment or complement human decision-making) are included. This may prove resource
    intensive both for regulators and for new entrants seeking to build AI, but this approach could ensure no potentially harmful system avoids oversight.
  2. Regulating higher-risk systems. This would involve creating risk tiers and regulating ‘higher-risk’ systems more intensely than lower risk, and could involve the UK adopting a broader and more encompassing definition of AI systems. A challenge with risk-based approaches to regulation comes in identifying and assessing the level of risk, particularly when risks for some members of society may be benefits for others. The UK could consider assigning risk tiers in a number of ways, including:
    1. Enumerating certain domains (such as credit scoring, or public services) that are inherently higher risk.[footnote]This is the approach the EU’s Draft AI Regulation takes. See Annex III of the European Commission. (2021). Proposal for a Regulation of the European Parliament and of the Council laying down harmonised rules on artificial intelligence (Artificial Intelligence Act) and amending certain Union legislative acts (COM(2021) 206 final).[/footnote] This approach could be easily bypassed by a developer seeking to classify their system in a different domain, and it may not capture ‘off-label’ uses of a system that could have harmful effects.
    2. Enumerating certain uses (such as facial-recognition systems that identify people in public places) as higher risk. This approach could also be easily bypassed by a developer who reclassifies the use of their system, and would require constant updating of new high-risk uses and a process for determining that risk.
    3. Enumerating certain criteria for assigning higher risk. These could include ex ante assessments of the foreseeable risk of a system’s intended and reasonably likely uses, along with ex post assessments of a system’s actual harms over time.

Who to regulate?

The UK Government must similarly choose who is the focus of AI regulation. This could include any of the following actors, with different obligations and requirements applying to each one:

  1. Developers: Those who create a system. Regulatory rules that enforce ex ante requirements about a system’s design, intended use or oversight could be enforced against this group.
  2. Adapters: A sub-developer who creates an AI system based on building blocks provided by other developers. For example, a developer who uses the Google Cloud ML service, which provides machine-learning models for developers to use, could be classified as an adapter. Similarly, a developer who utilises ‘foundation’ models like OpenAI’s GPT-3 to train their model could be classified as an adapter.[footnote]For a discussion about the opportunities and risks of ‘foundation models,’ see Bommasani, R., et al. (2021). On the opportunities and risks of foundation models. Stanford FSI. Available at: https://fsi.stanford.edu/publication/opportunities-and-risks-foundation-models[/footnote]
  3. Deployers: The person who is responsible for putting a system into practice. While a deployer may have procured this system from a developer, they may not have access to the source code or data of that system.[footnote]The EU’s Draft AI regulation attempts to distinguish between developers and ‘users,’ a term that can be confused with those who are subject to an AI system’s decisions. See Smuha, N. et al. (2021). How the EU Can Achieve Legally Trustworthy AI: A Response to the European Commission’s Proposal for an Artificial Intelligence Act. Available at SSRN: https://ssrn.com/abstract=3899991 or http://dx.doi.org/10.2139/ssrn.3899991.[/footnote]

How and when to regulate?

Part of the challenge with regulating AI systems is that risks and harms may arise in different stages of a product’s lifecycle. Addressing this challenge requires a combination of both ex ante and ex post regulatory interventions. Some options the UK Government could consider include:

  1. Ex ante criteria that all AI systems must meet. These could be both technical requirements around the quality of datasets an AI system is trained on, along with governance requirements including documentation standards (such as the use of model cards) and bias assessments. A regulatory system could ensure developers of an AI system meet these requirements through either:
    1. Self-certification: A developer self-certifies they are meeting these requirements. This raises a risk of certification becoming a checkbox exercise that is easily gameable.[footnote]The EU’s proposed regulations follow this same approach.[/footnote]
    2. Third-party certification: The UK Government could require developers to obtain a certification from a third-party, either a regulator or Government-approved independent certifier. This could enable more independent certification, but may become a barrier for smaller firms.
  2. Ex ante sectoral codes of practice. Certain sectors may choose to implement additional criteria on an AI system before it enters the market. This may be essential for certain sectors like healthcare that require additional checks for patient safety and operability of a system. This could include checks about how well a system has
    been integrated into a particular environment, or checks on how a system is behaving in a sandbox environment.
  3. Ex post auditing and inspection requirements. Regulators could evaluate the actual impacts and risks of a system post-deployment by inspecting and auditing its behaviour. This may require expanding on existing multi-regulator coordination efforts like the Digital Regulation Cooperation Forum to identify gaps and share
    information, and to create longitudinal studies on the risk and behaviour of an AI system over time.
  4. Novel forms of redress. This could include the creation of an ombudsman or form of consumer champion for intaking and raising complaints about an AI system on behalf of people and society, and ensuring the appropriate regulator has dealt with them.

Chapter 6: Recommendations for the Government’s White Paper on AI regulation

With the above open questions in mind, we recommend the Government focuses on taking action in the following three areas in their forthcoming White Paper on AI regulation:

  1. The development of new, clear regulations for AI.
  2. Improved regulatory capacity and coordination.
  3. Improving transparency standards and accountability mechanisms.

1. The development of new, clear regulations for AI

Recommendation 1:

The Government should establish a clear definition of AI systems that matches their overall approach towards regulation.

How broad and encompassing this definition may be will depend on what kind of regulatory approach the Government chooses (for example, risk-based vs all-encompassing), what criteria the Government chooses to trigger intervention (such as systems they classify as ‘high risk’ vs ‘low risk’) and which actors the Government chooses to target regulation at (such as the developers of AI or the deployers).

  • In their White Paper, the Government should explore the possibility of combining sectoral and risk-based approaches, and should commit to engaging with civil society on these questions.
  • The Government should commit to ensuring the definition and approach to AI they choose will be subject to parliamentary scrutiny.

Recommendation 2:

Government should consider creating a central function to oversee the development and implementation of AI-specific, domain-neutral statutory rules for AI systems. These rules should be subject to regular parliamentary scrutiny.

These domain-neutral statutory rules could:

  • set out consistent ways for regulators to approach common challenges posed by AI systems (such as accountability for automated decision-making, the encoding of contestable, value-laden judgements into AI systems, AI bias, the appropriate place for human oversight and challenge of AI systems, the problems associated with understanding, trusting and making important choices on the basis of opaque AI decision-making processes). The proposed approaches should be rooted in legal concepts and ethical values such as fairness, liberty, agency, human rights, democracy and the rule of law.

The specific understanding of these concepts and values should be informed not just by the existing discourse on AI ethics, but also by engagement with the public. The Government should commit to co-developing these rules with members of the public, civil society
and academia. These rules should:

  • include and set out a requirement for, and mechanism by which the central function must regularly revisit the definition of AI, the criteria for regulatory intervention and the domain-neutral rules themselves. The central function should be required to provide an annual report to Parliament on the status and operation of these rules.
  • provide a means of requiring individual regulators to attend to, and address the systemic, long-term impacts of AI systems. While the regulatory system as a whole is a potentially critical lever in addressing them, many of the most significant impacts of AI systems – such as how they affect democracies and alter the balance of power
    between different groups in society – are not covered by the narrow, domain-bounded remits of individual regulators. The provision of domain-neutral rules for AI regulation would be one way to require and mandate individual regulators to make regulatory decisions with a view to addressing these larger, more systemic issues – and could be a way of guiding regulators to do so in a coordinated manner.
  • provide a means for regulators to address all stages of an AI system’s lifecycle, from research to product development to procurement and post-deployment. This would require regulators to use ex ante regulatory mechanisms (such as impact assessments) to assess the potential impacts of an AI system on people and society, along with ex post mechanisms (such as regulatory inspections and audits) to determine the actual impact of an AI system’s behaviour on people and society. Regulators could also be required to use anticipatory methods to assess the potential future risks posed by AI systems in different contexts.
  • be intended to supplement, rather than replace, existing laws governing AI systems. These rules should complement existing health and safety, consumer protection, human rights and data-protection regulations and law.

In addition to developing and updating the domain-neutral rules, the central function could be responsible for:

  • leading cross-regulatory coordination on the regulation of AI systems, along with cross-regulatory horizon-scanning and foresight exercises to provide intelligence on potential harms and challenges posed by AI systems that may require regulatory responses
  • monitoring common challenges with regulating AI and, where there is evidence of problems that require new legislation, making recommendations to Parliament to address gaps in the law.

Recommendation 3:

Government should consider requiring regulators to develop sector-specific codes of practice for the regulation of AI.

These sector-specific codes of practices would:

  • lay out a regulator’s approach to setting and enforcing regulatory rules covering AI systems in particular contexts or domains, as well as the general regulatory requirements placed on developers, adapters and deployers of those systems
  • be developed and maintained by individual regulators, who are best placed to understand the particular ways in which AI systems are deployed in regulatory domains, the risks involved in those deployments, their current and future impacts, and the practicality of different regulatory interventions
  • be subject to regular review to ensure that they keep pace with developments in AI technologies and business models.

Potential synergy between recommendations 2 and 3

While recommendations 2 and 3 could individually each bring benefits to the regulatory system’s capacity to deal with the challenges posed by AI, we believe that they would be most beneficial if implemented together, enabling a system in which cross-cutting regulatory rules inform and work in tandem with sector-specific codes of practice.

Below we illustrate one potential way that the central function, domain-neutral statutory rules and sector-specific codes of practice could be combined to improve the coordination and responsiveness of the regulatory system with regards to AI systems.

A potential model for horizontal and vertical regulation of AI

 

On this model:

  • The central function would create domain-neutral statutory rules.
  • Individual regulators would be required to take the domain-neutral statutory rules into account when developing and updating the sector-specific codes of practice. These sector-specific codes of practice would apply the domain-neutral statutory rules to specific kinds of AI systems, or the use of those systems in specific contexts. These codes of practice should include enforcement mechanisms that address all stages of an AI system’s lifecycle, including ex ante assessments like impact assessments and ex post audits of a system’s behaviour.
  • Careful adherence to the domain-neutral statutory rules when developing the sector-specific codes of practice would help ensure that the multiple different AI codes of practice, developed across different regulators, all approached AI regulation with the same high-level goals in mind.
  • The central function would have a duty to advise and work with individual regulators on how best to interpret the domain-neutral statutory rules when developing sector-specific codes of practice.

2. Improved regulatory capacity and coordination

AI systems are often complex, opaque and straddle regulatory remits. For the regulatory system to be able to deal with these challenges, significant improvements will need to be made to regulatory capacity (both at the level of individual regulators and the whole regulatory
system) and to improve coordination and knowledge sharing between regulators.

Recommendation 4:

Government should consider expanded funding for regulators to deal with analytical and enforcement challenges posed by AI systems. This funding will support building regulator capacity and coordination.

Recommendation 5:

Government should consider expanded funding and support for regulatory experimentation, and the development of anticipatory and participatory capacity within individual regulators. This will involve bringing in new forms of public engagement and futures expertise.

Recommendation 6:

Government should consider developing formal structures for capacity sharing, coordination and intelligence sharing between regulators dealing with AI systems.

These structures could include a combination of several different models, including centralised resources of AI knowledge, experts rotating between regulators and the expansion of existing cross-regulator forums like the Digital Regulation Cooperation Forum.

Recommendation 7:

Government should consider granting regulators the powers needed to enable them to make use of a greater variety of regulatory mechanisms.

These include providing statutory powers for regulators to engage in regulatory inspections of different kinds of AI systems. The Government should commission a review of the powers different regulators will need to conduct ex ante and ex post assessments of an AI system before, during, and after its deployment.


3. Improving transparency standards and accountability mechanisms

The impacts of AI systems may not always be visible to, or controllable by policymakers and regulators alone. This means that regulation and regulatory intelligence gathering will need to be complemented by and coordinated with extra-regulatory mechanisms, such as standards,
investigative journalism and activism.

Recommendation 8:

Government should consider how best to use the UK’s influence over international standards to improve the transparency and auditability of AI systems.

While these are not a silver bullet, they can help ensure the UK’s approach to regulation and governance remains interoperable with approaches in other regions.

Recommendation 9:

Government should consider how best to maintain and strengthen laws and mechanisms to protect and enable journalists, academics, civil-society organisations, whistleblowers and citizen auditors to hold developers and deployers of AI systems to account.

This could include passing novel legislation to require the disclosure of AI systems when in use, or requirements for AI developers to disclose data around systems’ performance and behaviour.

Annex: The anatomy of regulatory rules and systems, and how these apply to AI

To explore how the UK’s regulatory system might adapt to meet the needs of the Government’s ambitions for AI, it is useful to consider ways in which regulatory systems (and sets of regulatory rules) can vary.

This section sets out some important variables in the design of regulatory systems, and how they might apply specifically to the regulation of AI. It is adapted from a presentation given at the second of the expert workshops, by Professor Julia Black, who has written
extensively on this topic.[footnote]Black, J., and Murray, A. D. (2019). ‘Regulating AI and machine learning: setting the regulatory agenda’. European Journal of Law and Technology, 10 (3). Available at: http://eprints.lse.ac.uk/102953/4/722_3282_1_PB.pdf[/footnote]

The following section addresses the challenges some of these variables may pose for the regulation of AI.

Why to regulate: The underlying aims of regulation

Regulatory systems can vary in terms of their underlying aims. Regulatory systems may have distinct, narrowly defined aims (such as maximising choice and value for consumers within a particular market, or ensuring a specific level of safety for a particular category of product), and may also have been driven by different broader objectives.

In the context of the regulation of AI, some of the broader values that could be taken into consideration by a regulatory system might include economic growth, the preservation of privacy, the avoidance of concentrations of market power and distributional equality.

When to regulate: The timing of regulatory interventions

A second important variable in the design of a regulatory system concerns the stage at which regulatory interventions take place. Here, there are three, mutually compatible, options:

Before: A regulator can choose to intervene prior to a product or service entering a market, or prior to it receiving regulatory approval. In the context of AI, examples of ex ante regulation might include pre-market entry requirements, such as audits and assessments of AI systems by regulators to ascertain the levels of accuracy and bias.[footnote]Ada Lovelace Institute and DataKind UK. (2020). Examining the Black Box: Tools for Assessing Algorithmic Systems Available at: www.adalovelaceinstitute.org/report/examining-the-black-box-tools-for-assessing-algorithmic-systems.[/footnote] It might also include bans on specific uses of AI in particular, high-risk settings.

During: A regulator can also intervene during the course of the operation of a business model, product or service. Here, this will be stipulating requirements that need to be met by the product during the course of its operation. Typically, this type of intervention will require some form of inspection regime to ensure ongoing compliance with the regulator’s requirements. In the context of AI, it might involve establishing mechanisms by which regulators can inspect algorithmic systems, or requirements for AI developers to disclose information on the performance of their systems – either publicly or to the regulator.

After: A regulator can intervene retrospectively to remedy harms, or address breaches of regulatory rules and norms. Retrospective regulation can take the form of public enforcement, undertaken by regulators with statutory enforcement powers, or private-sector enforcement pursued via contract, tort and public-law remedies. An AI-related example might be regulators having the power to issue fines to developers or users of AI systems for breaches of regulatory rules, or as redress for particular harms done to individuals or groups resulting from failure to comply with regulation.

What to regulate: Targets of regulatory interventions

A third important variable concerns the targets of regulatory interventions. Here, regulators and regulatory systems can be configured to concentrate on any of the following:

Conduct and behaviour: One of the most common forms of intervention involves regulating the conduct or behaviour of a particular actor or actors. On the one hand, regulation of conduct can be directed at suppliers of goods, products or services, and often involves stipulating:

  1. rules for how firms should conduct business,
  2. requirements to provide information or guidance to consumers, or
  3. responsibilities that must be borne by particular individuals.

Regulation of conduct can also be directed towards consumers, however. Attempts to regulate consumer behaviour typically involve the provision of information or guidance to help consumers better navigate markets. This kind of regulation may also involve manipulation of the way that consumers’ choices are presented and framed, known as ‘choice architecture’, with a view towards ‘nudging’ consumers to make particular choices.

Systems and processes: A second target of regulation are the systems and processes followed by companies and organisations. Regulators may look to dictate aspects of business processes and management systems or else introduce new processes that companies and organisations have to follow, such as health-and-safety checks
and procedures. Regulators may also target technical and scientific processes, for example, the UK Human Fertilisation and Embryology Authority addresses the scientific processes that can be adopted for human fertilisation.

Market structure: A third target of regulation is the overall dynamics and structure of a market with the aim of addressing current or potential market failures. Regulation of market structure may be aimed at preventing monopolies or other forms of anti-competitive behaviours or structures, or at more specific goals, such as avoiding moral hazard or containing the impact of the collapse of particular companies or sectors. These can be achieved though competition law or through the imposition of sector-specific rules.

Technological infrastructure should be a key concern for regulators of AI, particularly given that the majority of AI systems and ‘cloud’ services are going to be built and dependent on physical infrastructure provided by big tech. Regulators will want to consider control of the infrastructure necessary for the functioning of AI (and digital technologies more
generally), as well as the competition implications of this trend.

It is worth noting that the early 2020s is likely to be a time of significant change in approaches to competition law – particularly in relation to the tech industry. In the USA, the Biden administration has shown greater willingness than any of its recent predecessors to reform competition law, though the extent and direction of any changes remains unclear.[footnote]Bietti, E. (2021). ‘Is the Goal of Antitrust Enforcement a Competitive Digital Economy or a Different Digital Ecosystem?’ Ada Lovelace Institute. Available at: www.adalovelaceinstitute.org/blog/antitrust-enforcement-competitive-digital-economy-digital-ecosystem/ [accessed 20 September 2021][/footnote] In the EU, the Digital Markets Act[footnote]Tambiama, M. (2021). Digital Markets Act – Briefing, May 2021, p. 12. Available at: www.europarl.europa.eu/RegData/etudes/BRIE/2021/690589/EPRS_BRI(2021)690589_EN.pdf[/footnote] is set to change the regulatory landscape dramatically. For a UK Government eager to stimulate and develop the UK tech sector, getting the UK regulatory system’s
approach to competition law right will be imperative to success.

Calculative methods: A particularly important target of regulation in the context of AI is calculative and decision-making models. These can range from simple mathematical models that set the prices of consumer products, to more complex algorithms used to
rate a person’s credit worthiness, or the artificial-neural networks used to power self-driving vehicles.

Regulation of calculative methods can be undertaken by directly stipulating the requirements for the model (for instance stating that a decision-making model should have a particular accuracy threshold), or else by regulating the nature of the calculative or decision-making models themselves. For instance, in finance, a regulator might stipulate the means by which a bank calculates its liabilities – the cash reserves it must set aside as contingency.

How widely to regulate: The scope of regulatory intervention

An important related variable that is particularly salient in the context of a general-purpose technology like AI is the scope of regulation. Here, it is useful to distinguish between:

  1. The scope of the aims of regulation: One the one hand, a regulatory intervention might aim for the use of AI in a particular context to avoid localised harms, and for the use of AI in a particular domain to be consistent with the functioning of that domain. On the other, individual regulators might also be concerned with how the use of AI in their particular enforcement domain affects other domains, or how the sum of all regulatory rules concerning AI across different industries or domains affects the technology’s overall impact on society and the economy.
  2. The institutional scope of regulation: Closely related is the question of the extent to which regulators and other institutions see and develop regulatory rules as part of a coherent whole, or whether they operate separately.
  3. The geographical scope of regulation: Is regulation set at a national or a supranational level?

As a general rule, regulation with a narrow scope is easier for individual regulators to design and enforce, as it provides regulatory policy development and evaluation with fewer variables and avoids difficult coordination problems. Despite these advantages, narrow approaches to regulation have significant setbacks, which are of particular relevance to a general-purpose technology like AI, and may make the difficulties of more holistic, integrated approaches worth considering:

  • Regulatory systems that focus on addressing narrowly defined issues can often be blind to issues that are only visible in the aggregate.
  • Regulatory systems characterised by regulators with narrow areas of interest are more prone to blind spots in between domains of regulation.
  • The existence of regulators and regulatory regimes with narrow geographical or market scope can increase the risks of arbitrage (where multinational firms exploit the regulatory differences between markets to circumvent regulation).

How to regulate: Modes of regulatory intervention, and tools and techniques

A final variable is the tools, approaches and techniques used by a regulator or regulatory system.

The different mechanisms by which regulators can achieve their objectives can be divided up into the following categories:

  • norms
  • numbers
  • incentives and sanctions
  • regulatory approach
  • trust and legitimacy.

Norms

Perhaps the most common means of regulating is by setting norms. Regulatory norms can take the form of specific rules, or more general principles. The latter can be focused either on the outcomes the regulated entity should produce, or the nature of the processes or
procedures undertaken. In terms of scope, norms can be specific to particular firms or industries, or can be cross sectoral or even cross jurisdictional.

While norms do tend to require enforcement, there are many cases where norms are voluntarily adhered to, or where norms create a degree of self-regulation on the part of regulated entities. In the context of AI, regulatory policy (and AI policy more generally) may attempt to encourage norms of data stewardship,116 greater use of principles of data minimisation and privacy-by-design, and transparency about when and how AI systems are used. In some cases, however, the nature of the incentive structures and business models for tech companies will place hard limits on the efficacy of reliance on norms. (For instance,
corporations’ incentives to maximise profits and to increase shareholder value in the short term may outweigh considerations about adherence to specific norms).

Numbers

Another important means of regulatory intervention is by stipulating prices for products in a market, or by stipulating some of the numerical inputs to calculative models. For instance, if a company uses a scorecard methodology to make a particular, significant decision, a regulator might decide to stipulate the confidence threshold.

These kinds of mechanisms may be indirectly relevant to AI systems used to set prices within markets, and could be directly relevant for symbolic AI systems, where particular numerical inputs can have a significant and clear effect on outputs. However, recent literature on competition law and large technology companies highlights that a fixture
on price misses other forms of competition concern.[footnote]Khan, L. (2017). ‘Amazon’s Antitrust Paradox’. Yale Law Journal. Volume 126, No. 3. Available at: www.yalelawjournal.org/note/ amazons-antitrust-paradox.[/footnote]

Incentives and sanctions

Regulators can also provide incentives or impose penalties to change the behaviours of actors within a market. These might be pegged to particular market outcomes (such as average prices or levels of consumer satisfaction), to specific conduct (such as the violation of regulatory rules or principles) or to the occurrence of specific harms. Penalties can take the form of fines, requirements to compensate injured parties, the withdrawal of professional licenses or, in extreme cases, criminal sanctions. A prime example of the use of sanctions in tech regulation is provided by the EU’s General Data Protection Regulation, which imposes significant fines on companies for non-compliance.[footnote]General Data Protection Regulation. Available at: https://gdpr.eu/fines.[/footnote]

Regulatory approach

Finally, there are various questions of regulatory approach. Differences in regulatory approach might include whether a regulatory regime is:

  • Anticipatory, whereby the regulator attempts to understand potential harms or market failures before they emerge, and to address them before they become too severe, or reactive, whereby regulators respond to issues once harms or other problems are clearly manifest. In the realm of technology regulation, anticipatory approaches are perhaps the best answer to the ‘Collingridge dilemma’: when
    new technologies and business models do present clear harms that require regulation, these often only become apparent to regulators well after they have become commonplace. By this time, the innovations in question have often become so integrated into economic life that post hoc regulation is extremely difficult.[footnote]Liebert, W., and Schmidt, J. C. (2010)[/footnote] However, anticipatory approaches tend to have to err on the side of caution, potentially leading to a greater degree of overregulation than reactive approaches – which can operate with a fuller understanding of the harms and benefits of new technologies and business models.
  • Compliance based, where a regulator works with regulated entities to help them comply with rules and principles, or deterrence based, where regulatory sanctions provide the main mechanisms by which to incentivise adherence. This difference also tends to be more pronounced in the context of emerging technologies, where there is less certainty regarding what is and isn’t allowed under regulatory rules.
  • Standardised, where all regulated products and services are treated the same, or risk based, whereby regulators monitor and restrict different products and services to differing degrees, depending on judgements of the severity or likelihood of potential harms from regulatory failure.[footnote]In determining how to calibrate a regulatory response to a product or technology to the level of risk it presents, two of the most important factors are 1) If and to what extent the harms it could cause are reversible or compensatable; and 2) whether the harms done are contained in scope, or broader and more systemic.[/footnote] By creating different levels of regulatory requirements, the rules created by risk-based systems can be less
    onerous for innovators and businesses, but also depend on current (and potentially incorrect) judgements about the potential levels of risk and harms associated with different technologies or business models. Risk-based approaches come with the danger of creating gaps in the regulatory system, in which harmful practices or technologies can escape an appropriate level of regulatory scrutiny.

Trust and legitimacy

There are different things that different groups will require from a regulator or regulatory system in order for the system to be seen as trustworthy and legitimate. These include:

  • Expertise: A regulator needs to have, and be able to demonstrate a sufficient level of understanding of the subject matter they are regulating. This is particularly important in industries or areas where asymmetries of information are common, such as AI. While relevant technical expertise is a necessity for regulators, in many contexts (and especially that of AI regulation) understanding the dynamics
    of sociotechnical systems and their effects on people and society will also be essential.
  • Normative values: It is also important for a regulator to take into account societal values when developing and enforcing regulatory policy. For example, in relation to AI, it will be important for questions about privacy, distributional justice or procedural fairness to be reflected in a regulator’s actions, alongside considerations of efficiency, safety and security.
  • Constitutional, democratic and participatory values: A final important set of factors affecting the legitimacy and trustworthiness of a regulator concern whether a regulator’s ways of working are transparent, accountable and open to democratic participation and input. Ensuring a regulator is open to meaningful participation can
    often be difficult, depending on its legal and practical ability to make decisions differently in response to participatory interventions, and on the accessibility of the decisions being made.

Acknowledgements

We are grateful to the expert panellists who took part in our workshops in April and May 2021, the findings of which helped inform much of this report. Those involved in these workshops are listed below.

Workshop participant Affiliation
Ghazi Ahamat Centre for Data Ethics & Innovation
Mhairi Aitken Alan Turing Institute
Haydn Belfield Haydn Belfield Centre for the Study of Existential Risk
Elettra Bietti Berkman Klein Center for Internet and Society
Reuben Binns University of Oxford
Kate Brand Competition and Markets Authority
Lina Dencik Data Justice Lab, Cardiff University
George Dibb Institute for Public Policy Research
Mark Durkee Centre for Data Ethics & Innovation
Alex Georgiades UK Civil Aviation Authority
Mohammed Gharbawi Bank of England
Emre Kazim University College London
Paddy Leerssen University of Amsterdam
Samantha McGregor Arts and Humanities Research Council
Seán ÓhÉigeartaigh Leverhulme Centre for the Future of Intelligence
& Centre for the Study of Existential Risk
Lee Pope Department for Digital, Culture, Media and Sport
Mona Sloane New York University, Center for Responsible AI
Anna Thomas Institute for the Future of Work
Helen Toner Center for Security and Emerging Technology
Salomé Viljoen Columbia Law School
Karen Yeung Birmingham Law School & School of Computer Science

We are also grateful to those who, in addition to participating in the workshops, provided comments at different stages of this report and whose thinking, ideas and writing we have drawn on heavily, in particular: Professor Julia Black, London School of Economics; Jacob Turner, barrister at Falcon Chambers and Professor Lillian Edwards, University of Newcastle.


 

This report was lead authored by Harry Farmer, with substantive contributions from Andrew Strait and Imogen Parker.

Preferred citation: Ada Lovelace Institute. (2021). Regulate to innovate. Available at: https://www.adalovelaceinstitute.org/report/regulate-innovate/

1–12 of 15

Skip to content

The three legal mechanisms discussed in the report are data trusts, data cooperatives and corporate and contractual models, which can all be powerful mechanisms in the data-governance toolbox.

The report is a joint publication with the AI Council and endorsed by the ODI, the City of London Law Society and the Data Trusts Initiative.

Executive summary

Organisations, governments and citizen-driven initiatives around the world aspire to use data to tackle major societal and economic problems, such as combating the
COVID-19 pandemic. Realising the potential of data for social good is not an easy task, and from the outset efforts must be made to develop methods for the responsible
management of data on behalf of individuals and groups.

Widespread misuse of personal data, exemplified by repeated high-profile data breaches and sharing scandals, has resulted in ‘tenuous’ public trust[footnote]Centre for Data Ethics and Innovation (2020). Addressing trust in public sector data use. [online] GOV.UK. Available at: www.gov.uk/government/publications/cdei-publishes-its-first-report-on-public-sector-data-sharing/addressing-trust-in-public-sector-data-use [Accessed 18 Feb. 2021].[/footnote] in public and private-sector data sharing. Concentration of power and market dominance, based on extractive data practices from a few technological players, both entrench public concern about data use and impede data sharing and access in the public interest. The lack of transparency and scrutiny around public-private partnerships add additional layers of concerns when it comes to how data is used.[footnote]In 2020, in partnership with Understanding Patient Data at the Wellcome Trust, the Ada Lovelace Institute convened patient roundtables and citizen juries across the UK and commissioned a nationally representative survey of 2,095 people. The findings show that 82% of people expect the NHS to publish information about data access partnerships; 63% of people are unaware that the NHS gives third parties access to data; 75% of people believe the public should be involved in decisions about how NHS data is used. The two reports that underpin this research are available at: https://understandingpatientdata.org.uk/news/accountability-transparencyand- public-participation-must-be-established-third-party-use-nhs [Accessed 18 Feb. 2021].[/footnote] Part of these concerns comes from the fact that what individuals might consider to be ‘good’ is different to how those who process data may define it, especially if individuals have no say in that definition.

The challenges of the twenty-first century demand new data governance models for collectives, governments and organisations that allow data to be shared for individual and public benefit in a responsible way, while managing the harms that may emerge.

This work explores the legal mechanisms that could help to facilitate responsible data stewardship. It offers opportunities for shifting power imbalances through breaking data silos and allowing different levels of participatory data governance,[footnote]For a more detailed discussion on participatory governance see the Ada Lovelace Institute’s forthcoming report on Exploring participatory mechanisms for data stewardship (March 2021).[/footnote] and for enabling the responsible management of data in data-sharing initiatives by individuals, organisations and governments wanting to achieve societal, economic and environmental goals.

This report focuses on personal data management, as the most common type of data stewarded today in alternative data governance models.[footnote]See ‘Annex 2: Graphical Representation’ in Manohar, S., Kapoor, A. and Ramesh, A. (2020). Data Stewardship – A Taxonomy. [online] The Data Economy Lab. Available at: https://thedataeconomylab.com/2020/06/24/data-stewardship-a-taxonomy/ [Accessed 18 Feb. 2021].[/footnote] It points out where mechanisms are suited for non-personal data management and sees this area as requiring future exploration. The jurisdictional focus is mainly on UK law, however this report also introduces a section on EU legislative developments on data sharing and, where appropriate, indicates similarities with civil law systems (for example, fiduciary obligations resembling trust law mechanisms).

Produced by a working group of legal, technical and policy experts, this report describes three legal mechanisms which could help collectives, organisations and governments create flexible governance responses to different elements of today’s data governance challenges.
These may, for example, empower data subjects to more easily control decisions made about their data by setting clear boundaries on data use, assist in promoting desirable uses, increase confidence among organisations to share data or inject a new democratic element into data policy.

Data trusts,[footnote]For the purposes of this report, data trusts are regarded as underpinned by UK trust law.[/footnote] data cooperatives and corporate and contractual mechanisms can all be powerful mechanisms in the data-governance toolbox. There’s no one-size-fits-all
solution and choosing the type of governance mechanism will depend on a number of factors.

Some of the most important factors are purpose and benefits. Coming together around an agreed purpose is the critical starting point, and one which will subsequently determine the benefits and drive the nature of the relationship between the actors involved in a data-sharing initiative. These actors may include individuals, organisations and governments although
data-sharing structures do not necessarily need to include all actors mentioned.

The legal mechanisms presented in this report aim to facilitate this relationship, however the broader range of collective action and coordination mechanisms to address data challenges also need to be assessed on a case-by-case basis. The three mechanisms described here are meant to provide an indication as to the types of approaches, conditions and legal tools that can be employed to solve questions around responsible data sharing and governance.

To demonstrate briefly how purpose can be linked to the choice of legal tools:

Data trusts create a vehicle for individuals to state their aspirations for data use and mandate a trustee to pursue these aspirations.[footnote]Delacroix, S. and Lawrence, N.D. (2019). Bottom-up data Trusts: disturbing the “one size fits all” approach to data governance. International Data Privacy Law, [online] 9(4). Available at: https://academic.oup.com/idpl/article/9/4/236/5579842 [Accessed 6 Nov. 2019].[/footnote] Data trusts can be built with a highly participatory structure in mind, requiring systematic input from the individuals that set up the data trust. It’s also possible to build data trusts with the intention to delegate to the data trustee the responsibility to determine what type of data processing is to the beneficiaries’ interest.

The distinctive elements of this model are the role of the trustee, who bears a fiduciary duty in exercising data rights (or the beneficial interest in those rights) on behalf of the beneficiaries, and the role of the overseeing court in providing additional safeguards. Therefore, data trusts might work better in contexts where individuals and groups wish to define the terms of data use by creating a new institution (a trust) to steward data on their behalf, by representing them in negotiations about data use.

Data cooperatives can be considered when individuals want to voluntarily pool data resources and repurpose the data in the interests of those it represents. Therefore, data cooperatives could be the go-to governance mechanism when relationships are formed between peers or like-minded people who join forces to collectively steward their data and create one voice
in relation to a company or institution.

Corporate and contractual mechanisms can be used to design an ecosystem of trust in situations where a group of organisations see benefits in sharing data under mutually agreed terms and in a controlled way. This means these mechanisms might be better suited for creating data-sharing relationships between organisations. The involvement of an independent data steward is envisaged as a means of creating a trusted environment for stakeholders to feel comfortable sharing data with other parties, who they may not know or have had an opportunity to develop a relationship of trust.

This report captures the leading thinking on an emerging and timely issue of research and inquiry: how we can give tangible effect to the ideal of data stewardship: the trustworthy and responsible use and management of data.

Promoting and realising the responsible use of data is the primary objective of the Legal Mechanisms for Data Stewardship working group and the Ada Lovelace Institute, who produced this report, and who view this approach as critical to protecting the data rights of individuals and communities, and unlocking the benefits of data in a way that’s fair, equitable and focused on social benefit.

Chapter 1: Data trusts

Diagram illustrating how data trusts work
How data trusts work

Equity as a tool for establishing rights and remedies

Trust law has ancient roots, with the fiduciary responsibilities that sit at its core being traceable to practices established in Roman law. In the UK, the idea of a ‘trust’ as an entity has its origins in medieval England: with many landowners leaving England to fight in the Crusades, systems were needed to manage their estates in their absence.

Arrangements emerged through which Crusaders would transfer ownership of their estate to another individual, who would be responsible for managing their land and fulfilling any feudal responsibilities until their return. However, returning Crusaders often found themselves in disputes with their ‘caretaker’ landowners about land ownership. These disputes were referred to the Courts of Chancery to decide on an appropriate – equitable – remedy. These courts consistently recognised the claims of the returning Crusaders, creating the concepts of a ‘beneficiary’, ‘trustee’ and ‘trust’ to define a relationship in which one party would manage certain assets for the benefit of another – the establishment of a trust.

While the practices associated with trust law have changed over time, their core components have remained consistent: a trust is a legal relationship between at least two parties, in which one party (the trustee) manages the rights associated with an asset for the benefit of another (the beneficiary).[footnote]Chambers, R. (2010). Distrust: Our Fear of Trusts in the Commercial World. Current Legal Problems, [online] 63(1), pp.631–652. Available at: https://academic.oup.com/clp/article-abstract/63/1/631/379107 [Accessed 18 Feb. 2021].[/footnote] Almost any right can be held in trust, so long as the trust meets three conditions:

  1. there is a clear intention to establish a trust
  2. the subject matter or property of the trust is defined
  3. the beneficiaries of the trust are specified (including as a conceptual
    category rather than nominally).

In the centuries that followed their emergence, the Courts of Chancery have played an important role in settling claims over rights and creating remedies where these rights have been infringed. Core to the operation of these courts is the concept of equity – that disputes should be settled in a way that is fair and just. In centring this concept in their jurisprudence, they have found or clarified new rights or responsibilities that might not be directly codified in Common Law, but which can be adjudicated according to legal principles of fairness. This has enabled the courts to develop flexible and innovative responses in situations where there may be gaps in Common Law, or where the strict definitions of the Common Law are ill-equipped to manage new social practices.

It is this ability to flex and adapt over time that has ensured the longevity of trusts and trust law as a governance tool, and it is these characteristics that have attracted interest in current debates about data governance.

Why data trusts?

Today’s data environment is characterised by structural power imbalances. Those with access to large pools of data – often data about individuals – can leverage the value of aggregated data to create products and services that are foundational to many daily activities.

While offering many benefits, these patterns of data use can create new forms of vulnerability for individuals or groups. Recent years have brought examples of how new uses of data can, for example, create sensitive data about individuals by combining datasets that individually seemed innocuous, or use data to target individuals online in ways that might lead to discrimination or social division.

Today, these rights are typically managed through service agreements or other consent-based models of interaction between individuals and organisations. However, as patterns of data collection and use evolve, the weaknesses associated with these processes are becoming clearer. This has prompted re-examination of consent as a foundation for data exchange and the long-term risks associated with complex patterns of data use.

The limitations of consent as a model for data governance have already been well-characterised. Many terms and conditions are lengthy and difficult to understand, and individuals might not have the ability, knowledge or time to adequately review data access agreements; for many, interest in consent and control is sparked only after they have become aware of data misuse; and the processes for an individual to enact their data rights – or receive redress for data misuse – can be lengthy and inaccessible.[footnote]British Academy, techUK and Royal Society (2018). Data ownership, rights and controls: seminar report. [online] The British Academy. Available at: www.thebritishacademy.ac.uk/publications/data-ownership-rights-controls-seminar-report [Accessed 18 Feb. 2021].[/footnote]

Moreover, as interactions in the workplace, at home or with public services are increasingly shaped by digital technologies, there is pressure on individuals to ‘opt in’ to data exchanges, if they are to be able to participate in society. This reliance on digital interactions exacerbates power imbalances in the governance system.

Approaches to data governance that concentrate on single instances of data exchange also struggle to account for the pervasiveness of data use, much of this data being created as a result of a digital environment in which individuals ‘leak’ data during their daily activities. In many cases, vulnerabilities arising from data use come not from a single act of data processing, but from an accumulation of data uses that may have been innocuous individually, but that together form systems that shape the choices individuals make in their daily lives – from the news they read to the jobs adverts they see. Even if each single data exchange is underpinned by a consent-based interaction, this cumulative effect – and the long-term risks it can create – is something that existing policy frameworks are not well-placed to manage.[footnote]Delacroix, S. and Lawrence, N. D. (2019) ‘Bottom-up data Trusts.’[/footnote]

Nevertheless, it needs to be pointed out that the foundational elements of the GDPR that govern data processing are principles such as data protection by design and by default, and mechanisms such as data protection impact assessments (DPIAs), which are designed to help preempt potential risks as early as possible. These are legal obligations and a prerequisite step before individuals are asked for consent.[footnote]Jasmontaite, L., Kamara, I., Zanfir-Fortuna, G. and Leucci, S. (2018). Data Protection by Design and by Default: Framing Guiding Principles into Legal Obligations in the GDPR [online] European Data Protection Law Review, 4(2), pp.168–189. Available at: https://doi.org/10.21552/edpl/2018/2/7. [Accessed 18 Feb. 2021].[/footnote] Therefore, it is important to highlight the broader compliance failures as well as the limitations of the consent mechanism which play a significant role in creating imbalances of power and potential harm.

The imbalances of power or ability of individuals and groups to act in ways that define their own future create a data environment that is in some ways akin to the feudal system which fostered the development of trust law. Powerful actors are able to make decisions that affect individuals, and – even if those actors are expected to act with a duty of care for individual rights and interests – individuals have limited ability to challenge these structures.

There are also limited mechanisms allowing individuals who want to share data for public benefit to do so via a structure that warrants trust. In areas where significant public benefit is at stake, individuals and communities may wish to take a view on how data is used, or press for action to use data to tackle major societal challenges. At present, the vehicles for the public to have such a voice are limited.

For the purposes of this report, trust law is explored as a new form of governance that can achieve goals such as:

  • increase an individual’s ability to exercise the rights they currently have in law
  • redistribute power in the digital environment in ways that support individuals and groups to proactively define terms of data use
  • support data use in ways that reflect shifting understandings of social value and changing technological capabilities.

The opportunities for commercial or not-for-profit organisations focused on product or research development, or which are seriously concerned about implementing a high degree of ethical obligations when it comes to data pertaining to their customers (and empower these customers not only to make active choices about data management, but also benefit from insights from this data) are briefly discussed in the section on ‘Opportunities for organisations to engage with data trusts’.

What is a data trust?

A data trust is a proposed mechanism for individuals to take the data rights that are set out in law (or the beneficial interest in those rights) and pool these into an organisation – a trust – in which trustees would exercise the data rights conferred by the law on behalf of the trust’s
beneficiaries.

Public debates about data use often centre around key questions such as who has access to data about us and how is it used. Data trusts would provide a vehicle for individuals and groups to more effectively influence the answers to these questions, by creating a vehicle for individuals to state their aspirations for data use and mandate a trustee to pursue these aspirations. By connecting the aspiration to share data to structures that protect individual rights, data trusts could provide alternative forms of ‘weak’ democracy, or new mechanisms for holding those in power to account.

The purposes for which data should be used, or data rights exercised, would be specified in the trust’s founding documents, and these purposes would be the foundation for any decision about how the trust would manage its assets. Mechanisms for deliberation or consultation with beneficiaries could also be built into a trust’s founding charter, with the form and function of those mechanisms depending on the objectives and intentions of the parties creating the trust.

Trustees and their fiduciary duties

Trustees play a crucial role in the success of such a trust. Data trustees will be tasked with stewarding the assets managed in a trust on behalf of its beneficiaries. In a ‘bottom-up’ data trust,[footnote]Delacroix, S. and Lawrence, N. D. (2019) ‘Bottom-up data Trusts’.[/footnote] the beneficiaries will be the data subjects (whose interests may include research facilitation, etc.). Data trustees will have a fiduciary responsibility to exercise (or leverage the beneficial interest inherent in) their data rights. Data trustees may seek to further the interests of the data subjects by entering into data-sharing agreements on their behalf, monitoring compliance with those agreements or negotiating better terms with service providers.

By leveraging the negotiating power inherent in pooled data rights, the data trustee would become a more powerful voice in contract negotiations, and be better placed to achieve favourable terms of data use than any single individual. In so doing, the role of the data trustee would be to empower the beneficiaries, widening their choices about data use beyond the ‘accept or walk away’ dichotomy presented by current governance structures. This role would require a high level of skill and knowledge, and support for a cohort of data trustees would
be needed to ensure they can fulfil their responsibilities.

Core to the rationale for using trust law as a vehicle for data governance is the fiduciary duty it creates. Trustees are required to act with undivided loyalty and dedication to the interests and aspirations of the beneficiaries.[footnote]Ibid.[/footnote] The strong safeguards this provides can create a foundation for data governance that gives data subjects confidence that their data rights are being managed with care.

Adding to these fiduciary duties, the law of equity provides a framework for accountability. If not adhering to the constitutional terms of a trust, trustees can be held to account for their actions by the trust’s beneficiaries (or the overseeing Court acting on their behalf) or an
independent regulator. Not only is a Court’s equitable jurisdiction to supervise, and intervene if necessary, not easily replicable within a contractual or corporate framework, the importance of the fact that equity relies on ex-post moral standards and emphasises good faith cannot be overestimated.

The flexibility offered by trusts also offers benefits in creating a governance system that is able to adapt to shifting patterns of data use. A range of subject matters or application areas could form the basis of a trust, allowing trusts to be established according to need: trusts would therefore allow co-evolution of patterns of data use and regulation.

In conditions of change or uncertainty around data use, this flexibility offers the ability to act now to promote some types of data use, while creating space to change practices in the future.
A further advantage of trust law is its ability to enable collective action while providing institutional safeguards that are commensurate to the vulnerabilities at stake. It is possible to imagine situations in which individuals might group together on the basis of shared values or
attitudes to risk, and seek to use this shared understanding to promote data use. In coming together to define the terms of a trust, individuals would be able to express their agency and influence data use by defining their vision. The beneficiaries’ interest can be expressed in more restrictive or prudential terms, or may include a broader purpose such as the furthering of research or influencing patterns of data use. Current legal frameworks offer few opportunities to enable group action in this way.

The relationship between data rights and trusts

Almost any right or asset can be placed in trust. Trusts have already been established for rights relating to intellectual property and contracts, alongside a range of different types of property, including digital assets, and have proven themselves to be flexible in adapting to different types of asset across the centuries.[footnote]McFarlane, B. (2019). Data Trusts and Defining Property. [online] Oxford Law Faculty. Available at: www.law.ox.ac.uk/research-andsubject- groups/property-law/blog/2019/10/data-trusts-and-defining-property [Accessed 18 Feb. 2021].[/footnote]

Understanding what data rights can be placed in trust, when those rights arise and how a trust can manage those rights will be crucial in creating a data trust. Further work will be required to analyse the sorts of powers that a trustee tasked with stewarding those rights might be able to wield, and the advantages that might accrue to the trust’s beneficiaries as a result.

In the case of data about individuals, the GDPR confers individual rights in respect of data use, which could in principle be held in trust. These include ‘positive’ rights such as portability, access and erasure that would appear to be well-suited to being managed via a trust.

The development of data trusts will require further clarity on how these rights can be exercised. There is already active work on the extent to which (and conditions according to which) those positive rights may be mandatable to another party to act on behalf of an individual, such as a trustee. Opinions on the issue differ among GDPR experts and publication of the European Commission’s draft Data Governance Act raises new questions about how and whether data rights might be delegated to a trust. The feasibility of data trusts however does not hinge on a positive answer to this delegability question, since trust law offers a potential workaround that does not require any right transfer.[footnote]Prof. McFarlane puts forward this potential workaround in a conversation with Paul Nemitz and Sylvie Delacroix. See Data Trusts Initiative (2021) Understanding the Data Governance Act: in conversation with Sylvie Delacroix, Ben McFarlane and Paul Nemitz.[/footnote]

As trusts develop, they will also encounter new questions about the limitations of existing rights and what happens when different rights interact.[footnote]For further discussion of this and other issues in the development of data trusts, see: Data Trusts Initiative (2020b). Data Trusts: from theory to practice, working paper 1 [online] Data Trusts Initiative. Available at: https://static1.squarespace.com/ static/5e3b09f0b754a35dcb4111ce/t/5fdb21f9537b3a6ff2315429/1608196603713/Working+Paper+1+-+data+trusts+- +from+theory+to+practice.pdf [Accessed 18 Feb. 2021].[/footnote] For example, organisations can analyse aggregated datasets and create profiles of individuals, generating inferences about their likely preferences or behaviours. These profiles – created as a result of data analysis and modelling – would typically be considered the intellectual property of the entity that conducted the analysis or modelling. While input data might relate to individuals, once aggregated and anonymised to a certain extent, it would no longer be considered as personal data under the GDPR. However, if inferences are classified as personal data within the scope of the GDPR, individual data-protection rights should apply. Nevertheless, as some authors have explained, exercising data rights on inferences classified as personal data remains limited, and particularly in the case of data portability could give rise to different tensions with trade secrets and intellectual property.[footnote]Wachter, S. and Mittelstadt, B. (2018). A Right to Reasonable Inferences: Re-Thinking Data Protection Law in the Age of Big Data and AI. [online] papers.ssrn.com. Available at: https://papers.ssrn.com/abstract=3248829 [Accessed 18 Feb. 2021].[/footnote]

An example helps illustrate the challenges at stake: in the context of education technologies, data provided by a student – from homework to online test responses – would be portable under the rights set out in the GDPR, but model-generated inferences about what learning methods would be most effective for that student could be considered as the intellectual property of the training provider. The establishment of a trust to govern the use of pupil data (just like any other ‘bottom-up’ data trust) could help shed light on those necessarily contested borders between intellectual property (IP) rights – that arise from creative input in developing the models that produce individual profiles – and personal data rights.

There will never be a one-size-fits-all answer on where to draw these boundaries between IP and personal data.[footnote]A broader discussion could be around whether drawing boundaries is the right approach or whether we might need a different regime for inferences.[/footnote] Instead, what is needed is a mechanism for negotiating these borders between parties involved in data use. In such cases, data trustees could have a crucial public advocacy function in negotiations about the extent to which such inferences fall within the scope of portability provisions.

Examining the data rights that might be placed in trust points to important differences between the use of trusts as a data governance tool and their traditional application.

Typically, assets placed in trust have value at the time the trust is created. In contrast, modern data practices mean that data acquires value in aggregate – it is the bringing together of data rights in a trust that gives trustees power to influence negotiations about data use that would elude any individual. Whereas property is typically placed in trust to manage its value, data (or data rights) would be placed in trust in part to create value.

Another difference can be found in the ease with which assets can typically be removed from a trust. Central to the trusts proposition is that individuals would be able to move their data rights between trusts, within an ecosystem of trust entities that provide a choice in different types of data use.

The ecosystem of data trusts that would enable individuals to make choices between different approaches to data use and management presupposes the ability to switch from one trust to another relatively easily, probably more easily than in traditional trusts.

These differences need not present a barrier to the development of data trusts. The history of trusts demonstrates the flexibility of this branch of law, and trusts can have a range of properties or ways of working that are designed to match the intent of their creators.

Alternatives to trust law

The fiduciary duties owed by trustees to beneficiaries can be achieved by other legal models. For example, contractual frameworks or principal-agent relationships, can create duties between parties, with strong consequences if those duties are not fulfilled. Regulators can also perform a function similar to fiduciary responsibilities, for example in cases where imbalances of market power might have detrimental impacts on consumers. However, each has its limitations. For example:

  • Contracts allow use of data for a purpose. Coupled with an audit function, these can ensure that data is used in line with individual wishes, and – at least for simple data transactions – contracts would require less energy to establish than a trust. However, effective auditing relies on the ability to draw a line from the intention of those entering a contract to the wording of the contract then to its implementation. Given the complexity of patterns of data use – and the fact that many instances of undesirable data use arise from multiple inconsequential transactions – this function may be difficult to achieve. Due to their obligation of undivided loyalty, a trustee may be better placed and motivated to map intent to use and understand potential pitfalls arising
    from the interactions between data transactions.
  • Agents can be tasked with acting on behalf of an individual, taking a fiduciary responsibility in doing so. However, the interaction between an individual and their agent does not accommodate as easily the collective dimension enabled by the establishment of a trust, and it is in this collective dimension that the ability to disrupt digital power relationships lies. Another issue associated with the use of agents is accountability. Structures would be needed to ensure that agents could be held accountable by individuals, if they failed in their responsibilities. In comparison, under trust law, the Courts of Chancery (and the associated institutional safeguards) present a much stronger accountability regime.

Many jurisdictions do not have an equivalent to trust law. However, they may have mechanisms that could fulfil similar functions. For example, while Germany does not operate a trust law framework, some institutions have fiduciary responsibilities built into their very structure, with institutions such as Sparkassen, banks that operate on a cooperative and not-for-profit basis, taking on a fiduciary responsibility for their customers. Studying such mechanisms might uncover ways of delivering the key functions of trust law – stewarding the rights associated with data and delivering benefits for individuals, communities and society with strong safeguards against abuse.

Developing data trusts

Recent decades have brought radical changes in patterns of data collection and use, and the coming years will likely see further changes, many of which would be difficult to predict today. In this context, society will need a range of governance tools to anticipate and respond to emerging digital opportunities and challenges. In conditions of uncertainty, trusts offer a way of responding to emerging governance challenges, without requiring legislative intervention that can take time to produce (and is more difficult to adapt once in place).

Trusts occupy a special place in the UK’s legal system, and the skills and experience of the UK’s legal community in their development and use means it is well-placed to lead the development of data trusts. The next wave in the development of these governance mechanisms will require further efforts to analyse the assets that will be held by a data trust, investigate the powers that trustees may hold as a result, and consider the different forms of benefit that may arise as a result. Those seeking to capture this opportunity will need to:

  • clarify the limits of existing data rights
  • identify lessons from other jurisdictions in the use of fiduciary responsibilities to underpin data governance
  • support pilot projects that assess the feasibility of creating data trusts as a framework for data governance in areas of real-world need.

Problems and opportunities addressed by data trusts

Data trusts have the potential to address some of the digital challenges we face and could help individuals better position themselves in relationship to different organisations, offering new mechanisms for chanelling choices related to how their data is being used.

While organisations could also form data trusts, this section will deal only with data trusts where the beneficiaries are individuals (data subjects). Also, while trusts could manage rights over non-personal data, this section takes as starting point the opportunities coming from individuals delegating their rights (or beneficial interest therein) over personal data. In contexts where non-personal data is managed, the practical challenges in distinguishing personal and non-personal data need to be acknowledged, and it needs to be seen how managing mixed data sets influence the structure and running of a data trust.

There are a number of issues that might arise from setting up a data trust, which aims to balance the asymmetries between those who have less power and are more vulnerable (individuals or data subjects) and those who are in a more favoured position (organisations or data controllers). This section aims at briefly presenting a number of caveats in relation to data trusts and the ecosystem they create, however it should be noted that information asymmetries could also exist between individuals and trusts, not only between individuals and organisations.[footnote] For a more detailed discussion on caveats and shortcomings see O’hara, K. (2020) ‘Data Trusts’. For further discussion regarding the development of data trusts see: Data Trusts Initiative (2020) Data Trusts: from theory to practice, working paper 1.[/footnote]

1. Purpose of the trust and consent

Trusts are usually established for defined purposes set out in a constitutional document. The data subjects will either come together to define their vision about the purposes of data use or will need to adhere to an established data trust and be well-informed about the purposes of the trust and how data or data rights are handled. In either case, it is of the utmost importance that those joining a data trust can do so in full awareness of the trust’s terms and aims.

This raises important ‘enhanced consent’ questions: what mechanisms, if any, are available to data trustees to ensure informed and meaningful consent is achieved? Will the lack of mechanisms for deliberation or consultation with beneficiaries involve liability for the trustees? What would the trustee role be in a participatory structure (active or purely managerial)? Might data trustees for instance draw upon the significant body of work in medical ethics to delineate best practice in this respect?

This set of questions is related to the issues raised in the next section, regarding the status, oversight and required qualifications of data trustees. Important questions arise around how expertise is attracted to this position when, as we will see below, the challenges for remunerating this role and the responsibilities and liabilities of trustees are significant.

2. The role of the trustee

The trustee will be in charge of managing the relationship between the trust’s beneficiaries and the organisations the trust interacts with. Trustees will have a duty of undivided loyalty to the beneficiaries (understood here as the data subjects whose data rights they manage) and they would be responsible for skilfully negotiating the terms of use or access to the beneficiaries’ data. They could also be held responsible if terms are less than satisfactory or if beneficiaries find fault with their actions (in which case the burden of proof is reversed, and it is for the data trustee to demonstrate that they have acted with undivided loyalty).

There are open questions as to if and how beneficiaries will be able to monitor the trustees’ judgement and behaviour and how beneficiaries will be able to identify fault when complex data transactions are involved. More complexity is added also if an ecosystem of data trusts is developed, where one person’s data is spread across several trusts.

At the same time, in the context of increased concerns coming from combining different datasets, in a scenario where one data trust manages a particular dataset about its beneficiaries and another trust manages a different dataset, where the combination of these two datasets could result in harm, should there be mechanisms for trusts to cooperate in preventing such harms? Or would trustees just inform beneficiaries of potential dangers and ask them to sign a liability waiver?

If and when a data trust relies on a centralised model (rather than a decentralised one, whereby the data remains wherever it is, and the data trustee merely leverages the data rights to negotiate access, etc.), one of the central attributions of the trustees will be to ensure the privacy and security of the beneficiaries’ data. Such a task would involve a high degree of risk and complexity (hence the likely preference for decentralised models).

It is unclear what type of technical tools or interfaces will be needed in order for trustees to access credentials in a secure way, for example, and who will make these significant investments in the technical layer. Potential inspiration could come from the new Open Banking ecosystem, where data sharing is enabled by secure Application Programming Interfaces (APIs) which rely on the banks’ authentication methodologies, so that third-party providers do not have to access users’ credentials.

Managing such demanding attributions raises questions related to what will be the triggers, incentives and training required for trustees to take up such a complex role. Should there be formal training and entry requirements? Could data trustees eventually constitute a new type of profession, which could give rise to a ‘local’ and potentially more nimble layer of professional regulation (on top of court oversight and potential legislative interventions), not unlike the multilayered regulatory structure that governs medical practice today?

3. Incentives and sustainability of data trusts

The data trust ecosystem model suggests the importance of competition between trusts for members, yet at this stage it is not clear how enough competition between trusts will emerge. At the same time, it is presumed that a data trust would work best when it operates on behalf of a large number of people. This gives the data trust a bargaining power position in relation to different organisations such as companies and public institutions. Will this create a dependence on network effects, and how can the negative implications be addressed?

Moreover, there are questions related to the funding model and incentives structure underlying the sustainability of data trusts. What will attract individuals to a data trust? For example, if the concern of the beneficiaries is to restrict and to protect data, will the trust be able to generate an income stream or will the trust rely on funding from other sources (e.g. from beneficiaries, philanthropists, etc.)? At the same time, if potential income streams are maximised depending on the use of the data, what are the implications for privacy and data protection?

In addition, what happens when individuals are simply unaware or uninterested in joining a data trust? Might they be allocated to a publicly funded data trust, on the basis of arguments similar to those that were relied on when making pension contributions compulsory? If so, what would constitute adequate oversight mechanisms?

When individuals are interested in joining a data trust, will they be lured by the promise of streamlining their daily interaction with data-reliant service providers, effectively relying on data trusts as a lifestyle, paid-for intermediary service providing peace of mind when it comes to safeguarding personal data? Will individuals be motivated to join a data trust in order to contribute to the common good in a way that does not entail long-term data risks? Will there be monetary incentives for people joining a data trust (whereby individuals would obtain monetary compensation in exchange for providing data)? Should some incentives structures – such as monetary rewards – be controlled and regulated, or in some cases altogether banned?

There are a number of possible funding models for data trusts:

  • privately funded
  • publicly funded
  • charging a fee or subscription from data trust beneficiaries (the individuals or data subjects) in return for streamlining and/or safeguarding their data interactions
  • charging a fee or subscription from those who use the data (organisations)
  • charging individuals for related services
  • a combination of the above.

The different funding options will have both sustainability, and larger data ecosystem implications. If the trust needs to generate revenue by charging for access to the data it stewards or for related services, the focus might start to levitate towards the viability and performance of the trust. The trusts’ performance will correlate with the demand side (organisations using the trust’s beneficiaries’ data), how many people join a data trust (potentially reinforcing network effects) and which data trust can compete better. Will these interdependencies diminish the data trusts’ role as a rebalancing tool for adjusting asymmetries of power and consolidating the position of the disadvantaged?

At the same time, if the data trust operates on a model where the beneficiaries are charged for the service, much depends on how that service is understood. If the focus is on monetary rewards, and the latter are not regulated, the expectations of return from the data trust will increase, hence affecting the dynamics of the relationships. For example, if the data trusts’ funding model implies companies pay back profit on the data used, they will have to make a number of decisions regarding their profitability and viability on the market. Will this reinforce some of the business models that are considerably criticised today, such as the dominant advertising based model?

In the case of publicly funded data trusts, public oversight mechanisms and institutions will need to be developed. At the moment, it is unclear who will be responsible for ensuring funds are transparently allocated based on input from individuals, communities and data-sharing needs. The currently low levels of data awareness also raise concerns about ways of building genuine and adequate engagement mechanisms. Further, the impact, benefit, results or added value created by the data trust will need to be demonstrated. This calls for building transparency and accountability means that are specific to publicly funded data trusts, grafting themselves on top of existing fiduciary duties (and Court oversight mechanisms).

4. Opportunities for organisations to engage with data trusts

Data trusts could offer opportunities for commercial or not-for-profit organisations in a variety of ways. Some of the benefits have been briefly mentioned in the introductory section, pointing to reputational benefits, legal compliance and future-proofing data governance practices. In this respect, one may imagine a scenario whereby large corporate entities (such as banks for instance) are keen to go beyond mere regulatory compliance by sponsoring a data trust in a bid to show how seriously they take their ethical responsibilities when it comes to personal data.

Such a ‘sponsored data trust’ would be strictly separate from the bank itself (absence of conflict of interest would have to be very clear). It could be flagged as enabling the bank’s clients to ‘take the reins’ of their data and benefit from insights derived from this data. All the data that would normally be collected directly by the bank would only be so collected on the basis of terms and conditions negotiated by the data trustee on behalf of the trust’s beneficiaries. The trustee could also negotiate similar terms (or negotiate to revise terms of existing individual agreements) with other corporate entities (supermarkets for instance).

Other potential benefits for corporate and research bodies are around the trusts’ ability to enable access to potentially better quality data that fits organisations’ needs and enables a more agile use of data. This reduces overhead and provides more ease of mind, based on the trustees’ fiduciary responsibility to the data subjects. A trustee would be able to spot and prevent potential harms, therefore reducing liability issues for organisations that could have otherwise arisen from engaging with individual data subjects directly. At the same time, trusts offer a way of responding to emerging governance challenges, without requiring legislative intervention that can take time to produce (and is more difficult to adapt once in place). A broader discussion about opportunities for commercial or not-for-profit organisations could be
considered for a future report.

Mock case study: Greenfields High School

Greenfields High School is using an educational platform to deliver teaching materials, with homework being assigned by online tools that track student learning progress, for example recording test scores. The data collected is used to tailor learning plans, with the aim of improving student performance.

Students, parents, teachers and school leadership have a range of interests
and concerns when it comes to these tools:

  • Students wish to understand what data is collected about them, how it is used and for how long it is kept. Parents want assurances about how their children’s data is used, stored, and processed.
  • Parents, teachers, and school leadership wish to compare their performance against that of other schools, by sharing some types of data.
  • The school wants to keep records of educational data for all pupils for a number of years to track progress. It also wants to be able to compare the effectiveness of different learning platforms.
  • The company providing the learning platform requires access to the data to improve its products and services.

How would a data trust work?

A data trust is set up, pulling together the rights pupils and parents have over the personal data they share with the education platform provider. It tasks a data trustee with the exercise of those rights with the aim of negotiating the terms of service to the benefit and limits established by the school, parents and pupils. It also aims at maximising the school’s ability to evaluate different types of tools (and possibly pool this data with other schools), within an agreed scope of data use that maintains the pupils’ and parents’ confidence that they are minimising the risks associated with data sharing.

 

The trust will be able to leverage its members’ rights to data portability and/or access (under the GDPR) when the school discusses onwards terms of data usewith the educational platform service provider.

 

The data trust includes several schools who have joined a group of common interest in a certain educational approach. This group is overseen by a board. One of the persons sitting on that board is appointed as data trustee.

Chapter 2: Data cooperatives

How data cooperatives work
How data cooperatives work

Why data cooperatives?

The cooperative approach is attractive in situations where there is a desire to give members an equal stake in the organisation they establish and an equal say in its management, as for example with traditional mutuals – businesses owned by and run for the benefits of their members – which are common in financial services, such as building societies. As the business is owned and run by its members, the cooperative approach can be seen as a solution to a growing sense of powerlessness people feel over businesses and the economy.[footnote]See Co-operatives UK (n.d.). Understanding co‑ops. [online] uk.coop. Available at: www.uk.coop/understanding-co-ops [Accessed 18 Feb. 2021].[/footnote]

The cooperative approach in the context of data stewardship can be explored in examples where groups have voluntarily pooled data resources in a commonly owned enterprise, and where the stewardship of that data is a joint responsibility of the common owners. The aim of such enterprises is often to give members of the cooperative more control over their data and repurpose the data in the interests of those represented in it, as opposed to the erection of defensive restrictions around the use of data to prevent activities that conflict with the interests of data subjects (especially but not exclusively with respect to activities that threaten to breach their privacy). In other words, cooperatives tend to have a positive rather than a negative agenda, to achieve some goal held commonly by members, rather than to avoid some outcome resisted by them.

This chapter looks at some examples of data cooperatives, the problems and opportunities they address and patterns of data stewardship. It explores the structure and characteristics of cooperatives and provides a summary of the challenges presented by the cooperative model, together with descriptions of alternative approaches.

What is a cooperative?

A cooperative typically forms around a group that perceives itself as having collective interests, which it would be better to pursue jointly than individually. This may be because they have more bargaining power as a collective, because some kind of network effect means the value for all increases if resources are pooled, or simply because the members of the cooperative do not want to cede control of the assets to those outside the group. Cooperatives are typically formed to create benefits for members or to supply a need that was not being catered for by the market.

The International Cooperative Alliance or ICA[footnote]The ICA is the global federation of co-operative enterprises. More information available at International Cooperative Alliance (2019). Home. [online] ica.coop. Available at: www.ica.coop/en [Accessed 18 Feb. 2021].[/footnote] is the global steward of the Statement on the Cooperative Identity, which defines a cooperative as an ‘autonomous association of persons united voluntarily to meet their common economic, social, and cultural needs and aspirations through a jointly-owned and democratically controlled enterprise.’

According to the ICA there are an estimated three million cooperatives operating around the world,[footnote]International Cooperative Alliance (2019a). Facts and figures. [online] ica.coop. Available at: www.ica.coop/en/cooperatives/factsand- figures [Accessed 18 Feb. 2021].[/footnote] established to realise a vast array of economic, social and cultural needs and aspirations. Examples include:

  • Consumer cooperatives, which provide goods and services to their members/owners, and so serve the community of users. They value service and low price above profit, as well as being close to their customers. They might produce goods such as utilities, insurance or food, or services such as childcare.[footnote]More information available at: Consumer Federation of America (n.d.). Consumer Cooperatives. [online] Consumer Federation of America. Available at: https://consumerfed.org/consumer-cooperatives [Accessed 18 Feb. 2021].[/footnote] They might be ‘buyers’ clubs’, intended to enable the amalgamation of buyers’ power in order to reduce prices. Credit unions are also examples of consumer cooperatives, which mutualise loans based on social knowledge of local conditions and members’ needs, and are owned by the members and therefore able to devote more capital to members’ services rather than profits for external owners.[footnote]More information available at: Find Your Credit Union (n.d.). About Credit Unions. [online] Find Your Credit Union. Available at: www.findyourcreditunion.co.uk/about-credit-unions [Accessed 18 Feb. 2021].[/footnote]
  • Housing cooperatives take on a range of forms, from shared ownership of the entire asset to management of the leasehold or managing tenants’ participation in decision-making.
  • Worker cooperatives, where the entity is owned and controlled by employees.
  • Agricultural cooperatives, which might be concerned with marketing, supply of goods or sharing of machinery on behalf of members. Many agricultural cooperatives in the US are of significant size: the largest, for example, had revenues of $32 billion in 2019.[footnote]Morning AgClips (2021). A snapshot of the top 100 agricultural cooperatives. [online] morningagclips.com. Available at: www.morningagclips.com/a-snapshot-of-the-top-100-agricultural-cooperatives [Accessed 18 Feb. 2021].[/footnote] These cooperatives are formed to address a market power imbalance created by small producers and large distributors or buyers – power asymmetries that are also experienced by individuals in the data ecosystem.

The estimated three million cooperatives subscribe to a series of cooperative values and principles.[footnote]More information available at: www.ica.coop/en/cooperatives/cooperative-identity and International Cooperative Alliance (2017) The Guidance Notes on the Cooperative Principles. Available at: www.ica.coop/en/media/library/research-and-reviews/guidancenotes- cooperative-principles [Accessed 18 Feb. 2021].[/footnote] Values typically include self-help,self-responsibility, democracy, equality, equity, solidarity, honesty and transparency, social responsibility and an ethics of care.[footnote]For example, there have been a number of experiments in using cooperative forms to manage data equitably, especially in the area of healthcare. See Blasimme, A., Vayena, E. and Hafen, E. (2018). Democratizing Health Research Through Data Cooperatives. Philosophy & Technology, [online] 31(3), pp.473–479. Available at: https://doi.org/10.1007/s13347-018-0320-8 [Accessed 18 Feb. 2021]; Hafen, E. (2019). Personal Data Cooperatives – A New Data Governance Framework for Data Donations and Precision Health. Philosophical Studies Series, pp.141–149. Available at: https://doi.org/10.1007/978-3-030-04363-6_9 [Accessed 18 Feb. 2021].[/footnote] Fundamental cooperative characteristics include: voluntary and open membership, democratic member control (one member, one vote), member benefit and economic participation (with surpluses shared on an equitable basis), and autonomy and independence.[footnote]See International Cooperative Alliance, Facts and figures and Cooperatives UK (2017). Simply Legal. [online] Available at: www.uk.coop/sites/default/files/2020-10/simply-legal-final-september-2017.pdf [Accessed 18 Feb. 2021].[/footnote]

Cooperatives in the UK: characteristics and legal structures

According to Co-operatives UK[footnote]Co-operatives UK is a network for thousands of co-operative businesses with a mission to grow the co-operative economy. More information available at: www.uk.coop/about [Accessed 18 Feb. 2021].[/footnote] there are more than 7,000 independent cooperatives in the UK, operating in all parts of the economy and collectively contributing £38.2 billion to the British economy.[footnote]See Co-operatives UK (2021), Understanding co‑ops. [online]. Available at: www.uk.coop/about/what-co-operative [Accessed 18 Feb. 2021].[/footnote]

UK law does not provide a precise definition of a cooperative, nor is there a prescribed legal form that a cooperative must take. According to Co-operatives UK, a cooperative in the UK can generally be taken to be any organisation that meets the ICA’s definition of a cooperative and espouses the cooperative values and principles set out in the Statement on the Cooperative Identity.[footnote]Co-operatives UK (2017) Simply Legal.[/footnote] This status can be implemented via many different unincorporated and incorporated legal forms. Deciding which one is best will depend on a number of case-specific factors, including the level of liability members are willing to expose themselves to, and the way members want the cooperative to be governed.

A possible, and seemingly obvious, choice of legal form is registering as a cooperative society under the Co-operative and Community Benefit Societies Act 2014.[footnote]See: Co-operative and Community Benefit Societies Act 2014. [online] Available at: www.legislation.gov.uk/ukpga/2014/14/contents [Accessed 18 Feb. 2021].[/footnote] This Act consolidated a range of prior legislation and helped to clarify the legal form for cooperative societies in the UK (different rules apply for registration of a credit union under the Credit Unions Act 1979). Subsequent guidance from the Financial Conduct Authority (FCA) on registration, and the Charity Commission on share capital withdrawal allowances, have further clarified and codified the regulatory regime for cooperative societies. In particular, to register as a cooperative society under the Act, it must be a ‘bona fide co-operative society’. The Act however does not precisely define what is included as a bona fide co-operative society. In its guidance, the FCA adopted the definition in the ICA’s Statement on the Cooperative Identity and says it considers it an indicator that the condition for registration is met where the society puts the values from the ICA’s Statement into practice through the principles set out in the Statement.[footnote]See Financial Conduct Authority (2015) Guidance on the FCA’s registration function under the Co-operative and Community Benefit Societies Act 2014, Finalised guidance 15/12 [online]. Available at: www.fca.org.uk/publication/finalised-guidance/fg15-12.pdf[/footnote]

The cooperative society form is widely used by all types of cooperatives. Registration under the 2014 Act imposes a level of governance through a society’s rules and a level of transparency through certain reporting requirements that has some common ground with Companies Acts requirements for other types of organisations.

However, as noted above, this is not the only legal form available for a cooperative, and alternative legal forms that can be used include a private company limited by shares and a private company limited by guarantee. For a more detailed exploration of the options Co-operatives UK has published guidance,[footnote]Co-operatives UK (2017) Simply Legal.[/footnote] and has a ‘Select-a-Structure’ tool on its website.[footnote]See Co-operatives UK (2018), Support for your co‑op. [online]. Available at: www.uk.coop/developing-co-ops/select-structure-tool [Accessed 18 Feb. 2021].[/footnote]

Cooperatives and data stewardship

For the purposes of this report we see data cooperatives as cooperative organisations (whatever their legal form) that have as their main purpose the stewardship of data for the benefit of their members, who are seen as individuals (or data subjects).[footnote]Depending on the type of cooperative, members of a cooperative can also be SMEs, enterprises, different types of individuals or groups or a combination of these. For more information see Co-operatives UK (2018), Types of Co-ops. [online]. Available at: www.uk.coop/understanding-co-ops/what-co-op/types-co-ops [Accessed 18 Feb. 2021].[/footnote] This is in contrast to stewardship of data primarily or exclusively for the benefit of the community at large.
Under the Co-operative and Community Benefit Societies Act 2014,if the emphasis is to benefit a wider community then the appropriate legal form would be a community benefit society.

As for cooperative societies, other legal forms could also be used to achieve the same aims and deciding which is best will depend on a number of case-specific factors. However, that is not to say that a cooperative whose aim is to benefit its members might not also benefit wider society – we will see examples later (e.g. Salus Coop) where members’ benefits are also intended to benefit wider society. Indeed, where members see the wider benefits as their own priorities (as with philanthropic giving), the distinction between members’ benefits and social benefits may be hard to discern.

In a data cooperative, those responsible for stewarding the data act in the context of the collective interests of the members and – depending on how the cooperative is governed – may have to advance the interests of all members at once, and/or achieve consensus over whether an action is allowed.

The stewardship of data may be (and with increasing tech adoption is increasingly likely to be) a secondary function to the main purpose of a cooperative. For example, if the cooperative is enabled by technology, such as through the use of a social media platform, then it will routinely produce data that it may be able to capture. If so, this data might be of use to the cooperative’s own operations in future. Some of these groups have been described as social machines.[footnote]Shadbolt, N., O’Hara, K., De Roure, D. and Hall, W. (2019). The Theory and Practice of Social Machines. Lecture Notes in Social Networks. Cham: Springer International Publishing. Available at: https://www.springer.com/gp/book/9783030108885[/footnote]

Examples of areas where valuable data may be produced are medical applications, interest groups, such as religious or political groups, fitness, wellbeing and self-help groups, particularly including the quantified self movement, and gaming groups. While questions around the management and use of data produced by cooperatives through their ordinary business will become increasingly important (as with other types of organisations that produce data as part of their business) this is not our focus here.

Data cooperatives versus data commons

 

In their collaborative, consensual form, data cooperatives are similar to data commons. A commons is a collective set of resources that may be: owned by no one; jointly owned but indivisible; or owned by an individual with others nevertheless having rights to usage (as with some types of common land). Management of a commons is typically informal, via agreed institutions and social norms.[footnote]For a richer discussion on governing the commons see Ostrom, E. (2015). Governing the Commons. Cambridge: Cambridge University Press.[/footnote]

 

The distinction between commons and cooperatives is blurred; one possible marker is that a commons is an arrangement where the common resource is undivided, and the stakeholders all have equal rights, whereas in a cooperative, the resources may have been owned by the members and brought into the cooperative. The cooperative therefore grows or shrinks as resources are brought in or out as members join or leave, whereas the commons changes organically, and its stakeholders use but do not contribute directly to the resources.

 

In the case of data, the cooperative model would imply that data was brought to and withdrawn from the cooperative as members joined and left. A data commons implies a body of data whose growth or decline would be independent of the identity and number of stakeholders.

 

The governance of commons can provide sustainable support for public goods,[footnote]Ostrom, E. (2015) Governing the Commons. Available at: https://doi.org/10.1017/CBO9781316423936[/footnote] and data commons are often written and theorised about.[footnote]Grossman, R. (2018). A Proposed End-To-End Principle for Data Commons. [online] Medium. Available at: https://medium.com/ @rgrossman1/a-proposed-end-to-end-principle-for-data-commons-5872f2fa8a47 [Accessed 18 Feb. 2021].[/footnote] However, as this report is focused on existing examples of practice, in this respect it is difficult to identify actual paradigms of data commons (either intended as such, or merely as institutions whose governance happens to meet Ostrom’s principles).[footnote]See Ada Lovelace Institute (2020). Exploring principles for data stewardship. [online] www.adalovelaceinstitute.org. Available at: www.adalovelaceinstitute.org/project/exploring-principles-for-data-stewardship [Accessed 18 Feb. 2021] and Ostrom, E. (2015) Governing the Commons.[/footnote] Hence, while data commons may possibly be an exciting way forward, and while there are indeed some domains where a commons approach might be appropriate (such as OpenStreetMap and Wikidata), the prospects of their emergence from the complex legal position surrounding data at the time of writing are not strong, so will not be discussed further in this report.

Examples of cooperatives as stewards of data

For the purpose of this report, data cooperatives are seen as cooperative organisations (irrespective of their legal form) that have as their main purpose the stewardship of data for the benefit of its members. This section focuses on examples from the data cooperative space, sharing remarks on governance, approach to data rights and sustainability. Although they take different legal forms (particularly as they are not all UK-based projects) all are working along broadly cooperative principles.

1. Salus Coop

Salus Coop is a non-profit data cooperative for health research (referring not only to health data, but also lifestyle-related data more broadly, such as data that captures the number of steps a person takes in a day), founded in Barcelona by members of the public in September 2017. It set out to create a citizen-driven model of collaborative governance and management of health data ‘to legitimize citizens’ rights to control their own health records while facilitating data sharing to accelerate research innovation in healthcare’.[footnote]See Salus Coop (n.d.). Home. [online] SalusCoop. Available at: www.saluscoop.org [Accessed 18 Feb. 2021].[/footnote]

Governance: Salus have developed a ‘common good data license for health research’ together with citizens through a crowd-design mechanism,[footnote]More information available at: Salus Coop (2020). TRIEM: Let’s choose a better future for our data. [online] SalusCoop. Available at: www.saluscoop.org/proyectos/triem [Accessed 18 Feb. 2021].[/footnote] which it describes as the first health data-sharing license. The Salus CG license applies to data that members donate and specifies the conditions that any research projects seeking to use the member data must adhere to.[footnote]The terms of the licences are available at Salus Coop (2020). Licencia. [online]. Available at: www.saluscoop.org/licencia [Accessed 18 Feb. 2021].[/footnote] The conditions are:

  • health only: the data will only be used for biomedical research activities and health and/or social studies
  • non-commercial: research projects will be promoted by entities of general interest, such as public institutions, universities and foundations
  • shared results: all research results will be accessible at no cost maximum privacy: all data will be anonymised and unidentified before any use
  • total control: members can cancel or change the conditions of access to their data at any time.

Data rights: Individual members will have access to the data they’ve donated, but Salus will only permit third-party access to anonymised data. Salus describes itself as committed to ensuring, and requires researchers interacting with the data to ensure, that: individuals have the right to know under what conditions the data they’ve contributed will be used, for what uses, by which institutions, for how long and with what levels of anonymisation; individuals have the right to obtain the results of studies carried out with the use of data they’ve contributed openly and at no cost; and any technological architecture used allows individuals to know about and manage any data they contribute.

Note therefore that Salus meets the definition of a data cooperative, as it provides clear and specified benefits for its members – specifically a set of powers, rights and constraints over the use of their personal health data – in such a way as to also benefit the wider community by providing data for health research. Some of these powers and rights would be provided by GDPR, but Salus is committed to providing them to its members in a transparent and usable way.

Sustainability of the cooperative: Salus has run small-scale studies since 2016, and promotes itself as being about to generate ‘better’ data for research (in relation, for example, to surveys), creating ‘new’ datasets (such as heartbeat data generated through consumer wearables) and ‘more’ data than other approaches. However, the cooperative’s approach to sustainability is unclear. In June 2021, it aims to publicly launch CO3 (Cooperative COVID Cohort), a project stream to help COVID-19 research,[footnote]More information available at: Salus Coop (2020). Co3. [online]. Available at: www.saluscoop.org/proyectos/co3 [Accessed 18 Feb. 2021].[/footnote] and it aims to capture a fraction of the value generated by providing data for researchers to sustain itself.

2. Driver’s Seat

Driver’s Seat Cooperative LCA (‘Driver’s Seat’)[footnote]See Driver’s Seat Cooperative (n.d). Home. [online]. Available at: www.driversseat.co [Accessed 18 Feb. 2021].[/footnote] is a driver-owned cooperative incorporated in the USA in 2019,95 with ambitions to help unionise or collectivise the gig economy. It helps gig-economy workers gain access to work-related smartphone data and get insight from it:

it is ‘committed to data democracy … [and] empowering gig workers and local governments
to make informed decisions with insights from their rideshare data.’

The Driver’s Seat app, available only in the US, allows on-demand drivers to track the data they generate, and share it with the cooperative, which can then aggregate and analyse it to produce wider insights. These are fed back to members, enabling them to optimise their incomes. Driver’s Seat Cooperative also collects and sells mobility insights to city agencies to enable them to make better transportation-planning decisions. According to the website, when ‘the Driver’s Seat Cooperative profits from insight sales, driver-owners receive dividends and share the wealth’.

One issue here, unexplored on the website, is that in the ride-hailing market, in geographically limited areas, drivers may indeed have common interests, but they are also in competition with each other for rides. Access to data could also open up job allocation to scrutiny, something that is concerning drivers in the UK, where a recent complaint against Uber has been brought by drivers who want to see how algorithms are used to determine their work, on the basis that this could be allowing discriminatory or unfair practices to go unchecked.[footnote]See OpenCorporates (2021). Salus Coop. [online] opencorporates.com. Available at: https://opencorporates.com/companies/us_ co/20191545590 [Accessed 18 Feb. 2021].[/footnote]

Governance: Driver’s Seat Cooperative is an LCA or Limited Cooperative Association in the US, so will be governed by the legislation and rules associated with this type of entity. It is not obvious from the website what the terms and conditions are for becoming a member of the cooperative and how it is democratically controlled.

Data rights: Driver’s Seat is headquartered outside the jurisdiction of the GDPR. A detailed privacy notice sets out how Driver’s Seat collects and processes personal data from its platform, which includes its website and the Driver’s Seat app.[footnote]See Driver’s Seat Cooperative (2020). Privacy notice [online]. Available at: www.driversseat.co/privacy [Accessed 18 Feb. 2021].[/footnote] By accessing or using the platform the user consents to the collection and processing of personal data according to this notice.

Sustainability of the cooperative: Driver’s Seat is a very new cooperative and a graduate of the 2019 cohort of the start.coop accelerator programme in the US.[footnote]See Start.Coop (2019), Cohort report 2019. [online] Available at: https://start.coop/wp-content/uploads/2019/12/Start. coop_2019Report.pdf [Accessed 18 Feb. 2021].[/footnote] PitchBook reports that it secured $300k angel investment in August 2020.[footnote]See PitchBook (n.d.), Driver’s Seat Cooperative Company Profile: Valuation & Investors. [online] Available at: https://pitchbook.com/ profiles/company/251012-17 [Accessed 18 Feb. 2021].[/footnote] According to its website, Driver’s Seat sells mobility insights to city agencies, which is doubtless at least part of its plan for long-term sustainability. It is not obvious from the website if there is any further investment requirement from the driver-owners of the cooperative above and beyond sharing their data. The app itself is free.

3. The Good Data (now dissolved)

The Good Data Cooperative Limited (‘The Good Data’)[footnote]More information available at: TheGoodData (n.d). Home. [online]. Available at: www.thegooddata.org [Accessed 18 Feb. 2021].[/footnote] was a cooperative registered in the UK that developed technology
to collect, pool, anonymise (where possible) and sell members’ internet browsing data on their own terms, to correct the power imbalance between individuals and platforms (selling ‘on fair terms’).[footnote]For more information see: Nesta (n.d.). The Good Data. [online] Nesta. Available at: www.nesta.org.uk/feature/me-my-data-and-i/thegood- data/ [Accessed 18 Feb. 2021].[/footnote] Members participated in The Good Data by donating their browsing data through this technology, so that the cooperative could trade with it anonymously enabling the cooperative to raise funds to cover costs and fund charities.[footnote]See Partial Amendment to Rules dated 18 July 2017, filed at the FCA: https://mutuals.fca.org.uk/Search/Society/26166 [Accessed 18 Feb. 2021].[/footnote]

As with Salus Coop, The Good Data provided benefits for members while simultaneously promising potential benefits for the wider community (and indeed many of those wider benefits would also be reasons for members to join).

Governance: The Good Data was registered as a cooperative society under the Co-operative and Community Benefit Societies Act 2014, and accordingly was subject to the requirements of that Act and had to be governed according to its rules filed with the FCA. The Good Data determined which consumers should receive the data, and made decisions about what to sell and how far to anonymise on a case-by-case basis. It declined to collect data from ‘sensitive’ browsing behaviour, which included looking at ‘explicit’ websites, as well as health-related and political sites.[footnote]For more information see Nesta (n.d.). The Good Data.[/footnote] According to The Good Data’s last annual return filed at the FCA,[footnote]See Annual Return and Accounts dated 31 December 2018 filed at the FCA: https://mutuals.fca.org.uk/Search/Society/26166 [Accessed 18 Feb. 2021].[/footnote] The Good Data had three directors. Members had online access to all relevant information and based on that could present ideas or comments in the online collaboration platform at any time. Members could also participate in improving existing services and an Annual General Meeting was held once a year.

Data rights: It is hard to say what rights were invoked here. If the data has been anonymised, it is no longer personal data under the GDPR. If the data is likely to be re-identifiable or to be attributed to an individual, then the data is pseudonymised (and thus still personal data).

Sustainability of the cooperative: Revenue was generated from the sale of anonymised data to data brokers and other advertising platforms, and the profits redistributed, to maintain the system, and for social lending in developing countries. Decisions about the latter were determined by cooperative members. However, the model proved not to be sustainable, as its website announces the dissolution of the cooperative: ‘we thought that the best way to achieve our vision was by setting up a collaborative and not for profit initiative. But we failed to pass through the message and to attract enough members.’ The Request to Cancel filed at the FCA[footnote]See Request to Cancel dated 6 September 2019 filed at the FCA: https://mutuals.fca.org.uk/Search/Society/26166 [Accessed 18 Feb. 2021].[/footnote]
also indicated that this was due to Google rejecting The Good Data’s technology, which was intended to allow members to gain ownership of their browsing data from its Chrome Webstore, and being unable to build a new platform to pursue this objective given the required technical complexity and lack of sufficient human and financial resources.

Created with similar intentions, Streamr[footnote]Konings, R. (2019). Join a data union with the Swash browser plugin. [online] Medium. Available at: https://medium.com/streamrblog/ join-a-data-union-with-the-surf-streamr-browser-plugin-d9050d2d9332 [Accessed 18 Feb. 2021].[/footnote] advocates for the concept of ‘data unions’ and seeks to create financial value for individuals by creating aggregate collections of data in a similar way, including focusing on web browser data – it’s unclear whether this effort will prove more sustainable than The Good Data.

Problems and opportunities addressed by data cooperatives

From the examples surveyed above, data cooperatives appear mostly concerned with personal data (as opposed to non-personal data) and, in general, are directed towards giving members more control over data they generate, which in turn can be used to address existing problems (including social problems) or open up new opportunities. This is very much in line with the purpose of the cooperative model generally. For example, Salus Coop allows members to control the use of their health data, while opening up new opportunities for health research. The Good Data was aimed at giving data subjects more control and bargaining power with respect to data platforms, to get a better division of the economic benefits. Unionising initiatives, such as Driver’s Seat, have focused largely but not exclusively on the gig economy, and using data to empower workers and enable them to optimise their incomes and working practices.

Many data cooperatives seek to repurpose existing data at the discretion of groups of people, to create new cooperatively governed data assets. In this respect, they tend to pursue a positive agenda that uses data as a resource. For example, Driver’s Seat brings in data from sources such as rideshare platforms and sells mobility insights based on this data, sharing profits among members. The Good Data’s business model was to trade anonymised internet browsing data. Some data cooperatives do also seek to refactor the relationship between organisations that hold data and individuals who have an interest in it. The Good Data’s technology to collect internet browsing data was also designed to give members using it more privacy by blocking data trackers.

See also RadicalxChange’s proposal in Annex 3, which contains elements of all three legal mechanisms presented in this report. Described as a conceptual model, it would shake up the status quo even more by making corporate access to data subject’s data the cooperative decision of a Data Coalition.

Although privacy is usually a feature they respect, it is hard to find data cooperatives intending to preserve privacy as a first priority, through limiting the data that is collected and processed. Indeed, this is rather a negative aim, constraining the use of data, rather than pursuing a positive agenda and opening up a new purpose for the data.

More often data anonymisation techniques and privacy-preserving technologies are referred to, however these areas require research and investment,[footnote]Royal Society (2019). Protecting privacy in practice: The current use, development and limits of Privacy Enhancing Technologies in data analysis. [online] Royal Society. Available at: https://royalsociety.org/-/media/policy/projects/privacy-enhancing-technologies/ privacy-enhancing-technologies-report.pdf?la=en-GB&hash=862C5DE7C8421CD36C105CAE8F812BD0 [Accessed 18 Feb. 2021].[/footnote] especially given the legal uncertainty as to what it takes for companies to anonymise data in the light of the GDPR, and the complexity of the task of anonymisation itself, which requires a thorough understanding of the environment in which the data is held.[footnote]For a more detailed discussion see: UK Anonymisation Network (2020). Anonymisation Decision-Making Framework. [online] Available at: https://ukanon.net/framework [Accessed 18 Feb. 2021].[/footnote]

Examples that we have surveyed could be said to recognise the balance between 1) complete privacy and 2) the potential benefits to the individual from collecting and processing personal data and communicating the insights to the individual, and 3) in those individuals then being able to better influence the market and receive a better division of the economic benefits (e.g. through selling the data and/or insights).

Challenges

The cooperative approach appeals to a sense of data democracy, participation and fair dealing that may inform and shape the structuring of any data-sharing platform but, in themselves, cooperatives face a number of challenges:

1. Uptake

While the examples we’ve analysed represent experimentation around data cooperatives, there doesn’t appear to be significant uptake and use of them, and little evidence that they will scale to steward significant amounts of data within a particular geography or domain. This is perhaps unsurprising, given a number of challenges to uptake, as cooperatives require motivated individuals to come together and actively participate by:

  • recognising the significance of the problem a cooperative is trying to solve (resonance challenge)
  • being interested enough to find or engage with a data cooperative as a means to solve the problem (mobilisation challenge)
  • trusting a particular cooperative and its governance as the best place to steward data (trust challenge)
  • being data literate enough to understand the implications of different access permissions, and/or willing to devote time and effort to managing the process. Because cooperatives presume a role for voluntary members and rely on positive action to function, this is more likely to work in circumstances where all participants
    are suitably motivated and willing to consent to the terms of participation (capacity challenge).

The examples surveyed offer some insights into how these elements of the uptake challenge could be met. A strong common incentive could be enough to meet the mobilisation challenge by employing bottom-up attempts to create data cooperatives. For example, Driver’s Seat could use the interest and perceived injustice among gig-economy workers in their working conditions and pay to build an important worker-owned and controlled data asset. If endorsed or even delivered by trusted institutions such as labour unions this could further enhance uptake.

Other examples, such as The Good Data, were aiming to mobilise people around the concept of correcting a power imbalance between individuals and platforms. In a similar vein, the aim of the RadicalxChange model (discussed further in Annex 3) works at the level of power imbalance, with an added requirement for legislative change to make their data coalitions possible and reduce the market failure of data.[footnote]In RadicalxChange’s view, data fails because most of the information we have at our disposal (about ourselves and others) is largely the same as information others have at their disposal. The price is dragged down to zero as buyers can always find a cheaper seller for the same data. However, data’s combined value, which is higher than zero, is almost entirely captured by the (well-capitalised) parties that have capacity to combine data and extract insights. Because of this market failure, which is peculiar to data, RadicalxChange believes that top-down intervention is needed to make bottom-up organisation possible through Data Coalitions. Through the right type of legislation, the problem of buy-in for joining data coalitions would be removed, because joining would be costless or virtually costless and immediately advantageous or remunerative. RadicalxChange is discussed as a conceptual model in Annex 3.[/footnote]

Such a top-down approach could create challenges not too far removed from the issues that many data cooperatives seek to address, such as around the selected default sharing and processing options that the data would be subject to, and the abilities of people to opt-out or switch. Relying on individual buy-in for success may never move the needle, without more of a purpose or affiliation to coalesce around. Changing the world for the better is more abstract and often less motivating than changing one’s particular corner of it for one’s (and others’) benefit.

These uptake challenges are not unique to cooperatives and are experienced by many other data-stewardship approaches that focus on empowering individuals in relation to their personal data. However, potentially, the features of a cooperative approach to data stewardship could themselves hinder the uptake and scalability of a data cooperative
initiative. These are discussed next.

2. Scale

There are additional features of cooperatives that may make this approach unsuitable for large-scale data-stewardship initiatives:

a. Democratic control and shared ownership

The cooperative model presumes shared ownership. The implied level of commitment may be an asset to the organisation, but may similarly make it hard for the model to scale if everyone wants their say.

The cooperative model also favours democratic control. Depending on how the cooperative is established and governed, the democratic control of cooperatives could be too high a burden for all but the most motivated individuals, limiting its ability to scale. Alternatively, where a cooperative has managed to scale, this approach could become too unwieldy for a cooperative to effectively carry out its business in a nimble and timely fashion.

Democracy and ownership also need to be balanced by a constitution. It may aim for equal say for members (one member, one vote), or alternatively it may skew democratic powers toward those members with more of a commitment (e.g. based on the amount of data donated).
Questions need to be resolved about what members vote for – particular policies, or simply for an executive board. Can the latter restriction, which will lead to more efficient decision-making, still enable individual members to feel the commitment to the cause that is needed to meet
the mobilisation challenge? If, on the other hand, members’ votes feed directly into policy, can the cooperative sustain sufficient policy coherence to meet the trust challenge?

b. Rights, accountability and governance

To establish and enforce rights and obligations, a cooperative needs to be able to use additional contractual or corporate mechanisms, and this requires members to engage and understand their rights and obligations. This is particularly important where data is concerned, given legal duties under legislation such as the Data Protection Act 2018, which implements the GDPR in the UK.

Cooperatives can create a large audience of members who can demand accountability and these members may be exposed themselves to personal liability, with associated challenges to manage potential proliferation of claims and fear of unjust proceedings.

Cooperatives may establish high levels of fiduciary responsibility but do not inherently determine particular governance standards or establish clear management delegation and discretion. Registration under the Co-operative and Community Benefit Societies Act 2014
imposes a level of governance that partially echoes the greater body of legislation applicable to registered companies under the Companies Acts. Registration as a company under the Companies Acts will import a broader array of governance provisions.

With respect to data, governance is a particularly sensitive requirement, especially as a cooperative scales. If a cooperative ended up holding a large quantity of data, this may become extremely valuable as network effects kick in. The cooperative would certainly need a level of professionalism in its administration to prosper, especially if its mission required it to negotiate with large data consumers, such as social networks. Moreover, the overarching governance of the administrators of the cooperative would need to be addressed. For example, there could be a data cooperative board with each individual having ownership shares in the cooperative based on the data contributed (which in turn would need a quasi-contractual model to define the role of the board and its governance role regarding data use).

Failure of governance may also leave troves of data vulnerable, if the proper steps have not been taken. In one recent incident, a retail cooperative venture in Canada called Mountain Equipment Co-Op was sold to an American private-equity company from underneath its five million members, after years of poor financial performance (losing CAD$11 million in 2019), with the COVID-19 pandemic as the last straw. The board felt that the sale was the only alternative to liquidation, although the decision was likely to be challenged in court.[footnote]Cecco, L. (2020). Members of Canada’s largest retail co-op seek to block sale to US private equity fund. [online] The Guardian. Available at: www.theguardian.com/world/2020/sep/22/canada-mountain-equipment-co-op-members-bid-block-sale-us-firm [Accessed 18 Feb. 2021].[/footnote] This case throws up data issues specifically – does the buyer get access to data
about the members, for example? But the main point is that a data cooperative managing a large datastore effectively and securely might well have to endure significant costs (e.g. for security), and will need a commensurate income.

If that income could not be secured, could the cooperative members prevent the sale of the cooperative – and therefore the data – to a predator? Under UK law, the assets of a cooperative should be transferred, at least in some circumstances, to a ‘similar’ body or organisation with similar values if and when it is wound up. Sometimes even an asset lock can be involved under Community Interest Company law. The extent of legal restraint on the disposal of the assets of the cooperative will of course depend on how it is defined and incorporated, and the sensitivity of the data should be reflected in the care with which the fate of the data is constrained. There may be legal protections, but it is still worth pointing out that the very existence of the data cooperative, as a single point of access to the data, may represent a long-term vulnerability.

c. Financial sustainability

Cooperatives do not easily lend themselves to development funding other than grant aid or pure philanthropy. In combination with the mobilisation challenge this suggests financial sustainability is likely to be a significant issue.

One problem this creates for many cooperatives is that they have to fall back on internally generated resources (i.e. donated by the members). Without a substantial and sustainable income, a cooperative will find it difficult to recruit capable managers and administrators, and so will be forced to form committees selected from the membership. Without capable managers, a cooperative will be less able to generate income and manage resources effectively, and, for example, will be less able to raise external capital because of a low rate of expected return.

These factors constrain the scope for a cooperative to mature and operate in a commercial environment when compared with other models.

Mechanisms to address the challenges

The cooperative structure has longstanding heritage and diverse application, as demonstrated by the examples we have analysed, and ready appeal because of the inherent assumptions of common economic, social and cultural purpose. It is a natural mechanism by which an enterprise can be owned by people with a common purpose and managed for the benefit of those who supply and use shared services.

Recognising the challenges identified above that are inherent in a cooperative structure, we observe that cooperatives often rely on contract or incorporation to establish rights, obligations and governance, and either route might be selected as the preferred form while still
seeking to capture some of the essence of a cooperative through stated purpose, rights, obligations and oversight. However, neither is perfect – or, put another way, each, by diluting the cooperative ideal, may reintroduce some of the challenges that the cooperative model
was designed to address. These mechanisms are:

  • The contractual model, where all rules for the operation of the data platform should be set down in bilateral (or multilateral) agreements between data providers and data users. This, when combined with the fact that each party would need to take action on its own behalf to enforce the terms of these agreements against any counterparties,
    imposes a burden on participants to negotiate agreements and encourages participants to negotiate specific terms. It therefore has limited utility and is restricted to relatively limited groups of participants of similar sophistication, and may be vulnerable to the mobilisation or the capacity challenge.
  • The corporate model, often adopted in the form of a company limited by guarantee to underpin a cooperative, to achieve what a contractual model offers with additional flexibility, scalability and stability that is lacking from that model. This model may run into the trust challenge, however. In conceptual terms, data providers are being
    asked to give up a degree of control over the data they are providing in return for the inherent flexibility, scalability and stability of the structure. They will only do so if they feel they can trust the structure or organisation that has been set up to effect this, which can be offered via a combination of clear stated purpose of the institution, the
    reporting and accountability obligations of its board and an additional layer of oversight by a guarantor constituted to reflect the character of participants and charged with a duty to review and enforce due performance by the board. In time that might be supplemented by a suitably constituted, Government-sponsored regulator.

Although there is currently no obstacle in the way of data cooperatives – the law is in place, the cooperative model well-established – we can see a number of challenges to uptake, growth, governance and sustainability. The problem is rendered doubly hard by the fact that some of the challenges pull in different directions. For instance, the capacity challenge might be met by a division of labour, hiving off certain decision-making and executive functions, but then this might lead to the emergence of the trust challenge as the board’s decisions come under scrutiny. Failure to meet the mobilisation challenge could result in the members being as alienated from the stewardship of their data by the data cooperative as they were by other more remote corporate structures, but addressing the mobilisation challenge might lead to an
engaged set of members developing hard-to-meet expectations about the level of involvement they could aspire to, consistent with streamlined decision-making.

 

Mock case study: Greenfields High School

 

Greenfields High School and other educational facilities are interested in coordinating educational programs to meet the needs of their learners and communities in a way that complements and strengthens school programmes. All educational institutions use online educational tools to tailor learning plans for improving student performance and see a real opportunity to better serve their community through data sharing.

 

Greenfields High School proposes to the other educational boards to convene and explore the idea of pooling resources together for achieving these goals. They all have a shared interest in working together to gain better insights as to how they might improve educational outcomes for their community members.

 

In an act of good governance, educational facilities consult with their students, parents and teachers, and together they develop the rules and governance of the cooperative:

  • Members of the community vote on the collaborative agreement between
    educational facilities and decide what data can be shared and for what
    purposes. The agreement is transparent about what data is collected, stored,
    processed and how it is used.
  • The schools gain better understanding of the effectiveness of online tools
    and educational plans throughout the learning cycle.
  • Where educational programmes are developed for the community based
    on analysed data, members also decide on the price thresholds for such
    educational services.

How would a data cooperative work?

 

A data cooperative is set up, pulling together the data educational facilities have from using digital technologies. Schools maximise their aims in comparing performance and understanding what digital tools are more effective. Students have a direct say into how their data is used and decide on the management and organisation of the cooperative.

Chapter 3: Corporate and contractual mechanisms

How corporate and contractual mechanisms work
How corporate and contractual mechanisms work

Corporate and contractual mechanisms can create an ecosystem of trust where those involved:

  • establish a common purpose
  • share data on a controlled basis
  • agree on structure (corporate or contractual).

Why corporate and contractual mechanisms?

Corporate and contractual mechanisms can facilitate data sharing between parties for a defined set of aims or an agreed purpose. For the purposes of this report, it is envisaged that the overall purpose of a new data model will be to achieve more than mere data sharing, and
data stewardship can be used to generate trust between all the parties and help overcome relevant contextual barriers. The core purpose for data sharing will be wider than just the benefit gained by those who make use of data.

The role of the data model we envisage therefore includes:

  • enabling data to be shared effectively and on a sustainable basis
  • being for the benefit of those sharing the data, and for wider public benefit
  • ensuring the interests of those with legal rights over data
  • ensuring data is ethically used and in accordance with the rules of the institution
  • ensuring data is managed safely and securely.

How to establish the right approach?

The involvement of an independent data steward is envisaged as a means of creating a trusted environment for stakeholders to feel comfortable sharing data with other parties who they may
not necessarily know, or with whom they have not had an opportunity to develop a relationship of trust.

Incentives for allowing greater access to data and for making best use of internal data will vary according to an individual organisation’s circumstances and sector. While increased efficiency, data insights, improved decision making, new products and services and getting value from data are potential drivers, there are also a number of challenges to sharing data:

  • operating in highly competitive or regulated sectors, and concerns about undermining value in IP and confidential information
  • a fear of being shown up as having poor-quality or limited data sets
  • a fear of breaching commercial confidentiality, competition rules or GDPR
  • a lack of knowledge of business models to support data sharing – access to examples, lessons learned and data sharing terms can help others feel able to share
  • a lack of understanding of the potential benefits
  • not knowing where to find the data or limited technical resource to implement (e.g. to extract the data and transform it into appropriate formats for ingestion into a data-sharing platform)
  • fear of security and cybersecurity risks.

All these challenges can lead to inertia and lack of motivation.

Where a group of stakeholders see benefits in coming together to share data they will still need to be confident that this is done in a way that maintains a fair equilibrium between them, and that no single stakeholder will dominate the decision as regards the management and sharing of data. In order to establish and maintain the confidence of the stakeholders, they should all be fully engaged in the determination of what legal mechanism should be established. One or two stakeholders deciding and simply imposing a structure on other stakeholders is unlikely to engender a sense of trust, confidence and common purpose.

It is for this reason that we recommend the following approach.

1. Establish a clearly defined purpose

Establishing a clearly defined purpose is the essential starting point for stakeholders. Not only will a compelling statement of purpose engender trust among stakeholders, but it will also provide the ultimate measure against which governance bodies and stakeholders can check to ensure that the data-sharing venture remains true to its purpose. A clearly defined purpose can also help in assessing compliance with certain principles of the GDPR and other data-related regulations, including ePrivacy,[footnote]Directive 2002/58/EC of the European Parliament and of the Council of 12 July 2002 concerning the processing of personal data and the protection of privacy in the electronic communications sector (Directive on privacy and electronic communications). Available at: https://eur-lex.europa.eu/legal-content/EN/ALL/?uri=CELEX%3A32002L0058 [Accessed 18 Feb. 2021].[/footnote] or Payment Services Directive 2,[footnote]Directive 2015/2366 of the European Parliament and of the Council of 25 November 2015 on payment services in the internal market. Available at: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32015L2366 [Accessed 18 Feb. 2021].[/footnote] which are often tested against the threshold of whether a data-processing activity, or a way in which it is carried out, is ‘necessary’ for a particular purpose or objective.

Any statement of purpose will need to be underpinned by agreement on:

  • the types of data of which the data-sharing venture will take custody or facilitate sharing
  • the nature of the persons or organisations who will be permitted access to that data
  • the purpose for which they will be permitted access to that data
  • the data-stewardship model and governance arrangements for overseeing the structure and processing, including to enforce compliance with its terms and facilitate the exercise of rights by individuals, and to ensure that data providers and data users have adequate remedies if compliance fails.

2. Data provider considerations

The data-sharing model has to be an attractive proposition for the intended data providers, with clear value and benefit, and without unacceptable risk. There will need to be strong and transparent governance to engender the level of confidence required to encourage data sharing. This will include confidence in not only the data provider’s ability to share the data with the data-sharing venture without incurring regulatory risk or civil liability, but also in its ability to recoup losses from the data-sharing set up or from relevant data users if the governance fails and this results in a liability for the data provider. Other considerations for governance could be related to managing intellectual property rights and control over products developed based on the data shared.

3. Data user considerations

As with the data providers, the data-sharing model must be an attractive proposition for intended data users. The data will need to be of sufficient quality (including accuracy, reliability, currency and interoperability) and not too expensive, for the data users to want to participate. Data users will also require adequate protection against unlawful use of data. For example, in relation to personal data, data users will typically have no visibility of the origins of the datasets and the degree of transparency (or lack of it) provided to the underlying data subjects. They will also be relying on the data providers’ compliance with the governance model to ensure that use of the contributed datasets will not be a breach of third-party confidentiality or IP rights.

4. Data steward considerations

The data steward’s role is to make decisions and grant access to data providers’ data to approved data users in accordance with the purpose and rules of the data-sharing model. The steward may take on additional responsibilities such as due diligence on data providers and users, and enforcement of the purpose of the data-sharing model; however, the way in which the model is funded and structured[footnote]Article 11 and Recital 25 of the draft Data Governance Act include requirements for data-sharing services to be placed in a separate legal entity. This is required both in business-to-business data sharing as well as in business-to-consumer contexts where separation between data provision, intermediation and use needs to be provided. The text does not distinguish between closed or open groups.[/footnote] will impact on the extent of any such duties and who is practically responsible for performing them.

The responsibilities of the data steward may impact on considerations for the data providers and data users of the overall impact on risk and developing trust in the relationships.

5. Relationship/legal personality

The formal relationship between the parties will depend on the previous steps and the project structure that the stakeholders are comfortable with, based on the relevant risk, economic, regulatory and commercial considerations. Where there is no distinct legal personality, the
relationship may be governed by a series of contracts between the data providers, users and data steward – whether bilateral or a contract club with multiple parties. Where there is a legal personality, then as well as there being likely to be a series of contracts, there will be the documents establishing the relevant legal entity.

6. The rules

The rules of the data-sharing model will form part of the corporate and/or contractual relationship between stakeholders. This is discussed in more detail below in the ‘Mapping data protection requirements onto a data sharing venture’ section and in Annex 1 on ‘Existing mechanisms for supporting data stewardship’ when discussing regulatory mechanisms.

What is the appropriate legal structure?

As outlined, the aim is to design an ecosystem of trust. The data stewardship model will sit at the heart of this ecosystem. In this section we address two broad possibilities as to the legal form this should take:

  • a contractual model: this would involve a standardised form of data sharing
    agreement without the establishment of any form of additional legal structure or personality
  • a corporate model: this would involve the establishment of a company or other legal person, which would be responsible for various tasks relating to the provision of access to and use of data. The documents of incorporation would be supplemented by contractual arrangements.

In the contractual model, all of the rules for the operation of the data venture would need to be set down (and repeated) in a series of bilateral (or multilateral) agreements between data providers and data users. This, when combined with the fact that each party would need to take action on its own behalf to enforce the terms of that agreement against any counterparties, makes it likely that providers of data will only be willing to provide access to data on highly specific terms.

Where the aims of the stakeholders will require significant flexibility and scalability then a simple contractual model may not be the most appropriate. For example, a contractual model does not easily accommodate dedicated resources which may be required to govern and administer a growing data-sharing establishment (such as full-time employees, for which an employing entity is required). Also, an independent entity may find it easier to vary the rules of participation, or make other changes for the benefit of all, as the model evolves or laws change. Whereas a multilateral contractual arrangement may require protracted negotiation amongst the various stakeholders who each bring their own commercial objectives to the discussion.

In the corporate model, there is a degree of flexibility and scalability that is lacking from the contractual model. This model requires a greater degree of trust on the part of stakeholders, however. In conceptual terms, data providers are being asked to give up a degree of control over the data they are providing – presumably in return for some incentive or reward. They will only do so if they feel they can trust the structure or organisation that has been set up to effect this.

We consider three forms of company here: a company limited by shares, a company limited by guarantee (a CLG) and a community interest company (a CIC).

Whichever form is chosen, the company in question would operate as the data-platform owner and manager, and would enter into contractual arrangements with providers of data and proposed users.

The contractual terms would allow for:

  • required investment in the company to fund infrastructure requirements such as platform development and maintenance – this could be by way of non-returnable capital contribution or loan from either the data provider or data users as circumstances merit required returns on supply of data
  • required charges for use of the data
  • other contractual rights and obligations specific to the circumstances including access to and usage of data.

Returns and charges could be related to commercial exploitation or fixed. Also, depending on the nature of the venture, data users may be obliged to share insights gained from access to the data with the venture so that it can be shared with other data users (e.g. see the Biobank example below). The contract terms would dictate all required obligations and liabilities between the contracting parties. Bear in mind that the structure of a data-sharing venture could be adapted over time. For example, at the outset, the stakeholders may not be in a position to finance the establishment and resourcing of a corporate entity, or it may not be seen as appropriate to a data-sharing trial. As the venture scales, however, the stakeholders may determine that a corporate structure should be implemented.

1. Choice of corporate form

One of the key questions that will determine the appropriate form of company, is whether the data-sharing venture is intended to be able to make a profit other than for the benefit of its own business – i.e. whether profits are required to be applied to the furtherance of its business,
or whether surplus profits may be dividended up to the data-sharing venture’s shareholders.

CLGs are not usually used as a vehicle for a profit-making enterprise, and a CLG’s articles of association will often (but not always) prohibit or restrict the making of distributions to members. Any profits made by a CLG will generally be applied to a not-for-profit cause such as the data-sharing venture’s purpose.

A CLG may be the most appropriate vehicle where it is not envisaged that profit or surplus generated will be distributed to its members; and it is not envisaged that the institution will seek to raise debt or equity finance. In this case activities will need to be financed by other means,
such as revenue generated from its own activities including the provision of data services or third-party funding. If the focus changes over time to encompass more commercial activities, then establishing a trading subsidiary company limited by shares could also be considered.

It should be borne in mind that a CLG (unlike a company limited by shares) does not have share capital that it is able to show on its balance sheet. This often makes it more difficult for a CLG to raise external debt finance. The alternative possibility available to companies limited
by shares, of investment by way of equity finance, is precluded here because of the structure of the CLG. Because of these difficulties, it is worth drawing attention to CICs as a further alternative corporate vehicle.

A CIC is a limited-liability company that has been formed specifically for the purpose of carrying on a business for social purposes, or to benefit a community. Although it is a profit-making enterprise, its profits are largely applied to its community purpose rather than for private gain. This is achieved by way of a cap on any movements of value from the CIC to its shareholders or members (such as by way of dividends).

This model allows shareholders to share in some of the profit, while ensuring that the CIC continues to pursue its community purpose. CICs are regulated by the Office of the Regulator of CICs (the CIC Regulator), and are required to file a community interest statement at Companies House, which is also scrutinised by the CIC Regulator. The CIC’s share
capital would appear on its balance sheet, thus increasing its ability to raise external finance.

If surpluses generated by its activities (including the provision of data services) are to be applied to its business, and its financing arrangements are secure, then a CLG will likely assist in gaining traction with those stakeholders who believe that the independence of the data trust would be compromised by virtue of its ability to pay dividends to shareholders. The structure of a company limited by guarantee provides a well-established framework of governance and liability management, and avoids the risk of exposure to a proliferation of liabilities that exists
in shareholding and trust environments.

A guarantor, which could be a non-government organisation (NGO) or other suitably established and populated body, could be appointed to monitor compliance and governance. This could address the requirement for oversight in a way that is specific to the requirements of the platform and data supplier, and to subjects not easily undertaken by other pre-established bodies, such as the Charity Commission or Regulator of Community Interest Companies, neither of which is specifically equipped to perform this function.

2. Governance and rules

The agreed purpose for the data-sharing venture will drive the overall governance of the data arrangement and its objectives, the rules for its operation and the parameters for all data-sharing agreements entered into. That purpose and those objectives should be reflected (including, where appropriate, as binding obligations) in its governance framework,
rules and the contractual framework governing the provision and use of data.

While governance and rules are not necessarily made public documents, the greater the degree of transparency as to the data venture’s operations, the greater the level of confidence that stakeholders and the wider public will be likely to feel in its functioning. Strong and transparent governance is a critical factor in establishing trust to encourage data
sharing. The rules and governance framework will underpin the purpose. Confidence that strong governance will ensure strict compliance with the rules of the trust and enforcing any failings is critical.

There needs to be confidence that the interests of all key stakeholders are represented. In a corporate model, there are a number of means of achieving this that may include board representation and/or a mix of decision-making and advisory committees representing the various interest groups. Boards and committees that are made up of trusted, respected independent members will also help engender confidence.

Depending on the circumstances and scale of the data-sharing venture, as well as an overall Governance Board, there may be an Operations Committee, a Funding Risk Advisory Committee, an Ethics Committee, a Technical Committee and a Data Committee. Alternatively, committees might be set up to represent different groups of stakeholders e.g. data providers, data users and data subjects.

With the contractual model, it would also be possible to constitute an unincorporated governance body, such as a board that comprises representatives of the stakeholders, together with some independent members who have relevant expertise. However, one can foresee potential practical difficulties with governance bodies that are more ad hoc and decentralised, including generating sufficient trust for data providers and users to submit to the jurisdiction of the body via the contractual arrangements.

3. Documentation

The documentation will need to cover the constituent parts that make up the data-sharing venture and also, if the contractual model is adopted, how these will be constituted from among the stakeholders. Participants will need to sign up to the rules of the venture, either as a stand-alone document, or by incorporation into the operational agreements, such as a data-provision agreement or data-use agreement, or the articles of a corporate vehicle. The exact contracting arrangements will be bespoke to the specific arrangement. If the venture is intended to enable additional participants to join, there will also need to be robust
arrangements (e.g. through accession agreements) to avoid re-execution of multilateral arrangements for each new joiner.

The common agreement could prescribe the arrangement in broad terms, the nature of the data that will be collected; the identity or class of the persons or organisations with whom it will be shared; and the uses to which such persons or organisations will be entitled to put that data.
It can address leaver/joiner bases,[footnote]In order to improve the chances of participation, and where technically feasible, the exit arrangements for leavers should focus on the ability of a participant to leave the venture and remove their data. This respects the data sovereignty of the participant and enables them to remain in control of data, particularly important for personal data as participants will be conscious of their obligations under GDPR.[/footnote] due diligence, terms that underpin certain values or principles, for example the five data-access ‘control dimensions’ commonly referred to as the ‘Five Safes’.[footnote]The ‘Five Safes’ comprise: safe projects, safe people, safe data, safe settings and safe outputs. Ritchie, F. (2017). The “Five Safes”: a framework for planning, designing and evaluating data access solutions. [online] Zenodo. Available at: https://zenodo.org/ record/897821 [Accessed 18 Feb. 2021].[/footnote] Or, in the context of personal data, the core principles contained in Article 5 of the GDPR, change approval, the financial model for the operation of the club, dispute resolution, etc.

As mentioned above, the framework documents would need to cover the purpose of the venture and the type(s) of data in issue, along with the identity of persons or entities, or types of those that may be granted access, and the use to which they may put that data.

In addition, the documents will need to cover other important areas, such as:

  • technical architecture
  • interoperability
  • decision-making roles
  • the obligations of each participant and how any monitoring or audit of data use, particularly in respect of personal data will take place
  • information security.

There will inevitably be other areas that the rules should also cover.

Key legal considerations include data protection and privacy law; regulatory obligations or restrictions; commercial confidentiality; intellectual property rights; careful consideration of liability flows (particularly important if personal data is in issue), competition and external contractual obligations. As will be seen from some of the examples (detailed in the section below), such as iSHARE, it is possible to utilise existing standard documents to cover off some of the key issues, rather than developing everything from scratch. For example, existing open-source licences could be used to protect intellectual property rights of the data providers and control data usage, bolstered by data-sharing arrangements specific to the venture.

As regards the nature of the data and its use in specific circumstances, the data providers may want to share data on a segregated and controlled basis. This means there will not be access to overall aggregated data, but there may be layered access or access to a limited number of aggregated datasets to reflect any restrictions on sharing of some data (e.g. certain data only to be shared with certain users or shared for specific insights/activities). In some instances there may be agreement to pool datasets between parties. The following requirements may be set:

  • each contributor would provide raw data/datasets that include but are not limited to personal data, and that data could include normal personal data as well as special category/sensitive personal data
  • no contributor would see all the raw data provided by the other contributors[footnote]As part of the stewardship model, one of the protections should be only the data needed for an activity is accessed by other participants/stakeholders.[/footnote]
  • each contributor would want to be able to analyse, and to derive data and insights from aggregate datasets, without being able to identify individuals or confidential data in the datasets
  • individuals whose data is shared in this way would have the usual direct rights under data protection law in relation to the processing of their personal data.

Mapping data protection requirements onto a data-sharing venture

Where the data-sharing venture will involve processing of personal data, it will of course be necessary for all data providers, users and others processing personal data to comply with the GDPR (see in Annex 1 some of the key GDPR considerations). Depending on the nature of
the legal structure, there will be contractual terms and also potentially a Charter/Code of Conduct or Rulebook setting out the obligations of the data providers and data users including those relating to the GDPR. In some sectors, these may incorporate by referencing internationally recognised standards for data sharing, rather than completely reinventing the wheel.[footnote]An example is the Rules of Participation used by Health Data Research UK (HDR UK). Organisations requesting data access from one of the hubs set up through HDR UK (including the INSIGHT hub) are required to commit to these rules, which reference published standards. See Health Data Research UK (2020). Digital Innovation Hub Programme Prospectus Appendix: Principles For Participation. [online]. Available at: www.hdruk.ac.uk/wp-content/uploads/2019/07/Digital-Innovation-Hub- Programme-Prospectus-Appendix-Principles-for-Participation.pdf [Accessed 18 Feb. 2021].[/footnote]

It will be necessary for each stakeholder who processes data (whether they are a data controller, joint data controller or data processor) to ensure they are compliant with GDPR requirements. This will be determined by the individual circumstances and a particular stakeholder may well be a data controller in some regards and a joint data controller
in others. Similarly, a stakeholder may be a data controller as regards some processing and a data processor in relation to others.

Privacy-enhancing technologies (PETs) are increasingly being advocated as a means to help ensure regulatory compliance and the protection of commercially confidential information more generally. For example, technologies facilitating pseudonymisation, access control and
encryption of data (in transit and at rest) and more sophisticated PETs such as differential privacy and homomorphic encryption. This is an area of development with some already mature market offerings and others still undergoing significant development.

Examples of data-sharing initiatives with elements of data stewardship

1. The Data Sharing Coalition

The Data Sharing Coalition is an international initiative started in January 2020, after the Dutch Ministry of Economic Affairs and Climate Policy invited the market to seek cooperation in pursuit of cross-sectoral data-sharing.[footnote]See The Data Sharing Coalition (n.d.) Home. [online]. Available at: https://datasharingcoalition.eu.[/footnote] It ‘builds on existing data-sharing initiatives to enable data sharing across domains. By enabling multilateral interoperability between existing and future data-sharing initiatives with data sovereignty as a core principle, parties from different sectors and domains can easily share data with each other, unlocking significant economic and societal value.’

It aims to foster collaboration between a wider range of stakeholders, providing a platform for structured exchange of knowledge in the data-sharing coalition community.[footnote]The Data Sharing Coalition published an exploration on standards and agreements for enabling data sharing. See Data Sharing Coalition (2021). Harmonisation Canvas [online]. Available at: https://datasharingcoalition.eu/app/uploads/2021/02/210205- harmonisation-canvas-v05-1.pdf[/footnote] It plans to explore and document generic data-sharing agreements which it will capture in a Trust Framework governed by the Coalition. It will support the development of existing and new data-sharing initiatives, including around technical standards, data semantics, legal agreements, and trustworthy and reusable digital identities.

Principles

The Data Sharing Coalition has six core principles:

  1. Be open and inclusive: any interested party is welcome to participate in the Data Sharing Coalition.
  2. Deliver practical results: the Data Sharing Coalition will deliver functional frameworks and facilities that provide true value for all stakeholders of the data economy and that will help them accelerate in their data sharing context.
  3. Promote data sovereignty: the Data Sharing Coalition aims to enable the entitled party(ies) to control their data by including this as a requirement in the use cases and frameworks.
  4. Leverage existing building blocks: all Data Sharing Coalition frameworks and facilities will incorporate international open standards, technology and other existing facilities where possible.
  5. Utilise collective governance: all frameworks and facilities produced by the Data Sharing Coalition will be governed in a transparent, consensus-driven manner by a collective of all Data Sharing Coalition participants.
  6. Be ethical, societal and compliant: all activities of the Data Sharing Coalition are in line with societal values and compliant with relevant legislation.

Approach

It has two initial use cases:

  • green mortgages for investment in energy-saving measures
  • improving risk management for shipment insurance.

Members

The Data Sharing Coalition currently has about 30 member participants including: iSHARE, IDSA, MAAS Lab, Equinix, NLAI Coalition, Amsterdam University: Connect2Trust, Dexes, ECP, Equinix, FOCWA, Fortierra, GO FAIR, HDN, International Data Spaces Association, iSHARE, KPN, Maas-Lab, MedMij, Nederlandse AI Coalitie, NEN, Netbeheer Nederland, Nexus, NOAB, Ockto, Roseman Labs, SAE ITC, SBR, SURF, Sustainable Rescue, TanQyou, Techniek Nederland, Thuiswinkel.org, Universiteit van Amsterdam, UNSense, Verbond van Verzekeraars and Visma Connect.

2. iSHARE

iSHARE is a Dutch Transport and Logistics Trust Framework for data sharing and was developed as part of the Government-backed Data Sharing Coalition.[footnote]See Support Centre for Data Sharing (2020). iSHARE: Sharing Dutch transport and logistics data. [online] Support Centre for Data Sharing. Available at: https://eudatasharing.eu/examples/ishare-sharing-dutch-transport-and-logistics-data [Accessed 18 Feb. 2021].[/footnote]

It is a decentralised model, where parties maintain control of what data will be shared with whom and on what conditions/for what purpose. iSHARE is not a platform but a framework. INNOPAY co-created the iSHARE framework with about 20 organisations (customs, ports, logistics, etc). It has only the list of participants and the fact that they have agreed to and demonstrated conformance with operational, technical and legal specifications; so it deals with identification, authentication and access. The idea is that an accession agreement removes the need for separate bilaterals.

It doesn’t appear to involve any data stewardship in the sense of a trusted third party being given control of what data is shared, for what purpose and with whom.

iSHARE is trying to facilitate info on or access to various agreement terms to choose from. The website has a 50-page document setting out typical agreement terms for data sharing and then links to 10–15 sets of licences, and a table for each one setting out which of those typical
terms that particular licence covers.[footnote]Support Centre for Data Sharing (2019). Report on collected model contract terms. [online]. Available at: https://eudatasharing.eu/ sites/default/files/2019-10/EN_Report%20on%20Model%20Contract%20Terms.pdf [Accessed 18 Feb. 2021].[/footnote] The aim is to have 50 sets of terms during 2020. Currently, the licence agreements include Creative Commons, Google API Licence, Montreal, ONS, Open Banking, NIMHDA, Apache, CDLA – (copyleft Linux), Open Database Copyleft, Swedish API Open Source, Microsoft Data Use Agreement and Norwegian Open Data. Currently about 20 organisations are participants.

3. Amsterdam Data Exchange

AMDEX was initiated by the Amsterdam Economic Board and was backed by Amsterdam Science Park and Amsterdam Data Science.[footnote]For more information see Amsterdam Smart City (2020). Amsterdam Data Exchange [online]. Available at: https://amsterdamsmartcity.com/updates/project/amsterdam-data-exchange-amdex [Accessed 18 Feb. 2021].[/footnote] The project is supported by the City of Amsterdam.

Vision

‘The Amsterdam Data Exchange (in short: Amdex) aims to provide broad access to data for researchers, companies and private individuals. Inspired by the Open Science Cloud of the European Commission, the project is intended to connect with similar projects across Europe.
And eventually even become part of a global movement to share data more easily.’

Amdex’s CTO, Ger Baron is quoted as follows: ‘Since 2011, the municipality have had an open data policy. Municipal data is from the community and must therefore be available to everyone, unless privacy is at stake. In recent years we have learned to open up data in different
ways… We want to share data, but under the right conditions. This requires a transparent data market which is exactly what the Amsterdam Data Exchange can offer.’

The owner decides which data can be shared with whom and under what conditions. They build a ‘market model in which everyone is able to consult and use data in a transparent, familiar manner.’ [footnote]Ibid.[/footnote]

4. INSIGHT: The Health Data Research Hub for Eye Health

INSIGHT is a collaboration between University Hospitals Birmingham NHS Foundation Trust (lead institution), Moorfields Eye Hospital NHS Foundation Trust, the University of Birmingham, Roche, Google and Action Against AMD.

INSIGHT’s objective is to make anonymised, large-scale data, initially from Moorfields Eye Hospital and University Hospitals Birmingham, available for patient-focused research to develop new insights in disease detection, diagnosis, treatments and personalised healthcare.

Access to the datasets curated by INSIGHT is through the Health Data Research Innovation Gateway. Applications to access the data will be reviewed by INSIGHT’s Chief Data Officer and then passed to the Data Trust Advisory Board (Data TAB). The Data TAB is formed of members of the public, patients and other stakeholders joining in a private capacity.
Applications will be accepted or rejected in a transparent manner and applicants will need to sign strict licensing agreements that prioritise data security and patient benefit.

Currently the governance of INSIGHT is managed through the Advisory Board but at the recent ODI Data Institutions event, it is anticipated that a company Limited by Guarantee may be created.

5. Nallian for Cargo

Nallian is a common infrastructure for data sharing between commercial sectors.[footnote]For more information see Nallan (2020). Home. [online] Available at: www.nallian.com [Accessed 18 Feb. 2021].[/footnote] Nallian for Air Cargo is a set of applications built on top of Nallian’s Open Data Sharing Platform. The platform allows all stakeholders of a cargo community to connect and share relevant data across their processes, resulting in de-duplication and a single version of the truth for the benefit of airport operators, ground handlers, freight forwarders, shippers, etc. Each data source stays in control of who sees which parts of his data for which purpose. Example communities include Heathrow, Brussels and Luxembourg (e.g. Heathrow Cargo Cloud).[footnote]For more information see Heathrow (2020). Cargo. [online] Available at: www.heathrow.com/company/cargo [Accessed 18 Feb. 2021].[/footnote]

6. Pistoia Alliance

The Pistoia Alliance’s mission is to lower barriers to R&D innovation by providing a legal framework to enable straightforward and secure pre-competitive collaboration.[footnote]For more information see Pistoia Alliance (2020). About. [online]. Available at: www.pistoiaalliance.org/membership/about [Accessed 18 Feb. 2021].[/footnote] The Alliance is a global, not-for-profit members’ organisation conceived in 2007 and incorporated in 2009 by representatives of AstraZeneca, GSK, Novartis and Pfizer, who met at a conference in Pistoia, Italy.

The Pistoia Alliance’s projects help to overcome common obstacles to innovation and to transform R&D – whether identifying the root causes of inefficiencies, working with regulators to adopt new standards, or helping researchers implement AI effectively. There are currently more than 100 member companies – ranging from global organisations, to medium enterprises, to start-ups, to individuals – collaborating as equals on projects that generate value for the worldwide life sciences community.

7. Biobanks

Biobanks collect biological samples and associated data for medical-scientific research and diagnostic purposes and organise these in a systematic way for use by others.[footnote]For more information see UK Biobank (2020). Home. [online]. Available at: www.ukbiobank.ac.uk [Accessed 18 Feb. 2021].[/footnote] The UK Biobank is a registered charity that had initial funding of circa £62 million. Its aim is to improve the prevention, diagnosis and treatment of a wide range of serious and life-threatening illnesses such as cancer, heart disease and dementia.

UK Biobank was established by the Wellcome Trust medical charity, Medical Research Council, Department of Health, Scottish Government and the Northwest Regional Development Agency. It has also had funding from relevant charities. UK Biobank is supported by the National Health Service (NHS). Researchers apply to access its resources. The resource
is available to all bona fide researchers for all types of health-related research that is in the public interest. Researchers submit an application explaining what data they would like access to and for what purpose. The website provides summaries of funded research and academic papers.

Researchers have to pay for access to the resource on a cost-recovery basis for their proposed research, with a fixed charge for initiating the application review process and a variable charge depending on how many samples, tests and/or data are required for the research project.

  • UK Biobank remains the owner of the database and samples, but will have no claim over any inventions that are developed by researchers using the resource (unless they are used to restrict health-related research or access to health-care unreasonably).
  • Researchers granted access to the resource are required to publish their findings and return their results to UK Biobank so that they are available for other researchers to use for health-related research that is in the public interest.

The personal information of those joining the UK Biobank is held in strict confidence, so that identifiable information about them will not be available to anyone outside of UK Biobank. Identifying information is retained by UK Biobank to allow it to make contact with participants when required and to link with their health-related records. The level of access that is allowed to staff within UK Biobank is controlled by unique usernames and passwords, and restricted on the basis of their need to carry out particular duties.

8. Higher Education Statistics Agency

The Higher Education Statistics Agency (HESA) is the body responsible for collecting and publishing detailed statistical information about the UK’s higher education sector.[footnote]For more information see HESA (2020). About. [online] Available at: www.hesa.ac.uk/about[/footnote] It acts as a trusted steward of data that is made available and used by public-sector bodies including universities, public-funding bodies and the new Office for Students.

HESA was set up by agreement between funding councils, higher education providers and Government departments. It is a charitable company operating under a statutory framework and it is a recognised data source for ‘statistical information on all aspects of UK higher
education’.[footnote]HESA (2017). HE representatives comment on consultation on designated data body [online] hesa.ac.uk. Available at: www.hesa.ac.uk/news/19-10-2017/consultation-designated-data-body [Accessed 18 Feb. 2021].[/footnote] It was confirmed as a designated data body (DDB) for Higher Education in England in 2018.[footnote]See HESA (2020). Designated Data Body. [online]. Available at: www.hesa.ac.uk/about/what-we-do/designated-data-body [Accessed 18 Feb. 2021].[/footnote]

HESA collects, assures and disseminates higher education data on behalf of specific public bodies e.g. Department for Business, Energy and Industrial Strategy (DBEIS), Department for Education (DfE), Office for Students (OfS), UK Research & Innovation (UKRI) and its counterparts in the rest of the UK. As DDB, it compiles appropriate information about higher education providers and courses and makes this available to OfS, UKRI and the Secretary of State for Education. It consults as to the information it publishes with providers, students
and graduate employers. OfS holds HESA to account, reporting on its performance every three years.

HESA provides a trusted source of information, supporting better decision making, and promoting public trust in higher education. In addition, it is driven by the wider public purpose of advancing higher education in the UK.

It deploys statistical and open-data techniques to transform and present higher education data. It looks to develop low-cost techniques to improve quality and efficiency of data collection, and aims to ensure as much data as possible is open and accessible to all.

HESA may charge cost-based fees, operating on a subscription basis.

9. Safe Havens Scotland NHS Trusts for Patient Data

Safe Havens were developed in line with the Scottish Health Informatics Programme (SHIP), a blueprint that outlined a programme for a Scotland-wide research platform for the collation, management, dissemination and analysis of anonymised Electronic Patient Records(EPRs).[footnote]Scottish Government (2015). Charter for Safe Havens in Scotland: Handling Unconsented Data from National Health Service Patient Records to Support Research and Statistics. [online] www.gov.scot. Available at: www.gov.scot/publications/charter-safe-havensscotland- handling-unconsented-data-national-health-service-patient-records-support-research-statistics/pages/3 [Accessed 18 Feb. 2021].[/footnote] The agreed principles and standards to which the Safe Havens are required to operate are set out in the Safe Haven Charter. They aim to get funding research from grants.

The Safe Havens provide a virtual environment for researchers to securely analyse data without the data leaving the environment. Their data repositories provide secure handling and linking of data from multiple sources for research projects. They also provide research support, bringing together teams around health data science. The research coordinators provide support to researchers navigating the data requirements, permissions landscape and provide a mechanism to share the lessons from one project to the next. Users are researchers who are vetted and approved. Data is never released, and personal data cannot be sold. Together, the National Safe Haven within Scottish Informatics Linkage Collaboration (SILC)[footnote]For more information see Data Linkage Scotland (2020). Home. [online ] Available at: www.datalinkagescotland.co.uk [Accessed 18 Feb. 2021].[/footnote] and the four NHS Research Scotland (NRS) Safe Havens have formed a federated network of Safe Havens in order to work collaboratively to support health informatics research across Scotland.

All the Safe Havens have individual responsibility to operate at all times in full compliance with all relevant codes of practice, legislation, statutory orders and in accordance with current good professional practice. Each Safe Haven may also work independently to provide advice and assistance to researchers as well as secure environments, to enable health informatics research on the pseudonymised research datasets they create. The charter and the network facilitate collaboration between the Safe Havens by ensuring that they all work to the same principles and standards.

Problems and opportunities addressed by corporate and contractual mechanisms

Many organisations have started to explore data sharing via the use of contracts, and this model is already used in practice. The complexity of the governance model will vary depending on whether the relationships involved are one-to-one or multi-party data-sharing arrangements and whether there are singular use cases or multiple uses for the same type of purpose. Where the tools of use such as machine learning or AI become part of the agreement, further consideration is needed for defining the architecture of the legal mechanisms involved.

Multi-party and multi-use scenarios using corporate and contractual mechanisms will need to ensure an independent governance body is able to function within the structure. The role of the specific parties involved in the data ecosystem, their responsibilities, qualifications and potential competing interests will need to be considered and balanced. A difficult question emerges where the stewardship entity is absent. In this scenario, who would be the data steward that a contract could be entered into with? For example, an oversight committee composed of representatives of data users and providers could be established, but this would not be a legal entity with an ability to contract.

Other requirements that will need thoughtful consideration, as they have been mentioned throughout this chapter, are connected to the privacy and security of the data, the retention and deletion policy, and restrictions on use and onward transfers and rules of publication of
results or research.

To conclude, a series of steps need to be walked through with stakeholders to reach an agreed decision about the model to be employed. Concrete use cases are more likely to generate tangible and efficient mechanisms for the sharing of data, than vague overarching
statements of general purpose. The key element here is stakeholder engagement and the more engagement that can be encouraged at the design stage – in terms of purpose, structure and governance – the more likely it is that a data-sharing venture institution will succeed.

Case study: The Social Data Foundation

Brief overview

The Social Data Foundation[footnote] Boniface, M., Carmichael, L., Hall, W., Pickering, B., Stalla-Bourdillon, S. and Taylor, S. (2020). A Blueprint for a Social Data Foundation: Accelerating Trustworthy and Collaborative Data Sharing for Health and Social Care Transformation. [online] Available at: https://southampton.ac.uk/~assets/doc/wsi/WSI%20white%20paper%204%20social%20data%20foundations.pdf [Accessed 18 Feb. 2021][/footnote] aims to improve health and social care by accelerated access to linked data from citizens, local authorities and healthcare providers through the creation of an innovative trustworthy and scalable data-driven health and social-care ecosystem overseen by independent data stewards (i.e. the Independent Guardian).[footnote]The Independent Guardian is defined as follows: ‘A team of experts in data governance, who are independent from the Social Data Foundation Board and oversee the administration of the Social Data Foundation to ensure it achieves its purposes in accordance with its rulebook i.e. that all data related activities realise the highest standards of excellence for data governance. In particular, the Independent Guardian shall (i) help set up a risk-based framework for data sharing, (ii) assess the use cases in accordance with this risk-based framework and (iii) audit and monitor day-to-day all data-related activities, including data access, citizen participation and engagement.’ See Boniface, M. et al. (2020) A Blueprint for a Social Data Foundation.[/footnote] This new data institution takes a socio-technical approach to governing collaborative and trustworthy data linkage – and endeavours to support multi-party data sharing while respecting societal values endorsed by the community. Members of the Social Data Foundation will include the Southampton City Council, the University Hospital Southampton NHS Foundation Trust and the University of Southampton. Flexible membership is envisaged in order to allow other organisations to join and the institution to grow.

Governance

A key strength of the Social Data Foundation lies in its socio-technical approach to data governance, which necessitates a high-level of interdisciplinarity and strong stakeholder engagement from the outset (i.e. from the initial stages of design and development). This initiative therefore brings together a multi-disciplinary team of clinical and social-care practitioners with data governance, health data science, and security experts from ethics, law, technology and innovation, web science and digital health.

The Social Data Foundation builds on the data foundations governance framework[footnote] Stalla-Bourdillon, S., Wintour, A. and Carmichael, L. (2019). Building Trust Through Data Foundations: A Call for a Data Governance Model to Support Trustworthy Data Sharing. [online] Available at: https://cdn.southampton.ac.uk/assets/imported/transforms/ content-block/UsefulDownloads_Download/69C60B6AAC8C4404BB179EAFB71942C0/White%20Paper%202.pdf [Accessed 18 Feb. 2021]. The Social Data Foundation is an example of a functional data foundation – for more information see: StallaBourdillon, S., Carmichael, L., & Wintour, A. (Forthcoming). Fostering trustworthy data sharing: Establishing data foundations in practice. Data & Policy; Stalla-Bourdillon, S., Carmichael, L., & Wintour, A. (2020, September). Fostering Trustworthy Data Sharing: Establishing Data Foundations in Practice. Data for Policy Conference 2020, Available at: http://doi.org/10.5281/zenodo.3967690. [Accessed 18 Feb. 2021].[/footnote] developed by the Web Science Institute at the University of Southampton (UK) and Lapin Ltd (Jersey), which includes robust governance mechanisms together with strong citizen representation. Foundations laws are a source of inspiration for the data foundations governance framework. Two particular jurisdictions of interest are the Bailiwicks of Jersey and Guernsey (the Channel Islands) where the role of the guardian is a unique requirement, and is peculiar to these types of structures, which in a data governance model gives rise to independent data stewardship.[footnote]Note that all foundations incorporated under Jersey foundations law must have a guardian.[/footnote]

Data rights

The Social Data Foundation will not only empower citizens to co-create and participate in health and social care systems transformation, but to exercise their data-related rights. As a trusted third party intermediary (TTPI) that facilitates shared data-analysis projects, the Social Data Foundation will provide a centralised hub for citizens and their datarelated requests in relation to a wide range of data (re)usage activities. Agreements will govern relationships between all stakeholders.

The Social Data Foundation will promote adequate data protection and security – and will carry out a risk assessment for each shared data analysis project before any data is shared. Data providers will only share de-identified data as part of the Social Data Foundation. Each of the parties will undertake not to seek to reverse or circumvent any such de-identification of data. Where the Social Data Foundation provides a dynamic linking service[footnote]Dynamic linking service is understood as where two or more sources of health and social care data are brought together on demand according to the specific parameters of an authorised data user’s query where the risk of re-identification is both evaluated before and after data linkage, and mitigated through assurance processes facilitated by the Data Foundation.[/footnote] for authorised data users and data at rest remains within data providers’ premises, citizens are better empowered to exercise their rights over data linkage activities and oppose, restrict, or end their participation as part of the processing activities.

Case study: Emergent Alliance

Brief overview

The Emergent Alliance initiative was launched in April 2020 with the aim to aid societal recovery post COVID-19.[footnote]For more information see Emergent Alliance (n.d). Home. [online]. Available at: https://emergentalliance.org [Accessed 18 Feb. 2021][/footnote] Its objectives are to use data in order to accelerate global economic recovery in response to the outbreak, to make available datasets in the public domain and to develop secure data-sharing systems and infrastructure.[footnote]See Emergent Alliance (2020). Articles of Incorporation, p. 16. Available at: https://find-and-update.company-information.service.gov. uk/company/12562913/filing-history [Accessed 18 Feb. 2021].[/footnote]

The Emergent Alliance operates as a not-for-profit voluntary community made out of corporations, individuals, NGOs and government bodies that ‘contribute knowledge, expertise, data, and resources to inform decision making on regional and global economic challenges to aid societal recovery.’[footnote]See Emergent Alliance (n.d.). Frequently Asked Questions. Available at: https://emergentalliance.org/?page_id=440 [Accessed 18 Feb. 2021][/footnote]

The Emergent Alliance operates as a not-for-profit voluntary community made out of corporations, individuals, NGOs and government bodies that ‘contribute knowledge, expertise, data, and resources to inform decision making on regional and global economic challenges to aid societal recovery.’[footnote]See Emergent Alliance (n.d.). Frequently Asked Questions. Available at: https://emergentalliance.org/?page_id=440 [Accessed 18 Feb. 2021][/footnote] There can be different roles in this community, such as data contributors (either members of the alliance or participants in the community) making available agreed datasets to the public domain. There can be data scientists interpreting or modelling the data with resources coming from members or crowd-sourced from partners. There could also be individuals or organisations bringing or responding to domain-based problems to the alliance, contributing with datasets, data science or technical resources.

Governance

This case study is based on information from September 2020, and the Emergent Alliance’s legal structure has progressed significantly since then. Initially, the governance structure was operating on the basis of Articles of Association, and using ‘letters of intent’ from members to govern the alliance.[footnote]See Emergent Alliance (n.d), Statement of Intent. Available at: https://emergentalliance.org/?page_id=452 [Accessed 18 Feb. 2021].[/footnote] Two directors were appointed, and the structure was designed to allow different committees to be formed in order to carry out the set objectives.

Mock case study: Greenfields High School

Greenfields High School is increasingly using digital technologies to deliver
teaching materials and improve educational processes. It uses different
service providers, which are used by other schools as well. On the one hand,
Greenfields High School is interested to compare its performance with other
schools, and gain access to data and insights from its service providers. On
the other hand, Greenfields High School is interested to learn from the other
schools’ experience, and share data to understand the effectiveness of different
learning tools and methods.

 

Greenfields High School is not the only one in this situation. Other schools
using online tools are interested in the same goal: to get better insights
from the different service providers, to compare performances and to learn
from other schools about what tools are most effective for delivering better
educational outcomes. They all need data from the different service providers,
and from each other, to reach these goals, which ultimately serve the wider
public benefit of improving education. Greenfields High School proposes
to the other school leadership boards to convene and explore the idea of
working together. They also invite their service providers and start discussing
a data-sharing agreement that enables a trustworthy environment where each
party feels confident to share data with each other.

 

An independent data steward is appointed in order to ensure the proper
management of data and oversee who gets to access what type of data and
under which conditions. The data-governance framework also takes into account
the students, parents, teachers’ rights and interests. The agreement establishes
rules for:
• schools to safely and reliably exchange relevant data among themselves,
to compare their performance against that of other schools, by sharing some
types of data
• schools to share data, to understand the effectiveness of different learning
tools and methods for different educational cycles by comparing student
progress (schools keep records of educational data for all pupils for a number
of years to track progress)
• a transparent agreement about what data is collected, stored, processed
and how it is used, including rules for safeguarding students’ and parents’
rights and interests.

How would contractual mechanisms work?

Data-sharing agreements are set up with a very clear purpose in mind, and the
rules and documents could be made public to increase transparency.

 

An independent data steward is appointed and oversees data management. The
governance framework contains provisions around who will be permitted access
to data, for what purpose and under what circumstances. The governance
arrangements will include mechanisms for enforcing compliance and ensuring
that data users have adequate remedies if compliance fails.

 

The stakeholders could establish a company limited by guarantee (CLG) to fulfil
these roles with its members being participating schools – both state and private,
academies, further education bodies and data providers.

Final remarks and next steps

This report makes the first attempts to answer the question of how
legal mechanisms can help enable trustworthy data use and promote
responsible data stewardship. Trustworthy and responsible data
use are seen as key to protecting the data rights of individuals and
communities, increasing the confidence of organisational data sharing
and unlocking the benefits of data in a way that’s fair, equitable and
focused on societal benefit.

The legal mechanisms suggested in this report may offer support
for encouraging fair and trusted data sharing where individuals and
organisations retain control over the use of their data for their own
benefit, and often for wider societal good. At the same time, it is
important to highlight that responsible data stewardship should not
be equated in all circumstances with data sharing, and that responsible
data use may sometimes necessitate a decision not to share data.

Responsible data use also means robust data-governance architectures
that allow for a participatory element in taking decisions about data.
It remains to be seen whether the demand for transformation of data
practices will be driven bottom up, top down or from a mixture of both.
The mechanisms presented here may form part of the triggers that
increase the confidence of individuals to hand over the management
of their data, as well as of organisations to break data silos and
encourage beneficial uses.

As experience in the digital-platform economy demonstrates, the
commodification of data use may ultimately undermine individual or
societal interests. For this reason, it needs to be carefully considered
whether introducing financial gains for stimulating people to join
a data trust or a data cooperative would risk creating an even greater
dependency on how efficiently data is exploited, as the economic
performance of the company will translate directly into the type
of financial rewards those individuals would receive.[footnote]For a more detailed description of this failure model and others see Porcaro, K. (2020). Failure Modes for Data Stewardship. [online] Mozilla Insights. Available at: https://drive.google.com/file/d/1twxDGIBYz0TyM3yHDgA8qyf16Ltkk4V7/view [Accessed 18 Feb. 2021].[/footnote]

Extractive data practices have proven to be successful in maximising the
economic performance of some of the big technological companies on
the market, despite these problematic business models being criticised
today. Therefore, open questions remain around the incentives models
for establishing the structures presented in this report, and to what
extent such incentives can be considered empowering and truly driving
the transformation of data practices.

Importantly, in considering these alternative mechanisms, the benefits
coming out of them as institutions – rather than a relationship between
parties – is vital. As digital technologies advance and patterns of data
use shift, the rules and principles on which civic institutions are founded
can act as a stabilising force for collective good. Further exploration
is needed as to what democratic accountability would look like for
more effective control, compared to the type of control contractual
interactions offer.

Remaining challenges

A number of challenges and difficult questions have been pointed
out throughout the report, and more issues will arise from the digital
challenges that we face today. For example, while the different
mechanisms presented here imply structures that offer considerable
flexibility, further questions remain regarding how they are able to
respond in the context of the new Internet of Things ecosystem, where
data sharing is part of everyday life, in real time.

At the same time, it can be imagined that the same type of mechanism
can be seen as the solution to distinct problems. For example, there
might be groups interested in increasing the amount of data gathered,
others interests may be around increasing the amount of data shared,
or decreasing the amount of data shared.[footnote]See O’hara, K. (2020). Data Trusts[/footnote] If the same mechanism is used to respond to such different objectives, what are the potential tensions and how can they be addressed?

Moreover, there is the question of dealing with potential conflicts arising
between trusts, cooperatives, and corporate and contractual models.
These models will control overlapping data, therefore this could create potential tensions between structures of the same type (for example between different data trusts themselves), as well as between different structures (for example a data trust in a rivalrous relationship with
opposite interest from a data cooperative).

These models should not be seen as container-based models, and
important questions arise from interactions between the different types
of structures presented. For example, what types of interventions will
be needed in order to address potential conflicts between the different
structures? How will data rights be enforced when potentially combining
datasets across such structures?

This leads to questions around identifying ways in which more
granular mechanisms for data protection can be built in and how
to strengthen existing regulation. The structures presented here are
not meant as enclaves of protection, therefore a strong underlying
data protection layer is essential for preventing harm and achieving
responsible outcomes.
There is also an important conversation about how legal mechanisms
and other types of mechanisms such as technical ones (for example
data passports and others briefly described in Annex 1) might interact
or reinforce data stewardship.

Other difficult questions that need further research and consideration
would be:
• How will different privacy standards apply in certain situations,
for instance if the data is stored by a merchant located outside
of the UK (or the EU)?
• How can the challenges related to ensuring the independence
of different governance boards be addressed?
• What are the limitations for each legal mechanism presented? For
example, in a contractual model where a stewardship entity is absent,
who would be the data steward that a contract could be entered into
with? (An oversight committee composed of representatives of data
users and providers could be established, but this would not be a legal
entity with an ability to contract.)
•What are the implications for the transferability or mandatability
of GDPR rights in light of the Data Governance Act?
• Would a certification scheme similar to BCorps provide value for
certifying data stewardship structures?[footnote]BCorps are companies balancing profit gains with societal outcomes which receive a certification based on social and environmental performance, public transparency, and accountability. For more information see B Corporation (n.d.) About B Corps. [online]. Available at: https://bcorporation.net/about-b-corps [Accessed 18 Feb. 2021].[/footnote]
• Could these models be used for handling other types of assets?

On a broader scale, in the context of data sovereignty or data
nationalism, where increasing numbers of countries insist that the
personal data of their nationals be stored on servers in that jurisdiction,
the demands of data governance are likely to increase going forward. If
data contexts involve data from nationals of more than one jurisdiction,
managing data across jurisdictions would involve complex administration
requiring sufficient income to support it.

Notwithstanding the aim to facilitate trusted data sharing that results in
wider societal, economic and environmental benefit, there remains the
broader societal question of what do we want societies to do with data,
and towards which positive ambitions are we aspiring in practice?

Next steps

As observed from the list of case studies, some of the legal mechanisms
are in existence and available for immediate operation. Important lessons
can be drawn from these examples, but there remains an overarching
need for more testing, development, investment and knowledge building.
Other mechanisms such as data trusts represent a novel and unexplored
model in practice and require piloting and better understanding.

Next steps would involve practical implementation of each approach,
research and trialling and developing guidance for practitioners.
Challenges created by the global state of public health emergency from
the COVID-19 virus, as well as developments on the geopolitical side
(such as the UK leaving the European Union and new trade agreements
being discussed) and on the technological side (for example with new
data sources and new ways of data processing), trigger the need for
robust data-sharing structures where data is stewarded responsibly.

This creates an opportunity for the UK to take the lead in shaping the
emerging data-sharing ecosystem by investing in alternative approaches
to data governance. The mechanisms presented in this report offer
a starting ground for consolidating responsible and trustworthy
data management and a way towards establishing best practices
and innovative approaches that can be used as reference points
more globally.

 

 

Annexes

Annex 1

The legal mechanisms presented in this report support organisational solutions to collective action problems with data, and can be complemented by norms and rules for data stewardship and technology.

Examples of these complementarities include regulatory mechanisms, like the General Data Protection Regulation and the European Commission’s proposed Data Governance Act (which envisions data-sharing intermediaries and mechanisms for ‘data for the common good’ or data altruism).

By way of illustration, some of the key GDPR considerations that will translate into all the legal mechanisms described in this report for data providers will include:
1. ensuring that the data sharing is lawful and fair, which in addition to not being in breach of other laws, will include establishing a lawful basis under GDPR, such as:
a. the ‘legitimate interests’ basis, which requires the data provider to satisfy itself, via a three-part test and documented Legitimate
Interests Assessment, that the data-sharing is necessary to achieve legitimate interests of the data provider or a third party and that these interests are not overridden by the rights and interests of the data subjects; or
b. that the data provider has the consent of the data subjects to
share the data, which may be impractical or difficult to achieve, particularly for legacy data; to the extent that the data is ‘special category data’ (such as health data), whether one of the limited conditions for sharing such data is satisfied e.g. necessary for scientific research;
2. whether the principle of transparency has been satisfied in terms of informing data subjects of the specific disclosure of their data to, and use of their data by, the data-sharing venture;
3. whether processing of the data by the venture is incompatible with the original purposes for which the data provider collected and processed the data and thereby in breach of GDPR’s ‘purpose limitation’ principle;
4. ensuring that the shared data is limited to what is necessary
for the purposes for which the venture will process it (the ‘data minimisation’ principle);
5. ensuring that the data is accurate and where necessary kept up to date (‘accuracy’);
6. ensuring that the data will not be retained in a form that permits identification of the data subjects for any longer than necessary;
7. conducting due diligence on the data security measures established to protect data contributed to the venture;
8. ensuring that there is a mechanism in place enabling data subjects to
exercise their rights of data access, rectification, erasure, portability and right to object, including the right not to be subject to automated decision-making (‘rights’);
9. identifying any cross-border transfers of the data, or remote access to the data from outside the UK, and ensuring that such transfers or access are conducted in compliance with one of the mechanisms under GDPR; and
10. ensuring that all accountability requirements under GDPR are satisfied where appropriate, including Data Protection by Design and Default, Data Protection Impact Assessments, Appropriate Policy Document, Record of Processing Activities and mandatory
contractual requirements.[footnote]The Information Commissioner’s Office (ICO) published a draft Data Sharing Code of Practice that covers many of the above requirements, including expectations in terms of data sharing agreements. See Information Commissioner’s Office (2020). ICO publishes new Data Sharing Code of Practice (online). Available at: https://ico.org.uk/about-the-ico/news-and-events/news-andblogs/2020/12/ico-publishes-new-data-sharing-code-of-practice[/footnote]

Other complementaries could be technical mechanisms, such as Decidim, a digital platform for citizen participation[footnote]For more information see https://decidim.org[/footnote] – mechanisms that also being explored as part of the Open Data Institute programme[footnote]See Thereaux, O. and Hill, T. (2020). Understanding the common technical infrastructure of shared and open data. [online] theodi.org. Available at: https://theodi.org/article/understanding-the-common-technical-infrastructure-of-shared-and-open-data [Accessed 18 Feb. 2021][/footnote] – or the Alan Turing Institute’s framework on Data safe havens in the cloud,[footnote]Alan Turing Institute (n.d.). Data safe havens in the cloud. [online] The Alan Turing Institute. Available at: www.turing.ac.uk/research/
research-projects/data-safe-havens-cloud [Accessed 18 Feb. 2021].[/footnote] and the UK Anonymisation Network (UKAN) methodology for Data Situation Audits, part of the Anonymisation Decision-Making Framework.[footnote]See UK Anonymisation Network (UKAN), Anonymisation Decision-Making Framework.[/footnote] Together, these are the building blocks of a trustworthy institutional regime for data governance that could unlock the value of data.

There are also governance mechanisms that are starting to show what might work. For example, the participatory data governance mechanisms deployed in Genomics England[footnote]For more information see: Genomics England (n.d.) Home. [online]. Available at: www.genomicsengland.co.uk[/footnote] or The Good Data[footnote] For more information see: TheGoodData (2020). Home. Available at: www.thegooddata.org[/footnote] mean that members can participate in the decision-making process and realise the potential of good data stewardship. Furthermore, work highlighted by researchers such as Salomé Viljoen and research institutes such as the Bennett Institute for Public Policy show there are also institutional mechanisms which can be used to improve the stewardship of data.[footnote] Viljoen, S. (2020). Democratic Data: A Relational Theory For Data Governance. [online] Available at: https://doi.org/10.2139/ ssrn.3727562 [Accessed 18 Feb. 2021]; Coyle, D. et al. (2020) Valuing data.[/footnote] The rules in place, the choice of collaboration and how this translates in contractual terms constitute the ‘institutional framework’ within which organisational forms. This report speaks to the possibilities of how these organisational structures and how association take place.

Other complementaries could be codes of practice or ethical codes together with social arrangements that create pressure for abiding by the rules (e.g. being thrown out of the group and denied access to the data). For example, aside from contractual terms, different legal structures might also have a rulebook or code of conduct that sets out the obligations of the data providers and data users, including those relating to GDPR. This could form a formal code of conduct under GDPR. The UK Information Commissioner’s Office (ICO) is keen to incentivise such codes. If such a code was created in compliance with the GDPR and approved by the ICO, there is the potential to create a standard form Rulebook that could be used by other similar data models. There are however certain requirements that would need to be complied with – e.g. the Code must have a clear purpose and scope. It would have to be prepared and submitted by a body representative of the categories of the data controllers and data processors involved. The Code would need to meet the particular needs of the sector or processing activities and address a clearly identified problem. It would need to facilitate the application of GDPR and be tailored to the sector – in other words add value through clear specific solutions and go beyond mere compliance with the law. Any amendments would need to be approved by the ICO. It is also important to note the ICO’s efforts in establishing regulatory sandboxes to enable companies to test new data innovations and technologies – including data-sharing projects – in a safe and controlled environment, while receiving privacy and regulatory guidance. Such regulatory sandboxes provide an interesting tool to promote data sharing for the benefit of individuals and society, while minimizing risks to people’s privacy, security and human rights

Annex 2: EU data economy regulation

Background information

Between 1960 and 1980 public concerns around automation increased around the world. In Europe, member states were facing challenges around computerisation, predominantly in public administration, and member states started adopting different data-protection rules. The first efforts to harmonise data-protection rules began and led to the adoption of the Directive 95/46/EC (Data Protection Directive) on personal data protection, which entered into force in 1995.[footnote]Directive 95/46/EC of the European Parliament and of the Council of 24 October 1995 on the protection of individuals with regard to the processing of personal data and on the free movement of such data. Available at https://eur-lex.europa.eu/legal-content/EN/ TXT/?uri=celex%3A31995L0046[/footnote]

The two main objectives of the Data Protection Directive were to protect fundamental rights and freedoms of individuals, and to focus on the free movement of personal information as an important component of the internal market. Therefore, the adoption of European data protection legislation is rooted in the internal market and integration efforts.

With the consolidation of individual rights in the EU in the Charter of Fundamental Rights, which entered into force in 2009, the right to personal data protection was recognised as a distinct right to the right to privacy. The right to data protection is enshrined in Article 8 of the Charter of Fundamental Rights of the European Union (the Charter) and in Article 16 of the Treaty on the Functioning of the European Union (TFUE). Thus, the EU’s competence to enact the Data Protection Directive was an internal market one.

In 2015, building on early harmonisation and integration efforts, the European Commission adopted the Digital Single Market (DSM) Strategy, which set the goal to develop a European data economy.[footnote] European Commission (2015). A Digital Single Market Strategy for Europe. [online]. Available at: https://eur-lex.europa.eu/legalcontent/EN/TXT/?uri=COM%3A2015%3A192%3AFIN[/footnote] This means creating a common market across member states that eliminates impediments to transnational online activity in order to foster competition, investments and innovation:

A Digital Single Market is one in which the free movement of goods, persons, services and capital is ensured and where individuals and businesses can seamlessly access and exercise online activities under conditions of fair competition, and a high level of consumer and personal data protection, irrespective of their nationality or place of residence.’

The Digital Agenda talks about better access to online goods and services, high-speed, secure and trustworthy infrastructures and investment in cloud computing and big data.[footnote]Ibid.[/footnote] For these purposes a number of regulatory interventions were proposed, such as consumer protection laws, the reform of the telecommunications framework, a review of the privacy and data protection in electronic communications law, and new rules for ensuring the free flow of data.

In 2018, the General Data Protection Regulation entered into force after a two-year transition period.[footnote] Regulation 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data (General Data Protection Regulation). Available at: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A32016R0679[/footnote] The regulation updates the data-protection measures while maintaining the same two goals as the 1995 Data Protection Directive: strengthen individual rights and enable the free flow of data in the EU internal market.

Another relevant regulation adopted in 2018 was the Regulation on the Free flow of non-personal data. It aims to ensure data processing increases productivity to create new opportunities and supports the development of the data economy in the Union.[footnote] Recital 2 of the Regulation 2018/1807 on a framework for the free flow of non-personal data in the European Union. Available at: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32018R1807[/footnote] It aims to achieve these goals by prohibiting data localisation requirements in member states (except for national security grounds) and counters vendor lock-in practices in the private sector. It also includes rules supporting data portability and interoperability as a way to ensure data mobility within the EU, increase competition and foster innovation. The Regulation intends to deal only with anonymised and aggregate data sets such as for big data analytics, farming related data, industrial production data – e.g. data on maintenance for industrial machines.

On 19 February 2020, the European Commission published the EU Data Strategy,[footnote] See European Commission (2020). A European strategy for data.[/footnote] along with a whitepaper on artificial intelligence[footnote] European Commission (2020c). On Artificial Intelligence – A European approach to excellence and trust. [online] Available at: https://ec.europa.eu/info/sites/info/files/commission-white-paper-artificial-intelligence-feb2020_en.pdf [Accessed 18 Feb. 2021][/footnote] and a communication on shaping Europe’s digital future.[footnote] European Commission (2020e). Shaping Europe’s Digital Future. [online]. Available at: https://ec.europa.eu/info/sites/info/files/ communication-shaping-europes-digital-future-feb2020_en_4.pdf [Accessed 18 Feb. 2021].[/footnote] The European Commission supports a ‘human centric approach’ to technological development and the creation of ‘EU-wide common, interoperable data spaces […] overcoming legal and technical barriers to data sharing across organisations.’[footnote] European Commission (2020). A European strategy for data.[/footnote]

Annex 3: RadicalxChange’s Data Coalitions

This is a conceptual model that incorporates elements of all of the three legal mechanisms presented in this report

The RadicalxChange Foundation is a non-profit ‘envisioning institutions that preserve democratic values in a rapidly-changing technological landscape’,[footnote]Posner, E. A. and Weyl, E. G. (2018) Radical markets. Uprooting Capitalism and Democracy for a Just Society.
Princeton University Press. Available at: https://doi.org/10.2307/j.ctvc77c4f[/footnote] premised on the idea that data is essentially associated with groups, not individuals. If value comes from network effects, they ask, who owns the network? Social graphs of individuals necessarily contain information about a network of others; most records such as emails and calendar entries also refer to others; any data about one individual may be used to create a predictive profile of others. Through this account, in correcting imbalances and asymmetries, privacy is a red herring.[footnote]See RadicalxChange Foundation’s Data Freedom Act. Available at: www.radicalxchange.org/kiosk/papers/data-freedom-act.pdf [Accessed 18 Feb. 2021].[/footnote]

To that end, RadicalxChange proposes data coalitions, which are fiduciaries for their members, but would require legislation, new regulation and an oversight board (in the US context). The problem they are meant to solve is that data subjects have less bargaining power with data consumers because the data they supply overlaps in content with that of other individuals. A data coalition would in effect bargain for all its members, aggregating and thereby increasing their influence. In this respect, they are intended to play a similar role to bottom-up data trusts.[footnote]Delacroix, S. and Lawrence, N. D. (2019). ‘Bottom-up data Trusts’[/footnote]

Governance: RadicalxChange envisages a Data Relations Board created by legislation with quasi-judicial powers to administer the area. A data coalition would legally interpose between individuals and data consumers to negotiate terms of use, privacy policies, etc. Governance
would be democratic through the membership. Decisions would have
to be binding on all members.

Data rights: To become a member, individuals would assign exclusive rights to use (some of) their data to the coalition (e.g. assigning exclusive rights to all their browsing data). The coalition would then negotiate with data consumers for the use of the data. The coalition’s rights to data would be defined contractually, and the board would ensure that the relevant data could not be collected by another entity, except through the coalition. Rights to the use of data could never be transferred permanently to a data consumer. Members could leave, and take their data with them, perhaps to an alternative coalition.

The outcome of a successful initiative would be not unlike the ambitions of the UK Government’s Smart Data Initiative.[footnote]UK Department for Business, Energy and Industrial Strategy (2019). Smart Data: Putting consumers in control of their data and
enabling innovation. [online] Gov.uk Available at: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/
attachment_data/file/808272/Smart-Data-Consultation.pdf [Accessed 18 Feb. 2021].[/footnote]

Sustainability of the initiative: Given the legal framework the idea requires, it would be sustainable if there was enough business to support a coalition. The proposed business model is that the coalition makes money from the data, and passes a proportion of the profits on to its
members. It is, however, on the drawing board and presumes an objective to share profits with members proportionally. The legal framework itself is unlikely to emerge in the near term.


This report was authored by Valentina Pavel.

Preferred citation: Ada Lovelace Institute. (2021). Exploring legal mechanisms for data stewardship. Available at: https://www.adalovelaceinstitute.org/report/legal-mechanisms-data-stewardship/

Image credit: Jirsak

1–12 of 15

Skip to content

A report proposing a ‘framework for participatory data stewardship’, which rejects practices of data collection, storage, sharing and use in ways that are opaque or seek to manipulate people, in favour of practices that empower people to help inform, shape and – in some instances – govern their own data.

Executive summary

Well-managed data can support organisations, researchers, governments and corporations to conduct lifesaving health research, reduce environmental harms and produce societal value for individuals and communities. But these benefits are often overshadowed by harms, as current practices in data collection, storage, sharing and use have led to high-profile misuses of personal data, data breaches and sharing scandals.

These range from the backlash to Care.Data,[1] to the response to Cambridge Analytica and Facebook’s collection and use of data for political advertising.[2] These cumulative scandals have resulted in ‘tenuous’ public trust in data sharing,[3] which entrenches public concern about data and impedes its use in the public interest. To reverse this trend, what is needed is increased legitimacy, and increased trustworthiness, of data and AI use.

This report proposes a ‘framework for participatory data stewardship’, which rejects practices of data collection, storage, sharing and use in ways that are opaque or seek to manipulate people, in favour of practices that empower people to help inform, shape and – in some instances – govern their own data.

As a critical component of good data governance, it proposes data stewardship as the responsible use, collection and management of data in a participatory and rights-preserving way, informed by values and engaging with questions of fairness.

Drawing extensively from Sherry Arnstein’s ‘ladder of citizen participation’[4] and its more recent adaptation into a spectrum,[5] this new framework is based on an analysis of over 100 case studies of different methods of participatory data stewardship.[6] It demonstrates ways that people can gain increasing levels of control and agency over their data – from being informed about what is happening to data about themselves, through to being empowered to take responsibility for exercising and actively managing decisions about data governance.

Throughout this report, we explore – using case studies and accompanying commentary – a range of mechanisms for achieving participatory decision-making around the design, development and use of data-driven systems and data-governance frameworks. This report provides evidence that involving people in the way data is used can support greater social and economic equity, and rebalance asymmetries of power.[7]

It also highlights how examining different mechanisms of participatory data stewardship can help businesses, developers and policymakers to better understand which rights to enshrine, in order to contribute towards the increased legitimacy of – and public confidence in – the use of data and AI that works for people and society.

Focusing on participatory approaches to data stewardship, this report provides a complementary perspective to Ada’s joint publication with the AI Council, Exploring legal mechanisms for data stewardship, which explores three legal mechanisms that could help facilitate responsible data stewardship.[8]

We do not propose participatory approaches as an alternative to legal and rights-based approaches, but rather as a set of complementary mechanisms to ensure public confidence and trust in appropriate uses of data, and – in some cases – to help shape the future of rights-based approaches, governance and regulation.

Foreword

Companies, governments and civil-society organisations around the world are looking at ways to innovate with data in the hope of improving people’s lives through evidence-based research and better services. Innovation can move quickly, but it is important that people are given opportunities to shape it. This is especially true for areas where public views are not clearly settled.

This new report from the independent research institute and deliberative body the Ada Lovelace Institute provides practical examples of how to engage people in the governance of data through participatory data stewardship. The report shows that there are choices about when and how to bring people into the process – from data collection, to linkage, to data analysis.

One of the most frustrating experiences for people is when they are told they will have the power to shape something, but find in fact that consultation is very limited. To mitigate this the report helpfully distinguishes between different kinds of involvement, with a spectrum ranging from ‘inform’ all the way to ‘empower’.

As a society we are seeking to chart a way between a data ‘free for all’, where people feel powerless about how data is being used, and a situation where opportunities for beneficial innovation and research are lost because data is not shared and used effectively. Involving people in data stewardship is an accountability mechanism that can build trustworthiness, which in turn supports innovation.

Participatory data stewardship provides a practical framework and case studies, to demonstrate how citizens can participate in shaping the way that data is being used. I hope businesses, policymakers and leaders of organisations will take inspiration from it, and generate a new set of use cases that we can continue to share in the future.

Data innovation creates new opportunities and challenges that can take us beyond agreed social conventions. To make the most of the opportunities it is therefore imperative that people’s voices are heard to shape how we use data.

Hetan Shah
Vice-Chair, Ada Lovelace Institute
Chief Executive, The British Academy

Introduction

To understand how participatory data stewardship can bring unique benefits to data governance, we need to understand what it is and how it can be used.

Why participatory data stewardship matters

Organisations, governments and citizen-led initiatives around the world that aspire to use data to tackle major societal and economic problems (such as the COVID-19 pandemic) face significant ethical and practical challenges, alongside ‘tenuous’ public trust in public- and private-sector data sharing.[9] To overcome these challenges, we will need to create robust mechanisms for data governance, and participatory data stewardship has a distinct role to play in their development.

Traditionally, data governance refers to the framework used to define who has authority and control over data and how that data may be used.[10] It commonly includes questions around data architecture, security, documentation, retention, and access. Many organisations that use data or build data-driven systems implement a data-governance framework to guide their decision-making.

Institutions including data cooperatives, data trusts, data-donation systems and trusted research environments are designed to govern the use of beneficiaries’ data. Currently, private-sector data-governance approaches often do not address the concerns of the beneficiaries of data, and do not encourage those using that data to consider how their choices can best support the needs of those who will be affected by their decisions.

The report provides evidence that involving people (‘beneficiaries’) in the design, development and deployment of data governance frameworks can help create the checks and balances that engender greater societal and economic equity, can help to rebalance asymmetries of power, and can contribute towards increased public confidence in the use of data.[11]

The term beneficiaries includes ‘data subjects’, who have a direct relationship with the data in question as specified in the GDPR,[12] and also encompasses those impacted by the use of data (e.g. workers, underrepresented and excluded groups) even if they are not themselves data subjects. In other words, we use the term ‘beneficiary’ to encompass anyone who might be affected by the use of data beyond simply the data subjects – those who have the potential to benefit from participatory data stewardship – and this helps to move beyond a compliance-based approach to a model that is underpinned by social license.

Beneficiaries can include:

  • the data subjects – the people to whom the data directly relates, for instance, when processing biometrics data
  • people within the wider public – for instance, those who might have an interest in how data is governed or used ethically, as well as those who might have lived experience of an issue or disadvantage
  • people at risk of being oversurveilled, underrepresented or missing from the data itself, e.g. migrant populations, members of Indigenous communities, people from racialised minority groups, people with mental health conditions and transgender people
  • stakeholders working in technology or related organisations, or members of a global supply chain. e.g. those engaged in collecting and processing data, or who have an interest in data to secure their own collective workplace rights.

In addition, involving beneficiaries can encourage responsible innovation and improve data quality, as the beneficial feedback loop below illustrates.

Figure 1: Effective
participatory approaches
generate a beneficial
feedback loop

We outline the benefits of effective participation in the design, development and use of data and data-governance frameworks later in the report.

What do we mean by ‘stewardship’?

Stewardship is a concept that embodies the responsible planning and management of common resources.[13] To apply this concept of stewardship to data, we must first recognise that data is not a ‘resource’, in the same way that forests or fisheries are finite but renewable resources. Rather, it is a common good that everyone has a stake in, and where the interests of beneficiaries should be at the heart of every conversation.

The Ada Lovelace Institute has developed the following working definition of data stewardship:

‘The responsible use, collection and management of data in a participatory and rights-preserving way.’[14]

We understand data stewardship as key to protecting the data rights of individuals and communities, within a robust system of data governance, and to unlocking the benefits of data in a way that’s fair, equitable and focused on societal benefit. We contend that the principles and values that underpin stewardship can help to realise aspects of trustworthy and responsible data collection and use.

In this report, the term ‘data steward’ is used to describe the role of the individuals and organisations processing and using data on behalf of beneficiaries. In particular, stewards are responsible for involving the people who have a stake in and are impacted by data use and processing. That involvement is based on a relationship of trust and a social mandate to use data (often described in the legal context as a trust-based ‘fiduciary’ relationship – where the data steward has a responsibility to put people and society’s interests ahead of their own individual or organisational interests).

Data stewardship can operate throughout the data lifecycle (see figure 2 below). In the age of ‘datafication’,[15] data stewardship operates not only in relation to data collection, processing, organising, analysis and use, but also in the design and development of data-driven systems, including those that aim to predict, prescribe, observe, evaluate and sometimes influence human and societal behaviour. This makes data stewardship a complex task, but also points to its potential to engender better outcomes for people and society.

Figure 2: The data lifecycle

Introducing a framework for participatory data stewardship

Participation in its most fundamental sense is the involvement of people in influencing the decisions that affect their lives. In the technology-mediated environment that many of us currently inhabit, the ways data is used can preclude meaningful participation.

The Ofqual exam results algorithm that made predictions about A-levels in the UK, or the cookie notices and terms and conditions that ‘nudge’ towards uninformed consent at the expense of individual data rights, demonstrate how this can disempower people.[16]

In response to these conditions, we have developed a framework for participatory data stewardship (figure 3 below) to demonstrate how it is possible to empower beneficiaries to affect the design, development and use of data-driven systems.

The framework is based on Sherry Arnstein’s ‘ladder of citizen participation’,[17] which illustrates that there are different ways to enable participation (figure 5 in appendix), and its more recent adaptation into a spectrum of power,[18] which represents the possible outcomes of informing, consulting, involving, collaborating and empowering people (figure 6 in appendix).

Figure 3: Framework for participation in data stewardship

We propose this framework to support thinking about how different modes of participation in and about data governance can enable beneficiaries to have increasing power and agency in the decisions made about their (and by extension others’) data. Moving through the spectrum, the level of power afforded to beneficiaries increases from being the recipients of ‘information’ through to being ‘empowered’ to act with agency.

The framework first shows how participatory data stewardship mechanisms can seek to achieve meaningful transparency, responding to people’s rights to be informed about what is happening or can happen with data about them.

Next, the framework describes mechanisms and processes that can build towards understanding and responding to people’s views (consult and involve) in decision-making about data.

Ultimately, the vision is to realise conditions where people can collaborate actively with designers, developers and deployers of data-driven systems and (to the extent that is possible) are empowered. When this happens, beneficiaries’ perspectives form a central part of the design of data governance, which then builds confidence and capacity for people to continue to participate in the data-governance process.

The next section of the report outlines the range of participatory mechanisms at data stewards’ disposal.

It is important first to note that while participatory approaches may take different forms or have different intended outcomes, they are most often not mutually exclusive, and can in fact often complement each other. There is no single ‘right’ way to do participation, and effective participation is not a ‘one-off’ solution or mechanism.

The complex issues raised by data governance can’t be solved by a ‘one-size-fits-all’ or an ‘off-the-shelf’ approach. Beneficiaries can participate at different stages in the data cycle – from collection, storage, cleaning and processing of data, as well as its use and deployment – and there are different types, approaches, methods or means of participation that afford very different levels of power.

Figure 4: Purposes of different participatory mechanisms

Mechanisms for participatory data stewardship

There are a wide range of different participatory mechanisms, methods and activities that can be used to support better data governance, depending on purpose and context.

These are outlined in detail below. Each section of this report explains how the mechanisms works and explores the benefits, complexities and challenges through real-world case studies to help understand how these models operate in practice.  These models form an indicative but not an exhaustive list, which we expect will iterate and evolve over time.

The various mechanisms and accompanying case studies are also illustrative of the creative potential of participation, and the range of approaches and tools at data stewards’ disposal in thinking about how best to involve people in the use of data.

Inform

‘Informing’ people about data use and governance involves a one-way flow of information from those who use, gather, deploy and analyse data, to data subjects or ‘beneficiaries’. This information flow can be direct or indirect (through an intermediary, such as a data trust).

Meaningful transparency and explainability

Transparency and explainability are distinct mechanisms that can contribute to informing beneficiaries meaningfully as to how data is used. Transparency provides people with the necessary information and tools to be able to assess the use of data-driven systems and how their data is managed. Features of transparency include openness and clarity about the purpose, functions and effectiveness of these systems and frameworks. Transparency is enshrined in the GDPR as the ‘right to an explanation’, under Articles 13 and 14, as well as in Article 15 as the ‘right to access’ of data subjects for solely automated decision-making.[19]

An alternative way of expressing this one-way flow of information is as ‘explainability’. In the context of AI decision-making, explainability can be understood as the extent to which the internal mechanics of a machine or deep-learning system can be explained or made understandable in human terms. Enabling ‘explainability’ of complex uses of data (such as algorithmic or data-driven systems) has developed into its own established field of research, with contributions from the Alan Turing Institute and the Information Commissioner’s Office among many.[20]

However, to improve outcomes for beneficiaries, transparency and explainability must move beyond simply informing people about how their own individual data has been used, or how a data-driven system has affected a decision about them, and towards the data beneficiary being able to influence the outcome of the data use through their increased understanding. In other words, it must be meaningful, and there must be a recognition that the rights of transparency and explainability can extend beyond the individual data subject.

What is meaningful transparency and explainability?

What meaningful transparency and explainability look like will depend on the type of data used, the intent and purpose for its use, and the intended audience of the explanation.

Some researchers have distinguished between model-centric explanations (an explanation of the AI model itself) for general information-sharing and broader accountability purposes; and more subject-centric explanations (explanations of how a particular decision has impacted on a particular individual, or on certain groups).[21]

Subject-centric explanations are usually the first step towards increased accountability. The UK’s Information Commissioner’s Office (ICO) has identified six main types of explanation:[22]

  • Rationale: the reasons that led to a decision, delivered in an accessible and non-technical way.
  • Responsibility: who is involved in the development, management and implementation of an AI system, and who to contact for a human review of a decision.
  • Data: what data has been used in a particular decision and how.
  • Fairness: steps taken across the design and implementation of an AI system to ensure that the decisions it supports are generally unbiased and fair, and whether or not an individual has been treated equitably.
  • Safety and performance: steps taken across the design and implementation of an AI system to maximise the accuracy, reliability, security and robustness of its decisions and behaviours.
  • Impact: steps taken across the design and implementation of an AI system to consider and monitor the impacts that the use of an AI system and its decisions has or may have on an individual, and on wider society.

In addition to explainability, meaningful transparency refers to initiatives that seek to make information about an AI system both visible and interpretable to audiences who may have different levels of competency and understanding of technical systems.[23] These initiatives identify clear ‘targets’ of transparency, such as an algorithm, the ‘assemblage of human and non-human actors’ around algorithmic systems, or the governance regime that oversees the management and use of that system.[24]

As our article Meaningful transparency and (in)visible algorithms demonstrates, meaningful transparency initiatives are ones that ‘build publicness’ by aligning AI systems with values traditionally found in the public sector, such as due process, accountability and welfare provision.[25] These initiatives provide people with the necessary tools and information to assess and interact with AI systems.

One example of meaningful transparency in practice is the use of ‘algorithm registers’ implemented by the cities of Amsterdam and Helsinki. These registers provide a list of data-driven systems put into use by these cities, and provide different levels of information aimed at audiences with differing competencies and backgrounds. These registers not only disclose key facts about the use of data-driven systems, but also enable members of the public to perform their own independent ‘armchair audit’ into how these systems may affect them.

Other examples of emerging transparency mechanisms include model cards, which aim to document key details about an AI system’s intended uses, features and performance.[26] Data sheets are a similar mechanism that aim to summarise essential details about the collection, features, and intended uses of a dataset. [27]  To date, data sheets and model cards have been used primarily by AI researchers and companies as a means of transferring information about a dataset or model between teams within or outside of an organisation. While, to date, they are not intended to provide information to members of the public, they could be repurposed.

Case study: Enabling ‘armchair audit’ through open AI and algorithm registers (Helsinki and Amsterdam), administered by Saidot, 2020[28]

 

Overview:

Helsinki and Amsterdam were among the first cities to announce open artificial intelligence (AI) and algorithm registers. These were founded on the premise that the use of AI in public services should adhere to the same principles of transparency, security and openness as other city activities, such as public bodies’ approaches to procurement.

 

The infrastructure for open AI registers in both countries is being administered by the Finnish commercial organisation, Saidot. They are aiming to build a scalable and adaptable register platform, to collect information on algorithmic systems and share them flexibly with a wide range of stakeholder groups.

 

Participatory implications:

These AI and algorithm registers are openly available, and any individual is able to check them. This illuminates the potential for ‘armchair auditing’ of public-service algorithmic systems. Accessing the register reveals information about the systems that are reported on, the information that is being provided, and the specific applications and contexts in which algorithmic systems and AI are being used.

 

Saidot conducted research and interviews with clients and stakeholders to develop a model that serves the wider public, meaning it’s accessible not only to tech experts but also to those who know less about technology, or are less interested in it. The approach adopts a layered model, enabling any individual to find and discover more information based on their level of knowledge and interest.

Complexities and critiques

Given the relatively recent emergence of these approaches, more needs to be understood about the effectiveness of these systems in enabling ‘meaningful transparency’ in the way defined above. Do these registers, for example, genuinely enable non-specialist people to become armchair algorithmic auditors?  And are they complemented by the implementation of, and access to, independent audits, which are also important transparency mechanisms?[29]

To develop meaningful transparency and explainability of data-driven and AI systems, it is necessary to enable beneficiaries to gain insights into the decision-making processes that determine their goals, the impact of their outcomes, how they interact with social structures and what power relations they engender through continued use. This may sound straightforward, but these are not static pieces of information that can be easily captured; they require ongoing investigation using a multitude of sources.

Lack of access to complete, contextual information about complex uses of data is often due to information being spread across different organisational structures and under different practices of documentation.  Any information architecture set up to promote transparency must contend with the way data-governance frameworks relate to the real world, tracking the shifting influences of technology, governance and economics, and the public and private actors embedded within them. This requires an articulation of how to access information as well as what information to access. Answering the ‘how’ question of transparency will require addressing the conditions around who executes it, with what motives and purposes.[30]

Some critics have argued that relying exclusively on transparency and explainability as mechanisms for engaging and involving people in data governance risks creating an illusion of control that does not exist in reality. This might distract from addressing some more harmful data practices that engender unfairness or discriminatory outcomes.[31]

Others have highlighted that increased transparency about data use and management forces policymakers to be more explicit and transparent about the trade-offs they are choosing to make. In this context, it has been argued that explainability doesn’t just pose a technical challenge, but also a policy challenge about which priorities, objectives and trade-offs to choose.[32]

Rethinking and reframing the way we communicate about data

Another core component that informs the way we collectively conceptualise, imagine and interact with data are the narratives we build to describe how data operates in societies.

Narratives, fictional or not, have a profound effect on shaping our collective understandings of all aspects of our world, from the political or economic to the technological. They rely on a range of devices: news articles, visual images, political rhetoric and influential people all contribute to the prevalence of a particular narrative in the public consciousness.

Central to many narratives is the use of metaphors, and that’s particularly true when it comes to technology and data, where the abstract nature of many concepts means that metaphors are used to conceptualise and understand processes and practices.[33]

Technologists and policymakers frequently reach for analogies and metaphors, in efforts to communicate with diverse publics about data. Sociologists Puschmann and Burgess have explored how metaphors are central to making sense of the often abstract concept of data.[34] If we are to move to more participatory mechanisms for data stewardship and governance, we will need to understand and unpick how these specific metaphors and narratives shape data practices.

Why narratives about data matter

How we talk about data influences the way we think about it. How we think about data influences how we design data policies and technologies. Current narratives create policies and technologies that too-often minimise the potential societal benefit of data while facilitating commercial incentives. To reimagine narratives, we need to understand and challenge the metaphors and framings that are already prominent.

Data has been called everything from ‘the new oil’, to water or radioactive waste.[35] Metaphors like this are used to make the concept of data more tangible, more open and more accessible to the public, but they ascribe qualities to data that aren’t necessarily inherent, and so can have the effect of mystifying or creating disempowering perceptions of data. Metaphors can frame issues in particular ways, highlighting certain qualities while obscuring others, and this can shape perspectives and belief according to the intentions of those setting the narrative.

Common understandings created through metaphor are all-too-often misunderstandings. For example, as ‘oil’, data’s economic value is brought into focus, but its non-rivalrous quality (the fact that it can be used simultaneously by different people for different purposes) is hidden.[36]  These misunderstandings about the ways we describe data can reveal much about the contemporary social and cultural meaning we give to it.[37] For example, when economic value is highlighted, and common good is obscured, the market motivations to make ‘commodity’ the meaning of data are revealed.

Such narratives also suggest that practices about data are, like forces of nature, unfixed and unchangeable – rather than social practices undertaken by people and power holders. And it’s important to note that ‘data’ itself is not the source of value – but rather actors’ positioning and abilities to make use of data is what brings value.

But focusing on one single metaphor and its pitfalls reveals only part of the power of narratives. Researching the metaphors that exist across a range of prominent data narratives, Puschmann and Burgess identify two interconnected metaphorical frames for data: data as a natural force to resist or control; and data as a resource to be consumed. As a natural force, data ‘exists’ in the world – like water, gravity or forests – just waiting to be discovered and harnessed by humans. As a resource to be consumed, data is conceptualised as a commodity that can be owned and sold, then used – just like oil or coal.

These two metaphorical frames promote the idea that data is a value-neutral commodity, and so shape data practices to be extractive and promote commercial and competitive incentives. Many other sociocultural meanings exist, or could exist, manifested through metaphor. Recognising and understanding them allows us to interpret these sociocultural meanings, and address how they shape data policies, processes and practices.

Case study: Making sense of the societal value of data by reframing ‘big-data mining’ and ‘data as oil’ narratives

 

Overview:

One of the major metaphorical frames that embodies data as a natural force or resource to be consumed is that of ‘big-data mining’, a narrative that emerged in a range of corporate literature in the 1990s. This framed big data as a new ‘gold mine’ waiting to be tapped, from which companies could gain wealth if they knew how to extract it.[38]

 

Positioning ‘data’ as a resource from which insight can be ‘mined’ perpetuates the conceptual model that economic value can be extracted from data, leveraging the culturally embedded analogy that gold ore mined from the earth (for example, in the American Midwest) delivered untold wealth for colonialist settlers and ignores the unequal power dynamics of the extractive practices that underpin these narratives.

 

Participatory implications:

The reality is that data does not exist in the natural world, lying dormant until it is discovered and its value extracted, and the focus this metaphor places on wealth value obscures the societal value data can have if stewarded well. To overcome these framings it’s necessary for those affected by data to participate in rethinking and reframing how data is conceptualised, to bring data narratives (and therefore practices) in greater alignment with public expectations and values.

 

Following existing metaphors like oil and mining, the natural world remains a good conceptual foundation to develop alternative metaphors. One option is to see data as a greenhouse gas, where its uses create harmful byproducts that must be limited.[39] Other metaphors might recognise that oil, gold and other natural commodities are not perceived and treated in the same way as forests, rivers and sunlight. These more ‘ecological’ metaphors have been posed as alternatives that could promote concepts and practices where data is stewarded more sustainably and less extractively.

Complexities and critiques – why reframing doesn’t go far enough

Reframing old narratives, and creating new narratives is no easy task. Narratives are slow to develop and permeate, and require critical mass and collective understanding to gain the traction needed to change our deeply conditioned mental models of the world. Introducing new metaphors is challenging when so many, often-contradictory, ones already exist. Moreover, creating new metaphors from the ‘top down’ would continue to represent asymmetries of power by perpetuating a model where data stewards – governments, corporations and other institutions – continue to impose their world view on data subjects, especially those already marginalised. Rethinking the narratives and metaphors that shape how data is conceptualised must therefore be part of a participatory process.

Consult

Understanding public attitudes to different uses of data

A range of data stewardship mechanisms build consultation into their design and development. Consultation can take place with individuals, groups, networks or communities, to enable people to voice their concerns, thoughts and perspectives. Consultation activities can take a range of forms but often involve the use of quantitative and qualitative methods including public-opinion ‘attitude’ surveys, neighbourhood meetings and public hearings.

While the principles of user-led design aim to understand a narrower set of perspectives – those of the intended user of the data, rather than the views of beneficiaries or the public – they can contribute to consultation’s aim by ensuring designers and developers of data-driven systems understand people’s aspirations and concerns.

What is meaningful consultation?

The process of seeking to understand people’s attitudes to different uses and approaches to data, commonly referred to as consultation, has undergone a shift in acceptance. It was dismissed by Arnstein as a tokenistic activity, because it carried with it no meaningful promise of effecting power shifts or changes.[40] However, eighteen years later, the Gunning Principles – relating to consultation in two landmark British court cases of Coughlan (2001) and Brent London Borough ex parte Gunning (1989) – sought to make a distinction between ineffectual consultation of the type Arnstein had in mind and effective consultation, by setting out guiding principles.

The Gunning Principles of Consultation:[41]

  • Consultations should be undertaken when proposals are still at a formative stage: Bodies cannot consult on a decision already made. Organisations need to have an open mind during a consultation and a willingness to listen to alternatives to any proposals they have put forward.
  • There must be sufficient information to permit ‘intelligent consideration’ by the people who have been consulted: People involved in the consultation need to have enough information to make an intelligent choice and input in the process.
  • Adequate time for consideration and response: Timing is crucial – participants must have sufficient time and a supportive environment to consider issues and make an informed response.
  • Responses must be conscientiously taken into account: Bodies must be able to demonstrate (with reference to evidence) that they have listened to and fully considered responses from those they engaged prior to making any final decision. They must also provide motivation as to why certain responses were taken into consideration and others were not and for what reasons.

In the United Kingdom, the Gunning Principles have formed a strong legal foundation for assessing the legitimacy of public consultations, and are frequently referred to as a legal basis for judicial review decisions (the process by which UK law reviews the validity of decisions made by public bodies). They also act as a valuable mechanism through which to understand how impactful and meaningful a consultation process is likely to have been.

In addition to the principles above, a range of consultation-based approaches have sought to acknowledge significant asymmetries of power in relation to the perspectives of people at risk of being disproportionately impacted by the use of data, and/or who are at risk of being excluded or underrepresented by data-driven systems. This is especially important in a context where ‘publics’ themselves are diverse and not equal.

The use of lived-experience panels (groups that work to shape an initiative, drawing from their own personal experience of an issue, such as racial or social injustice) seek to ensure structural inequalities and dynamics of power and privilege are considered in the design of data-driven systems and governance frameworks. These can form an important part of good and effective consultation. Examples include the Ada Lovelace Institute’s approach to understanding issues of power and privilege impacting on people from minority-ethnic backgrounds, on issues of access and gender diversity;[42] and the emphasis on lived-experience panels by the Wellcome Trust, by engaging with people with mental health conditions to inform the establishment of a global mental-health databank.[43]

Globally, much has been made of the importance of co-creation and enabling agency over data in ways that are led and co-designed with a range of Indigenous communities rather than designed without and imposed on the communities,[44] and these initiatives draw on similar insights. There is good practice emerging across New Zealand and Canada in this context that demonstrates how to move away from the invisibility of Indigenous peoples in data systems towards a ‘people-and-purpose orientation to data governance’ that promotes active involvement and co-creation of systems in ways that are culturally and societally relevant and sensitive.[45]

Case study: Community engagement and consultation by CityVerve (Manchester), multi-sector consortium led by Manchester City Council, 2015[46]

 

Overview:

The CityVerve project was a Smart City demonstrator, funded by the UK Government and Innovate UK, in the UK city of Manchester. It was a two-year project to develop and test new Internet of Things (IoT) services at scale, adding sensors to equipment throughout the city that collected and shared data across a network. It combined health and social-care data, environment data and transport data to analyse energy use and monitor air quality in cities, and to provide integrated and personalised healthcare feedback to individuals and clinicians about people’s symptoms.

 

A key aim of CityVerve was to build a ‘platform of platforms’ to gather and collate data across Manchester about the needs of a UK city. This was built by telecoms company Cisco, and the data was stored by BT’s cloud-based ‘Internet of Things Datahub’.

 

Participatory implications:

The CityVerve team used a creative approach to consultation, which attracted 1,000 workshop attendees. They reached a further 11,300 with art and performance, commissioning local artist, Naho Matsuda, to bring the project to life through an innovative digital installation.

The team adopted user-design approaches and developed a local communities’ platform, which was built around interests and activities relating to the IoT, and also ran community forums to provide people with an introduction to networked data. One of the project partners, Future Everything, led human-centred design workshops with participants to discuss and explore IoT technology pilots and design use cases for the data being collected – highlighting the extent to which ‘consultation’ can be creative and innovative, helping to inform and build the capacity of those who participate.

Complexities and critiques

A critique of consultative approaches and user-design approaches is that, too often, the feedback loop between the people’s input and the decisions made about design and implementation fails to operate effectively or clearly, negating participants’ ability to have or exercise ‘real power’. When this happens, consultation can serve to legitimate decisions that would have been made regardless, or can design in a manner that is ‘leading’, exclude marginalised perspectives or fail to capture the full complexity and nuance of an issue. Critics also express concern about the process misleading participants into thinking they have greater power and agency over the terms and conditions of the data governance initiative than they actually have, creating a potential misalignment between expectations and the reality of the consultation’s likely impact.

The controversy around Google affiliate Sidewalks Labs’ efforts to develop Toronto’s Waterfront is a good example of this.[47] The now-abandoned proposal to deploy a civic data trust as a participatory mechanism was demonstrated to be insufficient and inadequate in reassuring members of the public of the acceptability of the partnership between Sidewalks Lab and Waterfront Toronto. Instead of assuming that the partnership itself was acceptable and proposing engagement within that context, Waterfront Toronto could have considered engaging publics on their expectations prior to entering into it – and then sought to embed consultation within the partnership.

Involve

Beneficiaries advise through deliberation

The process of ‘involvement’ positions beneficiaries in an advisory role, which helps inform early-stage decision-making by data stewards and fiduciaries. These initiatives convene non-specialist beneficiaries alongside specialists and stakeholders, with a view to informing key moments in the public-policy landscape. They seek to advise and better inform government and regulatory bodies on the conditions for the acceptability of uses of AI and data (particularly where the use of data is contested, controversial and where clear regulatory norms have yet to be established).

Often, enabling an advisory role for beneficiaries in shaping approaches to data governance draws on the values, norms and principles of deliberative democracy. Through the established methodologies of deliberation, beneficiaries and key stakeholders are provided with access to privileged information and equipped at an early stage with sufficient time and information to play a central role in shaping approaches to data governance.[48]

Doing this well can take time (through a long-form process, convening and re-convening people over weekends, weeks or months at a time), considerable resources (human and financial – to support people to contribute through the process), and the skills to make complex issues transparent and understandable.

This is particularly important given that questions about how data is used and governed can be deeply societally relevant, but also opaque for non-specialists unless considerable time and effort is made to address questions around how uses of data-driven systems interact with the complex policy contexts in which they take place.  For instance, in a deliberative process seeking advice on the parameters of health-data sharing, beneficiaries will need to be informed about the use of data, the proposal for a particular data governance mechanism, and also about the implications of its use for health-data outcomes and the impact on patients.[49]

There has been a proliferation of one-off initiatives in this space, which include:

  • Understanding Patient Data’s citizen juries on commercial health data partnerships, in partnership with the Ada Lovelace Institute[50]
  • US-based TheGovLab’s Data Assembly[51]
  • the Royal Society of Arts (RSA) citizen juries in partnership with DeepMind (an AI company owned by Google) on AI explainability[52]
  • the Information Commissioners’ Office and Alan Turing Institute’s Explainable AI citizen juries[53]
  • the Ada Lovelace Institute’s Citizens’ Biometrics Council, with input from stakeholders such as the Information Commissioner’s Office and the Biometrics Commissioner[54]
  • a range of UK Research and Innovation (UKRI)-funded initiatives under the auspices of the long-established public-dialogue programme Sciencewise, including a public dialogue on location data with the Ada Lovelace Institute.[55]

In addition to and beyond the one-off initiatives, calls have grown for the ‘institutionalisation’ of participatory and deliberative approaches to data and AI governance. A recent OECD report, Catching the Deliberative Wave defines ‘institutionalisation’ in two ways. The first is ‘incorporating deliberative activities into the rules of public decision-making structures and governance arrangements in a way that is legally constituted so as to establish a basic legal or regulatory framework to ensure continuity regardless of political change’. The second is ‘regular and repeated processes that are maintained and sanctioned by social norms, which are important for ensuring that new institutions are aligned with societal values’.[56]

Media commentators on data and AI have proposed that a ‘council of citizens’ should form part of the basis of the institutional decision-making structure that should be enabled to help regulate algorithmic decision-making.[57] While these kinds of approaches may be relatively novel in the field of AI and data governance specifically, they have precedents in the assessment of emerging technologies more broadly.

Case study: Institutionalised public deliberation, the Danish Board of Technology, 1986–2011[58]

 

Overview:

One of the highest-profile forms of institutionalised participatory technology assessments was the deliberative consensus conference model implemented by the Danish Board of Technology, which has now been disbanded. The Danish Board of Technology had a statutory duty to inform citizens and politicians on new technology implications, received an annual subsidy and delivered an annual report to the Danish Parliamentary Committee on Science and Technology. It was abolished by law in 2011, and its successor is the Danish Board of Technology Foundation.

 

The model combined the knowledge of technology experts with the perspectives of non-specialists. The experts helped inform citizen-led reports that summarised the citizens’ agreement or disagreement on questions of how a technology should be developed, their potential risks, future potential applications and appropriate mechanisms through which the effects of the technology on society might be measured. They then shared their consensus reports with the Danish Parliament and the media, which resulted in wide debate and reporting on the findings.

 

The Danish Parliament received reports directly from consensus conferences on topics as contentious as food irradiation and genetic modification. Deliberators in both instances proposed that government funding should not be spent on those technologies. These recommendations were subsequently endorsed by the Danish Parliament.[59]

 

Participatory implications:

The consensus conference as a method of technology assessment was introduced in many western European countries, the USA and Japan during the 1990s. This assured a route for participatory governance of technologies in an advisory capacity by non-specialist perspectives, while simultaneously ensuring that these perspectives reached power-holders, including parliamentarians.

 

While the Danish Board of Technology model no longer operates with a statutory footing, due to reductions in public funding, it has a legacy through the establishment of a non-profit foundation in shaping policy on emerging technology. As a case study, it highlights the potential policy and cultural institutionalisation of participatory advice and input in influencing policy decisions about data governance.

Complexities and critiques

Critics of public deliberation acknowledge the cost, resource and time involved, and these can make it difficult to embed public perspectives in a fast-moving policy, regulatory and development field such as technology.

Other critiques acknowledge that requirements that expect citizens to deliberate over some time may impose structural constraints on who is empowered or able to participate. This is a critique identified in a UK Government report on the use of public engagement for technological innovation, for instance.[60]

Not all data governance questions and issues will demand this level of engagement – on questions where there is a high level of established public support and mandate, meaningful transparency and consultation-based approaches may be adequate.

There are organisations working on technology issues that have sought to adapt the traditional deliberation model to incorporate the potential for a more rapid, iterative and online approach to deliberation, as well as to complement it with approaches to engagement that consider and engage with the lived experiences of underrepresented groups and communities. Examples include the rapid, online Sounding Board model prototyped in 2017 by UK Government-funded programme Sciencewise,[61] with a view to informing rapidly moving policy cycles, and a recent prototype undertaken during the COVID-19 pandemic crisis by the Ada Lovelace Institute in partnership with engagement agencies Traverse, Involve and Bang The Table.[62]

The Royal Society of Arts (RSA)’s landmark work on building a public culture of economics is an example of how deliberative processes have been complemented with community engagement and broader outreach work with underrepresented communities.[63] This initiative complemented a deliberative Citizens’ Economic Council with an economic inclusion community outreach programme (‘On the Road’) that worked in locations identified as high on the index of multiple deprivation.[64] 

Embedding public deliberation thoughtfully and effectively requires time and resourcing (often over months rather than weeks), and this can be in tension with the imperative to work rapidly in developing and designing data-driven systems, or in developing policy and regulation to govern the use of these systems. An example of when urgent decision-making demands a more iterative and agile approach to assembling data infrastructures is the necessarily rapid response to the COVID-19 pandemic. In these instances, deliberative exercises may not always appear to be expedient or proportionate – but might on balance be valuable and save energy and effort if a particular use of data is likely to generate significant societal concern and disquiet.

Collaborate

The process of collaboration in the context of public involvement in data and AI can be understood as enabling people to negotiate and engage in trade-offs with powerholders and those governing data about specific aspects of decision-making.

Deliberation embedded across data access, sharing and governance frameworks

While deliberation can be exclusively advisory, when embedded in a data access, sharing or governance framework, it can also have potential to navigate tensions and trade-offs. There has been increased interest in the question of whether (and to what extent) it is feasible to embed deliberation as part of data-sharing, databank, or data-trust models, through the use of ‘bottom-up data trusts’ and ‘civic data trusts’ for instance.[65]

There are two aspects to this potential. Firstly, deliberative approaches focus on understanding societal benefits and harms, exhibiting potential for enabling participation of data beneficiaries as collectives, rather than as individuals (who are rarely in a position to negotiate or engage in trade-offs as they relate to their data rights).

These approaches have the potential to enable collective consent around the particular uses of data governance, including engaging those most likely to be directly impacted – in turn helping to address a central challenge associated with relying exclusively on individual consent. Consent-based models have been critiqued for failing to encourage citizens to consider the benefits of data use for wider society, as well as being designed in ways that can feel coercive or manipulative – rather than genuinely seeking informed consent.

Secondly, unlike other mechanisms, deliberative approaches have the potential to be embedded at different points of the data lifecycle – shaping collection, access, use and analysis of data.

The Ada Lovelace Institute has not identified any pilots that have been successful, to date, in embedding deliberative approaches specifically in the governance of data-sharing partnerships. However, there is potential for developing future pilots drawing on and iterating from models in related areas, as the following case study – a pilot project initiated by the Wellcome Trust – demonstrates.

Case study: Global mental health databank, the Wellcome Trust with Sage Bionetworks, 2020[66]

 

Overview:

The Wellcome Trust has funded a pilot project with young people across India, South Africa and the United Kingdom with a view to co-designing, building and testing a mental-health databank that holds information about mental and physical health among people from 17 to 24 years old. The goal is to create a databank that can be used in the future to help improve the ways mental health problems like anxiety and depression for young people living in high and low-income countries around the world are treated and prevented. The Wellcome Trust expects the databank to help researchers, scientists and young people answer the questions, ‘What kinds of treatments and prevention activities for anxiety and depression really work, who do they work for, and why do they work among young people across different settings?’

 

Participatory implications:

This initiative aims to test participatory approaches to involving young people, including through the use of lived-experience panels and citizen deliberation, working directly with lived-experience advisers and young people experiencing mental health conditions through the process of creating the databank.

 

The Wellcome Trust’s mental-health databank pilot signals the likely direction of travel for many participatory approaches to data governance. Although it remains to be seen what its effect or impact will be, its ambitions to embed deliberative democracy approaches into its design, and involve young people experiencing mental health conditions in the process, are likely to be replicable.[67]

Complexities and critiques

There are currently few examples of participatory approaches to govern access and use of data, and therefore few studies of the effectiveness of these methods in practice. Early insights suggest that this approach may not be appropriate for all data-sharing initiatives, but may be particularly appropriate for those that require a thoughtful, measured and collaborative approach, usually involving the collection and use of data that is highly sensitive (as with the Wellcome Trust initiative, which relates to mental-health data).

A recent report from public participation thinktank Involve finds that there are three potential stages at which deliberative participation could be feasible in the design of data-sharing or data-access initiatives (what we describe in this report as a data-governance framework), which warrant further experimentation, testing and evaluation. These are:[68]

  1. Scoping – aligning purpose and values of the data-sharing or data-access initiative, prior to its establishment.
  2. Co-design – developing criteria, framework or principles to ensure its decision-making process meets the needs and expectations of wider stakeholders and the public, and developing policy on the distribution of value (i.e who benefits from the use of the data, and whether there is public or wider societal value that comes from it).
  3. Evaluation – deploying participatory approaches to ensure that the intended impact of the access initiative has been met, that outcomes and potential have been maximised, and to ensure adequate accountability and scrutiny around the claims for data access and use. This reflection on how the initiative has worked can be continual as well as retrospective.

Data-donation and data-altruism models

Another example of a collaborative approach to data governance is the model of ‘data donation’, which has at its basis the premise that those who contribute or donate their data will expect to see certain terms and conditions realised around the process of donation. Usually, a term or condition will revolve around a clear articulation of ‘public benefit’ or wider societal gain.

In contrast to more ‘extractivist’ approaches to data governance, where approaches to gathering and mining data about people can take place surreptitiously, data-donation mechanisms are a route through which individuals can explicitly agree to share their data for wider societal and collective benefits under clear terms and conditions.[69]

In data donation, an individual actively chooses to contribute their data for the purposes of wider societal gains, such as medical research. It is a non-transactional process, so donors do not expect an immediate direct benefit back as a consequence of donating. If data donors and volunteers are not satisfied that their expectations are being met, for example if an AI or machine-learning process failed to meet donor expectations, they have the right to ‘withdraw’ or delete their data, or otherwise to be able to take action, and many contributors may choose to do so during the lifespan of their participation in the process.

Data-donation initiatives can be part of a research study, where data subjects voluntarily contribute their own personal data for a specific purpose. Another model, which is increasingly gaining popularity, is that data subjects might opt to share data that has already been generated for a different purpose. For example, people who use wearable devices to track their own activities may choose to share their data with a third party).

Emerging evidence suggests that encouraging people to share data for pro-societal purposes is both a strong motivator and a key basis for public confidence in the effectiveness of the approach. A recent University of Bristol research study into the psychology of data donation found that the strongest predictor of the decision to donate data was to serve society, and that knowing the consequences and potential benefits of donating data was a critical factor influencing people’s decisions to participate. Here, data governance becomes a collaborative endeavour, where the legitimation, consent and pro-societal motives of data donors become central to both the viability and the effectiveness of the approach.

A good example of this type of initiative is UK Biobank, which is a large-scale biomedical database that provides accredited researchers and industry organisations with access to medical and genetic data generated by 500,000 volunteer participants specifically for the purpose of effecting good health outcomes.

Case study: Large-scale data donation for research, UK Biobank, 2006–present[70]

 

Overview:

UK Biobank has blood, urine and saliva samples from 500,000 volunteers whose health has been tracked over the past decade, as part of a large-scale ‘data donation’ research initiative. This has enabled it to gather longitudinal data about a large sample size of the population, helping to answer questions about how diseases such as cancer, stroke and dementia develop. It has also formed the basis for intervening in response to COVID-19, with data about positive coronavirus and GP/hospital appointments added, and 20,000 contributors sharing blood samples for pandemic response purposes.[71] A range of third-party organisations including those in academia and industry can apply for different layered levels of access, paying a subscription fee to be able to access this data.

 

Participatory implications:

UK Biobank is a data-donation model that has monitoring of data access and use in place through internal and external auditing mechanisms. It is a national resource for health research, with the aim of improving the prevention, diagnosis and treatment of a wide range of serious and life-threatening illnesses. It operates on the basis that contributors ‘opt in’ to sharing their data for research purposes.

 

Data donors are free to opt out and stop sharing their data at any point without needing to provide a reason. If someone opts out, UK Biobank will no longer contact the participant or obtain further information, and any information and samples collected previously would no longer be available to researchers. They would also destroy samples (although it may not be possible to trace all distributed sample remnants) and continue to hold information only for archival audit purposes.

Complexities and critiques

While data donation has the potential to support longitudinal, evidence-based research, a major critique and challenge of this mechanism has been the extent to which self-selection can result in the underrepresentation of marginalised groups and communities. Research demonstrates that there is a consistent trend of underrepresentation of minority populations in biobanks, which undermines their value.[71]

A number of dynamics contribute towards this, including the lack of financial support for ‘data donors’, or assumptions that, for instance, all data donors will have the necessary resources to contribute, or have a confirmed place of residence. Further structural inequalities compound underrepresentation, for instance, differential levels of trust in the effectiveness of data-driven systems, uses of data and in researchers.

Data from self-selecting biobank models, while useful, can therefore be at risk of perpetuating unequal outcomes when it comes to the use of social policy mechanisms, or excluding underrepresented groups and communities from data-driven health policy. This persistent phenomenon of exclusion of underrepresented groups and communities from datasets is often described as the ‘missing data’ problem. This missing data can entrench and perpetuate inadvertent bias and discrimination by failing to identify differential impacts.[72]

Other challenges include a lack of clarity to data donors about exactly how their data has been used, and how to ‘opt out’ of data donation. Increasingly, data donation’s incentive structures warrant critical scrutiny – there is always a risk of creating coercive or perverse incentive structures in corporate environments  (for instance, private-sector providers such as insurers requesting ‘donations’ of wearable data in exchange for lower premiums or annual fees).

Another potential issue is the lack of clarity about the exact terms on which the data is ‘donated’, for instance, some people feeling that they are expected to share their data to access a healthcare service, when in reality there is no such expectation, or limited awareness about rights to opt out of data sharing. This is a particular challenge with ‘opt out’, rather than ‘opt in’ models of data donation (presently widespread practice in the UK National Health Service), as recent societal discussion and debate about the UK’s new centralised data-sharing infrastructure, and opt-out mechanisms under the General Practice Data for Planning and Research (GPDPR) proposals have highlighted.[73]

Empower

Beneficiaries actively make decisions about data governance

Empowering data beneficiaries enables them to exercise full managerial power and agency, and take responsibility for exercising and actively managing decisions about data governance – specifically, how data is accessed, shared, governed and used. In this model, the dynamic of power is shifted away from the data steward towards the data beneficiary who makes the decision, advised where necessary by appropriate specialist expertise.

Examples of these approaches are relatively rare, but they do exist, are increasingly emerging and include the following:

  • shared control and ownership of data (through, for instance, data cooperatives)[74]/li>
  • electoral mechanisms for beneficiary involvement (such as voting on boards)
  • setting terms and conditions for licensing and data access
  • shaping the rules of the data-governance framework.

The following case study of the Salus Cooperative (known as Salus Coop), based in Spain, illustrates how beneficiaries have been enabled to make decisions actively about the governance of their, and others’, data – through corporate governance, but also through processes such as setting license terms.

Case study: Citizen-driven, collaborative data management and governance, Salus Coop (Spain), 2017–present[75]

 

Overview:

Salus Coop is a non-profit data cooperative for health research (meaning in this context not only health data, but also lifestyle-related data that has health indicators, such as number of steps taken), founded in Barcelona by members of the public in September 2017. It set out to create a citizen-driven model of collaborative governance and management of health data ‘to legitimize citizens’ rights to control their own health records while facilitating data sharing to accelerate research innovation in healthcare’.

 

Salus meets the definition of a data cooperative, as it provides clear and specified benefits for its members – specifically a set of powers, rights and constraints over the use of their personal health data – in a way that also benefits the wider community by providing data for health research. Some of these powers and rights would be provided by enforcement of the GDPR, but Salus is committed to providing them to its members in a transparent and usable fashion.

 

Participatory implications:

Together with citizens, Salus has developed a ‘common good data license for health research’ through a crowd-design mechanism, which it describes as the first health data-sharing license in the world. The Salus common-good license applies to data that members donate and specifies the conditions that any research projects seeking to use the member data must adhere to. The conditions are:

  • Health only: The data will only be used for health-related (i.e. treatment of chronic disease) research.
  • Non-commercial: Research projects will be promoted by entities of general interest such as public institutions, universities and foundations only.
  • Shared results: All research results will be accessible at no cost.
  • Maximum privacy: All data will be anonymised and unidentified before any use.
  • Total control: Data donors can cancel or change the conditions of access to their data at any time.

 

Salus describes itself as committed to supporting data donors’ rights and ensuring they are upheld, and requiring researchers interacting with the data to ensure that:

 

  • individuals have the right to know under what conditions the data they’ve contributed will be used, for what uses, by which institutions, for how long and with what levels of anonymisation
  • individuals have the right to obtain the results of studies carried out with the use of data they’ve contributed openly and at no cost
  • any technological architecture used allows individuals to know about and manage any data they contribute.

Critiques and complexities

Models such as data cooperatives can provide security for beneficiaries and data subjects that supports data sharing for beneficial purposes. However, because they exclusively expect or engender greater levels of active participation in managing and shaping a data-sharing regime or process, they are at risk of excluding those who may wish to actively participate but find the costs onerous. For example, potential data donors may feel they lack the time, levels of knowledge about the regulatory landscape, or financial and social capital.

This means that these approaches are rarely appropriate for all beneficiaries and data subjects, and so cannot claim to be fully inclusive or representative. Data cooperatives can, therefore, struggle to generate the scale and level of participation from data subjects that they might hope for, but they can nevertheless help broaden out the range of intelligence informing data governance.

Another critique relates to the financial sustainability of these approaches and models. There is limited financing available that would absorb the project and start-up costs associated with data cooperative models, especially where they need to meet regulatory requirements and constraints. This can present a financial and a regulatory burden that is a barrier to setting up a data cooperative. In contrast to shareholder-controlled companies, cooperatives cannot give equity to investors, as they are owned by, and give return on investment to, their respective members. Therefore, cooperatives (and governance models similar to cooperatives) require significant, external financial support from governments, foundations and research grants if they are to succeed.

Who to involve when developing participatory mechanisms

What all these different methods and approaches have in common is that they seek to involve beneficiaries in the use, development and design of data and AI, and that they involve some element of sharing power with those beneficiaries in the way that data-driven systems and governance frameworks themselves are designed.

When designing a participatory process to meet a specific objective, the choice about who to involve matters as much as which types of involvement or participation mechanisms are used. These choices will be dependent on context, but the range of actors who can be defined as beneficiaries is broad, and extends beyond those designing and deploying data-driven systems and governance frameworks, to those affected by them.

When participatory mechanisms are introduced, key questions developers of data-driven systems and governance frameworks should answer about who to involve (who their beneficiaries are) will be:

  1. Who has a stake in the outcomes that emerge?
  2. Who is most likely to be directly affected and impacted, either benefiting or being adversely impacted?
  3. Who is most likely to be overrepresented and/or underrepresented in the data?

These three key questions are informed by a recognition that the data stewards’ responsibility is not just to manage data itself effectively, but also to recognise that data often relates, either directly, or indirectly to people (beneficiaries). As well as recognising the rights of data subjects, and the potential benefits and harms of data use to beneficiaries, data stewards need to understand that when data omits or excludes people, it has the potential to have harmful consequences. This can happen, for instance, by discriminating against or underrepresenting some people’s interests and concerns. This means that participation can be as much about including or involving those who do not have a direct relationship with the data as assembled, as those who do.

Benefits of effective participation for the design, development and use of data-driven systems

Early and continuous engagement with beneficiaries and those most likely to be affected by the use and deployment of data-driven systems can help inform decisions about the design of those systems in ways that create better outcomes for those designing, developing and deploying data-driven systems and governance frameworks, as well as for people and society.[76]

Beneficial outcomes for designers, developers and deployers of data-driven systems and governance frameworks

Because participatory approaches encourage interactions with a range of views, perspectives, norms and lived experiences that might not otherwise be encountered in the development and design of data-driven systems and governance frameworks, they minimise the risks of groupthink, unconscious biases and misalignments between intended and actual outcomes. Benefits of effective participation for designers and developers include:[77]

  • Better understanding of ethical and practical concerns and issues: Enabling developers and designers to better understand ethical concerns and challenges from the public, and better understand public perspectives, values, trade-offs and choices.[78] Participatory data stewardship can also inform and affect the quality of data embedded within a system.
  • More considered design of systems and frameworks informed by diverse thinking: Improved decision-making about development and design that reflects and has taken account of a diversity of experiences and perspectives.[79]
  • The anticipation and management of risk in the development and design of systems or frameworks: The ability to manage risk in development and design, particularly those systems or frameworks that are complex and controversial because of sensitive data, circumventing and addressing the risk of ‘techlash’. Participation can also reduce the long-term costs for technology developers and designers.[80]
  • Higher-quality data-governance frameworks: Across corporate data governance, there is often a lack of documentation or knowledge about the context in which a particular dataset was collected, for example, what levels of consent it has or what were the data subject’s intended uses. When data generation is opaque and invisible to the data subject, its legitimacy as a data source is frequently either assumed through terms and conditions, or ignored entirely. This can lead to downstream violations of contextual integrity of that data. Embedding participatory data stewardship involves shifting institutional incentives in corporate practice, to prioritise improved data quality over data quantity (a ‘less is more’ ethos), with the benefits of clearer, higher quality and fit-for-purpose datasets.

Beneficial outcomes for trustworthy data governance

Participatory approaches to data governance also play a central role in shaping technology’s legitimacy, where legitimacy is defined as  the ‘reservoir of support that allows governments to deliver positive outcomes for people.’[81]  In the case of data, the principle of legitimacy extends beyond public bodies such as governments and regulators, to those designing, developing and implementing data-governance frameworks– this is developers and deployers’ ‘social license to build’. All these public bodies, companies and individuals can use data to deliver positive outcomes for people, but only with public support.

We redefine legitimacy in the context of data, therefore, as the broad base of public support that allows companies, designers, public servants and others to use data, and to design and develop data-governance frameworks.  The absence of this legitimacy has been illustrated in a series of public, controversial events that have taken place since 2018. Examples include the Cambridge Analytica/Facebook scandal, in which political microtargeting was misused;[82] or the scandal that emerged when the NHS shared sensitive patient data with private research company DeepMind, with limited oversight or constraints.[83] More recently, governments’ efforts to implement and deploy digital contact tracing across the world have also met with considerable public debate and scrutiny.[84]

Participatory approaches to data governance can help engender some societally beneficial outcomes, but we do not propose that they replace legal and rights-based approaches, such as those embedded in the General Data Protection Regulation (GDPR) or across broader data protection and governance frameworks, which are analysed in detail in our companion report Exploring legal mechanisms for data stewardship.[85] Rather, they work hand in hand with rights-based and legal approaches to ensure public confidence and trust in appropriate uses of data. Some participatory approaches will and can, help shape the future of rights-based approaches, governance and regulation.

Outcomes from different approaches to participatory data governance can include:

  • enabling people and society to better understand and appreciate the wider public benefit that comes from ‘donating their data’ in a given context
  • enabling designers of data-governance frameworks to understand the boundaries of data use and management: what people feel is acceptable and unacceptable for the access, use and sharing of their data in a given context
  • enabling the use of collective intelligence shared by data stewards and beneficiaries, to improve the quality of data protection and data governance when gathering and using personal or sensitive data, or data about marginalised groups
  • strengthening accountability, by opening opaque data uses up to democratic, civic scrutiny and constructive criticism (particularly in contexts where the use of data is likely to have a significant impact on individuals or diverse groups, or where there is a risk of bias and self-selection in the use of datasets)
  • building public confidence in the way data is gathered and used, particularly where third parties access data, by ensuring people are able to oversee and hold to account decisions about rights to access, share and control that data
  • tightening the feedback loops that exist between those who are stewards and fiduciaries of data, and those to whom the data relates (either directly, or indirectly).

These are all potentially valid outcomes that emerge from embedding participatory data stewardship mechanisms. The mechanisms to enable these outcomes can be different – if they are to be impactful, they are quite likely to vary depending on context, use of the data, the type of data being governed, and those who are most likely now and in future to be impacted by the data use and governance.

Conclusion

This report highlights the range and plurality of participatory data stewardship mechanisms through which people and society are able to influence and shape data governance processes, and the potential benefits for beneficiaries, public bodies and those designing and deploying data-driven systems and governance frameworks.

By examining real-world case studies, it has set out some of the ways in which those managing, controlling and governing the use of data can avoid ‘going through the empty ritual of participation’ described by Arnstein – and shift away from practices that are harmful, disempowering or coercive of people, towards practices that promote a collaborative and co-creative approach to working together.

The framework for participation demonstrates the multiple and overlapping goals participation can have, in the context of the design, development and deployment of data-driven systems and data-governance frameworks. The range of different mechanisms that serve different goals and purposes can complement each other and are mutually reinforcing.

Effective participation must start with the goal of informing people about their data – with transparency (the right to be informed and the right to take meaningful action based on what one knows), where the data steward shares information about what is happening to people’s data in ways that enable them to take action.

For participation to become meaningful, it must extend beyond transparency towards mechanisms that aim to understand, interact with, be advised by and then respond to people’s views (consult and involve) in decision making about data–shifting power dynamics in that process.

The end goal of participation is to realise a set of conditions where people have the potential to be empowered, so that their perspectives form part of the design of data governance, are responded to, and in turn build confidence and capacity for beneficiaries to continue to participate in the data governance process.

As this report makes apparent, there is considerable creative potential for developers, designers, beneficiaries and users of data-driven systems and data-governance frameworks to work together in shaping improved mechanisms for stewardship. The aim is to create a virtuous cycle, where participation effects substantial change in practice, and public confidence and trust generated by participatory approaches effect improved outcomes in the use of data for people and society.

Methodology

This report and framework has been developed and informed by extensive mixed-methods qualitative and case study-based research that encompasses:

  • a case study-based analysis of over one hundred examples of participatory data sharing[86]
  • a literature review of data metaphors and narratives
  • a desk-based review and synthesis of grey and academic literature on existing models of public engagement in technology and data
  • the deliberations of the Ada Lovelace Institute and Royal Society’s Legal Mechanisms for Data Stewardship working group.[87]

Appendix

Sherry Arnstein’s ‘ladder of citizen participation’ sets out how institutions might ascend the metaphorical rungs in involving people in critical decisions made about what affects their lives. The lower rungs on the ladder represent nudging or manipulating people to act in a particular way, or merely attempting to tell people about a decision, and the higher rungs grant greater levels of power and involvement in shaping those decisions.

Figure 6: Arnstein’s ‘Ladder of Citizen Participation’

 

Figure 6: The participation spectrum

Acknowledgements

We would like to thank the following colleagues for taking time to review a draft of this paper, and offering their expertise and feedback:

  • Alix Dunn, Computer Says Maybe and Ada Lovelace Institute Board member
  • Dr Allison Gardner, Keele University
  • Andrew Strait, Ada Lovelace Institute
  • Astha Kapoor, The Aapti Institute
  • Carly Kind, Ada Lovelace Institute
  • Professor Diane Coyle, Bennett Institute for Public Policy, Cambridge University
  • Hetan Shah, British Academy and Ada Lovelace Institute Board member
  • Jack Hardinges, Open Data Institute
  • Dr Jeni Tennison, Open Data Institute
  • Kasia Odrozek, Mozilla Foundation
  • Katie Taylor, Wellcome Trust
  • Professor Lina Dencik, The Cardiff Data Justice Lab
  • Lisa Murphy, NHSx
  • Dr Mahlet Zimeta, Open Data Institute
  • Miranda Marcus, Wellcome Trust
  • Dr Miranda Wolpert, Wellcome Trust.

We are also grateful to techUK and to the Cardiff Data Justice Lab, who both organised workshops to facilitate input into this research, and for attendance at those workshops by a range of technology companies, policymakers, civil-society organisations and academics.


This report was lead authored by Reema Patel, with substantive contributions from Aidan Peppin, Valentina Pavel, Jenny Brennan, Imogen Parker and Cansu Safak.

Preferred citation: Ada Lovelace Institute. (2021). Participatory data stewardship. Available at: https://www.adalovelaceinstitute.org/report/participatory-data-stewardship/

References

[1] Triggle, N. (2014). ‘Care.data: How did it go so wrong?’ BBC News. 19 Feb. Available at: https://www.bbc.co.uk/news/health-26259101 [Accessed 6 Jul. 2021]

[2] Fruchter, N., Yuan, B. and Specter, M. (2018). ‘Facebook/Cambridge Analytica: Privacy lessons and a way forward’. Internet Policy Research Initiative at MIT. Available at: https://internetpolicy.mit.edu/blog-2018-fbcambridgeanalytica/.

[3] Centre for Data Ethics and Innovation. (2020). Independent report: Addressing trust in public sector data use. GOV.UK. Available at: https://www.gov.uk/government/publications/cdei-publishes-its-first-report-on-public-sector-data-sharing/addressing-trust-in-public-sector-data-use [Accessed 15 February 2021]

[4] Arnstein, S. (1969). ‘A Ladder of Citizen Participation’. Journal of the American Institute of Planners, 35(4), pp.216-224. Available at: https://www.tandfonline.com/doi/abs/10.1080/01944366908977225

[5] Patel, R. and Gibbon, K. (2018). ‘Why decisions about the economy need you’. The RSA. Available at: https://www.thersa.org/blog/2017/04/why-decisions-about-the-economy-need-you [Accessed 15 February 2021]

[6] Patel, R. and Peppin, A. (2020). ‘Exploring principles for data stewardship’. Ada Lovelace Institute. Available at: https://www.adalovelaceinstitute.org/project/exploring-principles-for-data-stewardship/ [Accessed 16 Feb. 2021]

[7] Kapoor, Astha and Whitt, Richard S. (2021). Nudging Towards Data Equity: The Role of Stewardship and Fiduciaries in the Digital Economy. February 22. Available at SSRN: https://ssrn.com/abstract=3791845 or http://dx.doi.org/10.2139/ssrn.3791845

[8] Ada Lovelace Institute. (2021). Exploring legal mechanisms for data stewardship. Available at: https://www.adalovelaceinstitute.org/report/legal-mechanisms-data-stewardship/

[9] Centre for Data Ethics and Innovation. (2020)

[10] Olavsrud, T. (2021). ‘Data governance: A best practices framework for managing data assets.’ CIO. https://www.cio.com/article/3521011/what-is-data-governance-a-best-practices-framework-for-managing-data-assets.html

[11] Kapoor, Astha and Whitt, Richard S. (2021). Nudging Towards Data Equity: The Role of Stewardship and Fiduciaries in the Digital Economy. 22 February. Available at SSRN: https://ssrn.com/abstract=3791845 or http://dx.doi.org/10.2139/ssrn.3791845

[12] The Information Commissioner’s Office defines the data subject as ‘the identified or identifiable living individual to whom personal data relates’. Available at: https://ico.org.uk/for-organisations/data-protection-fee/legal-definitions-fees/#subject

[13] The use of stewardship to describe the governance of common resources was foundational to the work of Nobel Prize-winning economist Elinor Ostrom, who developed design principles for collective governance. Though often focused on shared natural resources like pastures, forests or fisheries, applying Ostrom’s principles to data can help us to think through thorny challenges of doing good with data. See Ostrom, E. (2015). Governing the Commons. Cambridge: Cambridge University Press. The Ada Lovelace Institute has mapped Ostrom’s principles against real-world examples where data is stewarded for social causes, or on
behalf of data subjects, see: https://docs.google.com/spreadsheets/d/1hAN8xMJuxobjARAWprZjtcZgq1lwOiFT7hf2UsiRBYU/edit#gid=1786878085

[14] See Ada’s post, Disambiguating data stewardship, for more on the development of this definition. Available at: https://www.adalovelaceinstitute.org/blog/disambiguating-data-stewardship/

[15] ’Datafication’ is a term that describes the process of rendering human and social behaviour in quantifiable ways, in turn releasing it as a new form of economic and social value. See the Ada Lovelace Institute’s report The data will see you now for a detailed explanation
of how this operates in relation to health data. Available at: https://www.adalovelaceinstitute.org/report/the-data-will-see-you-now/

[16] Office for Statistics Regulation. (2021). Ensuring statistical models command public confidence. Available at: https://osr.statisticsauthority.gov.uk/publication/ensuring-statistical-models-command-public-confidence/ [Accessed 19 Aug 2021]

[17] Arnstein, S. (1969). ‘A Ladder of Citizen Participation’. Journal of the American Institute of Planners, 35(4), pp.216-224. Available at:
https://www.tandfonline.com/doi/abs/10.1080/01944366908977225

[18] Patel, R. and Gibbon, K. (2018). ‘Why decisions about the economy need you’. The RSA. Available at: https://www.thersa.org/blog/2017/04/why-decisions-about-the-economy-need-you [Accessed 15 February 2021]

[19] Wachter, S. (2021). A right to explanation. Alan Turing Institute. Available at: https://www.turing.ac.uk/research/impact-stories/a-right-to-explanation [Accessed 15 February 2021]

[20] See website from the Alan Turing Institute on Project ExplAIn. Leslie, D. (2019). Project ExplAIn. The Alan Turing Institute. Available at: https://www.turing.ac.uk/news/project-explain [Accessed 15 February 2021].

[21] Ruiz, J. (2018). ‘Machine learning and the right to explanation in GDPR’. Open Rights Group. Available at: https://www.openrightsgroup.org/blog/machine-learning-and-the-right-to-explanation-in-gdpr/[Accessed 15 February 2021]

[22] ICO. (2020). What goes into an explanation? Available at: https://ico.org.uk/for-organisations/guide-to-data-protection/key-data-protection-themes/explaining-decisions-made-with-artificial-intelligence/part-1-the-basics-of-explaining-ai/what-goes-into-an-explanation/ [Accessed 5 Jul. 2021]

[23] Ananny M, Crawford K. (2018). Seeing without knowing: Limitations of the transparency ideal and its application to algorithmic accountability. New Media & Society. 20(3):973-989. doi:10.1177/1461444816676645

[24] M. Kaminski. (2020). Understanding Transparency in Algorithmic Accountability. Forthcoming in Cambridge Handbook of the Law of Algorithms, ed. Woodrow Barfield, Cambridge University Press (2020)., U of Colorado Law Legal Studies Research Paper No. 20-
34, Available at SSRN: https://ssrn.com/abstract=3622657

[25] L. Stirton, M. Lodge. (2002). Transparency Mechanisms: Building Publicness into Public Services. Journal of Law and Society. https://doi.org/10.1111/1467-6478.00199

[26] See, for example the model cards from Google Cloud: https://modelcards.withgoogle.com/about

[27] Gebru, T., Morgenstern, J., Vecchione, B., Vaughan, J.W., Wallach, H., Daumeé III, Hal and Crawford, K. (2018). Datasheets for Datasets. Available at: https://arxiv.org/abs/1803.09010

[28] Dimitrova, A., (2020). ‘Helsinki and Amsterdam with first ever AI registries’ TheMAYOR.eu Available at: https://www.themayor.eu/en/a/view/helsinki-and-amsterdam-with-first-ever-ai-registries-5982 [Accessed 15 Feb. 2021]

[29] Dimitrova, A. (2020)

[30] Safak, C. and Parker, I. (2020). ‘Meaningful transparency and in(visible) algorithms’. Ada Lovelace Institute. Available at: https://www.adalovelaceinstitute.org/blog/meaningful-transparency-and-invisible-algorithms

[31] Edwards, L. and Veale, M. (2018). ‘Enslaving the Algorithm: From a “Right to an Explanation” to a “Right to Better Decisions”?’. IEEE Security & Privacy, 16(3), pp.46–54. Available at: https://arxiv.org/pdf/1803.07540.pdf

[32] Coyle, D., & Weller, A. (2020). ‘“Explaining” machine learning reveals policy challenges’. Science, 368 (6498), pp.1433-1434. Available at: https://doi.org/10.1126/science.aba9647

[33] Van den Boomen, M. (2014). Transcoding the digital: How metaphors matter in new media. Instituut voor Netwerkcultuur. Available at: https://dspace.library.uu.nl/handle/1874/289883

[34] Puschmann, C. and Burgess, J. (2014). ‘Big Data, Big Questions | Metaphors of Big Data’, International Journal of Communication, 8(0), p. 20. Available at: https://ijoc.org/index.php/ijoc/article/view/2169

[35] Rajan, A. (2017). ‘Data is not the new oil’, BBC News. Available at: https://www.bbc.com/news/entertainment-arts-41559076; Lupton,
D. (2013). ‘Swimming or drowning in the data ocean? Thoughts on the metaphors of big data’, This Sociological Life, Available at: https://simplysociology.wordpress.com/2013/10/29/swimming-or-drowning-in-the-data-ocean-thoughts-on-the-metaphors-of-big-data/; Doctorow, C. (2008). ‘Cory Doctorow: why personal data is like nuclear waste.’ The Guardian. Available at: http://www.theguardian.com/technology/2008/jan/15/data.security (All accessed: 15 February 2021)

[36] Coyle D., Diepeveen S., Wdowin J., et al. (2020). The value of data: Policy Implications. Bennett Institute for Public Policy Cambridge, the Open Data Institute and Nuffield Foundation. Available at: https://www.bennettinstitute.cam.ac.uk/media/uploads/files/Value_of_data_Policy_Implications_Report_26_Feb_ok4noWn.pdf

[37] Lupton D. (2013). ‘Swimming or drowning in the data ocean? Thoughts on the metaphors of big data’, This Sociological Life, 29 October. Available at: https://simplysociology.wordpress.com/2013/10/29/swimming-or-drowning-in-the-data-ocean-thoughts-on-the-metaphors-of-big-data/ (Accessed: 15 February 2021)

[38] Kerssens, N. (2019). ‘De-Agentializing Data Practices: The Shifting Power of Metaphor in 1990s Discourses on Data Mining’, Journal of Cultural Analytics, 1(1), p. 11049. Available at: https://dspace.library.uu.nl/handle/1874/387149

[39] Tisne M. (2019). ‘Data isn’t the new oil, it’s the new CO2’. Luminate Group. Available at: https://luminategroup.com/posts/blog/data-isnt-the-new-oil-its-the-new-co2 (Accessed: 15 February 2021).

[40] Arnstein, S. (1969). ‘A Ladder of Citizen Participation’. Journal of the American Institute of Planners, 35(4), pp.216-224. Available at:
https://www.tandfonline.com/doi/abs/10.1080/01944366908977225

[41] The Consultation Institute. (2018). The Gunning Principles – Implications. Available at: https://www.consultationinstitute.org/the-gunning-principles-implications/ [Accessed 15 Feb. 2021]

[42] Patel, R. and Peppin, A. (2020). ‘Making visible the invisible: what public engagement uncovers about privilege and power in data systems’. Ada Lovelace Institute. Available at: https://www.adalovelaceinstitute.org/blog/public-engagement-uncovers-privilege-and-power-in-data-systems/ [Accessed 5 Jul. 2021]

[43] Wolpert, M. (2020). ‘Global mental health databank pilot launches to discover what helps whom and why in youth anxiety and depression’. LinkedIn. Available at: https://www.linkedin.com/pulse/global-mental-health-databank-pilot-launches-discover-wolpert/?trackingId=uh3v1y0AQ3WGMK1Ssvq9JA%3D%3D [Accessed 5 Jul. 2021]

[44] Carroll, S.R., Hudson, M., Holbrook, J., Materechera, S. and Anderson, J. (2020). ‘Working with the CARE principles: operationalising Indigenous data governance.’ Ada Lovelace Institute. Available at: https://www.adalovelaceinstitute.org/blog/care-principles-operationalising-indigenous-data-governance/ [Accessed 5 Jul. 2021]

[45] Carroll, S.R., Garba, I., Figueroa-Rodríguez, O.L., Holbrook, J., Lovett, R., Materechera, S., Parsons, M., Raseroka, K., Rodriguez-Lonebear, D., Rowe, R., Sara, R., Walker, J.D., Anderson, J. and Hudson, M., (2020). The CARE Principles for Indigenous Data Governance. Data Science Journal, 19(1), p.43. Available at: https://datascience.codata.org/articles/10.5334/dsj-2020-043/

[46] University of Manchester Digital Futures. (2015). The CityVerve Project. Available at: http://www.digitalfutures.manchester.ac.uk/case-studies/the-cityverve-project/ [Accessed 19 Aug. 2021]

[47] Hawkins, A.J. (2020). ‘Alphabet’s Sidewalk Labs shuts down Toronto smart city project’. The Verge. Available at: https://www.theverge.com/2020/5/7/21250594/alphabet-sidewalk-labs-toronto-quayside-shutting-down

[48] Solomon, S., & Abelson, J. (2012). ‘Why and when should we use public deliberation?’ The Hastings Center report, 42(2), p.17–20. Available at: https://doi.org/10.1002/hast.27

[49] Understanding Patient Data and Ada Lovelace Institute. (2020). Foundations of Fairness: Where Next For NHS Health Data Partnerships? Available at: https://understandingpatientdata.org.uk/sites/default/files/2020-03/Foundations%20of%20Fairness%20-%20Summary%20and%20Analysis.pdf

[50] Understanding Patient Data and Ada Lovelace Institute. (2020)

[51] TheGovLab. (2020). The Data Assembly. Available at: https://thedataassembly.org/ [Accessed 5 Jul. 2021]

[52] Balaram, B. (2017). ‘The role of citizens in developing ethical AI’. The RSA. Available at: https://www.thersa.org/blog/2017/10/the-role-of-citizens-in-developing-ethical-ai [Accessed 5 Jul. 2021]

[53] ICO (2019). Project explAIn Interim report. Available at: https://ico.org.uk/media/2615039/project-explain-20190603.pdf
[Accessed 5 Jul. 2021]

[54] The Ada Lovelace Institute. (2021). The Citizens’ Biometrics Council. Available at: https://www.adalovelaceinstitute.org/project/citizens-biometrics-council/ [Accessed 5 Jul. 2021]

[55] Peppin, A. (2021). ‘How charting public perspectives can show the way to unlocking the benefits of location data.’ Ada Lovelace Institute. Available at: https://www.adalovelaceinstitute.org/blog/public-perspectives-unlocking-benefits-location-data/ [Accessed 5 Jul. 2021]

[56] OECD. (2020). Innovative Citizen Participation and New Democratic Institutions: Catching the Deliberative Wave. Available at: innovative-citizen-participation-new-democratic-institutions-catching-the-deliberative-wave-highlights.pdf

[57] Carugati, F. (2020). ‘A Council of Citizens Should Regulate Algorithms’. Wired. Available at: https://www.wired.com/story/opinion-a-council-of-citizens-should-regulate-algorithms/ [Accessed 15 Feb 2021]

[58] Jørgensen, M.S. (2020). ‘A pioneer in trouble: Danish Board of Technology are facing problems’. EASST. Available at: https://easst.net/easst-review/easst-review-volume-311-march-2012/a-pioneer-in-trouble-danish-board-of-technology-are-facing-problems/
[Accessed 15 Feb 2021]

[59] Grundahl, J. (1995). ‘The Danish consensus conference model.’ In Joss, S.& Durant, J. (eds) Public Participation in Science: The Role of Consensus Conferences in Europe. Science Museum: London. Available at: https://people.ucalgary.ca/~pubconf/Education/grundahl.html

[60] GOV.UK, (2021). The Use of Public Engagement for Technological Innovation. Available at: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/955880/use-of-public-engagement-for-technological-innovation.pdf

[61] Patel, R. (2015). ‘Digital Public Engagement – Lessons from the Sounding Board’. Involve. Available at: https://www.involve.org.uk/resources/blog/opinion/digital-public-engagement-lessons-sounding-board [Accessed 5 Jul. 2021]

[62] Ada Lovelace Institute. (2020). Confidence in a crisis? Building public trust in a contact tracing app. Available at: https://www.adalovelaceinstitute.org/report/confidence-in-crisis-building-public-trust-contact-tracing-app/ [Accessed 5 Jul. 2021]

[63] The RSA. (2017). The Citizens’ Economic Council ‘On The Road’. Available at: https://www.thersa.org/projects/archive/economy/citizens-economic-council/the-council2

[64] The RSA. (2017)

[65] Delacroix, S., and Lawrence, N. D. (2019). ‘Bottom-up Data Trusts: disturbing the ‘one size fits all’ approach to data governance’, International Data Privacy Law, Volume 9, Issue 4, pp. 236–252. Available at: https://doi.org/10.1093/idpl/ipz014

[66] Taylor, K. (2020). ‘Developing a mental health databank’. Medium. Available at: https://medium.com/wellcome-digital/developing-a-mental-health-databank-99c25a96001d [Accessed 15 Feb. 2021]

[67] Sage Bionetworks. (2020). Collaborating with youth is key to studying mental-health management. Available at: https://www.eurekalert.org/pub_releases/2020-11/sb-cwy111720.php [Accessed 5 Jul. 2021]

[68] Lansdell, S. and Bunting, M. (2019). Designing decision making processes for data trusts: lessons from three pilots. The Involve Foundation (Involve). Available at: http://theodi.org/wp-content/uploads/2019/04/General-decision-making-report-Apr-19.pdf

[69] Bietz, M., Patrick, K. and Bloss, C. (2019). ‘Data Donation as a Model for Citizen Science Health Research’. Citizen Science: Theory and Practice, 4(1), p.6. Available at: http://doi.org/10.5334/cstp.178

[70] UK BioBank. (2021). Explore your participation in UK Biobank. Available at: https://www.ukbiobank.ac.uk/explore-your-participation
[Accessed 15 Feb. 2021]

[71] UK BioBank. (2021)

[72] Kim P, Milliken EL. (2019). ‘Minority Participation in Biobanks: An Essential Key to Progress: Methods and Protocols’. Methods in Molecular Biology. 1897: pp.43-50. Available at: https://www.researchgate.net/publication/329590514_Minority_Participation_in_Biobanks_An_Essential_Key_to_Progress_Methods_and_Protocols

[73] Groenwold, R., & Dekkers, O. (2020). ‘Missing data: the impact of what is not there’. European Journal of Endocrinology, 183(4), E7-E9. Available at: https://eje.bioscientifica.com/view/journals/eje/183/4/EJE-20-0732.xml

[74] Machirori, M. and Patel, R. (2021). ‘Turning distrust in data sharing into “engage, deliberate, decide”.’ Ada Lovelace Institute. Available at: https://www.adalovelaceinstitute.org/blog/distrust-data-sharing-engage-deliberate-decide/ [Accessed 5 Jul. 2021]

[75] For more on data cooperatives see chapter 2 in Exploring legal mechanisms for data stewardship (Ada Lovelace Institute, 2021)

[76] Data Collaboratives. (n.d.). Salus Coop. Available at: https://datacollaboratives.org/cases/salus-coop.html [Accessed 15 Feb. 2021]

[77] ESRC website. (n.d). Why public engagement is important. Available at: https://esrc.ukri.org/public-engagement/public-engagement-guidance/why-public-engagement-is-important/ [Accessed 15 February 2021]

[78] There are various definitions of values-led and instrumental rationales for public engagement including: Stirling, A. (2012). ‘Opening Up the Politics of Knowledge and Power in Bioscience’. PLoS Biology 10(1). Available at: https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1001233

[79] The Involve Foundation (‘Involve’), (2019). How to stimulate effective public engagement on the ethics of artificial intelligence. Available at: https://www.involve.org.uk/sites/default/files/field/attachemnt/How%20to%20stimulate%20effective%20public%20debate%20on%20the%20ethics%20of%20artificial%20intelligence%20.pdf [Accessed 15 February 2021]

[80] Page, S. (2011). Diversity and complexity. Princeton, N.J: Princeton University Press.

[81] Clarke, R. (2015). Valuing Dialogue: Economic Benefits and Social Impacts. London: Sciencewise Expert Resource Centre. Available at: https://sciencewise.org.uk/wp-content/uploads/2018/12/Valuing-dialogue-2015.pdf [Accessed 15 February 2021]

[82] Centre for Public Impact. (2018). Finding Legitimacy. Available at: https://www.centreforpublicimpact.org/assets/documents/Finding-a-more-Human-Government.pdf [Accessed 15 February 2021]

[83] Fruchter, N., Yuan, B. and Specter, M. (2018). Facebook/Cambridge Analytica: Privacy lessons and a way forward. Internet Policy Research Initiative at MIT. Available at: https://internetpolicy.mit.edu/blog-2018-fb-cambridgeanalytica/

[84] ICO. (2018). Royal Free – Google DeepMind trial failed to comply with data protection law. Available at: https://ico.org.uk/about-the-ico/news-and-events/news-and-blogs/2017/07/royal-free-google-deepmind-trial-failed-to-comply-with-data

[85] Ada Lovelace Institute. (2020). Exit through the App Store? Rapid evidence review. Available at: https://www.adalovelaceinstitute.org/evidence-review/covid-19-rapid-evidence-review-exit-through-the-app-store/ [Accessed 5 Jul. 2021]

[86] Ada Lovelace Institute. (2020)

[87] Patel, R., Gibbon, K. and Peppin, A. (2020). Exploring principles for data stewardship. Ada Lovelace Institute. Available at:
https://www.adalovelaceinstitute.org/project/exploring-principles-for-data-stewardship/ [Accessed 16 Feb. 2021]

[88] The Ada Lovelace Institute. (2021). Exploring legal mechanisms for data stewardship. Available at: https://www.adalovelaceinstitute.org/report/legal-mechanisms-data-stewardship

1–12 of 15

Skip to content

Executive summary

What can foundation model oversight learn from the US Food and Drug Administration (FDA)?

In the last year, policymakers around the world have grappled with the challenge of how to regulate and govern foundation models – artificial intelligence (AI) models like OpenAI’s GPT-4 that are capable of a range of general tasks such as text synthesis, image manipulation and audio generation. Policymakers, civil society organisations and industry practitioners have expressed concerns about the reliability of foundation models, the risk of misuse of their powerful capabilities and the systemic risks they could pose as more and more people begin to use them in their daily lives.

Many of these risks to people and society – such as the potential for powerful and widely used AI systems to discriminate against particular demographics, or to spread misinformation more widely and easily – are not new, but foundation models have some novel features that could greatly amplify the potential harms.

These features include their generality and ability to complete range of tasks; the fact that they are ‘built on’ for a wide range of downstream applications, creating a risk that a single point of failure could lead to networked catastrophic consequences; fast and (sometimes) unpredictable jumps in their capabilities and behaviour, which make it harder to foresee harm; and their wide-scale accessibility, which puts powerful AI capabilities in the hands of a much larger number of people.

Both the UK and US governments have released voluntary commitments for developers of these models, and the EU’s AI Act includes some stricter requirements for models before they can be sold on the market. The US Executive Order on AI also includes some obligations on some developers of foundation models to test their systems for certain risks.[1] [2]

Experts agree that foundation models need additional regulatory oversight due to their novelty, complexity and lack of clear safety standards. Oversight needs to enable learning about risks, and to ensure iterative updates to safety assessments and standards.

Notwithstanding the unique features of foundation models, this is not the first time that regulators have grappled with how to regulate complex, novel technologies that raise a variety of sociotechnical risks.[3] One area where this challenge already exists is in life sciences. Drug and medical device regulators have a long history of applying a rigorous oversight process to novel, groundbreaking and experimental technologies that – alongside their possible benefits – could present potentially severe consequences for people and society.

This paper draws on interviews with 20 experts and a literature review to examine the suitability and applicability of the US Food and Drug Administration (FDA) oversight model to foundation models. It explores the similarities and differences between medical devices and foundation models, the limitations of the FDA model as applied to medical devices, and how the FDA’s governance framework could be applied to the governance of foundation models.

This paper highlights that foundation models may pose risks to the public that are similar to — or even greater than — Class III medical devices (the FDA’s highest risk category). To begin to address the mitigation of these risks through the lens of the FDA model, the paper lays out general principles to strengthen oversight and evaluation of the most capable foundation models, along with specific recommendations for each layer in the supply chain.

This report does not address questions of international governance implications, the political economy of the FDA or regulating AI in medicine specifically. Rather, this paper seeks to answer a simple question: when designing the regulation of complex AI systems, what lessons and approaches can regulators draw on from medical device regulation?

A note on terminology

Regulation refers to the legally binding rules that govern the industry, setting the standards, requirements and guidelines that must be complied with.

Oversight refers to the processes of monitoring and enforcing compliance with regulations, for example through audits, reporting requirements or investigations.

What is FDA oversight?

With more than one hundred years’ history, a culture of continuous learning, and increasing authority, the FDA is a long-established regulator, with FDA-regulated products accounting for about 20 cents of every dollar spent by US consumers.

The FDA regulates drugs and medical devices by assigning them a specific risk level corresponding to how extensive subsequent evaluations, inspections and monitoring will be at different stages of development and deployment. The more risky and more novel a product, the more tests, evaluation processes and monitoring it will undergo.

The FDA does this by providing guidance and setting requirements for drug and device developers to follow, including regulatory approval of any protocols the developer will use for testing, and evaluating the safety and efficacy of the product.

The FDA has extensive auditing powers, with the ability to inspect drug companies’ data, processes and systems at will. It also requires companies to report incidents, failures and adverse impacts to a central registry. There are substantial fines for failing to follow appropriate regulatory guidance, and the FDA has a history of enforcing these sanctions.

Core risk-reducing aspects of FDA oversight

  • Risk- and novelty-driven oversight: The riskier and more novel a product, the more tests, evaluation processes and monitoring there will be.
  • Continuous, direct engagement with developers from development through to market: Developers must undergo a rigorous testing process through a protocol agreed with the FDA.
  • Wide-ranging information access: The FDA has statutory powers to access comprehensive information, for example, clinical trial results and patient data.
  • Burden of proof on developers: Developers must demonstrate the efficacy and safety of a drug or medical device at various ‘approval gates’ before the product can be tested on humans or be sold on a market.
  • Balancing innovation with efficacy and safety: This builds acceptance for the FDA’s regulatory authority.

How suitable is FDA-style oversight for foundation models?

Our findings show that foundation models are at least as complex as and more novel than FDA Class III medical devices (the highest risk category), and that the risks they pose are potentially just as severe.[4][5][6] Indeed, the fact that these models are deployed across the whole economy, interacting with millions of people, means that they are likely to pose systemic risks far beyond those of Class III medical devices.[7] However, the exact risks of these models are so far not fully clear. Risk mitigation measures are uncertain and risk modelling is poor or non-existent.

The regulation of Class III medical devices offers policymakers valuable insight into how they might regulate foundation models, but it is also important that they are aware of the limitations.

Limitations of FDA-style oversight for foundation models

  • High cost of compliance: A high cost of compliance could limit the number of developers, which may benefit existing large companies. Policymakers may need to consider less restrictive requirements for smaller companies that have fewer users, coupled with support for such companies in compliance and via streamlined regulatory pathways.
  • Limited range of risks assessed: The FDA model may not be able to fully address the systemic risks and the risks of unexpected capabilities associated with foundation models. Medical devices are not general purpose, and the FDA model therefore largely assesses efficacy and safety in narrow contexts. Policymakers may need to create new, exploratory methods for assessing some types of risk throughout the foundation model supply chain, which may require increased post-market monitoring obligations.
  • Overreliance on industry: Regulatory agencies like the FDA sometimes need industry expertise, especially in novel areas where clear benchmarks have not yet been developed and knowledge is concentrated in industry. Foundation models present a similar challenge. This could raise concerns around regulatory capture and conflicts of interest. An ecosystem of independent academic and governmental experts needs to be built up to support balanced, well-informed oversight of foundation models, with clear mechanisms for those impacted by AI technologies to contribute. This could be at the design and development stage, eliciting feedback from pre-market ‘sandboxing’, or through market approval processes (under the FDA regime, patient representatives have a say in this process). At any step in the process, consideration should be given to who is involved (this could range from a representative panel to a jury of members of the public), the depth of engagement (from public consultations through to partnership decision-making), and methods (for example, from consultative exercises such as focus groups, to panels and juries for deeper engagement).

General principles for AI regulators

To strengthen oversight and evaluations of the most capable foundation models (for example, OpenAI’s GPT-4), which currently lag behind FDA oversight in aspects of risk-reducing external scrutiny:

  1. Establish continuous, risk-based evaluations and audits throughout the foundation model supply chain.
  2. Empower regulatory agencies to evaluate critical safety evidence directly, supported by a third-party ecosystem – consistently proven higher quality than self- or second-party evaluations across industries.
  3. Ensure independence of regulators and external evaluators, through mandatory industry fees and a sufficient budget for regulators that contract third parties. While existing sector-specific regulators, for example, the Consumer Financial Protection Bureau (CFPB) in the USA, may review downstream AI applications, there might be a need for an upstream regulator of foundation models themselves. The level of funding for such a regulator would need to be similar to that of other safety-critical domains, such as medicine.
  4. Enable structured access to foundation models and adjacent components for evaluators and civil society. This will help ensure the technology is designed and deployed in a manner that meets the needs of the people who are impacted by its use, and enable methods to offer accountability mechanisms if it is not
  5. Enforce a foundation model pre-approval process, shifting the burden of proof to developers.

Recommendations for AI regulators, developers and deployers

Data and compute layers oversight

  1. Regulators should compel pre-notification of, and information-sharing on, large training runs.
  2. Regulators should compel mandatory model and dataset documentation and disclosure for the pre-training and fine-tuning of foundation models,[8] [9] [10] including a capabilities evaluation and risk assessment within the model card for the (pre-) training stage and post-market.

Foundation model layer oversight

  1. Regulators should introduce a pre-market approval gate for foundation models, as this is the most obvious point at which risks can proliferate. In any jurisdiction, defining the approval gate will require significant work, with input from all relevant stakeholders. In critical or high-risk areas, depending on the jurisdiction and existing or foreseen pre-market approval for high-risk use, regulators should introduce an additional approval gate at the application layer of the supply chain.
  2. Third-party audits should be required as part of the pre-market approval process, and sandbox testing in real-world conditions should be considered.
  3. Developers should enable detection mechanisms for the outputs of generative foundation models.
  4. As part of the initial risk assessment, developers and deployers should document and share planned and foreseeable modifications throughout the foundation model’s supply chain.
  5. Foundation model developers, and high-risk application providers building on top of these models, should enable an easy complaint mechanism for users to swiftly report any serious risks that have been identified.

Application layer oversight

  1. Existing sector-specific agencies should review and approve the use of foundation models for a set of use cases, by risk level.
  2. Downstream application providers should make clear to end users and affected persons what the underlying foundation model is, including if it is an open-source model, and provide easily accessible explanations of systems’ main parameters and any opt-out mechanisms or human alternatives available.

Post-market monitoring

  1. An AI ombudsman should be considered, to take and document complaints or known instances of harms of AI. This should be complimented by a comprehensive remedies framework for affected persons based on clear avenues for redress.
  2. Developers and downstream deployers should provide documentation and disclosure of incidents throughout the supply chain, including near misses. This could be strengthened by requiring downstream developers (building on top of foundation models at the application layer) and end users (for example, medical or education professionals) to also disclose incidents.
  3. Foundation model developers, downstream deployers and hosting providers (for example GitHub or Hugging Face) should be compelled to restrict, suspend or retire a model from active use if harmful impacts, misuse or security vulnerabilities (including leaks or otherwise unauthorised access) arise.
  4. Host layer actors (for example cloud service providers or model hosting platforms) should also play a role in evaluating model usage and implementing trust and safety policies to remove harmful models that have demonstrated or are likely to demonstrate serious risks, and flagging harmful models to regulators when it is not in their power to take them down.
  5. AI regulators should have strong powers to investigate and require evidence generation from foundation model developers and downstream deployers. This should be strengthened by whistleblower protections for any actor involved in development or deployment who raises concerns about risks to health or safety.
  6. Any regulator should be funded to a level comparable to (if not greater than) regulators in other domains where safety and public trust are paramount and where underlying technologies form part of national infrastructure – such as civil nuclear, civil aviation, medicines, or road and rail.[11] Given the level of resourcing required, this may be partly funded by AI developers over a certain threshold.
  7. The law around AI liability should be clarified to ensure that legal and financial liability for AI risk is distributed proportionately along foundation model supply chains.

Introduction

As governments around the world consider the regulation of artificial intelligence (AI), many experts are suggesting that lessons should be drawn from other technology areas. The US Food and Drug Administration (FDA) and its approval process for drug development and medical devices is one of the most cited areas in this regard.

This paper seeks to understand if and how FDA-style oversight could be applied to AI, and specifically to foundation models, given their complexity, novelty and potentially severe risk profile – each of which arguably exceeds those of the products regulated by the FDA.

This paper first maps the FDA review process for Class III medical software, to identify both the risk-reducing features and the limitations of FDA-style oversight. It then considers the suitability and applicability of FDA processes to foundation models and suggests how FDA risk-reducing features could be applied across the foundation model supply chain. It concludes with actionable recommendations for policymakers.

What are foundation models?

Foundation models are a form of AI system capable of a range of general tasks, such as text synthesis, image manipulation and audio generation.[12] Notable examples include OpenAI’s GPT-4 – which has been used to create products such as ChatGPT – and Anthropic’s Claude 2.

Advances in foundation models raise concerns about reliability, misuse, systemic risks and serious harms. Developers and researchers of foundation models have highlighted that their wide range of capabilities and unpredictable behaviours[13] could pose a series of risks, including:

  • Accidental harms: Foundation models can generate confident but factually incorrect statements, which could exacerbate problems of misinformation. In some cases this could have potentially fatal consequences, for example, if someone is misled into eating something poisonous or taking the wrong medication.[14] [15]
  • Misuse harms: These models could enable actors to intentionally cause harm, from harassment[16] through to cybercrime at a greater scale[17] or biosecurity risks.[18] [19]
  • Structural or systemic harms: If downstream developers increasingly rely on foundation models, this creates a single point of dependency on a model, raising security risks.[20] It also concentrates market power over cutting-edge foundation models as few private companies are able to develop foundation models with hundreds of millions of users.[21] [22] [23]
  • Supply chain harms: These are harms involving the processes and inputs used to develop AI, such as poor labour practices, environmental impacts and the inappropriate use of personal data or protected intellectual property.[24]

Context and environment

Experts agree that foundation models are a novel technology in need of additional oversight. This sentiment was shared by industry, civil society and government experts at an Ada Lovelace Institute roundtable on standards-setting held in May 2023. Attendees largely agreed that foundation models represent a ‘novel’ technology without an established ‘state of the art’ for safe development and deployment.

This means that additional oversight mechanisms may be needed, such as testing the models in a ‘sandbox’ environment or regular audits and evaluations of a model’s performance before and after its release (similar to the approach to the testing, approval and monitoring approaches in public health). Such mechanisms would enable greater transparency and accessibility for actors with incentives more aligned with societal interest in assessing (second order) effects on people.[25]

Crafting AI regulation is a priority for governments worldwide. In the last three years, national governments across the world have sought to draft legislation to regulate the development and deployment of AI in different sectors of society.

The European AI Act takes a risk-based approach to regulation, with stricter requirements applying to AI models and systems that pose a high risk to health, safety or fundamental rights. In contrast, the UK has proposed a principles-based approach, calling for existing individual regulators to regulate AI models through an overarching set of principles.

Policymakers in the USA have proposed a different approach in the Algorithm Accountability Act,[26] which would create a baseline requirement for companies building foundation models and AI systems to assess the impacts of ‘automating critical decision-making’ and empower an existing regulator to enforce this requirement. Neither the UK nor the USA have ruled out ‘harder’ regulation that would require the creation of a new (or empowering an existing) body for enforcement.

Regulation in public health, such as FDA pre-approvals, can inspire AI regulation. As governments seek to develop their approach to regulating AI, they have naturally turned to other emerging technology areas for guidance. One area routinely mentioned is the regulation of public health – specifically, the drug development and medical device regulatory approval process used by the FDA.

The FDA’s core objective is to ‘speed innovations that make food and drug products more effective, safer and more affordable’ to ‘maintain and improve the public’s health’. In practice, the FDA model requires developers of drugs or medical devices to provide (sufficiently positive) evidence on the safety risks, efficacy and accessibility of products before they are approved to be sold in a market or continue to the next development phase (referred to as pre-market approval or pre-approval).

Many call for FDA-style oversight for AI, though its detailed applicability for foundation models is largely unexamined. Applying lessons from the FDA to AI is not a new idea,[27] [28] [29] though it has recently gained significant traction. In a May 2023 Senate Hearing, renowned AI expert Gary Marcus testified that priority number one should be ‘a safety review like we use with the FDA prior to widespread deployment’.[30] Leading AI researchers Stuart Russell and Yoshua Bengio have also called for FDA-style oversight of new AI models.[31] [32] [33] In a recent request for evidence by the USA’s National Telecommunications and Information Administration on AI accountability mechanisms, 43 pieces of evidence mentioned the FDA as an inspiration for AI oversight.[34]

However, such calls often lack detail on how appropriate the FDA model is to regulate AI. The regulation of AI for medical purposes has received extensive attention,[35] [36] but there has not yet been a detailed analysis on how FDA-style oversight could be applied to foundation models or other ‘general-purpose’ AI.

Drug regulators have a long history of applying a rigorous oversight process to novel, groundbreaking and experimental technologies that – alongside their possible benefits – present potentially severe consequences.

Such technologies include gene editing, biotechnology and medical software. As with drugs, the effects of most advanced AI models are largely unknown but potentially significant.[37] Both public health and AI are characterised by fast-paced research and development progress, the complex nature of many components, their potential risk to human safety, and the uncertainty of risks posed to different groups of people.

As market sectors, public health and AI are both dominated by large private-sector organisations developing and creating new products sold on a multinational scale. Through registries, drug regulators ensure transparency and dissemination of evaluation methods and endpoint setting. The FDA is a prime example of drug regulation and offers inspiration for how complex AI systems like foundation models could be governed.

Methodology and scope

This report draws on expert interviews and literature to examine the suitability of applying FDA oversight mechanisms to foundation models. It includes lessons drawn from a literature review[38] [39] and interviews with 20 experts from industry, academia, thinktanks and government on FDA oversight and foundation model evaluation processes.[40] In this paper, we answer two core research questions:

  1. Under what conditions are FDA-style pre-market approval mechanisms successful in reducing risks for drug development and medical software?
  2. How might these mechanisms be applied to the governance of foundation models?

The report is focused on the applicability of aspects of FDA-style oversight (such as pre-approvals) to foundation models for regulation within a specific jurisdiction. It does not aim to determine if the FDA’s approach is the best for foundation model governance, but to inform policymakers’ decision-making. This report also does not answer how the FDA should regulate foundation models in the medical context.[41]

We focus on how foundation models might be governed within a jurisdiction, not on international cross-jurisdiction oversight. An international approach could be built on top of jurisdictional FDA-style oversight models through mutual recognition and trade limitations, as recently proposed.[42] [43]

We focus particularly on auditing and approval mechanisms, outlining criteria relevant for a future comparative analysis with other national and multinational regulatory models. Further research is needed to understand whether a new agency like the FDA should be set up for AI.

The implications and recommendations of this report will apply differently to different jurisdictions. For example, many downstream ‘high-risk’ applications of foundation models would have the equivalent of a regulatory approval gate under the EU AI Act (due to be finalised at the end of 2023). The most relevant learnings for the EU would therefore be considerations of what upstream foundation model approval gates could entail, or how a post-market monitoring regime should operate. For the UK and USA (and other jurisdictions), there may be more scope to glean ideas about how to implement an FDA-style regulatory framework to cover the whole foundation model supply chain.

‘The FDA oversight process’ chapter explores how FDA oversight functions and its strengths and weaknesses as an approach to risk reduction. We use Software as a Medical Device (SaMD) as a case study to examine how the FDA approaches the regulation of current ‘narrow’ AI systems (AI systems that do not have general capabilities). Then, the chapter on ‘FDA-style oversight for foundation models’ explores the suitability of this approach to foundation models. The paper concludes with recommendations for policymakers and open questions for further research.

Definitions

 

●      Approval gates are the specific points in the FDA oversight process at which regulatory approval decisions are made. They are throughout the development process. A gate can only be passed when the regulator believes that sufficient evidence on safety and efficacy has been provided.

●      Class IIII medical devices: Class I medical devices are low-risk with non-critical consequences. Class II devices are medium risk. Class III devices are devices which can potentially cause severe harms.

●      Clinical trials, ‘also known as clinical studies, test potential treatments in human volunteers to see whether they should be approved for wider use in the general population’.[44]

●      Endpoints are targeted outcomes of a clinical trial that are statistically analysed to help determine efficacy and safety. They may include clinical outcome assessments or other measures to predict efficacy and safety. The FDA and developers jointly agree on endpoints before a clinical trial.

●      Foundation models are ‘AI models capable of a wide range of possible tasks and applications, such as text, image, or audio generation. They can be standalone systems or can be used as a ‘base’ for many other more narrow AI applications’.[45]

○      Upstream (in the foundation model supply chain) refers to the component parts and activities in the supply chain that feed into development of the model.[46]

○      Downstream (in the foundation model supply chain) refers to activities after the launch of the model and activities that build on a model.[47]

○      Fine-tuning is the process of training a pre-trained model with an additional specialised or context-specific dataset, removing the need to train a model from scratch.[48]

●      Narrow AI is ‘designed to be used for a specific purpose and is not designed to be used beyond its original purpose’.[49]

●      Pre-market approval is the point in the regulatory approval process where developers provide evidence on the safety risks, efficacy and accessibility of their products before they are approved to be sold in a market. Beyond pre-market, the term ‘pre-approvals’ generally describes a regulatory approval process before the next step along the development process or supply chain.

●      A Quality Management System (QMS) is a collection of business processes focused on achieving quality policy and objectives to meet requirements (see, for example ISO 9001 and ISO 13485),[50] [51] or on safety and efficacy (see, for example FDA Part 820). This includes management controls; design controls; production and process controls; corrective and preventative actions; material controls; records, documents, and change controls; and facilities and equipment controls.

●      Risk-based regulation ‘focuses on outcomes rather than specific rules and process as the goal of regulation’,[52] adjusting oversight mechanisms to the level of risk of the specific product or technology.

●      Software as a Medical Device (SaMD) is ‘Software intended to be used for one or more medical purposes that perform these purposes without being part of a hardware medical device’.[53]

●      The US Food and Drug Administration (FDA) is a federal agency (and part of the Department of Health and Human Services) that is charged with protecting consumers against impure and unsafe foods, drugs and cosmetics. It enforces the Federal Food Drug and Cosmetic Act and related laws, and develops detailed guidelines.

 How to read this paper

This report offers insight from FDA regulators, civil society and private sector companies on applying specific oversight mechanisms proven in life sciences, to govern AI and foundation models specifically.

…if you are a policymaker working on AI regulation and oversight:

  • The section on ‘Applying key features of FDA-style oversight to foundation models’ provides general principles that can contribute to a risk-reducing approach to oversight,
  • The chapter on ‘Recommendations and open questions’ summarises specific mechanisms for developing and implementing oversight for foundation models.
  • For a detailed analysis of the applicability of life sciences oversight to foundation models, see the chapter ‘FDA-style oversight for foundation models’ and section on ‘The limitations of FDA oversight’.

…if you are a developer or designer of data-driven technologies, foundation models or AI systems:

  • Grasp the importance of rigorous testing, documentation and post-market monitoring of foundation models and AI applications. The introduction and ‘FDA-style oversight for foundation models’ chapter detail why significant investments into AI governance is important, and why the life sciences are a suitable inspiration.
  • The section on ‘Applying specific FDA-style processes along the foundation model supply chain’ describes mechanisms for each layer in the foundation model supply chain, They are tailored to data providers, foundation model developers, hosts and application providers. These mechanisms are based on proven governance methods used by regulators and companies in the pharmaceutical and medical device sectors.
  • Our Recommendations and open questions’ provide actionable ways in which AI companies can contribute to a better AI oversight process.

…if you are a researcher or public engagement practitioner interested in AI regulation:

  • The introduction includes an overview of the methodology which may also offer insight for others interested in undertaking a similar research project.
  • In addition to a summary of the FDA oversight process, the main research contribution of this paper is in the chapter ‘FDA-style oversight for foundation models’.
  • Our chapter on ‘Recommendations and open questions’ outlines opportunities for future research on governance processes.
  • There is also potential in collaborations between researchers in life sciences regulation and AI governance, focusing on the specific oversight mechanisms and technical tools like unique device identifiers described in our recommendations for AI regulators, developers and deployers.

The FDA oversight process

The Food and Drug Administration (FDA) is the US federal agency tasked with enforcing laws on food and drug products. Its core objective is to help ‘speed innovations that make products more effective, safer and more affordable’ through ‘accurate, science-based information’. In 2023, it had a budget of around $8 billion, around half of which was paid through mandatory fees by companies overseen by the FDA.[54] [55]

The FDA’s regulatory mandate has come to include regulating computing hardware and software used for medical purposes, such as in-vitro glucose monitoring devices or breast cancer diagnosis software.[56] The regulatory category SaMD and adjacent software for medical devices encompasses AI-powered medical applications. These are novel software applications that may bear potentially severe consequences, such as software for eye surgeries[57] or automated oxygen level control under anaesthesia.[58] [59]

An understanding of the most important oversight components for the FDA enables the discussion on suitable inspirations for foundation models in the following chapter.

The FDA regulates drugs and medical devices through a risk-based approach. This seeks to identify potential risks at different stages of the development process. The FDA does this by providing guidance and setting requirements for drug and device developers, including agreed protocols for testing and evaluating the safety and efficacy of the drug or device. The definition of ‘safety’ and ‘efficacy’ are dependent on the context, but generally:

  • Safety refers to the type and likelihood of adverse effects. This is then described as ‘a judgement of the acceptability of the risk associated with a medical technology’. A ‘safe’ technology is described as one that ‘causes no undue harm’.[60]
  • Efficacy refers to ‘the probability of benefit to individuals in a defined population from a medical technology applied for a given medical problem’.[61] [62]

Some devices and drugs undergo greater scrutiny than others. For medical devices, the FDA has developed a Class I–III risk rating system; higher-risk (Class III) devices are required to meet more stringent requirements to be approved and sold on the market. For medical software, the focus lies more on post-market monitoring. The FDA allows software on the market with higher levels of risk uncertainty than drugs, but it monitors such software continuously.

Figure 3: Classes of medical devices (applicable to software components and SaMD)[63]

The FDA’s oversight process follows five steps, which are adapted to the category and risk class of the drug, software or medical device in question.[64] [65]

The FDA can initiate reviews and inspections of drugs and medical devices (as well as other medical and food products) at three points: before clinical trials begin (Step 2), before a drug is marketed to the public (Step 4) and as part of post-market monitoring (Step 5). The depth of evidence required depends on the potential risk levels and novelty of a drug or device.

Approval gates – points in the development process where proof of sufficient safety and efficacy is required to move to the next step – are determined depending on where risks originate and proliferate.

This section illustrates the FDA’s oversight approach to novel Class III software (including narrow AI applications). Low-risk software and software similar to existing software go through a significantly shorter process (see Figure 3).

We illustrate each step using the hypothetical scenario of an approval process for medical AI software for guiding a robotic arm to take patients’ blood. This software consists of a neural network that has been trained with an image classification dataset to visually detect an appropriate vein and that can direct a human or robotic arm to this vein (see Figure 4).[66] [67]

While the oversight process for drugs and medical devices is slightly different, this section borrows insights from both and simplifies when suitable. This illustration will help to inform our assessment in the following chapter, of whether and how a similar approach could be applied to ensure oversight of foundation models.

Risk origination points are when risks arise initially; risk proliferation points: when risks spread without being controllable any more.

Step 1: Discovery and development

Description: A developer conducts initial scoping and ideation of how to design a medical device or drug, including use cases for the new product, supply chain considerations, regulatory implications and needs of downstream users. At the start of the development process, the FDA uses pre-submissions, which aim to provide a path from conceptualisation through to placement on the market.

Developer responsibilities:

  • Determine the product and risk category to classify the device, which will determine the testing and evaluation procedure (see Figure 3).
  • While training the AI model, conduct internal (non-clinical) tests, and clearly document the data and algorithms used throughout the process in a Quality Management System (QMS).[68]
  • Follow Good Documentation Practice, which offer guidance on how to document procedures from development through to market, to facilitate risk mitigation, validation and verification, and traceability (to support regulators in the event of recall or investigations).
  • Inform the FDA on the necessity of new software, for example, for efficiency gains or improvements in quality.

FDA responsibilities:

  • Support developers in risk determination.
  • Offer guidance on, for example, milestones for (pre-)clinical research and data analysis.

Required outcomes: Selection of product and risk category to determine regulatory pathway.

Example scenario: A device that uses software to guide the taking of blood may be classified as an in-vitro diagnostics device, which the FDA has previously classified as Class III (highest risk class).[69]

Step 2: Pre-clinical research

Description: In this step, basic questions about safety are addressed through initial animal testing.

Developer responsibilities:

  • Propose endpoints of study and conduct research (often with a second party).
  • Use continuous tracking in the QMS and share results with FDA.

FDA responsibilities:

  • Approve endpoints of the study, depending on the novelty and type of medical device or drug.
  • Review results to allow progression to clinical research.

Required outcomes: Developer proves basic safety of product, allowing clinical studies with human volunteers in the next step.

Example scenario: This step is important for assessing risks of novel drugs. It would not usually be needed for medical software such as our example that helps take blood, as these types of software are typically aimed at automating or improving existing procedures.

Step 3: Clinical research

Description: Drugs, devices and software are tested on humans to make sure they are safe and effective. Unlike for foundation models and most AI research and development, institutional review for research with human subjects is mandatory in public health.

Developer responsibilities:

  • Create a research design (called a protocol) and submit it to an institutional review board (IRB) for ethical review, along with Good Clinical Practice (GCP) principles and ISO standards such as ISO14155.
  • Provide the FDA with the research protocol, the hypotheses and results of the clinical trials and of any other pre-clinical or human tests undertaken, and other relevant information.
  • Following FDA approval, hire an independent contractor to conduct clinical studies (as required by risk level); these may be in multiple regions or locations, as agreed with the FDA, to match future application environments.

For drugs, trials may take place in phases that seek to identify different aspects of a drug:

  • Phase 1 studies tend to involve less than 100 participants, run for several months and seek to identify the safety and dosage of a drug.
  • Phase 2 studies tend to involve up to several hundred people with the disease/condition, run for up to two years and study the efficacy and side effects.
  • Phase 3 studies involve up to 3,000 volunteers, can run for one to four years and study efficacy and adverse reactions.

FDA responsibilities:

  • Approve the clinical research design protocol before trials can proceed.
  • During testing, support the developer with guidance or advice at set intervals on protocol design and open questions.

Required outcomes: Once the trials are completed, the developer submits them as evidence to the FDA. The supplied information should include:

  • description of main functions
  • data from trials to prove safety and efficacy
  • benefit/risk and mitigation review, citing relevant literature and medical association guidelines
  • intended use cases and limitations
  • a predetermined change control plan, allowing for post-approval adaptations of software without the need for re-approval (for a new use, new approval is required)
  • QMS review (code, protocols of storing data, Health Protection Agency guidelines, patient confidentiality).

Example scenario: The developers submit a ‘submission of investigational device exemption’ to the FDA, seeking to simplify design, requesting observational studies of the device instead of randomised controlled trials. They provide a proposed research design protocol to the FDA. Once the FDA approves it, they begin trials in 15 facilities with 50 patients each, aiming to prove 98 per cent accuracy and reduction of waiting times at clinics. During testing, no significant adverse events are reported. The safety and efficacy information is submitted to the FDA.

Step 4: FDA review

Description: FDA review teams thoroughly examine the submitted data on the drug or device and decide whether to approve it.

Developer responsibilities: Work closely with the FDA to provide access to all requested information and facilities (as described above).

FDA responsibilities:

  • Assign specialised staff to review all submitted data.
  • In some cases, conduct inspections and audits of developer’s records and evidence, including site visits.
  • If needed, seek advice from an advisory committee, usually appointed by the FDA Commissioner with input from the federal Secretary of the Health & Human Service department.[70] The committee may include representation from patients, scientific academia, consumer organisations and industry (if decision-making is delegated to the committee, only scientifically qualified members may vote).

Required outcomes: Approval and registration, or no approval with request for additional evidence.

Example scenario: For novel software like the example here, there might be significant uncertainty. The FDA could request more information from the developer and consult additional experts. Decision-making may be delegated to an advisory committee to discuss open questions and approval.

Step 5: Post-market monitoring

Description: The aim of this step is to detect ‘adverse events’[71] (discussed further below) to increase safety iteratively. At this point, all devices are labelled with Unique Device Identifiers to support monitoring and reporting from development through to market. These are particularly in relation to identifying the underlying causes of, and corrective actions for adverse events.

Developer responsibilities: Any changes or upgrades must be clearly documented, within the agreed change control plan.

FDA responsibilities:

  • Monitor safety of all drugs and devices once available for use by the public.
  • Monitor compliance on an ongoing basis through the QMS, with safety and efficacy data reviewed every six to 12 months.
  • Maintain a database on adverse events and recalls.[72]

Required outcomes: No adverse events or diminishing efficacy. If safety issues occur, the FDA may issue a recall.

Example scenario: Due to a reported safety incident with the blood-taking software, the FDA inspects internal emails and facilities. In addition, every six months, the FDA reviews a one per cent sample of patient data in the QMS and conducts interviews with patients and staff from a randomly selected facility.

Risk-reducing aspects of FDA oversight

Our interviews with experts on the FDA and a literature review[73] highlighted several themes. We group them into five risk-reducing aspects below.

Risk- and novelty-driven oversight

The approval gates described in the previous section lead to iterative oversight using QMS and jointly agreed research endpoints, as well as continuous post-market monitoring.

Approval gates are informed by risk controllability. Risk controllability is understood by considering the severity of harm to people; the likelihood of that harm occurring; proliferation, duration of exposure to population; potential false results; patient tolerance of risk; risk factors for people administering or using the drug or device, such as caregivers; detectability of risks; risk mitigations; the drug or device developer’s compliance history; and how much uncertainty there may be around any of these factors.[74]

Class III devices and related software – those that may guide critical clinical decisions or that are invasive or life-supporting – need FDA pre-approval before the drug is marketed to the public. In addition, the clinical research design needs to be approved by the FDA.

Continuous, direct engagement of FDA with developers throughout the development process

There can be inspections at any step of the development and deployment process. Across all oversight steps, the FDA’s assessments are independent and not reliant on input from private auditors who may have profit incentives.

In the context of foundation models, where safety standards are unclear and risk assessments are therefore more exploratory, these assessments should not be guided by profit incentives.

In cases where the risks are less severe, for example,  Class II devices, the FDA is supported by accredited external reviewers.[75] External experts also support reviews of novel technology where the FDA lacks expertise, although this approach has been criticised (see limitations below and advisory committee description above).

FDA employees review planned clinical trials, as well as clinical trial data produced by developers and their contractors. In novel, high-stakes cases, a dedicated advisory committee reviews evidence and decides on approval. Post market, the FDA reviews sample usage, complaint and other data approximately every six months.

Wide-ranging information access

By law, the FDA is empowered to request comprehensive evidence through audits, conduct inspections[76] and check the QMS. The FDA’s QMS regulation requires documented, comprehensive managerial processes for quality planning, purchasing, acceptance activities, nonconformities and corrective/preventative actions throughout design, production, distribution and post-market. While the FDA has statutory powers to access comprehensive information, for example, on clinical trials, patient data and in some cases internal emails, it releases only a summary of safety and efficacy post approval.

Putting the burden of proof on the developer

The FDA must approve clinical trials and their endpoints, and the labelling materials for drugs and medical devices, before they are approved for market. This model puts the burden of proof on the developer to provide this information or be unable to sell their product.

A clear approval gate entails the following steps:

  • The product development process in scope: The FDA’s move into regulating SaMD required it to translate regulatory approval gates for a drug approval process to the stages of a software development process. In the SaMD context, a device may be made up of different components, including software and hardware, that come from other suppliers or actors further upstream in the product development process. The FDA ensures the safety and efficacy of each component by requiring all components to undergo testing. If a component has been previously reviewed by the FDA, future uses of it can undergo an expedited review. In some cases, devices may use open-source Software of Unknown Provenance (SOUP). Such software needs either to be clearly isolated from critical components of the device, or to undergo demonstrable safety testing.[77]
  • The point of approval in the product development process: Effective gates occur once a risk is identifiable, but before it can proliferate or turn into harms. Certain risks (such as differential impacts on diverse demographic groups) may not be identifiable until after the intended uses of the device are made clear (for example will it be used in a hospital or a care home?). For technology with a wide spectrum of uses, like gene editing, developers must specify intended uses and the FDA allows trials with human subjects only in a few cases, where other treatments have higher risks or significantly lower chance of success.[78]
  • The evidence required to pass the approval gate: This is tiered depending on the risk class, as already described. The FDA begins with an initial broad criterion such as simply not causing to the human body when used. Developers and contractors then provide exploratory evidence. Based on this, in the case of medicines, the regulator learns and makes further specifications, for example, around the drug elimination period. For medical devices such as heart stents, evidence could include the percentage reduction in the rate of major cardiac events.

Balancing innovation and risks enables regulatory authority to be built over time

The FDA enables innovation and access by streamlining approval processes (for example, similarity exemptions, pre-submissions) and approvals of drugs with severe risks but high benefits. Over time, Congress has provided the FDA with increasing information access and enforcement powers and budgets, to allow it to enforce ‘safe access’.

The FDA has covered more and more areas over time, recently adding tobacco control to its remit.[79] FDA-regulated products account for about 20 cents of every dollar spent by US consumers.[80] It has the statutory power to issue warnings, make seizures, impose fines and pursue criminal prosecution.

Safety and accessibility need to be balanced. For example, a piece of software that automates oxygen control may perform slightly less well than healthcare professionals, but if it reduces the human time and effort involved and therefore increases accessibility, it may still be beneficial overall. By finding the right balance, the FDA builds an overall reputation as an agency providing mostly safe access, enabling continued regulatory power.[81] When risk uncertainty is high, it can slow down the marketing of technologies, for example, allowing only initial, narrow experiments for novel technologies such as gene editing.[82]

The FDA approach does not rely on any one of these risk-reducing aspects alone. Rather, the combination of all five ensures the safety of FDA-regulated medical devices and drugs in most cases.[83] The five together also allow the FDA to continuously learn about risks and improve its approval process and its guidance on safety standards.

Risk- and novelty-driven oversight focuses learning on the most complex and important drugs, software and devices. Direct engagement and access to a wide range of information is the basis of the FDA’s understanding of new products and new risks.

With the burden of proof on developers through pre-approvals, they are incentivised to ensure the FDA is informed about safety and efficacy.

As a result of this approach to oversight, the FDA is better able to balance safety and accessibility, leading to increased regulatory authority.

‘The burden is on the industry to demonstrate the safety and effectiveness, so there is interest in educating the FDA about the technology.’

Former FDA Chief Counsel

The history of the FDA: 100+ years of learning and increasing power [84] [85] [86]

 

The creation of the FDA was driven by a series of medical accidents that exposed the risks drug development can pose to public safety. While the early drug industry initially pledged to self-regulate, and members of the public viewed doctors as the primary keepers of public safety, public outcry over tragedies like the Elixir Sulfanilamide disaster (see below) led to calls for an increasingly powerful federal agency.

Today the FDA employs around 18,000 people (2022 figures) with a $8 billion budget (2023 data). The FDA’s approach to regulating drugs and devices involves learning iteratively about risks and benefits of products with every new evidence review it undertakes as part of the approval process.

Initiation

The 1906 Pure Food and Drugs Act was the first piece of legislation to regulate drugs in the USA. A groundbreaking law, it took nearly a quarter-century to formulate. It prohibited interstate commerce of adulterated and misbranded food and drugs, marking the start of federal consumer protection.

Learning through trade controls: This Act established the importance of regulatory oversight for product integrity and consumer protection.

Limited mandate

From 1930 to 1937, there were failed attempts to expand FDA powers, with relevant bills  not being passed by Congress. This period underscored the challenges in evolving regulatory frameworks to meet public health needs.

Limited power and limited learning.

Elixir Sulfanilamide disaster

This 1937 event, where an untested toxic solvent caused over 100 deaths, marked a turning point in drug safety awareness.

Learning through post-market complaints: The Elixir tragedy emphasised the crucial need for pre-market regulatory oversight in pharmaceuticals.

Extended mandate

In 1938, previously proposed legislation, the Food, Drug, and Cosmetic Act, was passed into law that changed the FDA’s regulatory approach by mandating review processes without requiring proof of fraudulent intent.

Learning through mandated information access and approval power: Pre-market approvals and the FDA’s access to drug testing information enabled the building of appropriate safety controls.

Safety reputation

During the 1960s, the FDA’s refusal to approve thalidomide –a drug prescribed to pregnant women causing an estimated 80,000 miscarriages and infant deaths and deformities in 20,000 children worldwide – further established its commitment to drug safety.

Learning through prevented negative outcomes: The thalidomide situation led the FDA to calibrate its safety measures by monitoring and preventing large-scale health catastrophes, especially in comparison with similar countries. Post-market recalls were included in the FDA’s regulatory powers.

Extended enforcement power

The 1962 Kefauver-Harris Amendment to the Federal Food, Drug, and Cosmetic  Act was a significant step, requiring new drug applications to provide substantial evidence of efficacy and safety.

Learning through expanded enforcement powers: This period reinforced the evolving role of drug developers in demonstrating the safety and efficacy of their products.

Balancing accessibility with safety

The 1984 Drug Price Competition and Patent Term Restoration Act marked a balance between drug safety and accessibility, simplifying generic drug approvals. In the 2000s, Risk Minimization Action Plans were introduced, emphasising the need for drugs to have more benefits than risks, monitored at both the pre- and the post-market stages.

Learning through a lifecycle approach: This era saw the FDA expanding its oversight scope across product development and deployment for a deeper understanding of the benefit–risk trade-off.

Extended independence

The restructuring of advisory committees in the 2000s and 2010s enhanced the FDA’s independence and decision-making capability.

Learning through independent multi-stakeholder advice: The multiple perspectives of diverse expert groups bolstered the FDA’s ability to make well-informed, less biased decisions, reflecting a broad range of scientific and medical insights – although critics and limitations remain (see below).

Extension to new technologies

In the 2010s and 2020s, recognising the potential of technological advancements to improve healthcare quality and cost efficiency, the FDA began regulating new technologies such as AI in medical devices.

Learning through a focus on innovation: Keeping an eye on emerging technologies.

The limitations of FDA oversight

The FDA’s oversight regime is built for regulating food, drugs and medical devices, and more recently extended to software used in medical applications. Literature reviews[87] and interviewed FDA experts suggest three significant limitations of this regime’s applicability to other sectors.

Limited types of risks controlled

The FDA focuses on risks to life posed by product use, therefore focusing on reliability and (accidental) misuse risks. Systemic risks such as accessibility challenges, structural discrimination issues and novel risk profiles are not as well covered.[88] [89]

  • Accessibility risks include the cost barriers of advanced biotechnology drugs or SaMD for underprivileged groups.[90]
  • Structural discrimination risks include disproportionate risks to particular demographics caused by wider societal inequalities and a lack of representation in data. These may not appear in clinical trials or in single-device post-market monitoring. For example, SaMD algorithms have misclassified Black patients’ healthcare needs systematically because they have suggested treatment based past healthcare spending data that did not accurately reflect their requirements.[91]
  • Equity risks arise when manufacturers claim average accuracy across a population or use only for a specific population (for example, people aged 60+). The FDA only considers whether a product safely and effectively delivers according to the claims of its manufacturers – it doesn’t go beyond this to urge them to reach other populations. It does not yet have comprehensive algorithmic impact assessments to ensure equity and fairness.
  • False similarity risks originate in the accelerated FDA 510(k) approval pathway for medical devices and software through comparison with already-approved products –referred to as predicate devices. Reviews of this pathway have shown ‘predicate creep’ when multiple generations of predicate devices slowly drift away from the originally approved use.[92] This could mean that predicate devices may not provide suitable comparisons for new devices.
  • Novel risk profiles challenge the standard regulatory approach of the FDA that rests on risk detection through trials before risks proliferate through marketing. Risks that are not typically detectable in clinical trials, due to their novelty or new application environments, may be missed. For example, the risk of water-contaminating foods is clear, but it may be less clear how to monitor for new pathogens that might be significantly smaller or otherwise different to those detected by existing routines.[93] While any ‘adverse events’ need to be reported to the FDA, risks that are difficult to detect might be missed.

Limited number of developers due to high costs of compliance

The FDA’s stringent approval requirements lead to costly approval processes that only large corporations can afford, as a multi-stage clinical trial can cost tens of millions of dollars.[94] [95] This can lead to oligopolies and monopolies, high drug prices because of limited competition, and innovation focused on areas with high monetary returns.

If this is not counteracted through governmental subsidies and reimbursement incentives, groups with limited means to pay for medications can face accessibility issues. It remains an open question whether small companies should be able to develop and market severe-risk technologies, or how governmental incentives and efforts can democratise the drug and medical device – or foundation model – development process.

Reliance on industry for expertise

The FDA sometimes relies on industry expertise, particularly in novel areas where clear benchmarks have not been developed and knowledge is concentrated in industry. This means that the FDA may seek input from external consultants and its advisory committees to make informed decisions.[96]

An overreliance on industry could raise concerns around regulatory capture and conflicts of interest – similar to other agencies.[97] For example, around 25 per cent of FDA advisory committee members had conflicts of interest in the past five years.[98] In principle, conflicted members are not allowed to participate, but dependency on their expertise regularly leads this requirement being waived.[99] [100] [101] External consultants have been conflicted, too: one notable scandal occurred when McKinsey advised the FDA on opioid policy while being paid by corporations to help them sell the same drugs.[102]

A lack of independent expertise can reduce opportunities for the voice of people affected by high-risk drugs or devices being heard. This in turn may undermine public trust in new drugs and devices. It has also been shown that oversight processes that are not heavily dependent on industry expertise and funding have been proven to discover more, and more significant, risks and inaccuracies.[103]

Besides these three main limitations, others include enforcement issues for small-scale illegal deployment of SaMD, which can be hard to identify;[104] [105] and device misclassifications in new areas.[106]

FDA-style oversight for foundation models

FDA Class III devices are complex, novel technologies with potentially severe risks to public health and uncertainties regarding how to detect and mitigate these risks.[107]

Foundation models are at least as complex, more novel and – alongside their potential benefits – likewise pose potentially severe risks, according to the experts we interviewed and recent literature.[108] [109] [110] They are also deployed across the economy, interacting with millions of people, meaning they are likely to pose systemic risks that are far beyond those of Class III medical devices.[111]

However, the risks of foundation models are so far not fully clear, risk mitigation measures are uncertain and risk modelling is poor or non-existent.

Leading AI researchers such as Stuart Russell and Yoshua Bengio, independent research organisations, and AI developers have flagged the riskiness, complexity and black-box nature of foundation models.[112] [113] [114] [115] [116] In a review on the severe risks of foundation models (in this case, the accessibility of instructions for responding to biological threats), the AI lab Anthropic states: ‘If unmitigated, we worry that these risks are near-term, meaning they may be actualised in the next two to three years.’[117]

As seen in the history of the FDA outlined above, it was a reaction to severe harm that led to its regulatory capacity being strengthened. Those responsible for AI governance would be well advised to act ahead of time to pre-empt and reduce the risk of similarly severe harms.

The similarities between foundation models and existing, highly regulated Class III medical devices – in terms of complexity, novelty and risk uncertainties – suggests that they should be regulated in a similar way (see Figure 5).

However, foundation models differ in important ways from Software as a Medical Device (SaMD). The definitions themselves reveal inherent differences in the range of applications and intended use:

Foundation models are AI models capable of a wide range of possible tasks and applications, such as text, image or audio generation. They can be stand-alone systems or can be used as a ‘base’ for many other more narrow AI applications.[118]

SaMD  is more specific: it is software that is ‘intended to be used for one or more medical purposes that perform[s] these purposes without being part of a hardware medical device’.[119]

However, the most notable differences are more subtle. Even technology applied across a wide range of purposes, like general drug dispersion software, can be effectively regulated with pre-approvals. This is because the points of risk and the pathways to dangerous outcomes are well understood and agreed upon, and they all start from the distribution of products to consumers – something in which the FDA can intervene.

The first section of this chapter outlines why this is not yet the case for foundation models. The second section illustrates how FDA-style oversight can bridge this gap generally. The third section details how these mechanisms could be applied along the foundation model supply chain – the different stages of development and deployment of these models.

The foundation model challenge: unclear, distributed points of risk

In this section we discuss two key points of risk: 1) risk origination points, when risks arise initially; and 2) risk proliferation points, when risks spread without being controllable.

A significant challenge that foundation models raise is the difficulty of identifying where different risks originate and proliferate in their development and deployment, and which actors within that process should be held responsible for mitigating and providing redress for those harms.[120]

Risk origination and proliferation examples

Bias

Some risks may originate in multiple places in the foundation model supply chain. For example, the risk of a model producing outputs that reinforce racial stereotypes may originate in the data used to train the model, how it was cleaned, the weights that the model developer used, which users the model was made available to, and what kinds of prompts the end user of the model is allowed to make.[121] [122]

 

In this example, a series of evaluations for different bias issues might be needed throughout the model’s supply chain. The model developer and dataset provider would need to be obliged to proactively look for and address known issues of bias. It might also be necessary to find ways to prohibit or discourage end users from prompting a model for outputs that reinforce racial stereotypes.

Cybercrime

Another example is reports of GPT-4 being used to write code for phishing operations to steal people’s personal information. Where in the supply chain did such cyber-capabilities originate and proliferate?[123] [124] Did the risk originate during training (while general code-writing abilities were being built) or after release (allowing requests compatible with phishing)? Did it proliferate through model leakage, widely accessible chatbots like ChatGPT or Application Programming Interfaces (APIs), or downstream applications?

Some AI researchers have conceptualised the uncertainty over risks as a matter of the unexpected capabilities of foundation models. This ‘unexpected capabilities problem’ may arise during models’ development and deployment.[125] Exactly what risks this will lead to cannot be identified reliably, especially not before the range of potential use cases is clear.[126] In turn, this uncertainty means that risks may be more likely to proliferate rapidly (the ‘proliferation problem’),[127] and to lead to harms throughout the lifecycle – with limited possibility for recall (the ‘deployment safety problem’).[128]

The challenge in governing foundation models is therefore in identifying and mitigating risks comprehensively before they proliferate.[129]

There is a distinction to draw between risk origination (the point in the supply chain a risk such as toxic content may arise) and risk proliferation (the point in the supply chain a risk can be widely distributed to downstream actors). Identifying points of risk origination and proliferation can be challenging for different kinds of risks.

Foundation model oversight needs to be continuous throughout the supply chain. Identifying all inherent risks in a foundation model upstream is hard. Leaving risks to downstream companies is not the solution, because they may have proliferated already by this stage.

There are tools available to help upstream foundation model developers reduce risk before training (through filtering data inputs), and to assess risks during training (through clinical trial style protocols). More of these tools are needed. They are most effective when applied at the foundation model layer (see Figure 2 and Figure 6), given the centralised nature of foundation models. However, some risks might arise or be detectable only at the application layer, so tools for intervention at this layer are also necessary.

Applying key features of FDA-style oversight to foundation models

How should an oversight regime be designed so that it suits complex, novel, severe-risk technologies with distributed, unclear points of risk origination and proliferation?

Both foundation models and Class III devices pose potentially severe levels of risk to public safety and therefore require governmental oversight. For the former, this is arguably even more important given national security concerns (for example, the risk that such technologies could enable cyberattacks or widespread disinformation campaigns at far greater scales than current capabilities allow).[130] [131] [132]

Government oversight is needed also because of the limitations of private insurance for severe risks.

As seen in the cases of nuclear waste insurance or financial crisis, large externalities and systemic risks need to be captured by a government.

Below we consider what we can learn from the oversight of FDA-regulated products and whether an FDA-style approach could provide effective oversight of foundation models.

Building on Raji et al’s recent review[133] and interviews, current oversight regimes for foundation models can be understood alongside, and compared with, the core risk-reducing aspects of the FDA approach, as depicted in Figure 7.[134] [135] Current oversight and evaluations of GPT-4 lag behind FDA oversight in all dimensions.

Governance of GPT-4’s development and release according to their 2023 system card and interviews, vs. FDA governance of Class III drugs.[136] [137] [138] While necessarily simplified, characteristics furthest to the right fit best for complex, novel technologies with potentially severe risks and unclear risk (measures).[139]

‘We are in a “YOLO [you only live once]” culture without meaningful specifications and testing – “build, release, see what happens”.’

Igor Krawczuk on current oversight of commercial foundation models

The complexity and risk uncertainties of foundation models could justify similar levels of oversight to those provided by the FDA in relation to Class III medical devices.

This would involve an extensive ecosystem of second-party, third-party and regulatory oversight to monitor and understand the capabilities of foundation models and to detect and mitigate risks. The high speed of progress in foundation model development requires adaptable oversight institutions, including non-governmental organisations with specialised expertise. AI regulators need to establish and enforce improved foundation model oversight across the development and deployment process.

General principles for applying key features of the FDA’s approach to foundation model governance

  1. Establish continuous, risk-based evaluations and audits throughout the foundation model supply chain. Existing bug bounty programmes[140] and complaint-driven evaluation do not sufficiently cover potential risks. The FDA’s incident reporting system captures fewer risks than the universal risk-based reviews before market entry and post-market monitoring requirements.[141] Therefore, review points need to be defined across the supply chain of foundation models, with risk-based triggers. As already discussed, risks can originate at multiple sources, potentially simultaneously. Continuous engagement of reviewers and evaluators is therefore important to detect and mitigate risks before they proliferate.
  2. Empower regulatory agencies to evaluate critical safety evidence directly, supported by a third-party ecosystem. First-party self-assessments and second-party contracted auditing have consistently proven to be lower quality than accredited third-party or governmental audits.[142] [143] [144] [145] Regulators of foundation models should therefore have direct access to assess evaluation and audit evidence. This is especially significant when operating in a context when standards are unclear and audits therefore more exploratory (in the style of evaluations). Regulators can also improve their understanding by consulting independent experts.
  3. Ensure independence of regulators and external evaluators. Oversight processes not dependent on industry expertise and funding have been proven to discover more, and more significant, risks and inaccuracies, especially in complex settings with vague standards.[146] [147] Inspired by the FDA approach, foundation model oversight could be funded directly through mandatory fees from AI labs and only partly through federal funding. Sufficient resourcing in these ways is essential, to avoid the need for additional resourcing that is associated with potential conflicts of interest. Consideration should also be given to an upstream regulator of foundation models as existing sector-specific regulators may only have the ability to review downstream AI applications. The level of funding for such a regulator needs to be similar to that of other safety-critical domains, such as medicine. Civil society and external evaluators could be empowered through access to federal computing infrastructure for evaluations and accreditation programmes.
  4. Enable structured access to foundation models and adjacent components for evaluators and civil society. Access to information is the foundation of an effective audit (although while it is necessary, it is not sufficient on its own).[148] Providing information access to regulators – not just external auditors – increases audit quality.[149] Information access needs to be tiered to protect intellectual property and limit the risks of model leakage.[150] [151] Accessibility to civil society could increase the likelihood of innovations that meet the needs of people that are impacted by its use, for example, through understanding public perceptions of risks and perceived benefits of technologies. Foundation model regulation needs to strike a risk-benefit balance.
  5. Enforce a foundation model pre-market approval process, shifting the burden of proof to developers. If the regulator has the power to stop the development or sale of products, this significantly increases developers’ incentive to provide sufficient safety information. The regulatory burden needs to be distributed across the supply chain – with requirements in line with the risks at each layer of the supply chain. Cross-context risks and those with the most potential for wide-scale proliferation need to be regulated upstream at the foundation model layer; context-dependent risks should be addressed downstream in domain-specific regulation.

‘Drawing from very clear examples of real harm led the FDA to put the burden of proof on the developers – in AI this is flipped. We are very much in an ex post scenario with the burden on civil society.’

Co-Founder of Leading AI thinktank

 

‘We should see a foundation model as a tangible, auditable product and process that starts with the training data collection as the raw input material to the model.’

Kasia Chmielinski, Harvard Berkman Klein Center for Internet & Society

Learning through approval gates

The FDA’s capabilities have increased over time. Much of this has occurred through setting approval gates, which become points of learning for regulators. Given the novelty of foundation models and the lack of an established ‘state of the art’ for safe development and deployment, a similar approach could be taken to enhance the expertise of regulators and external evaluators (see Figure 2).

Approval gates can provide regulators with key information throughout the foundation model supply chain.

Some approval gates already exist under current sectoral regulation for specific downstream domains. At the application layer of a foundation model’s supply chain, the context of its use will be more clear than at the developer layer. Approval gates at this stage could require evidence similar to clinical studies for medical devices, to approximate risks. This could be gathered, for example, through an observational study on the automated allocation of physicians’ capacity based on described symptoms.

Current sectoral regulators may need additional resources, powers and support to appropriately evaluate the evidence and make a determination of whether a foundation model is safe to pass an approval gate.

Every time a foundation model is suggested for use, companies may already need to – or should – collect sufficient context-specific safety evidence and provide it to the regulator. For the healthcare capacity allocation example above, existing FDA –  or MHRA (Medicines and Healthcare products Regulatory Agency, UK) – requirements and approval gates on clinical decision support software currently support extensive evaluation of such applications.[152]

Upstream stages of the foundation model supply chain, in particular, lack an established ‘state of the art’ defining industry standards for development and underpinning regulation. A gradual process might therefore be required to define approval requirements and the exact location of approval gates.

Initially, lighter approval requirements and stronger transparency requirements will enable learning for the regulator, allowing it to gradually set optimal risk-reducing approval requirements. The model access required by the regulator and third parties for this learning could be provided via mechanisms such as sandboxes, audits or red teaming, detailed below.

Red teaming is an approach originating in computer security. It describes exercises where individuals or groups (the ‘red team’) are tasked with looking for errors, issues or faults with a system, by taking on the role of a bad actor and ‘attacking’ it. In the case of AI, it has increasingly been adopted as an approach to look for risks of harmful outputs from AI systems.[153]

Once regulators have agreed inclusive[154] international standards and benchmarks for testing of upstream capabilities and risks, they should impose standardised thresholds for approval and endpoints. Until that point, transparency and scrutiny should be increased, and the burden of proof should be on developers to prove safety to regulators at approval gates.

The next section discusses in more specific detail how FDA-style processes could be applied to foundation model governance.

‘We need end-to-end oversight along the value chain.’

CEO of an Algorithmic auditing firm

Applying specific FDA-style processes along the foundation model supply chain

Risks can manifest across the AI supply chain. Foundation models and downstream applications can have problematic behaviours originating in pre-training data, or they can develop new ones when integrated into complex environments (like a hospital or a school). This means that new risks can emerge over time.[155] Policymakers, researchers, industry and the public therefore ‘require more visibility into the risks presented by AI systems and tools’.

Regulation can ‘play an important role in making risks more visible, and the mitigation of risk more actionable, by developing policy to enable a robust and interconnected evaluation, auditing, and disclosure ecosystem that facilitates timely accountability and remediation of potential harms’.[156]

The FDA has processes, regulatory powers and a culture that helps to identify and mitigate risks across the development and deployment process, from pre-design through to post-market monitoring. This holistic approach provides lessons for the AI regulatory ecosystem.

There are also significant similarities between specific FDA oversight mechanisms and proposals for oversight in the AI space, suggesting that the latter proposals are generally feasible. In addition, new ideas for foundation model oversight can be drawn from the FDA, such as in setting endpoints that determine the evidence required to pass an approval gate. This section draws out key lessons that AI regulators could take from the FDA approach and applies them to each layer of the supply chain.

Data and compute layers oversight

There is an information asymmetry between governments and AI developers. This is demonstrated, for example, in the way that governments have been caught off-guard by the release of ChatGPT. This also has societal implications in areas like the education sector, where universities and schools are having to respond to a potential increase in students’ use of AI-generated content for homework or assessments.[157]

To be able to anticipate these implications, regulators need much greater oversight on the early stages of foundation model development, when large training runs (the key component of the foundation model development process) and the safety precautions for such processes are being planned. This will allow greater foresight over potentially transformative AI model releases, and early risk mitigation.

Pre-submissions and Good Documentation Practice

At the start of the development process, the FDA uses pre-submissions (pre-subs), which allow it to conduct ‘risk determination’. This benefits the developer because they can get feedback from the regulator at various points, for example on protocols for clinical studies. The aim is to provide a path from device conceptualisation through to placement on the market.

This is similar to an idea that has recently gained some traction in the AI governance space: that labs should submit reports to regulators ‘before they begin the training process for new foundation models, periodically throughout the training process, and before and following model deployment’. [158]

This approach would enable learning and risk mitigation by giving access to information that currently resides only inside AI labs (and which has not so far been voluntarily disclosed), for example covering compute and capabilities evaluations,[159] what data is used to train models, or environmental impact and supply chain data.[160] It would mirror the FDA’s Quality Management System (QMS), which documents compliance with standards (ISO 13485/820) and is based on Good Documentation Practice throughout the development and deployment process to ensure risk mitigation, validation and verification, and traceability (to support regulators in the event of recall or investigations).

As well as documenting compliance in this way, the approach means that the regulator would need to demonstrate similar good practice when handling pre-submissions. Developers would have concerns around competition: the relevant authorities would need to be legally compelled to observe confidentiality, to protect intellectual property rights and trade secrets. A procedure for documenting and submitting high-value information at the compute and data input layer would be the first step towards an equivalent to the FDA approach in the AI space.

Transparency via Unique Device Identifiers (UDIs)

The FDA uses UDIs for medical devices and stand-alone software. The aim of this is to support monitoring and reporting throughout the lifecycle, particularly to identify the underlying causes of ‘adverse events’ and what corrective action should be taken (this is discussed further below).[161] This holds some similarities to AI governance proposals, particularly the suggestion for compute verification to help ensure that (pre-) training rules and safety standards are being followed.

Specifically for the AI supply chain, this would apply at the developer layer, to the essential hardware used to train and run foundation models: compute chips. Chip registration and monitoring has gained traction because, unlike other components of AI development, this hardware can be tracked in the same manner as other physical goods (like UDIs). It is also seen as an easy win. Advanced chips are usually tagged with unique numbers, so regulators would simply need to set up a registry; this could be updated each time the chips change hands.[162]

Such a registry would enable targeted interventions. For example, Jason Matheny, the CEO of RAND suggests that regulators should ‘track and license large concentrations of AI chips’, while ‘cloud providers, who own the largest clusters of AI chips, could be subject to ‘know your customer’ (KYC) requirements so that they identify clients who place huge rental orders that signal an advanced AI system is being built’.[163]

This approach would allow regulators and relevant third parties to track use throughout the lifecycle – starting with monitoring for large training runs to build advanced AI models and to verify safety compliance (for example, via KYC checks or providing information about the cybersecurity and risk management measures) for these training runs and subsequent development decisions. It would also support them to hold developers accountable if they do not comply.

Quality Management System (QMS)

The FDA’s quality system regulation is sometimes wrongly assumed to be only a ‘compliance checklist’ to be completed before the FDA approves a product. In fact, the QMS – a standardised process for documenting compliance – is intended to put ‘processes, trained personnel, and oversight’ in place to ensure that a product is ‘predictably safe throughout its development and deployment lifecycles’.

At the design phase, controls consist of design planning, design inputs that establish user needs and risk controls, design outputs, verification to ensure that the product works as planned, validation to ensure that the product works in its intended setting, and processes for transferring the software into the clinical environment.[164]

To apply a QMS to foundation model development phase, it is logical to look at the data used to (pre-)train the model. This – alongside compute – is the key input at this layer of the AI supply chain. As with the pharmaceuticals governed by the FDA, the inputs will strongly shape the outputs, such as decisions on size (of dataset and parameters), purpose (while pre-trained models are designed to be used for multiple downstream tasks, some models are better suited than others to particular types of tasks) and values (for example, choices on filtering and cleaning the data).[165]

These decisions can lead to issues in areas such as bias,[166] copyright[167] and AI-generated data[168] throughout the lifecycle. Data governance and documentation obligations are therefore needed, with similar oversight to the FDA QMS for SaMD. This will build an understanding of where risks and harms originate and make it easier to stop them from proliferating by intervening upstream.

Regulators should therefore consider model and dataset documentation methods[169] for pre-training and fine-tuning foundation models. For example, model cards document information about the model’s architecture, testing methods and intended uses,[170] while datasheets document information about a dataset, including what kind of data is included and how it was collected and processed.[171] A comprehensive model card should also contain a risk assessment,[172] similar to the FDA’s controls for testing for effectiveness in intended settings. This could be based on uses foreseen by foundation model developers. Compelling this level of documentation would help to introduce FDA-style levels of QMS practice for AI training data.

Core policy implications

An approach to pre-notification of, and information-sharing on, large training runs could use the pre-registration process of the FDA as a model. As discussed above, under the FDA regime, developers are continuously providing information to the regulator, from the pre-training stage onwards.[173] This should also be the case in relation to foundation models.

It might also make sense to track core inputs to training runs by giving UDIs to microchips. This would allow compliance with regulations or standards to be tracked and would ensure that the regulator would have sight of non-notified large training runs. Finally, the other key input into training AI models – data – should adhere to documentation obligations, similarly to FDA QMS procedures.

Foundation model developer layer oversight

Decisions taken early in the development process have significant implications downstream. For example, models (pre-)trained on fundamental human rights values produce outputs that are less structurally harmful.[174] To reduce risk of harm as early as possible, critical decisions that shape performance across the supply chain should be documented as they are made, before wide-scale distribution, fine-tuning or application,

Third-party evidence generation and endpoints

The FDA model relies on third-party efficacy and safety evidence to prove ‘endpoints’ (targeted outcomes, jointly agreed between the FDA and developers before a clinical trial) as defined in standards or in an exploratory manner together with the FDA. This allows high-quality information on the pre-market processes for devices to be gathered and submitted to regulators.

Narrowly defined endpoints are very similar to one of the most commonly cited interventions in the AI governance space: technical audits.[175] A technical audit is ‘a narrowly targeted test of a particular hypothesis about a system, usually by looking at its inputs and outputs – for instance, seeing if the system performs differently for different user groups’. Such audits have been suggested by many AI developers and researchers and by civil society.[176]

Regulators should therefore develop – or support the AI ecosystem to develop – benchmarks and metrics to assess the capabilities of foundation models, and possibly thresholds that a model would have to meet before it could be placed on the market. This would help standardise the approach to third-party compliance with evidence and measurement requirements, as under the FDA, and establish a culture of safety in the sector.

Clinical trials

In the absence of narrowly defined endpoints and in cases of uncertainty, the FDA works with developers and third-party experts to enable more exploratory scrutiny as part of trials and approvals. Some of these trials are based on iterative risk management and explorative auditing, and on small-scale deployment to facilitate ‘learning by doing’ on safety issues. This informs what monitoring is needed, provides iterative advice and leads to learning being embedded in regulations afterwards.

AI regulators could use similar mechanisms, such as (regulatory) sandboxes. This would involve pre-market, small-scale deployment of AI models in real-world but controlled conditions, with regulator oversight.

This could be done using a representative population for red-teaming, expert ‘adversarial’ red-teamers (at the foundation model developer stage), or sandboxing more focused on foreseeable or experimental applications and how they interact with end users. In some jurisdictions, existing regulatory obligations could be used as the endpoint and offer presumptions of conformity – and therefore market access – after sandbox testing (as in the EU AI Act).

It will take work to develop a method and an ecosystem of independent experts who can work on third-party audits and sandboxes for foundation models. But this is a challenge the FDA has met, as have other sectors such as aviation, motor vehicles and banking.[177] An approach like the one described above has been used in aviation to monitor and document incidents and devise risk mitigation strategies. This helped to encourage a culture of safety in the industry, reducing fatality risk by 83 per cent between 1998 and 2008 (at the same time as a five per cent annual increase in passenger kilometres flown).[178]

Many organisations already exist that can service this need in the AI space (for example, Eticas AI, AppliedAI, Algorithmic Audit, Apollo Research), and more are likely to be set up.[179]

An alternative to sandboxes is to consider structured access for foundation models, at least until it can be proven that a model is safe for wide-scale deployment.[180] This would be an adaptation of the FDA’s approach to clinical trials, which allows experimentation with a limited number of people when the technology has a wide spectrum of uses (for example, gene editing) or when the risks are unclear, to get insights while preventing any harms that arise from proliferation.

Applied to AI, this could entail a staged release process – something leading AI researchers have already advocated for. This would involve model release to a small number of people (for example, vetted researchers) so that ‘beta’ testing is not done on the whole population via mass deployment.

Internal testing and disclosure of ‘adverse events’

Another mechanism used at the development stage by the FDA is internal testing and mandatory disclosure of ‘adverse events’. Regulators could impose similar obligations on foundation model developers, requiring internal audits and red teaming[181] and the disclosure of findings to regulators. Again, these approaches have been suggested by leading AI developers.[182] They could be made more rigorous by coupling them with mandatory disclosure, as under the FDA regime.

The AI governance equivalent of reporting ‘adverse effects’ might be incident monitoring.[183] This would involve a ‘systematic approach to the collection and dissemination of incident analysis to illuminate patterns in harms caused by AI’.[184] The approach could be strengthened further by including ‘near-miss’ incidents.[185]

In developing these proposals, however, it is important to bear in mind challenges faced in the life sciences sector regarding how to make adverse effect reporting suitably prescriptive. For example, clear indicators for what to report need to be established so that developers cannot claim ignorance and underreport.

However, it is not possible to foresee all potential effects of a foundation model. As a result, there needs to be some flexibility in incident reporting as well as penalties for not reporting. Medical device regulators in the UK have navigated this by providing high-level examples of indirect harms to look out for, and examples of the causes of these harms.[186] In the USA, drug and device developers are liable to report larger-scale incidents, enforced by the FDA through, for example, fines. If enacted effectively, this kind of incident reporting would be a valuable foresight mechanism for identifying emergent harms.

A pre-market approval gate for foundation models

After the foundation model developer layer, regulators should consider a pre-market approval gate (as used by the FDA) at the point just before the model is made widely available and accessible for use by other businesses and consumers. This would build on the mandatory disclosure obligations at the data and compute layers and involve submitting all documentation compiled from third-party audits, internal audits, red teaming and sandbox testing. It would be a rigorous regime, similar to the FDA’s use of QMS, third-party efficacy evidence, adverse event reporting and clinical trials.

AI regulators should ensure that documentation and testing practices are standardised, as they are in FDA oversight. This would ensure that high-value information is used for market approval at the optimal time, to minimise the risk of potential downstream harms before a model is released onto the market.

This approach also depends on developing adequate benchmarks and standards. As a stopgap, approval gates could initially be based on transparency requirements and the provision of exploratory evidence. As benchmarks and standards emerged over time, the evidence required could be more clearly defined.

Such an approval gate would be consistent with one of the key risk-reducing features of the FDA’s approach: putting the burden of proof on developers. Many of the concerns around third-party audits of foundation models (in the context of the EU AI Act) centre on the lack of technological expertise beyond AI labs. A pre-market approval gate would allow AI regulators to specify what levels of safety they expect before a foundation model can reach the market, but the responsibility for proving safety and reliability would be placed on the experts who wish to bring the model to market.

In addition, the approval gate offers the regulator and accredited third parties the chance to learn. As the regulator learns – and the technology develops – approval gates could be updated via binding guidance (rather than legislative changes). This combination of ‘intervention and reflection’ has ‘been shown to work in safety-critical domains such as health’.[187] Regulators and other third parties should cascade this learning downstream, for example, to parties who build on top of the foundation model. This is a key risk-reducing feature of the FDA’s approach: the ‘approvers’ and others in the ecosystem become more capable and more aware of safe use and risk mitigation.

While the burden of proof would be primarily on developers (who may use third parties to support in evidence creation), approval would still depend on the regulator. Another key lesson from FDA processes is that the regulator should bring in support from independent experts in cases of uncertainty, via a committee of experts, consumer and industry representatives, and patient representatives. This is important, as the EU’s regulatory regime for AI has been criticised for a lack of multi-stakeholder governance mechanisms, including ‘effective citizen engagement’.[188]

Indeed, many commercial AI labs say that they want avenues for democratic oversight and public participation (for example, OpenAI and Anthropic’s participation in ‘alignment assemblies’,[189] which seek public opinion to inform, for example, release criteria) but are unclear on how to establish them.[190] Introducing ways to engage stakeholders in cases of uncertainty as part of the foundation model approval process could help to address this. It would give a voice to those who could be affected by models with potentially societal-level implications, in the same way patients are given a voice in FDA review processes for SaMD. It might also help address one of the limitations of the FDA: an overreliance on industry expertise in some novel areas.

To introduce public participation in foundation model oversight in a meaningful way, it would be important to consider the approach to engagement that is suitable to help to identify risks.

One criteria to consider is who should be involved, with options ranging from a representative panel or jury of members of the public to panels formed of members of the public at higher risk of harm or marginalisation.

Another criteria to consider relates to the depth of engagement. The depth of engagement is often framed as a spectrum from low involvement, such as public consultations, all the way to deeper processes that involve partnership in decision-making.[191]

A third criteria to consider is the method of engagement. This would depend on decisions related to who should be involved and to what extent. For example, surveys or focus groups are common in consultative exercises, workshops can enable more involvement whereas panels and juries allow for deeper engagement which can result in its members proposing recommendations. In any case it will be important to consider whose voices, experiences and potential harms will be included or missed, and ensure those less represented or at more risk of harms are part of the process.

Finally, there are ongoing debates about whether pre-market approval should be applied to all foundation models, or ‘tiered’ to ensure those with the most potential to impact society are subject to greater oversight.

While answering this question is beyond the scope of this paper, it seems important that both ex ante and ex post metrics are considered when establishing which models belong in which tier. The former might include, for example, measurement of modalities, the generality of the base model, the distribution method and the potential for adaptation of the model. The latter could include the number of downstream applications built on the model, the number of users across applications and how many times the model is being queried. Any regulator must have the power and capacity to update the makeup of tiers in a timely fashion as and when these metrics shift.

Application layer oversight

Following the AI supply chain, a foundation model is made available and distributed via the ‘host’ layer, by either the model provider (API access) or a cloud service provider (for example, Hugging Face, which hosts models for download).

Some argue that this layer should also have some responsibility for the safe development and distribution of foundation models (for example, through KYC checks, safety testing before hosting or take-down obligations in case of harm). But there is a reason why regulators have focused primarily on developers and deployers: they have the most control over decisions affecting risk origin and safety levels. For this reason, we also focus on interventions beyond the host layer.

However, a minimal set of obligations on host layer actors (such as cloud service providers or model hosting platforms) is necessary, as they could play a role in evaluating model usage, implementing trust and safety policies to remove models that have demonstrated or are likely to demonstrate serious risks, and flagging harmful models to regulators when it is not in their power to take them down. This is beyond the scope of this paper, and we suggest that the responsibilities of the host layer are addressed in further research.

Once a foundation model is on the market and it is fine-tuned, built upon or deployed by downstream users, its risk profile becomes clearer. Regulatory gates and product safety checks are introduced by existing regulators at this stage, for example in healthcare, automotives or machinery (see UK regulation of large language models – LLMs – as medical devices, or the EU AI Act’s regulation of foundation models deployed in ‘high-risk’ areas). These are useful regulatory endpoints that should help to reduce risk and harm proliferation.

However, there are still lessons to be learned at the application layer from the FDA model. Many of the mechanisms used at the foundation model developer layer could be used at this layer, but with endpoints defined based on the risk profile of the area of deployment. This could take the form of third-party audits based on context-specific standards, or sandboxes including representative users based on the specific setting in which the AI system will be used.

Commercial off-the-shelf software (COTS) in critical environments

One essential mechanism for the application layer is a deployment risk assessment. Researchers have proposed that this should involve a review of ‘(a) whether or not the model is safe to deploy, and (b) the appropriate guardrails for ensuring the deployment is safe’.[192] This would serve as an additional gate for context-specific risks and is similar to the FDA’s rules for systems that integrate COTS in severe-risk environments. Under these rules, additional approval is needed unless the COTS is approved for use in that context.

A comparable AI governance regime could allow foundation models that pass the earlier approval gate to be used downstream unless they are to be used in a high-risk or critical sector, in which case a new risk assessment would have to be undertaken and further regulatory approval sought.

For example, foundation models applied in critical energy system would be pre-approved as COTS. The final approval would still need to be given by energy regulators, but the process would be substantially easier for pre-approved COTS. The EU AI Act employs a similar approach: foundation models that are given a high-risk ‘intended purpose’ by downstream developers would have to undergo EU conformity assessment procedures.

Algorithmic impact assessments are a tool for assessing the possible societal impacts of an AI system before the system is in use (with ongoing monitoring often advised).[193] Such assessments should be undertaken when an AI system is to be deployed in a critical area such as cybersecurity, and mitigation measures put in place. This assessment should be coupled with a new risk assessment (in addition to that carried out by the foundation model developer), tailored to the area of deployment. This could involve additional context-specific guidance or questions from regulators, and the subsequent mitigation measures should address these.

Algorithmic impact and risk assessments are essential components at the application layer for high-risk deployments, and are very similar to the QMS imposed by the FDA throughout the development and deployment process. If they are done correctly, they can help to ensure that risk and impact mitigation measures are put in place to cover the lifecycle and will form the basis of post-market monitoring processes.

Some AI governance experts have suggested that these assessments should be complemented by user evaluation and testing – defined as assessments of user-centric effects of an application or system, its functionality and its restrictions, usually via user testing or surveys.[194] These evaluations could be tailored to the intended use context of an application, to ensure adequate representation of people potentially affected by it, and would be similar to the context-specific audit gates used by the FDA.

Post-market monitoring

Across sectors, one-off conformity checks have been shown to open the door for regulations to be ‘gamed’ or for emergent behaviours to be missed (see the Volkswagen emissions scandal).[195] These issues are even more likely to arise in relation to AI, given its dynamic nature, including the capacity to change throughout the lifecycle and for downstream users to fine-tune and (re)deploy models in complex environments. The FDA model shows how these risks can be reduced by having an ecosystem of reporting and foresight, and strong regulatory powers to act to mitigate risks.

MedWatch and MedSun reporting

Post-market monitoring by the FDA includes reporting mechanisms such as MedWatch and MedSun.[196] These mechanisms enable adverse event reporting for medical products, as well as monitoring of the safety and effectiveness of medical devices. Serious incidents are documented and their details made available to consumers.

In the AI space, there are similar proposals for foundation model developers, and for high-risk application providers building on top of these models, to implement ‘an easy complaint mechanism for users and to swiftly report any serious risks that have been identified’.[197] This should compel the upstream providers to take corrective action when they can, and to document and report serious incidents to regulators.

This is particularly important for foundation models that are provided via API, as in this case the provider maintains a huge degree of control over the underlying model.[198] This would mean that the provider would usually be able to mitigate or correct the emerging risk. It would also reduce the burden on regulators to document incidents or take corrective action. Leading AI developers have already committed to introducing a ‘robust reporting mechanism’ to allow ‘issues [that] may persist even after an AI system is released’ to be ‘found and fixed quickly’.[199] Regulators could consider putting such a regime in place for all foundation models.

Regulators could also consider detection mechanisms for generative foundation models. These would aim to ‘distinguish content produced by the foundation model from other content, with a high degree of reliability’, as recently proposed by the Global Partnership on AI.[200] Their report found that this is ‘technically feasible and would play an important role in reducing certain risks from foundation models in many domains’. Requiring this approach, at least for the largest model providers (who have the resources and expertise to develop detection mechanisms), could mitigate risks such as disinformation and subsequent undermining of the rule of law or democracy.

Other reporting mechanisms for foundation models have been proposed, which overlap with the FDA’s ‘usability and clinical data logging, and trend reporting’. For example, Stanford researchers have suggested that regulators should compel the disclosure of usage patterns, in the same manner of transparency reporting for online platforms.[201] This would greatly enhance understanding of ‘how foundation models are used (for example, for providing medical advice, preparing legal documents) to hold their providers to account’.[202]

Concern-based audits

Concern-based audits are a key part of the FDA’s post-market governance. They are triggered by real-world monitoring of consumers and impacts after approval. If concerns are identified, the FDA has strong enforcement mechanisms that allow it to access relevant data and documentation. The audits are rigorous and have been shown to have strong deterrence effects on negligent behaviour by drug companies.

Mechanisms for highlighting ‘concern’ in the AI space could include reporting mechanisms and ‘trusted flaggers’ – organisations that are formally recognised as  independent, and with the requisite expertise, for identifying and reporting concerns. People affected by the technologies could be given the right to lodge a complaint with supervisory authorities, such as an AI ombudsman, to support people affected by AI and increase regulators’ awareness of AI harms as they occur.[203] [204] . This should be complimented by a comprehensive remedies framework for affected persons based on effective avenues for redress, including a right to lodge a complaint with a supervisory authority, judicial remedy and an explanation of individual decision-making

Feedback loops

Post-market monitoring is a critical element of the FDA’s risk-reducing features. It is based on mechanisms to facilitate feedback loops between developers, regulators, practitioners and patients. As discussed above, Unique Device Identifiers at the pre-registration stage support monitoring and traceability throughout the lifecycle, while ongoing review of quality, safety and efficacy data via QMS further supports this. Post-market monitoring for foundation models should similarly facilitate such feedback loops. These could include customer feedback, usability and user prompt screening, human-AI interaction evaluations and cross-company reporting of trends and structural indicators. Beyond feedback to the provider, affected persons should also be able to report incidents directly to a regulatory authority, particularly where harm arises, or is reasonably foreseeable to arise.

Software of Unknown Provenance (SOUP)

In the context of safety-critical medical software, SOUP is software that has been developed with an unknown development process or methodology, or which has unknown safety-related properties. The FDA monitors for SOUP by compelling the documentation of pre-specified post-market software adaptations, meaning that the regulator can validate changes to a product’s performance and monitor for issues and unforeseen use in software.[205]

Requiring similar documentation and disclosure of software and cybersecurity issues after deployment of a foundation model would be a minimum sensible safeguard for both risk mitigation and regulator learning. This could also include sharing issues back upstream to the model developer so that they can take corrective action or update testing and risk profiles.

The approach should be implemented alongside the obligations around internal testing and disclosure of adverse events for foundation models at the developer layer. Some have argued that disclosure of near misses should also be required (as it is in the aviation industry)[206] as an added incentive for safe development and deployment.

Another parallel with the monitoring of SOUP can be seen in AI governance proposals for measures around open-source foundation models. To reduce the unknown element, and for transparency and accountability reasons, application providers – or whoever makes the model or system available on the market – could be required to make it clear to affected persons when they are engaging with AI systems and what the underlying model is (including if it is open source), and to share easily accessible explanations of systems’ main parameters and any opt-out mechanisms or human alternatives available.[207] This would be the first step to both corrective action to mitigate risk or harm, and redress if a person is harmed. It is also a means to identify the use of untested underlying foundation models.

Finally, similar to the FDA’s use of documentation of pre-specified, post-market software adaptations, AI regulators could consider mandating that developers and application deployers document and share planned and foreseeable changes downstream. This would have to be defined clearly and standardised by regulators to a proportionate level, taking into consideration intellectual property and trade secret concerns, and the risk of the system being ‘gamed’ in the context of new capabilities. In other sectors, such as aviation, there have been examples of changes being underreported to avoid new costs, such as retraining.[208] But a similar regime would be particularly relevant for AI models and systems, given their unique ability to learn and develop throughout their lifecycle.

The need for documenting or pre-specifying post-market adaptations of foundation models could be based on capabilities evaluations and risk assessments, so that new capabilities or risks that arise post-deployment are reported to the ecosystem. Significant changes could trigger additional safety checks, such as third-party (‘concern-based’, in FDA parlance) audits or red teaming to stress-test the new capabilities.

Investigative powers

The FDA’s post-market monitoring puts reporting obligations on providers and users, while underpinning this with strong investigative powers. It conducts ‘active surveillance’ (for example, under the Sentinel Initiative),[209] and it is legally empowered to check QMS and other documentation and logging data, request comprehensive evidence and conduct inspections.

Similarly, AI regulators should have powers to investigate foundation model developers and downstream deployers, such as for monitoring and learning purposes or when investigating suspected non-compliance. This could include off- and on-site inspections to gather evidence, to address the information asymmetries between AI developers and regulators, and to mitigate emergent risks or harms.

Such a regime would require adequate resources and sociotechnical expertise. Foundation models are a general-purpose technology that will increasingly form part of our digital infrastructure. In this light, there needs to be a recognition that regulators should be funded on a comparable level to other domains in which safety and public trust are paramount and where underlying technologies form important parts of national infrastructure – such as civil nuclear, civil aviation, medicines, and road and rail.[210]

Recalls, market withdrawals and safety alerts

The FDA uses recalls, market withdrawals and safety alerts when products are in violation of law. Recall can also be a voluntary action by manufacturers and distributors to meet their responsibility to protect public health and wellbeing from products that present risk or are otherwise defective.[211]

Some AI governance experts and standards bodies have called for foundation model developers to similarly establish standard criteria and protocols for when and how to restrict, suspend or retire a model from active use.[212] This would be based on monitoring by the original providers throughout the lifecycle for harmful impacts, misuse or security vulnerabilities (including leaks or otherwise unauthorised access).

Whistleblower protection

In the same way that the FDA mandates reporting, with associated whistleblower protections, of adverse events by employees, second-party clinical trial conductors and healthcare practitioners, AI regulators should protect whistleblowers (for example, academics, designers, developers, project contributors, auditors, product managers, engineers and economic operators) who suspect breaches of law by a developer or deployer or an AI model or system. This protection should be developed in a way that learns from the pitfalls of whistleblower law in other sectors, which have led to ineffective uptake or enforcement. This includes ensuring breadth of coverage, clear communication of processes and protections, and review mechanisms.[213]

Recommendations and open questions

The FDA model of pre-approval and monitoring is an important inspiration for regulating novel technologies with potentially severe risks, such as foundation models.

This model entails risk-based mandates for pre-approval based on mandatory safety evidence. This works well when risks reliably originate and can be identified before proliferating or developing into harms.

The general-purpose nature of foundation models requires exploratory external scrutiny upstream in the supply chain, and targeted sector-specific approvals downstream.

Risks need to be identified and mitigated before they proliferate. This is especially difficult for foundation models.[214] Explorative approval gates have been ‘shown to work in safety-critical domains such as health’, due to the combination of ‘intervention and reflection’. Pre-approvals offer the FDA a mechanism for intervention, allowing most risks to be caught.

Another important feature of oversight is reflection. In health regulation, this is achieved through ‘iteration via guidance, rather than requiring legislative changes’.[215] This is a key consideration for AI regulators, who should be empowered (and compelled) to frequently update rules via binding guidance.

A continuous learning process to build suitable approval and monitoring regimes for foundation models is essential, especially at the model development layer. Downstream, there needs to be targeted scrutiny and approval for deployment through existing approval gates in specific application areas.

Effective oversight of foundation models requires recurring, independent evaluations and audits and access to information, placing the burden of proof on developers – not on civil society or regulators.

Literature reviews of other industries[216] show that this might be achieved through risk-based reviews by empowered regulators and third parties, tiered access for evaluators, mandatory pre-approvals, and treating foundation models like auditable products.

Our general principles for AI regulators are detailed in the section ‘Applying key features of FDA-style oversight to foundation models’.

Recommendations for AI regulators, developers and deployers

Data and compute layers oversight

  1. Regulators should compel pre-notification of, and information-sharing on, large training runs. Providers of compute for such training runs should cooperate with regulators on monitoring (by registering device IDs for microchips) and safety verification (KYC checks and tracking).
    • FDA inspiration: pre-submissions, Unique Device Identifiers (UDIs)
  2. Regulators should compel mandatory model and dataset documentation and disclosure for the pre-training and fine-tuning of foundation models,[217] [218] [219] including a capabilities evaluation and risk assessment within the model card for the (pre-) training stage and throughout the lifecycle.[220] Dataset documentation should focus on a description of training data that is safe to be made public (what is in it, where was it collected, under what licence, etc.), coupled with structured access for regulators or researchers to the training data itself (while adhering to strict levels of cybersecurity, as even this access carries security risks).
    • FDA inspiration: Quality Management System (QMS)

Foundation model layer oversight

  1. Regulators should introduce a pre-market approval gate for foundation models, as this is the most obvious point at which risks can proliferate. In any jurisdiction, defining the approval gate will require significant work, with input from all relevant stakeholders. Clarity should be provided about which foundation models would be subject to this stricter form of pre-market approval. Based on the FDA findings, this gate should at least entail submission of evidence to prove safety and market readiness based on internal testing and audits, third-party audits and (optional) sandboxes. Making models available on a strict and controllable basis via structured access could be considered as a temporary fix until an auditing ecosystem and/or sandboxes are developed.Depending on the jurisdiction in question and existing or foreseen pre-market approval for high-risk use, an additional approval gate should be introduced using endpoints (outcomes or thresholds to be met to determine efficacy and safety) based on the risk profile of the area of deployment for the application layer.
    • FDA inspiration: QMS, third-party efficacy evidence, adverse events reporting, clinical trials
  2. Third-party audits should be required as part of the pre-market approval process, and sandbox testing (as described in Recommendation 3) in real-world conditions should be considered. These should consist of – at least – a third-party audit based on context-specific standards. Alternatively, regulators could use sandboxes that include representative users (based on the setting in which the AI system will be used) to check conformity before deployment. Results should be documented and disclosed to the regulator.
    • FDA inspiration: third-party efficacy evidence, adverse events reporting, clinical trials
  3. Developers should enable detection mechanisms for outputs of generative foundation models.[221] Developers and deployers should make clear to affected persons and end users when they are engaging with AI systems. As an additional safety mechanism, they should build in detection mechanisms to allow end users and affected persons to ‘distinguish content produced by the foundation model from other content, with a high degree of reliability’.[222] Such detection mechanisms are important both as a defensive tool (for example, tagging AI-generated content) and also to enable study of model impacts. AI regulators could consider making this mandatory, at least for the most significant models (developers of which may have the resources and expertise to develop detection mechanisms).
    • FDA inspiration: post-market safety monitoring
  4. As part of the initial risk assessment, developers and deployers should document and share planned and foreseeable modifications throughout the foundation model’s supply chain. A substantial modification that falls outside this scope should trigger additional safety checks, such as third-party (‘concern-based’) audits or red teaming to stress test the new capabilities.
    • FDA: concern-based audits, pre-specified change control plans
  5. Foundation model developers, and subsequently high-risk application providers building on top of these models, should enable an easy complaint mechanism for users to swiftly report any serious risks that have been identified. This should compel upstream providers to take corrective action when they can, and to document and report serious incidents to regulators. These feedback loops should be strengthened further by awareness-raising across the ecosystem about reporting, and sharing lessons learned on what has been reported and corrective actions taken.
    • FDA Inspiration: MedWatch and MedSun programs

Application layer oversight

  1. Existing sector-specific agencies should review and approve the use of foundation models for a set of use cases, by risk level. Deployers of foundation models in high-risk or critical areas (to be defined in each jurisdiction) should undertake a deployment risk assessment to review ‘(a) whether or not the model is safe to deploy, and (b) the appropriate guardrails for ensuring the deployment is safe’.[223] Upstream developers should cooperate and share information with downstream customers to conduct this assessment. If the model is deemed safe, they should also undertake an algorithmic impact assessment to assess possible societal impacts of an AI system before the system is in use (with ongoing monitoring often advised).[224] Results should be documented and disclosed to the regulator.
    • FDA inspiration: COTS (commercial off-the-shelf software), QMS
  2. Downstream application providers should make clear to end users and affected persons what the underlying foundation model is, including if it is an open-source model, and provide easily accessible explanations of systems’ main parameters and any opt-out mechanisms or human alternatives available.[225]
    • FDA inspiration: Software of Unknown Provenance (SOUP)

Post-market monitoring

  1. An AI ombudsman should be considered, to receive and document complaints or known instances of harms of AI. This would increase regulators’ visibility of AI harms as they occur. It could be piloted initially for a relatively modest investment, but if successful it could dramatically improve redress for AI harms and the functionality of an AI regulatory framework as a whole.[226] An ombudsman should be complimented by a comprehensive remedies framework for affected persons based on clear avenues for redress.
    • FDA inspiration: concern-based audits, reporting of adverse events
  2. Developers and deployers should provide documentation and disclosure of incidents throughout the supply chain, including near misses.[227] This could be strengthened by requiring downstream developers (building on top of foundation models at the application layer) and end users (for example, medical or education professionals) to also disclose incidents.
    • FDA inspiration: reporting of adverse events
  3. Foundation model developers and downstream deployers should be compelled to restrict, suspend or retire a model from active use if harmful impacts, misuse or security vulnerabilities (including leaks or other unauthorised access) arise. Such decisions should be based on standardised criteria and processes.[228]
  4. Host layer actors (for example, cloud service providers or model hosting platforms) should also play a role by evaluating model usage, implementing trust and safety policies to remove models that have demonstrated or are likely to demonstrate serious risks, and flagging harmful models to regulators when it is not in their power to take them down.
    • FDA inspiration: recalls, market withdrawals and safety alerts
  5. AI regulators should have strong powers to investigate and require evidence generation from foundation model developers and downstream deployers. This should be strengthened by whistleblower protections for anyone involved in the development or deployment process who raises concerns about risks to health or safety. This would support regulatory learning and act as a strong deterrent to rule breaking. Powers should include off- and on-site inspections and evidence-gathering mechanisms to address the information asymmetries between AI developers and regulators and to mitigate emergent risks or harms. Consideration should be given to the trade-offs between intellectual property, trade secret and privacy protections (and whether these could serve as undue legal loopholes) and the safety-enhancing features of investigative powers: regulators considering the FDA model across jurisdictions should clarify such legally contentious issues.
    • FDA inspiration: wide information access, active surveillance
  6. Any regulator should be funded to a level comparable to (if not greater than) regulators in other domains where safety and public trust are paramount and where underlying technologies form part of national infrastructure – such as civil nuclear, civil aviation, medicines, or road and rail.[229] Given the level of resourcing required, this may be partly funded by AI developers over a certain threshold (to be defined the regulatorfor example, annual turnover)– as is the case with the FDA[230] and the EU’s European Medicines Agency (EMA).[231] Such an approach is important, to ensure that regulators have a source of funding that is stable and secure, and (importantly) independent from political decisions or reprioritisation.
    • FDA inspiration: mandatory fees
  7. The law around AI liability should be clarified to ensure that legal and financial liability for AI risk is distributed proportionately along foundation model supply chains. Liability regimes vary between jurisdictions and a thorough assessment is beyond the scope of this paper, but across sectors regulating complex technology, clarity in liability is a key driver of compliance within companies and uptake of the technology. For example, lack of clarity as to end user liability in clinical AI is a major reason that uptake has been limited. Liability will be even more contentious in the foundation model supply chain when applications are developed on top of foundation models, and this must be addressed accordingly in any regulatory regime for AI.

Overcoming the limitations of the FDA in a prospective AI regulatory regime

Having considered how the risk-reducing mechanisms of the FDA might be applied to AI governance, it makes sense to also acknowledge the limitations of the FDA regime, and to consider how they might also be counterbalanced in a prospective AI regulatory regime.

The first limitation is the lack of coverage for systemic risks, as the FDA focuses on risk to life. Systemic risks are prevalent in the AI space.[232] AI researchers have conceptualised systemic risk as societal harm and point out that it is similarly overlooked. Proposals to address this include: ‘(1) public oversight mechanisms to increase accountability, including mandatory impact assessments with the opportunity to provide societal feedback; (2) public monitoring mechanisms to ensure independent information gathering and dissemination about AI’s societal impact; and (3) the introduction of procedural rights with a societal dimension, including a right to access to information, access to justice, and participation in public decision-making on AI, regardless of the demonstration of individual harm’.[233] We have expanded on and included these mechanisms in our recommendations in the hope that they can overcome limitations centring on systemic risks.

The second limitation is the high cost of compliance and subsequent limited number of developers, given that the stringent approval requirements are challenging for smaller players to meet. Inspiration for how to counterbalance this may be gleaned from the EU’s FDA equivalent, the EMA. It offers tailored support to small and medium-sized enterprises (SMEs), via an SME Office that provides regulatory assistance for reduced fees. This has contributed to the approval rates for SME applicants increasing from 40 per cent in 2016 to 89 per cent in 2020.[234] Similarly, the UK’s NHS has an AI & Digital Regulations Service that gives guidance and advice on navigating regulation, especially for SMEs that do not have compliance teams.[235]

Streamlined regulatory pathways could be considered to further reduce burdens for AI models or systems with demonstrably promising potential (for example, for scientific discovery). The EMA has done this through its Advanced Therapy Medicine Products process, which streamlines approval procedures for certain medicines.[236]

Similar support mechanisms could be considered for SMEs and startups, as well as streamlined procedures for demonstrably beneficial AI technology, under an AI regulator.

The third limitation is the FDA’s overreliance on industry in some novel areas, because of a lack of expertise. Lack of capacity for effective regulatory oversight has been voiced as a concern in the AI space, too.[237] Some ideas exist for how to overcome this, such as the Singaporean AI Office’s use of public–private partnerships to utilise industry talent without being reliant on it.[238]

The EMA has grappled with similar challenges. Like the FDA, it overcomes knowledge gaps by having a pool of scientific experts, but it seeks to prevent conflict of interest by leaning substantially on transparency: the EMA Management Board and experts cannot have any financial or other interests in the industry they are overseeing, and the curricula vitae, declarations of interest and risk levels for these experts are publicly available.[239]

Taken together, these solutions might be considered to reduce the chances of the limitations of FDA governance being reproduced by an AI regulator.

Open questions

The proposed FDA-style oversight approach for foundation models is far from a detailed ready-to-implement guideline for regulators. We acknowledge the small sample of interviewees for this paper, and that many of our interview subjects may strongly support an FDA model for regulation. For further validation and detailing of the claims in this paper, we are especially interested in future work on three sets of questions.

Understanding foundation model risks

  • Across the foundation model supply chain, where exactly do foundation model risks[240] originate and proliferate, and which players need to be tasked with their mitigation? How can unknown risks be discovered?
  • How effective will exploratory and targeted scrutiny be in identifying different kinds of risks for foundation models?
  • Do current and future foundation models need to be categorised along risk tiers? If so, how? Do all foundation models need to go through an equally rigorous process of regulatory approvals?

Detailing FDA-style oversight for foundation models to foster ‘safe innovation’

  • For the FDA, what aspects of regulatory guidance were easier to prescribe, and to enforce in practice?
  • How do FDA-style oversight or specific oversight features address each risk of foundation models in detail?
  • How can FDA-style oversight for foundation models be integrated into international oversight regimes?[241]
  • What do FDA-style review, audit and inspection processes look like, step by step, for foundation models?
  • How can the limitations of the FDA approach be addressed in every layer of the foundation model supply chain? How can difficult-to-detect systemic risks be mitigated? How can the stifling of innovation, especially among SMEs, be avoided?
  • Are FDA-style product recalls feasible for a foundation model or a downstream applications of foundation models?
  • What role should third parties in the host layer play? While they have less remit over risk origin, might they have significant control over, for example, risk mitigation?
  • What are the implications of FDA-style oversight for foundation models on their accessibility, affordability and sharing their benefits?
  • How would FDA-style pre-approvals be enforced for foundation models, for example, for product recalls?
  • How is liability distributed in an FDA-style oversight approach?
  • Why is the FDA able to be stringent/cautious? How do political incentives on congressional oversight and aversion to risk of harms of medication apply to foundation model regulation?
  • What can be learned from the political economy of the FDA and its reputation?
  • In each jurisdiction (for example, USA, UK, EU), how does an FDA-style approach for AI fit into the political economy and institutional landscape?
  • In each jurisdiction, how should liability law be adapted for AI to ensure that legal and financial liability for AI risk is distributed proportionately along foundation model supply chains?

Learnings from other regulators

  • What can be learned from regulators in public health in other jurisdictions, like the UK’s Medicines and Healthcare products Regulatory Agency (MHRA), EU’s EMA and Health Canada? [242] [243] [244]
  • How can other non-health regulators, such as the US Federal Aviation Administration  or National Highway Traffic Safety Administration, inspire foundation model oversight?[245]
  • How can novel forms of oversight and audits, such as cross-audits or joint audits, be coupled with processes from existing regulators?

Acknowledgements

This paper was co-authored by Merlin Stein (PhD candidate at the University of Oxford) and Connor Dunlop (EU Public Policy Lead at the Ada Lovelace Institute) with input from Andrew Strait.

Interviewees

The 20 interviewees included experts on FDA oversight and foundation model evaluation processes from industry, academia, and thinktanks, as well as government officials. This included three interviews with leading AI labs, two with third-party AI evaluators and auditors, nine with civil society organisations, and six with medical software regulation experts, including former FDA leadership and clinical trial leaders.

The following participants gave us permission to mention their names and affiliations (in alphabetical order). Ten interviewees not listed here did not provide their permission. Respondents do not represent any organisations they are affiliated with. They chose to add their name after the interview and were not sent a draft of this paper before publication. The views expressed in this paper are of the Ada Lovelace Institute.

  • Kasia Chmielinski, Berkman Klein Center for Internet & Society
  • Gemma Galdón-Clavell, Eticas Research & Consulting
  • Gilian Hadfield, University of Toronto, Vector Institute and OpenAI, independent contractor
  • Sonia Khatri, independent SaMD and medical device regulation expert
  • Igor Krawczuk, Lausanne Institute of Technology
  • Sarah Myers West, AI Now Institute
  • Noah Strait, Scientific and Medical Affairs Consulting
  • Robert Trager, Blavatnik School of Government, University of Oxford, and Centre for the Governance of AI
  • Alexandra Tsalidas, Harvard Ethical Intelligence Lab
  • Rudolf Wagner, independent senior executive advisor for SaMD

Reviewers

We are grateful for helpful comments and discussions on this work from:

  • Ashwin Acharya
  • Markus Anderljung
  • Clíodhna Ní Ghuidhir
  • Xiaoxuan Liu
  • Deborah Raji
  • Sarah Myers West
  • Moritz von Knebel

Footnotes

[1] ‘Voluntary AI Commitments’, <www.whitehouse.gov/wp-content/uploads/2023/09/Voluntary-AI-Commitments-September-2023.pdf>, accessed October 12, 2023

[2] ‘An EU AI Act that works for people and society’ (Ada Lovelace Institute 2023) <www.adalovelaceinstitute.org/policy-briefing/eu-ai-act-trilogues/> accessed 12 October 2023

[3] The factors that determine AI risk are not purely technical – sociotechnical determinants of risk are crucial. Features such as the context of deployment, the competency of the intended users, and the optionality of interacting with an AI system must all be considered, in addition to specifics of the data and AI model deployed. OECD, “OECD Framework for the Classification of AI Systems,” OECD Digital Economy Papers, no. 323 (February 2022), https://doi.org/10.1787/cb6d9eca-en.

[4] Markus Anderljung and others, ‘Frontier AI Regulation: Managing Emerging Risks to Public Safety’ (arXiv, 4 September 2023) <http://arxiv.org/abs/2307.03718> accessed 15 September 2023.

[5] ‘A Law for Foundation Models: The EU AI Act Can Improve Regulation for Fairer Competition – OECD.AI’ <https://oecd.ai/en/wonk/foundation-models-eu-ai-act-fairer-competition> accessed 15 September 2023.

[6] ‘Stanford CRFM’ <https://crfm.stanford.edu/report.html> accessed 15 September 2023.

[7] ‘While only a few well-resourced actors worldwide have released general purpose AI models, hundreds of millions of end-users already use these models, further scaled by potentially thousands of applications building on them across a variety of sectors, ranging from education and healthcare to media and finance.’ Pegah Maham and Sabrina Küspert, ‘Governing General Purpose AI’.

[8] Draft standards here are a very good example of the value of dataset documentation (i.e. declaring metadata) on what is used in training and fine-tuning models. In theory, this could also all be kept confidential as commercially sensitive information once a legal infrastructure is in place www.datadiversity.org/draft-standards

[9] Mitchell, Wu, Zaldivar, Barnes, Vasserman, Hutchinson, Spitzer, Raji and Gebru, (2019), ‘Model Cards for Model Reporting’, doi: 10.1145/3287560.3287596

[10] Gebru, Morgenstern, Vecchione, Vaughan, Wallach, Daum and Crawford, (2021), Datasheets for Datasets, https://m-cacm.acm.org/magazines/2021/12/256932-datasheets-for-datasets/abstract (Accessed: 27 February 2023) Hutchinson, Smart, Hanna, Denton, Greer, Kjartansson, Barnes and Mitchell, (2021), ‘Towards Accountability for Machine Learning Datasets: Practices from Software Engineering and Infrastructure’, doi: 10.1145/3442188.3445918;

[11] In the UK, the Civil Aviation Authority has a revenue of £140m and staff of over 1,000, and the Office for Nuclear Regulation around £90m with around 700 staff). An EU-level agency for AI should be funded well beyond this, given that the EU is more than six times the size of the UK.

[12] Algorithmic Accountability Act of 2022 <2022-02-03 Algorithmic Accountability Act of 2022 One-pager (senate.gov)> accessed 15 September 2023.

[13] Lingjiao Chen, Matei Zaharia and James Zou, ‘How Is ChatGPT’s Behavior Changing over Time?’ (arXiv, 1 August 2023) <http://arxiv.org/abs/2307.09009> accessed 15 September 2023.

[14] ‘AI-Generated Books on Amazon Could Give Deadly Advice – Decrypt’ <https://decrypt.co/154187/ai-generated-books-on-amazon-could-give-deadly-advice> accessed 15 September 2023.

[15] ‘Generative AI for Medical Research | The BMJ’ <www.bmj.com/content/382/bmj.p1551#> accessed 15 September 2023.

[16] Emanuel Maiberg ·, ‘Inside the AI Porn Marketplace Where Everything and Everyone Is for Sale’ (404 Media, 22 August 2023) <www.404media.co/inside-the-ai-porn-marketplace-where-everything-and-everyone-is-for-sale/> accessed 15 September 2023.

[17] Belle Lin, ‘AI Is Generating Security Risks Faster Than Companies Can Keep Up’ Wall Street Journal (10 August 2023) <www.wsj.com/articles/ai-is-generating-security-risks-faster-than-companies-can-keep-up-a2bdedd4> accessed 15 September 2023.

[18] Sarah Carter et. al., <The Convergence of Artificial Intelligence and the Life Sciences www.nti.org/analysis/articles/the-convergence-of-artificial-intelligence-and-the-life-sciences/> accessed 2 November 2023

[19] Dual Use of Artificial Intelligence-powered Drug Discovery – PubMed (nih.gov)

[20] Haydn Belfield, ‘Great British Cloud And BritGPT: The UK’s AI Industrial Strategy Must Play To Our Strengths’ (Labour for the Long Term 2023)

[21] Thinking About Risks From AI: Accidents, Misuse and Structure | Lawfare (lawfaremedia.org)

[22] Governing General Purpose AI — A Comprehensive Map of Unreliability, Misuse and Systemic Risks | Stiftung Neue Verantwortung (SNV) (stiftung-nv.de); Anthropic \ Frontier Threats Red Teaming for AI Safety

[23] www.deepmind.com/blog/an-early-warning-system-for-novel-ai-risks

[24] ‘Mission critical: Lessons from relevant sectors for AI safety’ (Ada Lovelace Institute 2023) <https://www.adalovelaceinstitute.org/policy-briefing/ai-safety/> accessed 23 November 2023

[25] ‘EU AI Standards Development and Civil Society Participation’ <www.adalovelaceinstitute.org/event/eu-ai-standards-civil-society-participation/> accessed 18 September 2023.

[26] Algorithmic Accountability Act of 2022 <2022-02-03 Algorithmic Accountability Act of 2022 One-pager (senate.gov)> accessed 15 September 2023.

[27] ‘The Problem with AI Licensing & an “FDA for Algorithms” | The Federalist Society’ <https://fedsoc.org/commentary/fedsoc-blog/the-problem-with-ai-licensing-an-fda-for-algorithms> accessed 15 September 2023.

[28] ‘Clip: Amy Kapczynski on an Old Idea Getting New Attention–an “FDA for AI”. – AI Now Institute’ <https://ainowinstitute.org/general/clip-amy-kapczynski-on-an-old-idea-getting-new-attention-an-fda-for-ai> accessed 15 September 2023.

[29] Dylan Matthews, ‘The AI Rules That US Policymakers Are Considering, Explained’ (Vox, 1 August 2023) <www.vox.com/future-perfect/23775650/ai-regulation-openai-gpt-anthropic-midjourney-stable> accessed 15 September 2023; Belenguer L, ‘AI Bias: Exploring Discriminatory Algorithmic Decision-Making Models and the Application of Possible Machine-Centric Solutions Adapted from the Pharmaceutical Industry’ (2022) 2 AI and Ethics 771 <https://doi.org/10.1007/s43681-022-00138-8>

[30] ‘Senate Hearing on Regulating Artificial Intelligence Technology | C-SPAN.Org’ <www.c-span.org/video/?529513-1/senate-hearing-regulating-artificial-intelligence-technology> accessed 15 September 2023.

[31] ‘AI Algorithms Need FDA-Style Drug Trials | WIRED’ <www.wired.com/story/ai-algorithms-need-drug-trials/> accessed 15 September 2023.

[32] ‘One of the “Godfathers of AI” Airs His Concerns’ The Economist <www.economist.com/by-invitation/2023/07/21/one-of-the-godfathers-of-ai-airs-his-concerns> accessed 15 September 2023.

[33] ‘ISVP’ <www.senate.gov/isvp/?auto_play=false&comm=judiciary&filename=judiciary072523&poster=www.judiciary.senate.gov/assets/images/video-poster.png&stt=> accessed 15 September 2023.

[34] ‘Regulations.Gov’ <www.regulations.gov/docket/NTIA-2023-0005/comments> accessed 15 September 2023.

[35] Guidelines for Artificial Intelligence in Medicine: Literature Review and Content Analysis of Frameworks – PMC (nih.gov)

[36] ‘Foundation Models for Generalist Medical Artificial Intelligence | Nature’ <www.nature.com/articles/s41586-023-05881-4> accessed 15 September 2023.

[37] Anthropic admitted openly that “we do not know how to train systems to robustly behave well“. ‘Core Views on AI Safety: When, Why, What, and How’ (Anthropic) <www.anthropic.com/index/core-views-on-ai-safety> accessed 18 September 2023.

[38] NTIA AI Accountability Request for Comment <www.regulations.gov/docket/NTIA-2023-0005/comments> accessed 18 September 2023.

[39] Inioluwa Deborah Raji and others, ‘Outsider Oversight: Designing a Third Party Audit Ecosystem for AI Governance’ (arXiv, 9 June 2022) <http://arxiv.org/abs/2206.04737> accessed 18 September 2023.

[40] See Appendix for a list of interviewees

[41] Michael Moor and others, ‘Foundation Models for Generalist Medical Artificial Intelligence’ (2023) 616 Nature 259.

[42] Lewis Ho and others, ‘International Institutions for Advanced AI’ (arXiv, 11 July 2023) <http://arxiv.org/abs/2307.04699> accessed 18 September 2023.

[43] Center for Devices and Radiological Health, ‘Medical Device Single Audit Program (MDSAP)’ (FDA, 24 August 2023) <www.fda.gov/medical-devices/cdrh-international-programs/medical-device-single-audit-program-mdsap> accessed 18 September 2023.

[44] Center for Drug Evaluation and Research, ‘Conducting Clinical Trials’ (FDA, 2 August 2023) <www.fda.gov/drugs/development-approval-process-drugs/conducting-clinical-trials> accessed 18 September 2023.

[45] ‘Explainer: What Is a Foundation Model?’ <www.adalovelaceinstitute.org/resource/foundation-models-explainer/> accessed 18 September 2023.
Alternatively: ‘any model that is trained on broad data (generally using self-supervision at scale) that can be adapted (e.g.,fine-tuned) to a wide range of downstream tasks’.

Bommasani R and others, ‘On the Opportunities and Risks of Foundation Models’ (arXiv, 12 July 2022) <http://arxiv.org/abs/2108.07258>

[46] ‘Explainer: What Is a Foundation Model?’ <www.adalovelaceinstitute.org/resource/foundation-models-explainer/> accessed 18 September 2023.

[47] Ibid.

[48] AWS, ‘Fine-Tune a Model’ <https://docs.aws.amazon.com/sagemaker/latest/dg/jumpstart-fine-tune.html> accessed 3 July 2023

[49] ‘Explainer: What Is a Foundation Model?’ <www.adalovelaceinstitute.org/resource/foundation-models-explainer/> accessed 18 September 2023.

[50] ‘ISO – ISO 9001 and Related Standards — Quality Management’ (ISO, 1 September 2021) <www.iso.org/iso-9001-quality-management.html> accessed 2 November 2023.

[51] 14:00-17:00, ‘ISO 13485:2016’ (ISO, 2 June 2021) <www.iso.org/standard/59752.html> accessed 2 November 2023.

 [52] OECD, ‘Risk-Based Regulation’ in OECD, OECD Regulatory Policy Outlook 2021 (OECD 2021) <www.oecd-ilibrary.org/governance/oecd-regulatory-policy-outlook-2021_9d082a11-en> accessed 18 September 2023.

[53] Center for Devices and Radiological Health, ‘International Medical Device Regulators Forum (IMDRF)’ (FDA, 15 September 2023) <www.fda.gov/medical-devices/cdrh-international-programs/international-medical-device-regulators-forum-imdrf> accessed 18 September 2023.

[54] Office of the Commissioner, ‘What We Do’ (FDA, 28 June 2021) <www.fda.gov/about-fda/what-we-do> accessed 18 September 2023.

[55] ‘FDA User Fees: Examining Changes in Medical Product Development and Economic Benefits’ (ASPE) <https://aspe.hhs.gov/reports/fda-user-fees> accessed 18 September 2023.

[56] ‘Premarket Approval (PMA)’ <www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfpma/pma.cfm?id=P160009> accessed 18 September 2023.

[57] ‘Product Classification’ <www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfPCD/classification.cfm?id=LQB> accessed 18 September 2023.

[58] Center for Devices and Radiological Health, ‘Et Control – P210018’ [2022] FDA <www.fda.gov/medical-devices/recently-approved-devices/et-control-p210018> accessed 18 September 2023.

[59] Note that only ~2% of SaMD are Class III, see Approval of artificial intelligence and machine learning-based medical devices in the USA and Europe (2015–20): a comparative analysis – The Lancet Digital Health and Drugs and Devices: Comparison of European and U.S. Approval Processes – ScienceDirect

[60] ‘Assessing the Efficacy and Safety of Medical Technologies (Part 4 of 12) (Princeton.Edu) – Google Search’ <www.google.com/search?q=Assessing+the+Efficacy+and+Safety+of+Medical+Technologies+(Part+4+of+12)+(princeton.edu)&rlz=1C1GCEA_enBE1029BE1030&oq=Assessing+the+Efficacy+and+Safety+of+Medical+Technologies+(Part+4+of+12)+(princeton.edu)&gs_lcrp=EgZjaHJvbWUyBggAEEUYOdIBBzM1N2owajSoAgCwAgA&sourceid=chrome&ie=UTF-8> accessed 18 September 2023.

[61] Ibid.

[62] For the purposes of this report, ‘effectiveness’ is used as a synonym of ‘efficacy’. In detail, effectiveness is concerned with the benefit of a technology under average conditions of use, whereas efficacy is the benefit under ideal conditions.

[63] ‘SAMD MDSW’ <www.quaregia.com/blog/samd-mdsw> accessed 18 September 2023.

[64] Office of the Commissioner, ‘The Drug Development Process’ (FDA, 20 February 2020) <www.fda.gov/patients/learn-about-drug-and-device-approvals/drug-development-process> accessed 18 September 2023.

[65] Eric Wu and others, ‘How Medical AI Devices Are Evaluated: Limitations and Recommendations from an Analysis of FDA Approvals’ (2021) 27 Nature Medicine 582.

[66] It can be debated whether this falls under the exact definition of SaMD as a stand-alone software feature, or as a software component of a medical device, but the lessons and process remain the same.

[67] SUMMARY OF SAFETYAND EFFECTIVENESS DATA (SSED) <www.accessdata.fda.gov/cdrh_docs/pdf21/P210018B.pdf> accessed 18 September 2023.

[68] A QMS is a standardised process for documenting compliance based on international standards (ISO 13485/820).

[69] Center for Devices and Radiological Health, ‘Overview of IVD Regulation’ [2023] FDA <www.fda.gov/medical-devices/ivd-regulatory-assistance/overview-ivd-regulation> accessed 18 September 2023.

[70] ‘When Science and Politics Collide: Enhancing the FDA | Science’ <www.science.org/doi/10.1126/science.aaw8093> accessed 18 September 2023.

[71] ‘Unique Device Identification System’ (Federal Register, 24 September 2013) <www.federalregister.gov/documents/2013/09/24/2013-23059/unique-device-identification-system> accessed 18 September 2023.

[72] ‘openFDA’ <https://open.fda.gov/data/faers/> accessed 10 November 2023.

[73] For example, Carpenter 2010, Hilts 2004, Hutt et al 2022

[74] ‘Factors to Consider Regarding Benefit-Risk in Medical Device Product Availability, Compliance, and Enforcement Decisions – Guidance for Industry and Food and Drug Administration Staff’.

[75] Center for Devices and Radiological Health, ‘510(k) Third Party Review Program’ (FDA, 15 August 2023) <www.fda.gov/medical-devices/premarket-submissions-selecting-and-preparing-correct-submission/510k-third-party-review-program> accessed 18 September 2023.

[76] Office of Regulatory Affairs, ‘What Should I Expect during an Inspection?’ [2020] FDA <www.fda.gov/industry/fda-basics-industry/what-should-i-expect-during-inspection> accessed 18 September 2023.

[77] ‘Device Makers Can Take COTS, but Only with Clear SOUP’ <https://web.archive.org/web/20130123140527/http://medicaldesign.com/engineering-prototyping/software/device-cots-soup-1111/> accessed 18 September 2023.

[78] ‘FDA Clears Intellia to Start US Tests of “in Vivo” Gene Editing Drug’ (BioPharma Dive) <www.biopharmadive.com/news/intellia-fda-crispr-in-vivo-gene-editing-ind/643999/> accessed 18 September 2023.

[79] ‘FDA Authority Over Tobacco’ (Campaign for Tobacco-Free Kids) <www.tobaccofreekids.org/what-we-do/us/fda> accessed 18 September 2023.

[80] FDA AT A GLANCE: REGULATED PRODUCTS AND FACILITIES, November 2020 <www.fda.gov/media/143704/download#:~:text=REGULATED%20PRODUCTS%20AND%20FACILITIES&text=FDA%2Dregulated%20products%20account%20for,dollar%20spent%20by%20U.S.%20consumers.&text=FDA%20regulates%20about%2078%20percent,poultry%2C%20and%20some%20egg%20products.> accessed 18 September 2023.

[81] ‘Getting Smarter: FDA Publishes Draft Guidance on Predetermined Change Control Plans for Artificial Intelligence/Machine Learning (AI/ML) Devices’ (5 February 2023) <www.ropesgray.com/en/newsroom/alerts/2023/05/getting-smarter-fda-publishes-draft-guidance-on-predetermined-change-control-plans-for-ai-ml-devices> accessed 18 September 2023.

[82] Center for Veterinary Medicine, ‘Q&A on FDA Regulation of Intentional Genomic Alterations in Animals’ [2023] FDA <www.fda.gov/animal-veterinary/intentional-genomic-alterations-igas-animals/qa-fda-regulation-intentional-genomic-alterations-animals> accessed 18 September 2023.

[83] Andrew Kolodny, ‘How FDA Failures Contributed to the Opioid Crisis’ (2020) 22 AMA Journal of Ethics 743.

[84] Commissioner O of the, ‘Milestones in U.S. Food and Drug Law’ [2023] FDA <https://www.fda.gov/about-fda/fda-history/milestones-us-food-and-drug-law> accessed 3 December 2023

 

[85] Reputation and Power (2010) <https://press.princeton.edu/books/paperback/9780691141800/reputation-and-power> accessed 3 December 2023

 

[86] ‘Hutt, Merrill, Grossman, Cortez, Lietzan, and Zettler’s Food and Drug Law, 5th – 9781636596952 – West Academic’ <https://faculty.westacademic.com/Book/Detail?id=341299> accessed 3 December 2023

 

[87]For example Carpenter 2010, Hilts 2004, Hutt et al 2022

[88] ‘Hutt, Merrill, Grossman, Cortez, Lietzan, and Zettler’s Food and Drug Law, 5th – 9781636596952 – West Academic’ <https://faculty.westacademic.com/Book/Detail?id=341299> accessed 18 September 2023.

[89] Eric Wu and others, ‘How Medical AI Devices Are Evaluated: Limitations and Recommendations from an Analysis of FDA Approvals’ (2021) 27 Nature Medicine 582.

[90] Other public health regulators, for example NICE (UK) cover accessibility risk to a larger degree than the FDA, similarly on structural discrimination risks with NICE “Standing together” work on data curation and declarations of datasets used in developing SaMD. The FDA over time developed similar programs.

[91] Ziad Obermeyer and others, ‘Dissecting Racial Bias in an Algorithm Used to Manage the Health of Populations’ (2019) 366 Science 447.

[92] ’FDA-cleared artificial intelligence and machine learning-based medical devices and their 510(k) predicate networks’<www.thelancet.com/journals/landig/article/PIIS2589-7500(23)00126-7/fulltext#sec1> accessed 18 September 2023.

[93] ‘How the FDA’s Food Division Fails to Regulate Health and Safety Hazards’ <https://politico.com/interactives/2022/fda-fails-regulate-food-health-safety-hazards> accessed 18 September 2023.

[94] Christopher J Morten and Amy Kapczynski, ‘The Big Data Regulator, Rebooted: Why and How the FDA Can and Should Disclose Confidential Data on Prescription Drugs and Vaccines’ (2021) 109 California Law Review 493.

[95] ‘Examination of Clinical Trial Costs and Barriers for Drug Development’ (ASPE) <https://aspe.hhs.gov/reports/examination-clinical-trial-costs-barriers-drug-development-0> accessed 18 September 2023.

[96] Office of the Commissioner, ‘Advisory Committees’ (FDA, 3 May 2021) <www.fda.gov/advisory-committees> accessed 18 September 2023.

[97] For example. Carpenter 2010, Hilts 2004, Hutt et al 2022

[98] ‘FDA’s Science Infrastructure Failing | Infectious Diseases | JAMA | JAMA Network’ <https://jamanetwork.com/journals/jama/article-abstract/1149359> accessed 18 September 2023.

[99] Bridget M Kuehn, ‘FDA’s Science Infrastructure Failing’ (2008) 299 JAMA 157.

[100] ‘What to Expect at FDA’s Vaccine Advisory Committee Meeting’ (The Equation, 19 October 2020) <https://blog.ucsusa.org/genna-reed/vrbpac-meeting-what-to-expect/> accessed 18 September 2023.

[101] Office of the Commissioner, ‘What Is a Conflict of Interest?’ [2022] FDA <www.fda.gov/about-fda/fda-basics/what-conflict-interest> accessed 18 September 2023.

[102] The Firm and the FDA: McKinsey & Company’s Conflicts of Interest at the Heart of the Opioid Epidemic <https://fingfx.thomsonreuters.com/gfx/legaldocs/akpezyejavr/2022-04-13.McKinsey%20Opioid%20Conflicts%20Majority%20Staff%20Report%20FINAL.pdf> accessed 18 September 2023.

[103] Causholli M, Chambers DJ and Payne JL, ‘Future Nonaudit Service Fees and Audit Quality’ (2014) ,<onlinelibrary.wiley.com/doi/abs/10.1111/1911-3846.12042> accessed 21 September 2023; Jamal K and Sunder S, ‘Is Mandated Independence Necessary for Audit Quality?’ (2011) 36 Accounting, Organizations and Society 284 <Is mandated independence necessary for audit quality? – ScienceDirect> accessed 21 September 2023

[104] Reputation and Power (2010) <https://press.princeton.edu/books/paperback/9780691141800/reputation-and-power> accessed 18 September 2023.

[105] ‘Hutt, Merrill, Grossman, Cortez, Lietzan, and Zettler’s Food and Drug Law, 5th – 9781636596952 – West Academic’ <https://faculty.westacademic.com/Book/Detail?id=341299> accessed 18 September 2023.

[106] Ana Santos Rutschman, ‘How Theranos’ Faulty Blood Tests Got to Market – and What That Shows about Gaps in FDA Regulation’ (The Conversation, 5 October 2021) <http://theconversation.com/how-theranos-faulty-blood-tests-got-to-market-and-what-that-shows-about-gaps-in-fda-regulation-168050> accessed 18 September 2023.

[107] Center for Devices and Radiological Health, ‘Classify Your Medical Device’ (FDA, 14 August 2023) <www.fda.gov/medical-devices/overview-device-regulation/classify-your-medical-device> accessed 18 September 2023.

[108] Anderljung and others, ‘Frontier AI Regulation: Managing Emerging Risks to Public Safety’ (arXiv, 4 September 2023) <http://arxiv.org/abs/2307.03718> accessed 15 September 2023.

[109] ‘A Law for Foundation Models: The EU AI Act Can Improve Regulation for Fairer Competition – OECD.AI’ <https://oecd.ai/en/wonk/foundation-models-eu-ai-act-fairer-competition> accessed 18 September 2023.

[110] ‘Stanford CRFM’ <https://crfm.stanford.edu/report.html> accessed 18 September 2023.

[111] Pegah Maham and Sabrina Küspert, ‘Governing General Purpose AI’.

[112] ‘Frontier AI Regulation: Managing Emerging Risks to Public Safety’ <https://openai.com/research/frontier-ai-regulation> accessed 18 September 2023.

[113] ‘Auditing Algorithms: The Existing Landscape, Role of Regulators and Future Outlook’ (GOV.UK) <www.gov.uk/government/publications/findings-from-the-drcf-algorithmic-processing-workstream-spring-2022/auditing-algorithms-the-existing-landscape-role-of-regulators-and-future-outlook> accessed 18 September 2023.

[114] ‘Introducing Superalignment’ <https://openai.com/blog/introducing-superalignment> accessed 18 September 2023.

[115] ‘Why AI Safety?’ (Machine Intelligence Research Institute) <https://intelligence.org/why-ai-safety/> accessed 18 September 2023.

[116] ‘DAIR (Distributed AI Research Institute)’ (DAIR Institute) <https://dair-institute.org/> accessed 18 September 2023.

[117] Anthropic < https://www.anthropic.com/index/frontier-threats-red-teaming-for-ai-safety#:~:text=If%20unmitigated%2C%20we%20worry%20that,implementation%20of%20mitigations%20for%20them> accessed 29 November 2023

[118] ‘Explainer: What Is a Foundation Model?’ <www.adalovelaceinstitute.org/resource/foundation-models-explainer/> accessed 18 September 2023.

[119] Center for Devices and Radiological Health, ‘Software as a Medical Device (SaMD)’ (FDA, 9 September 2020) <www.fda.gov/medical-devices/digital-health-center-excellence/software-medical-device-samd> accessed 10 November 2023.

[120] Pegah Maham and Sabrina Küspert, ‘Governing General Purpose AI’.

[121] ‘The Human Decisions That Shape Generative AI’ (Mozilla Foundation, 2 August 2023) <https://foundation.mozilla.org/en/blog/the-human-decisions-that-shape-generative-ai-who-is-accountable-for-what/> accessed 18 September 2023.

[122] ‘Frontier Model Security’ (Anthropic) <www.anthropic.com/index/frontier-model-security> accessed 18 September 2023.

[123] Is ChatGPT a cybersecurity threat? | TechCrunch

[124] ChatGPT Security Risks: What Are They and How To Protect Companies (itprotoday.com)

[125] 2307.03718.pdf (arxiv.org)

[126] 2307.03718.pdf (arxiv.org)

[127] 2307.03718.pdf (arxiv.org)

[128] 2307.03718.pdf (arxiv.org)

[129] ‘AI Assurance?’ <www.adalovelaceinstitute.org/report/risks-ai-systems/> accessed 21 September 2023.

[130] Preparing for Extreme Risks: Building a Resilient Society (parliament.uk) ‘Preparing for Extreme Risks: Building a Resilient Society’

[131] Nguyen T, ‘Insurability of Catastrophe Risks and Government Participation in Insurance Solutions’ (2013) <www.semanticscholar.org/paper/Insurability-of-Catastrophe-Risks-and-Government-in-Nguyen/dcecefd3f24a099b958e8ac1127a4bdc803b28fb> accessed 21 September 2023

[132] Banias MJ, ‘Inside CounterCloud: A Fully Autonomous AI Disinformation System’ (The Debrief, 16 August 2023) <https://thedebrief.org/countercloud-ai-disinformation/> accessed 21 September 2023

[133] Raji ID and others, ‘Outsider Oversight: Designing a Third Party Audit Ecosystem for AI Governance’ (arXiv, 9 June 2022) <http://arxiv.org/abs/2206.04737> accessed 21 September 2023

[134] McAllister LK, ‘Third-Party Programs to Assess Regulatory Compliance’ (2012) <www.acus.gov/sites/default/files/documents/Third-Party-Programs-Report_Final.pdf> accessed 21 September 2023

[135] Science in Regulation, A Study of Agency Decisionmaking Approaches, Appendices 2012 <www.acus.gov/sites/default/files/documents/Science%20in%20Regulation_Final%20Appendix_2_18_13_0.pdf> accessed 21 September 2023

[136] GPT-4-system-card (openai.com) (2023) <https://cdn.openai.com/papers/gpt-4-system-card.pdf> accessed 21 September 2023

[137] Intensive own evidence production of regulators, for example like the IAEA, is only suitable for non-complex industries

[138] The order does not indicate the importance of each dimension. The importance for risk reduction depends significantly on the specific implementation of the dimensions and the context.

[139] While other oversight regimes such as practised in cybersecurity, aviation or similar are an inspiration for foundation models too, FDA-style oversight is among the few that score towards the right on most dimensions identified in the regulatory oversight and audit literature and depicted above.

[140] Open AI Bug Bounty Program (2022) <Announcing OpenAI’s Bug Bounty Program> accessed 21 September 2023

[141] ‘MAUDE – Manufacturer and User Facility Device Experience’ <www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfmaude/search.cfm> accessed 21 September 2023

[142] ‘Auditor Independence and Audit Quality: A Literature Review – Nopmanee Tepalagul, Ling Lin, 2015’ <https://journals.sagepub.com/doi/abs/10.1177/0148558×14544505?casa_token=6R7ABlbi2I0AAAAA:K1pMF6sw6QrmvEhczXbW0BwjE8xXD0r3GKfOHpZczbeIvdMckGn00I6zkluRqd06WmBJXJ616xz_KXk> accessed 21 September 2023

[143] ‘Customer-Driven Misconduct: How Competition Corrupts Business Practices – Article – Faculty & Research – Harvard Business School’ <www.hbs.edu/faculty/Pages/item.aspx?num=43347> accessed 21 September 2023

[144] Donald R. Deis Jr and Giroux GA, ‘Determinants of Audit Quality in the Public Sector’ (1992) 67 The Accounting Review 462 <www.jstor.org/stable/247972?casa_token=luGLXHQ3nAoAAAAA:clOnnu3baxAfZYMCx7kJloL08GI0RPboKMovVPQz7Z6bi9w4grsJEqz1tNIKJD88yFXbpc8iqLDoeZY9U5jnECBH99hKFWKk3-WxI9e__HBwlQ_bOBhSWQ> accessed 21 September 2023

[145] Engstrom DF and Ho DE, ‘Algorithmic Accountability in the Administrative State’ (9 March 2020) <https://papers.ssrn.com/abstract=3551544> accessed 21 September 2023

[146] Causholli M, Chambers DJ and Payne JL, ‘Future Nonaudit Service Fees and Audit Quality’ (2014) ,<onlinelibrary.wiley.com/doi/abs/10.1111/1911-3846.12042> accessed 21 September 2023

[147] Jamal K and Sunder S, ‘Is Mandated Independence Necessary for Audit Quality?’ (2011) 36 Accounting, Organizations and Society 284 <Is mandated independence necessary for audit quality? – ScienceDirect> accessed 21 September 2023

[148] Widder DG, West S and Whittaker M, ‘Open (For Business): Big Tech, Concentrated Power, and the Political Economy of Open AI’ (17 August 2023) <https://papers.ssrn.com/abstract=4543807> accessed 21 September 2023

[149] Lamoreaux PT, ‘Does PCAOB Inspection Access Improve Audit Quality? An Examination of Foreign Firms Listed in the United States’ (2016) 61 Journal of Accounting and Economics 313

<Does PCAOB inspection access improve audit quality? An examination of foreign firms listed in the United States – ScienceDirect> accessed 21 September 2023

[150] ‘Introduction to NIST FRVT’ (Paravision) <www.paravision.ai/news/introduction-to-nist-frvt/> accessed 21 September 2023

[151] ‘Confluence Mobile – UN Statistics Wiki’ <https://unstats.un.org/wiki/plugins/servlet/mobile?contentId=152797274#content/view/152797274> accessed 21 September 2023

[152] ‘Large Language Models and Software as a Medical Device – MedRegs’ <https://medregs.blog.gov.uk/2023/03/03/large-language-models-and-software-as-a-medical-device/> accessed 21 September 2023

[153] Ada Lovelace Institute, AI assurance? Assessing and mitigating risks across the AI lifecycle (2023) < https://www.adalovelaceinstitute.org/report/risks-ai-systems/>

[154] ‘Inclusive AI Governance – Ada Lovelace Institute’ (2023) < www.adalovelaceinstitute.org/wp-content/uploads/2023/03/Ada-Lovelace-Institute-Inclusive-AI-governance-Discussion-paper-March-2023.pdf> accessed 21 September 2023

[155] ‘AI Assurance?’ <www.adalovelaceinstitute.org/report/risks-ai-systems/> accessed 21 September 2023

[156] ‘Comment of the AI Policy and Governance Working Group on the NTIA AI Accountability Policy’ (2023) <www.ias.edu/sites/default/files/AI%20Policy%20and%20Governance%20Working%20Group%20NTIA%20Comment.pdf> accessed 21 September 2023

[157] Weale S and correspondent SWE, ‘Lecturers Urged to Review Assessments in UK amid Concerns over New AI Tool’ The Guardian (13 January 2023) <https://www.theguardian.com/technology/2023/jan/13/end-of-the-essay-uk-lecturers-assessments-chatgpt-concerns-ai> accessed 23 November 2023

[158] ‘Proposing a Foundation Model Information-Sharing Regime for the UK | GovAI Blog’ <www.governance.ai/post/proposing-a-foundation-model-information-sharing-regime-for-the-uk> accessed 21 September 2023

[159] ‘Proposing a Foundation Model Information-Sharing Regime for the UK | GovAI Blog’ <www.governance.ai/post/proposing-a-foundation-model-information-sharing-regime-for-the-uk> accessed 21 September 2023

[160] ‘Regulating AI in the UK’ <www.adalovelaceinstitute.org/report/regulating-ai-in-the-uk/> accessed 21 September 2023

[161] ‘Unique Device Identification System’ (Federal Register, 24 September 2013) <www.federalregister.gov/documents/2013/09/24/2013-23059/unique-device-identification-system> accessed 21 September 2023

[162] Anthropic AB is CL at and others, ‘How We Can Regulate AI—Asterisk’ <https://asteriskmag.com/issues/03/how-we-can-regulate-ai> accessed 21 September 2023

[163] ‘Opinion | Here’s a Simple Way to Regulate Powerful AI Models’ Washington Post (16 August 2023) <www.washingtonpost.com/opinions/2023/08/16/ai-danger-regulation-united-states/> accessed 21 September 2023

[164] Vidal DE and others, ‘Navigating US Regulation of Artificial Intelligence in Medicine—A Primer for Physicians’ (2023) 1 Mayo Clinic Proceedings: Digital Health 31

[165] ‘The Human Decisions That Shape Generative AI’ (Mozilla Foundation, 2 August 2023) <https://foundation.mozilla.org/en/blog/the-human-decisions-that-shape-generative-ai-who-is-accountable-for-what/> accessed 21 September 2023

[166] Birhane A, Prabhu VU and Kahembwe E, ‘Multimodal Datasets: Misogyny, Pornography, and Malignant Stereotypes’ (arXiv, 5 October 2021) <http://arxiv.org/abs/2110.01963> accessed 21 September 2023

[167] Schaul K, Chen SY and Tiku N, ‘Inside the Secret List of Websites That Make AI like ChatGPT Sound Smart’ (Washington Post) <www.washingtonpost.com/technology/interactive/2023/ai-chatbot-learning/> accessed 21 September 2023

[168] ‘When AI Is Trained on AI-Generated Data, Strange Things Start to Happen’ (Futurism) <https://futurism.com/ai-trained-ai-generated-data-interview> accessed 21 September 2023

[169] Draft standards here are a very good example of the value of dataset documentation (that is, declaring metadata) on what is used in training and fine-tuning models. In theory, this could also all be kept confidential as commercially sensitive information once a legal infrastructure is in place www.datadiversity.org/draft-standards

[170] Mitchell, Wu, Zaldivar, Barnes, Vasserman, Hutchinson, Spitzer, Raji and Gebru, (2019), ‘Model Cards for Model Reporting’, doi: 10.1145/3287560.3287596

[171] Gebru, Morgenstern, Vecchione, Vaughan, Wallach, Daum and Crawford, (2021), Datasheets for Datasets, <https://m-cacm.acm.org/magazines/2021/12/256932-datasheets-for-datasets/abstract >(Accessed: 27 February 2023); Hutchinson, Smart, Hanna, Denton, Greer, Kjartansson, Barnes and Mitchell, (2021), ‘Towards Accountability for Machine Learning Datasets: Practices from Software Engineering and Infrastructure’, doi: 10.1145/3442188.3445918;

[172] Shevlane T and others, ‘Model Evaluation for Extreme Risks’ (arXiv, 24 May 2023) <http://arxiv.org/abs/2305.15324> accessed 21 September 2023

[173] A pretrained AI model is a deep learning model that is already trained on large datasets to accomplish a specific task, meaning there are design choices which affect its output and performance (according to one leading lab ‘language models already learn a lot about human values during pretraining’ and this is where ‘implicit biases’ arise.)

[174] ‘running against a suite of benchmark objectionable behaviors… we find that the prompts achieve up to 84% success rates at attacking GPT-3.5 and GPT-4, and 66% for PaLM-2; success rates for Claude are substantially lower (2.1%), but notably the attacks still can induce behavior that is otherwise never generated.’ Zou A and others, ‘Universal and Transferable Adversarial Attacks on Aligned Language Models’ (arXiv, 27 July 2023) <http://arxiv.org/abs/2307.15043> accessed 21 September 2023

[175] Shevlane T and others, ‘Model Evaluation for Extreme Risks’ (arXiv, 24 May 2023) <http://arxiv.org/abs/2305.15324> accessed 21 September 2023; Nelson et al ; Kolt N, ‘Algorithmic Black Swans’ (25 February 2023) <https://papers.ssrn.com/abstract=4370566> accessed 21 September 2023

[176] Mökander J and others, ‘Auditing Large Language Models: A Three-Layered Approach’ [2023] AI and Ethics <http://arxiv.org/abs/2302.08500> accessed 21 September 2023; Wan A and others, ‘Poisoning Language Models During Instruction Tuning’ (arXiv, 1 May 2023) <http://arxiv.org/abs/2305.00944> accessed 21 September 2023; ‘Analyzing the European Union AI Act: What Works, What Needs Improvement’ (Stanford HAI) <https://hai.stanford.edu/news/analyzing-european-union-ai-act-what-works-what-needs-improvement> accessed 21 September 2023; ‘EU AI Standards Development and Civil Society Participation’ <www.adalovelaceinstitute.org/event/eu-ai-standards-civil-society-participation/> accessed 21 September 2023

[177] ’Outsider Oversight: Designing a Third Party Audit Ecosystem for AI Governance’ <https://dl.acm.org/doi/pdf/10.1145/3514094.3534181> accessed 21 September 2023

[178] Gupta A, ‘Emerging AI Governance Is an Opportunity for Business Leaders to Accelerate Innovation and Profitability’ (Tech Policy Press, 31 May 2023) <https://techpolicy.press/emerging-ai-governance-is-an-opportunity-for-business-leaders-to-accelerate-innovation-and-profitability/> accessed 21 September 2023

[179] Key Enforcement Issues of the AI Act Should Lead EU Trilogue Debate’ (Brookings) <www.brookings.edu/articles/key-enforcement-issues-of-the-ai-act-should-lead-eu-trilogue-debate/> accessed 21 September 2023

[180] ‘Structured Access’ – Toby Shevlane (2022)< https://arxiv.org/ftp/arxiv/papers/2201/2201.05159.pdf> accessed 21 September 2023

[181] ‘Systematic probing of an AI model or system by either expert or non-expert human evaluators to reveal undesired outputs or behaviors’.

[182] House TW, ‘FACT SHEET: Biden-Harris Administration Secures Voluntary Commitments from Leading Artificial Intelligence Companies to Manage the Risks Posed by AI’ (The White House, 21 July 2023) <www.whitehouse.gov/briefing-room/statements-releases/2023/07/21/fact-sheet-biden-harris-administration-secures-voluntary-commitments-from-leading-artificial-intelligence-companies-to-manage-the-risks-posed-by-ai/> accessed 21 September 2023

[183] ‘Keeping an Eye on AI’ <www.adalovelaceinstitute.org/report/keeping-an-eye-on-ai/> accessed 21 September 2023

[184].Janjeva A and others, ‘Strengthening Resilience to AI Risk’ (2023 <)https://cetas.turing.ac.uk/sites/default/files/2023-08/cetas-cltr_ai_risk_briefing_paper.pdf> accessed 21 September 2023

[185] Shrishak K, ‘How to Deal with an AI Near-Miss: Look to the Skies’ (2023) 79 Bulletin of the Atomic Scientists 166

[186] ‘Guidance for Manufacturers on Reporting Adverse Incidents Involving Software as a Medical Device under the Vigilance System’ (GOV.UK) <www.gov.uk/government/publications/reporting-adverse-incidents-involving-software-as-a-medical-device-under-the-vigilance-system/guidance-for-manufacturers-on-reporting-adverse-incidents-involving-software-as-a-medical-device-under-the-vigilance-system> accessed 21 September 2023

[187] www.adalovelaceinstitute.org/blog/ai-regulation-learn-from-history/ Guidance always has its roots in legislation, but can be iterated more rapidly and flexibly whereas legislation requires several legal and political steps at minimum. Explainer here: www.oneeducation.org.uk/difference-between-laws-regulations-acts-guidance-policies/.

[188] www.tandfonline.com/doi/pdf/10.1080/01972243.2022.2124565?needAccess=true

[189] https://cip.org/alignmentassemblies

[190] https://arxiv.org/abs/2306.09871 ; https://openai.com/blog/democratic-inputs-to-ai

[191] Ada Lovelace Institute, Participatory data stewardship: A framework for involving people in the use of data’ (2021) < https://www.adalovelaceinstitute.org/report/participatory-data-stewardship/>

[192] Shevlane T and others, ‘Model Evaluation for Extreme Risks’ (arXiv, 24 May 2023) <http://arxiv.org/abs/2305.15324> accessed 21 September 2023

[193] ‘Examining the Black Box’ <www.adalovelaceinstitute.org/report/examining-the-black-box-tools-for-assessing-algorithmic-systems/> accessed 21 September 2023

[194] Nelson and et al., ‘AI Policy and Governance Working Group NTIA Comment.Pdf’ <www.ias.edu/sites/default/files/AI%20Policy%20and%20Governance%20Working%20Group%20NTIA%20Comment.pdf> accessed 21 September 2023

[195] Bill Chappell, ‘“It Was Installed For This Purpose,” VW’s U.S. CEO Tells Congress About Defeat Device’ NPR (8 October 2015) <www.npr.org/sections/thetwo-way/2015/10/08/446861855/volkswagen-u-s-ceo-faces-questions-on-capitol-hill> accessed 30 August 2023

[196] MedWatch is the FDA’s adverse event reporting program, while Medical Product Safety Network (MedSun) monitors the safety and effectiveness of medical devices. Commissioner O of the, ‘Step 5: FDA Post-Market Device Safety Monitoring’ [2018] FDA <www.fda.gov/patients/device-development-process/step-5-fda-post-market-device-safety-monitoring> accessed 21 September 2023

[197] AINOW, ‘Zero-Trust-AI-Governance.Pdf’ (August 2023) <https://ainowinstitute.org/wp-content/uploads/2023/08/Zero-Trust-AI-Governance.pdf> accessed 21 September 2023

[198] ‘The Value​​​ ​​​Chain of General-Purpose AI​​’ <www.adalovelaceinstitute.org/blog/value-chain-general-purpose-ai/> accessed 21 September 2023

[199] www.whitehouse.gov/briefing-room/statements-releases/2023/07/21/fact-sheet-biden-harris-administration-secures-voluntary-commitments-from-leading-artificial-intelligence-companies-to-manage-the-risks-posed-by-ai/

[200] Knott A and Pedreschi D, ‘State-of-the-Art Foundation AI Models Should Be Accompanied by Detection Mechanisms as a Condition of Public Release’ <https://gpai.ai/projects/responsible-ai/social-media-governance/Social%20Media%20Governance%20Project%20-%20July%202023.pdf> accessed 21 September 2023

[201] www.tspa.org/curriculum/ts-fundamentals/transparency-report/

[202] Bommasani R and others, ‘Do Foundation Model Providers Comply with the Draft EU AI Act?’ <https://crfm.stanford.edu/2023/06/15/eu-ai-act.html> accessed 21 September 2023

[203] ‘Keeping an Eye on AI’ <www.adalovelaceinstitute.org/report/keeping-an-eye-on-ai/> accessed 21 September 2023

[204] ‘Regulating AI in the UK’ <www.adalovelaceinstitute.org/report/regulating-ai-in-the-uk/> accessed 21 September 2023

[205] Zinchenko V and others, ‘Changes in Software as a Medical Device Based on Artificial Intelligence Technologies’ (2022) 17 International Journal of Computer Assisted Radiology and Surgery 1969

[206] Shrishak K, ‘How to Deal with an AI Near-Miss: Look to the Skies’ (2023) 79 Bulletin of the Atomic Scientists 166

[207] AINOW, ‘Zero-Trust-AI-Governance.Pdf’ (August 2023) <https://ainowinstitute.org/wp-content/uploads/2023/08/Zero-Trust-AI-Governance.pdf> accessed 21 September 2023

[208] ‘How Boeing 737 MAX’s Flawed Flight Control System Led to 2 Crashes That Killed 346 – ABC News’ <https://abcnews.go.com/US/boeing-737-maxs-flawed-flight-control-system-led/story?id=74321424> accessed 21 September 2023

[209] A new national system to more quickly spot possible safety issues, using existing electronic health databases to keep an eye on the safety of approved medical products in real time. This tool will add to, but not replace, FDA’s existing post-market safety assessment tools. Commissioner of the, ‘Step 5: FDA Post-Market Device Safety Monitoring’ [2018] FDA <www.fda.gov/patients/device-development-process/step-5-fda-post-market-device-safety-monitoring> accessed 21 September 2023

[210] In the UK, the Civil Aviation Authority has a revenue of £140m and staff of over 1,000, and the Office for Nuclear Regulation around £90m with around 700 staff. An EU-level agency for AI should be funded well beyond this, given that the EU is more than six times the size of the UK.

[211] Affairs O of R, ‘Recalls, Market Withdrawals, & Safety Alerts’ (FDA, 11 February 2022) <www.fda.gov/safety/recalls-market-withdrawals-safety-alerts> accessed 21 September 2023

[212] Team NA, ‘NIST AIRC – Govern’ <https://airc.nist.gov/AI_RMF_Knowledge_Base/Playbook/Govern> accessed 21 September 2023

[213]‘Committing to Effective Whistleblower Protection | En | OECD’ <www.oecd.org/corruption-integrity/reports/committing-to-effective-whistleblower-protection-9789264252639-en.html> accessed 21 September 2023

[214] Anderljung M and others, ‘Frontier AI Regulation: Managing Emerging Risks to Public Safety’ (arXiv, 4 September 2023) <http://arxiv.org/abs/2307.03718> accessed 21 September 2023

[215] Guidance always has its roots in legislation but can be iterated more rapidly and flexibly, whereas legislation requires several legal and political steps at minimum. ‘AI Regulation and the Imperative to Learn from History’ <www.adalovelaceinstitute.org/blog/ai-regulation-learn-from-history/> accessed 21 September 2023

Explainer here: www.oneeducation.org.uk/difference-between-laws-regulations-acts-guidance-policies/.

[216] Raji ID and others, ‘Outsider Oversight: Designing a Third Party Audit Ecosystem for AI Governance’ (arXiv, 9 June 2022) <http://arxiv.org/abs/2206.04737> accessed 21 September 2023

[217] Draft standards here are a very good example of the value of dataset documentation (i.e. declaring metadata) on what is used in training and fine-tuning models. In theory, this could also all be kept confidential as commercially sensitive information once a legal infrastructure is in place. www.datadiversity.org/draft-standards

[218] Mitchell, Wu, Zaldivar, Barnes, Vasserman, Hutchinson, Spitzer, Raji and Gebru, (2019), ‘Model Cards for Model Reporting’, doi: 10.1145/3287560.3287596

[219] Gebru, Morgenstern, Vecchione, Vaughan, Wallach, Daum and Crawford, (2021), Datasheets for Datasets, <https://m-cacm.acm.org/magazines/2021/12/256932-datasheets-for-datasets/abstract> (Accessed: 27 February 2023) Hutchinson, Smart, Hanna, Denton, Greer, Kjartansson, Barnes and Mitchell, (2021), ‘Towards Accountability for Machine Learning Datasets: Practices from Software Engineering and Infrastructure’, doi: 10.1145/3442188.3445918;

[220] Shevlane T and others, ‘Model Evaluation for Extreme Risks’ (arXiv, 24 May 2023) <http://arxiv.org/abs/2305.15324> accessed 21 September 2023

[221]

[222] Knott A and Pedreschi D, ‘State-of-the-Art Foundation AI Models Should Be Accompanied by Detection Mechanisms as a Condition of Public Release’ <https://gpai.ai/projects/responsible-ai/social-media-governance/Social%20Media%20Governance%20Project%20-%20July%202023.pdf> accessed 21 September 2023

[223] Shevlane T and others, ‘Model Evaluation for Extreme Risks’ (arXiv, 24 May 2023) <http://arxiv.org/abs/2305.15324> accessed 21 September 2023

[224] ‘Examining the Black Box’ <www.adalovelaceinstitute.org/report/examining-the-black-box-tools-for-assessing-algorithmic-systems/> accessed 21 September 2023

[225] AINOW, ‘Zero-Trust-AI-Governance.Pdf’ (August 2023) <https://ainowinstitute.org/wp-content/uploads/2023/08/Zero-Trust-AI-Governance.pdf> accessed 21 September 2023

[226] ‘Regulating AI in the UK’ <www.adalovelaceinstitute.org/report/regulating-ai-in-the-uk/> accessed 21 September 2023

[227] Shrishak K, ‘How to Deal with an AI Near-Miss: Look to the Skies’ (2023) 79 Bulletin of the Atomic Scientists 166

[228] Team NA, ‘NIST AIRC – Govern 1.7’ <https://airc.nist.gov/AI_RMF_Knowledge_Base/Playbook/Govern> accessed 21 September 2023

[229] In the UK, the Civil Aviation Authority has a revenue of £140m and staff of over 1,000, and the Office for Nuclear Regulation around £90m with around 700 staff). An EU-level agency for AI should be funded well beyond this, given that the EU is more than six times the size of the UK.

[230] In 2023 ~50% of the FDA’s ~$8bn budget was covered through mandatory fees by companies overseen by the FDA. See: < https://www.fda.gov/media/165045/download > accessed 24/11/2023

[231] 80% of the EMA’s funding comes from fees and charges levied on companies. See: EMA, “Funding,” European Medicines Agency, Sep. 17, 2018. <www.ema.europa.eu/en/about-us/how-we-work/governance-documents/funding> accessed Aug. 10, 2023

[232] ‘Governing General Purpose AI — A Comprehensive Map of Unreliability, Misuse and Systemic Risks’ (20 July 2023) <www.stiftung-nv.de/de/publikation/governing-general-purpose-ai-comprehensive-map-unreliability-misuse-and-systemic-risks> accessed 21 September 2023

[233] Nathalie Smuha: Beyond the Individual: Governing AI’s Societal Harm < https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3941956 > accessed Nov. 24, 2023

[234] EMA, “Success rate for marketing authorisation applications from SMEs doubles between 2016 and 2020,” European Medicines Agency, Jun. 25, 2021 <www.ema.europa.eu/en/news/success-rate-marketing-authorisation-applications-smes-doubles-between-2016-2020> accessed Aug. 10, 2023

[235] ‘AI and Digital Regulations Service for Health and Social Care – AI Regulation Service – NHS’ <www.digitalregulations.innovation.nhs.uk/> accessed 21 September 2023

[236] EMA, “Advanced therapy medicinal products: Overview,” European Medicines Agency, Sep. 17, 2018. <www.ema.europa.eu/en/human-regulatory/overview/advanced-therapy-medicinal-products-overview> accessed Aug. 10, 2023

[237] ‘Key Enforcement Issues of the AI Act Should Lead EU Trilogue Debate’ (Brookings) <www.brookings.edu/articles/key-enforcement-issues-of-the-ai-act-should-lead-eu-trilogue-debate/> accessed 21 September 2023

[238] Infocomm Media Development Authority, Aicadium, and AI Verify Foundation, ‘Generative AI: Implications for Trust and Governance’ 2023 <https://aiverifyfoundation.sg/downloads/Discussion_Paper.pdf> accessed 21 September 2023

[239] EMA, “Transparency,” European Medicines Agency, Sep. 17, 2018 <www.ema.europa.eu/en/about-us/how-we-work/transparency> (accessed Aug. 10, 2023).

[240] ‘Governing General Purpose AI — A Comprehensive Map of Unreliability, Misuse and Systemic Risks’ (20 July 2023) <www.stiftung-nv.de/de/publikation/governing-general-purpose-ai-comprehensive-map-unreliability-misuse-and-systemic-risks> accessed 21 September 2023

[241] Ho L and others, ‘International Institutions for Advanced AI’ (arXiv, 11 July 2023) <http://arxiv.org/abs/2307.04699> accessed 21 September 2023

[242] ‘Three Regulatory Agencies: A Comparison’ <www.hmpgloballearningnetwork.com/site/frmc/articles/three-regulatory-agencies-comparison> accessed 21 September 2023

[243] ‘COVID-19 Disruptions of International Clinical Trials: Comparing Guidances Issued by FDA, EMA, MHRA and PMDA’ (4 February 2020) <www.ropesgray.com/en/newsroom/alerts/2020/04/national-authority-guidance-on-clinical-trials-during-the-covid-19-pandemic> accessed 21 September 2023

[244] Van Norman GA, ‘Drugs and Devices: Comparison of European and U.S. Approval Processes’ (2016) 1 JACC: Basic to Translational Science 399

[245] Cummings ML and Britton D, ‘Chapter 6 – Regulating Safety-Critical Autonomous Systems: Past, Present, and Future Perspectives’ in Richard Pak, Ewart J de Visser and Ericka Rovira (eds), Living with Robots (Academic Press 2020) <www.sciencedirect.com/science/article/pii/B9780128153673000062> accessed 21 September 2023


Image credit: Lyndon Stratford

1–12 of 15

Skip to content

A draft of this evidence review served as the basis for a virtual roundtable we held on 31 March 2022, convening a cross section of academia, policy and civil society to discuss public attitudes towards the regulation of data in the UK.

Background

The need to meaningfully understand public attitudes towards data regulation has become urgent in the UK.

To ensure data policy and governance are aligned with societal values and needs, and worthy of public trust, it is vital to understand people’s perspectives and experiences in relation to data and data-driven technologies. It is therefore imperative that, as the UK Government develops renewed policy, strategy, guidance and regulation about data, policymakers have a robust understanding of relevant public attitudes.

This briefing paper is intended to support policymakers to build that understanding by presenting a review of evidence about public attitudes towards data that helps address the question: what does the UK public think about data regulation?

Introduction

Data is increasingly central to public services, innovation and civic life. This is highlighted through recent UK Government work, such as the ‘Data: A New Direction’ consultation in autumn 2021. This work aims to revisit data governance to strike a balance between permitting data use and protecting people’s rights.

There is a growing number of published studies on public attitudes towards data. However, in 2020 a comprehensive academic review of this research identified how the complex, context-dependent nature of public attitudes to data renders any single study unlikely to provide a conclusive picture. The authors of that review recommended that policymakers should ‘look beyond headline findings about public perceptions of data practices [and] engage with the breadth of evidence available across academic disciplines, policy domains and the third sector’.[footnote]Kennedy, H. et al. (2020). Public understanding and perceptions of data practices: a review of existing research. University of Sheffield. Available at: https://livingwithdata.org/project/wp-content/uploads/2020/05/living-with-data-2020-review-of-existing-research.pdf.[/footnote]

To support policymakers to engage with the breadth of existing research, we have reviewed evidence from nearly 40 studies about UK public attitudes to data conducted in recent years. The evidence presented in this paper is drawn from the body of research the Ada Lovelace Institute has used to inform our own policy positions and responses to recent government consultations.

Given the urgent nature of the question of how to regulate data with public support, this paper is not an exhaustive synthesis of every piece of research published on public attitudes towards data. Instead, it represents a curated overview of credible research that is relevant to recent data-related consultations, inquiries and debates.

In our review of this research, five key findings have emerged:

Summary of findings

  1. There is consistent evidence of public support for more and better regulation of data and data-driven technologies. (But more research is needed on what the public expects ‘better’ regulation to look like.)
  2. The UK public want data-driven innovation, and they expect it to be ethical, responsible and focused on public benefit. (Determining what constitutes ‘public benefit’ from data requires ongoing engagement with the public.)
  3. People want clearer information about data practices and how to enact their data rights. (But what this looks like in practice is not yet fully understood.)
  4. Creating a trustworthy data ecosystem is critical to protecting against potential public backlash or resistance. (Emerging research suggests that regulation is a key component in those ecosystems.)
  5. Public concerns around data should not be dismissed as lack of awareness or understanding, and simply raising awareness about the benefits of data will not increase public trust. (More research is needed to understand the connection between awareness of and attitudes to data.)

In this paper we present the evidence that underpins each of these findings. We also highlight gaps in our collective understanding and identify avenues for future research that policymakers, industry, academia and civil society must work together to address.

Detailed findings

1. There is consistent evidence of public support for more and better regulation of data and data-driven technologies

 

But more research is needed on what the public expects ‘better’ regulation to look like.

Findings from many surveys, focus groups and public dialogues conducted in recent years consistently indicate that the UK public does want data and digital regulation to be strengthened, and that any moves to deregulate this area would not achieve public support.

This evidence suggests that people expect regulation – and the governance structures that underpin it – to ensure that people’s rights and interests are protected, their privacy is preserved, and the power of large technology companies and other data controllers is held to account. Many people feel current regulation does not do this effectively. This evidence also shows that better regulation of data is necessary to ensure public trust in the use of data.

Evidence

  • In 2020, a team of researchers from the Living With Data research project published an extensive review of literature related to public attitudes towards data. The review concluded that existing research suggests regulation is a requirement for fairer data practices.[footnote]Kennedy, H. et al. (2020). Public understanding and perceptions of data practices: a review of existing research. University of Sheffield. Available at: https://livingwithdata.org/project/wp-content/uploads/2020/05/living-with-data-2020-review-of-existing-research.pdf[/footnote]
  • UK research organisation Doteveryone conducted two large-scale surveys of UK public attitudes towards digital technology in 2018 and 2020. 64% of survey respondents thought that Government should regulate online services more heavily, even if that comes with disadvantages like making it harder for small businesses to make money or creating fewer choices for consumers. Respondents identified the Government (54%) and independent regulators (48%) as ‘having the most responsibility for directing the impacts of technology on people and society’. In 2018, Doteveryone reported that 66% of respondents felt the Government should play a role in enforcing rules that ensure people and society are treated fairly by technology companies.[footnote]Miller, C., Kitcher, H., Perera, K., Abiola, A., (2020) People, Power and Technology: The 2020 Digital Attitudes Report. London: doteveryone. Available at: https://doteveryone.org.uk/wp-content/uploads/2020/05/PPT-2020_Soft-Copy.pdf (Accessed: 4 March 2021); and Miller, C., Coldicutt, R., Kos, A., (2018) People, Power, Technology. London: doteveryone. Available at: https://doteveryone.org.uk/wp-content/uploads/2018/06/People-Power-and-Technology-Doteveryone-Digital-Attitudes-Report-2018.compressed.pdf (Accessed: 30 November 2021).[/footnote]
  • Findings from the first wave of the CDEI’s Public Attitudes to Data and AI tracker were published in March 2022. It found that just 26% of the public report knowing at least a fair amount about digital regulation, and that few people express confidence that there are protections in place around digital technologies:

    31% of people agree that ‘the digital sector is regulated enough to protect my interests’, compared with 30% who disagree.

    The survey also found that ‘respondents were more likely to be willing to share data if [strict] rules are place to protect them as users’.[footnote]Centre for Data Ethics and Innovation. (2022). Public Attitudes to Data and AI Tracker: Wave 1, p.35. Available at: https://www.gov.uk/government/publications/public-attitudes-to-data-and-ai-tracker-survey (Accessed: 15 November 2021).[/footnote]

  • The CDEI’s 2020 Trust in COVID-19 Technology poll found that fewer than half of people (43%) trust that the right rules and regulations are in place to ensure digital technology is used responsibly. A similar proportion (44%) wouldn’t know where to raise their concerns if they were unhappy with how a digital technology was being used.[footnote]Centre for Data Ethics and Innovation. (2020). Trust in technology: COVID-19. Available at: https://cdei.blog.gov.uk/wp-content/uploads/sites/236/2020/07/CDEI-Trust-in-Technology-Public-Attitudes-Survey-1.pdf (Accessed: 4 March 2021).[/footnote]
  • A 2020 study by UK academics looked at public perceptions of good data management. They found that, of a range of models for data management, the most preferred option was a ‘personal data store’ that would offer individuals direct control over their personal data. The second-most preferred option was a ‘regulatory public body overseeing “how organizations access and use data, acting on behalf of UK citizens”’. Among other experiments the researchers conducted, regulation and regulatory oversight consistently ranked highly among various options, alongside individual control and consent or opt-out mechanisms.[footnote]Hartman, T. et al. (2020). ‘Public perceptions of good data management: Findings from a UK-based survey’, Big Data & Society, 7(1). doi: 10.1177/2053951720935616.[/footnote]
  • In the context of biometrics specifically, the Ada Lovelace Institute’s 2019 survey found that 55% of people agree that the Government should limit the police use of facial recognition to specific circumstances.[footnote]Ada Lovelace Institute. (2019). Beyond face value: public attitudes to facial recognition technology. Available at: https://www.adalovelaceinstitute.org/report/beyond-face-value-public-attitudes-to-facial-recognition-technology/ (Accessed: 23 February 2021).[/footnote] In a subsequent public deliberation exercise called the Citizens’ Biometrics Council, members of the public developed 30 recommendations for the governance of biometric technologies. 9 of these recommendations focused on regulation, legislation and oversight of biometrics, and several recommendations specifically called for new regulation for biometrics, beyond the existing GDPR, that ensures people’s rights are protected and data controllers and processors are held to account.[footnote]Peppin, A., Patel, R. and Parker, I. (2021). The Citizens’ Biometrics Council. Ada Lovelace Institute. Available at: https://www.adalovelaceinstitute.org/report/citizens-biometrics-council/ (Accessed: 29 March 2022).[/footnote]
  • A public dialogue on the ethics of location data, commissioned by the Geospatial Commission and UKRI’s Sciencewise programme, was carried out by Traverse and the Ada Lovelace Institute in 2021. The dialogue participants said that effective regulation, accountability and transparency are essential for ensuring public trust in the use of location data. They also thought that data collectors should be accountable to regulators and data subjects, with consequences for breaches or misuse. Many participants questioned whether current regulation and regulators are effective in achieving this.[footnote]McCool, S., Maxwell, M., Peppin, A., et al. (2021). Public dialogue on location data ethics. Geospatial Commission, Traverse, Ada Lovelace Institute. Available at: https://www.gov.uk/government/publications/public-dialogue-on-location-data-ethics. (Accessed 28 January 22)[/footnote]
  • A literature review conducted by Administrative Data Research UK in 2020 reported several studies ‘identified an increase in public acceptance [of data use] after study participants were informed about’ data protection and governance mechanisms.[footnote]Waind, E. (2020). Trust, Security and Public Interest: Striking the Balance, p.18. Administrative Research UK. Available at: https://www.adruk.org/fileadmin/uploads/adruk/Trust_Security_and_Public_Interest-_Striking_the_Balance-_ADR_UK_2020.pdf (Accessed: 2 December 2021).[/footnote] This research also suggests that there is public support for penalties if data is misused and laws to regulate access to data.
  • Results from the Information Commissioner’s Office 2020 and 2021 annual public attitudes trackers show the number of people who agree that ‘current laws and regulations sufficiently protect personal information’ increased from 33% to 49% between 2019 and 2020, but dropped to 42% in 2021.[footnote]Worledge, M. and Bamford, M. (2020). ICO Trust and Confidence Report. Harris Interactive and Information Commissioner’s Office. Available at: https://ico.org.uk/media/about-the-ico/documents/2618178/ico-trust-and-confidence-report-2020.pdf (Accessed: 29 March 2022).[/footnote] [footnote]Worledge, M. and Bamford, M. (2021) ICO Annual track findings, 2021. Information Commissioner’s Office. Available at: https://ico.org.uk/media/about-the-ico/documents/2620165/ico-trust-and-confidence-report-290621.pdf (Accessed: 29 March 2022).[/footnote]
  • In 2021, Which? conducted online deliberative focus groups with 22 people, focusing on specific components of the ‘Data: A New Direction’ consultation.[footnote]Which? (2021). The Consumer Voice: Automated Decision Making and Cookie Consents proposed by “Data: A new direction”. Available at: https://www.which.co.uk/policy/digital/8426/consumerdatadirection. (Accessed: 19 November 2021).[/footnote] They found that some participants ‘called for AI and algorithms to be more regulated than at present, given the potential for negative impacts on people’. Some participants also thought that ‘unchallenged solely automated decisions would further skew the imbalance of power between consumers and businesses and give companies more ways to rescind responsibility if something went wrong.’ 
  • In 2021 Ada conducted citizens’ juries on the use of data during health emergencies, such as the COVID-19 pandemic. The juries found that even in times of health crises, good governance that includes appropriate regulation is essential for public trust in the use of data.[footnote]Peppin, A., Patel, R., Alnemr, N., Machirori, M. and Gibbon, K. (forthcoming) Report on Citizens’ Juries on data governance during pandemics. Ada Lovelace Institute.[/footnote]

Future research avenues: data regulation

This evidence sends a strong signal that the UK public want more and better regulation of data-driven technologies. However, more research is needed to understand exactly what the public expects this regulation to look like.

This has been explored in some domains, such as biometrics (for example, Ada’s Citizens’ Biometrics Council asked members of the public to set out their recommendations for the governance of biometrics). But for other domains, such as health or finance, what exactly the public thinks regulators and policymakers should focus on is not yet clear.

More research is also needed to understand how the public considers trade-offs and tensions that arise in the context of data regulation. While the Doteveryone survey, for example, suggested people want tougher regulation even at the expense of consumer choice, this remains an under-explored topic. The Geospatial Commission’s dialogue on location data offered some useful findings on this, demonstrating how public dialogue and deliberation offer fruitful methods to studying this.

Finally, it is important to note that, while we have drawn on research that we feel is robust, some surveys have limited sample sizes, especially in terms of the number of people surveyed from relevant marginalised groups, such as digitally excluded people. Any future research must ensure the views of marginalised groups are meaningfully included.

2. The UK public wants data-driven innovation, but expects it to be ethical, responsible and focused on public benefit

 

Determining what constitutes ‘public benefit’ from data requires ongoing engagement with the public

Research shows that people expect the Government to support innovation and for public bodies and commercial companies to use new data-driven technologies to improve services and tackle societal problems. At the same time, they expect the Government to use regulation and oversight to limit how and where these technologies can be used.

In our view, evidence that people want both the benefits of data and regulatory limits on it is not inconsistent or contradictory; it is reflective of how the public understand and experience both the potential good and the potential harm of data use. People want the benefits of data-driven innovation to be realised, but to minimise the harms, they want it to be safe, ethical, responsible and to put the good of the public first.

Evidence

  • A series of focus groups and participatory workshops conducted in 2019 by the RSA, Open Data Institute and Luminate found that:

    People feel positive about the benefits of data-driven technologies, but want greater rights over its use, and to see Government regulate companies’ uses of data.[footnote]Samson, R., Gibbon, K. and Scott, A. (2019). About Data About Us. The RSA. Available at: https://www.thersa.org/globalassets/pdfs/reports/data-about-us-final-report.pdf (Accessed: 2 December 2021).[/footnote]

  • Our 2019 survey of public attitudes towards the use of facial recognition found good support for the use of the technology, in certain circumstances. We found 70% support for the use of facial recognition by the police in criminal investigations, 54% support for its use to unlock smartphones and 50% support for its use in airports. This compares with very low support in other circumstances: just 7% in supermarkets to track shopper behaviour, 6% to monitor pupil’s expressions in schools and 4% in hiring processes. From this we concluded that any support for the use of data-driven systems like facial recognition is conditional on the context in which it is deployed, with use cases where there is clear public benefit enjoying more support. Importantly, people’s support for the use of facial recognition in any scenario ‘assumes appropriate safeguards were in place’.[footnote]Ada Lovelace Institute. (2019). Beyond face value: public attitudes to facial recognition technology. Available at: https://www.adalovelaceinstitute.org/report/beyond-face-value-public-attitudes-to-facial-recognition-technology/ (Accessed: 23 February 2021).[/footnote]
  • Our Citizens’ Biometrics Council reiterated this conclusion, finding that the use of biometric data and technologies can bring benefits in certain circumstances, but a prerequisite for their use is more effective regulation, independent oversight and strict standards for their development and deployment.[footnote]Peppin, A., Patel, R. and Parker, I. (2021). The Citizens’ Biometrics Council. Ada Lovelace Institute. Available at: https://www.adalovelaceinstitute.org/report/citizens-biometrics-council/ (Accessed: 29 March 2022).[/footnote]
  • A 2021 public dialogue commissioned by Understanding Patient Data and the National Data Guardian explored how health data can be used for public good. Participants concluded that data use should be subject to a ‘level of governance’ that balances the use of data for innovation with ethical and responsible practice, to ensure public benefit. Understanding Patient Data reported that participants argued for the need to ‘embed on-going evaluation of public benefit throughout the data life cycle’ and that, in the case of NHS data use, ‘public benefit must always outweigh profit.’[footnote]Hopkins Van Mil. (2021). Putting Good into Practice: A public dialogue on making public benefit assessments when using health and care data. Available at: https://www.gov.uk/government/publications/putting-good-into-practice-a-public-dialogue-on-making-public-benefit-assessments-when-using-health-and-care-data (Accessed: 15 April 2021).[/footnote] [footnote]Harrison, T. (2021).‘What counts as a “public benefit” for data use?’. Understanding Patient Data. Available at: http://understandingpatientdata.org.uk/news/what-counts-public-benefit-data-use (Accessed: 28 January 2022).[/footnote]
  • The Centre for Data Ethics and Innovation conducted a set of focus groups about Trust in Data in 2021. Through the use of 12 case studies, they found that participants were most supportive of data use cases when the benefits to society are clear and substantial, as well as when there is an intuitive and direct link between the data collected and the purposes for its use.[footnote]CDEI and Britain Thinks. (2021). Trust in Data, p.33. Available at: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1049179/Trust_In_Data_-_Publishable_Report__1.pdf (Accessed: 17 March 2022).[/footnote]
  • An Imperial College London-led survey of attitudes to health data sharing, published in 2020, found that ‘the more commercial the purpose of the receiving institution (e.g., for an insurance or technology company), the less often respondents were willing to share their anonymised personal health information’ in the UK and the USA.[footnote]Ghafur, S. et al. (2020). ‘Public perceptions on data sharing: key insights from the UK and the USA’. The Lancet Digital Health, 2(9), pp. e444–e446. doi: 10.1016/S2589-7500(20)30161-8[/footnote]
  • Deloitte and Reform’s 2017/18 State of the State report included a survey of 1,000 adults across the UK. It found that trust in Government use of data is ‘driven by a belief that it uses data for the good of society […] and its use is regulated’.[footnote]Deloitte and Reform. (2018). Citizens, government and business: the state of the State 2017-18, p.3. Available at: https://www2.deloitte.com/content/dam/Deloitte/uk/Documents/public-sector/deloitte-uk-the-state-of-the-state-report-2017.pdf (Accessed: 1 December 2021).[/footnote]
  • In 2020, the Ada Lovelace Institute conducted an online public deliberation on the Government’s use of data-driven technologies to address the COVID-19 crisis. Participants in the dialogue expressed how, even in extreme circumstances like a pandemic, data-driven technologies must comply with data-privacy regulations. Participants expected to see regulators take an active role in overseeing the use of data, and clear standards for its use and development.[footnote]Ada Lovelace Institute and Traverse. (2020). Confidence in a crisis? Available at: https://www.adalovelaceinstitute.org/report/confidence-in-crisis-building-public-trust-contact-tracing-app/ (Accessed: 4 March 2021).[/footnote]
  • In 2021 the UK Geospatial Commission and Sciencewise commissioned a public dialogue on the ethics of location data use. Delivered by Traverse and the Ada Lovelace Institute, the dialogue found that participants thought that why location data is used and who benefits from it are important when considering whether location data use is ethical and trustworthy, and that benefits to members of the public or wider society should be prioritised.[footnote]McCool, S. et al. (2021). Public dialogue on location data ethics. Geospatial Commission, Traverse, Ada Lovelace Institute. Available at: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1040807/Accessible_Public_dialogue_on_location_data_ethics_Engagement_report.pdf.[/footnote]
  • Researchers on the Observatory for Monitoring Data-Driven Approaches to COVID-19 (OMDDAC) programme surveyed UK public perceptions of data sharing for COVID-19 purposes in 2021. They found that people are more willing to share data if it will help address an emergency, but that the nature of the data, the urgency of the issue, and who will access or use the data all affect levels of comfort in data-sharing. The researchers conclude that ‘it cannot be assumed that the urgency of a global pandemic leads to people disregarding their concerns and engaging with data-sharing initiatives.[footnote]Selina Sutton et al. (2021). Survey of Public Perceptions of Data Sharing for COVID-19 related purposes. The Observatory for Monitoring Data-Driven Approaches to COVID-19 (OMDDAC). Available at: https://www.omddac.org.uk/wp-content/uploads/2021/08/WP3-Snapshot.pdf (Accessed: 7 December 2021.[/footnote]
  • Academics from the Me and My Big Data project surveyed the UK public based on their digital literacy. In 2021 they reported that people at all levels of online engagement are more favourable toward data collection when it is used for consumers’ benefit rather than companies’ benefit.[footnote]Yates, P. S. J. et al. (2021). Understanding Citizens Data Literacies Research Report, p. 125. Available at: https://www.liverpool.ac.uk/media/livacuk/humanitiesampsocialsciences/meandmybiddata/Understanding,Citizens,Data,Literacies,Research,,Report,Final.pdf (Accessed: 18 January 2022).[/footnote]

Future research avenues: public benefit

Evidence that the public expects both data-driven innovation to deliver benefits and to be ethical, responsible and put public benefit first helps understand the public’s priorities and desires. However, existing research has not yet offered detailed analysis of what the public consider ‘public benefit’ to be, nor what they expect to happen when tensions arise: for example if a company uses data to deliver a public service, but increases its market value in the process, do the public consider this to be ‘responsible data use in the public interest’?

Some research explores this in specific contexts, particularly in health, such as Understanding Patient Data and the National Data Guardian’s Putting Good Into Practice report cited above, but more research is needed on how the public feels tensions around the proportionate use of data in the public interest should be navigated. In the meantime, while this research is undertaken, policymakers should heed the fact that for the public, ‘innovation’ is not beneficial unless it is ethical and responsible.

It is also likely that what counts as ‘public benefit’ is context dependent and will vary case to case. This means that ongoing work, including public participation, will be needed to align data innovation with wider concepts of public benefit.

3. People want clearer information about data practices and how to enact their data rights

 

But what this looks like in practice is not yet fully understood

There is a large body of research on issues of transparency and people’s rights in relation to data use. This research suggests that people often want more specific, granular and accessible information about what data is collected, who it is used by, what it is used for and what rights data subjects have over that use. Often, people want this information so that they can make informed decisions, such as whether or not to consent to data collection, or who to raise a complaint with.

Evidence

  • In 2021 the CDEI commissioned Britain Thinks to conduct qualitative research with members of the UK public about transparency of algorithmic decision-making systems. They found that participants expected two ‘tiers’ of information that should be made publicly available. ‘Tier 1’ information should be made available to the general public and includes explanations of what data is collected, how that data is used, and how data subjects’ privacy is ensured. ‘Tier 2’ information should contain more technical detail, and be made available for expert scrutiny and for those members of the public with particular interest.[footnote]Centre for Data Ethics and Innovation. (2021). Complete transparency, complete simplicity. Available at: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/995014/Complete_transparency__complete_simplicity_-_Accessible.pdf.[/footnote]
  • Participants in a series of focus groups conducted by the Royal Society of Arts (RSA) said that transparency around data use is critical, and they expect companies and organisations to be clear about how they use data. The RSA’s report concluded that overall people want greater honesty and transparency around data use.[footnote]Samson, R., Gibbon, K. and Scott. A. (2019). About Data About Us. The RSA. Available at: https://www.thersa.org/globalassets/pdfs/reports/data-about-us-final-report.pdf (Accessed: 2 December 2021).[/footnote]
  • The 2020 Information Commissioner’s Office (ICO) annual tracker found that:

    Only 37% of people agreed that companies and organisations are open and transparent about how they collect and use personal information,

    and only 31% agreed that is easy to find out how personal information is stored and used. This dropped to 33% and 30% respectively in 2021.[footnote]Worledge, M. and Bamford, M. (2021). ICO Annual track findings, 2021. Information Commissioner’s Office. p.21. Available at: https://ico.org.uk/media/about-the-ico/documents/2620165/ico-trust-and-confidence-report-290621.pdf (Accessed: 29 March 2022).[/footnote]

  • In 2020 researchers at the University of Sheffield conducted a major survey of UK public attitudes towards data practices, called the Living With Data survey. It found that:

    83% of respondents want to know who has access to data about them and 80% want to know where data about them is stored.[footnote]Kennedy, H. et al. (2021) Living with Data survey report. University of Sheffield. Available at: https://livingwithdata.org/resources/living-with-data-survey-results/.[/footnote]

  • Understanding Patient Data has reported extensively on transparency and communication around the use of health data. Their research, based on workshops with health professionals and members of the public, has found that the language and processes around health data are confusing for many patients and members of the public. Most people feel clear, accessible explanations of what health data is collected, how it’s used and how it’s protected are crucial to building public trust in data use.[footnote]Understanding Patient Data. (2017). What are the best words to use when talking about data? Available at: http://understandingpatientdata.org.uk/what-are-best-words-use-when-talking-about-data (Accessed: 2 December 2021).[/footnote]
  • In 2018 the Academy of Medical Sciences conducted a series of dialogues with patients, the public and healthcare professionals. They found that there are ‘strong expectations from patients and the public for transparency around the use of data-driven technologies’.[footnote]The Academy of Medical Sciences. (2018). Our data-driven future in healthcare. Available at: https://acmedsci.ac.uk/file-download/74634438 (Accessed: 2 December 2021).[/footnote]
  • Ada’s 2020 citizen juries on COVID-19 technologies found that the public want a transparent evidence base around uses of data, particularly in high-stakes scenarios. A lack of transparency can generate suspicion and distrust, people feel it should be easy to know what data is held, by whom, for what purpose and how long.[footnote]Ada Lovelace Institute and Traverse (2020) Confidence in a crisis? Available at: https://www.adalovelaceinstitute.org/report/confidence-in-crisis-building-public-trust-contact-tracing-app/ (Accessed: 4 March 2021).[/footnote]
  • Online deliberative focus groups conducted by Which? found that participants felt that the ability to challenge AI decisions was a ‘right not a privilege’, and safeguards like consent to data use remain important to the public. Which? reported that ‘consumers want to continue to be able to actively choose which cookies are consented to when they use the internet and what data is collected about them.’[footnote]Which? (2021). The Consumer Voice: Automated Decision Making and Cookie Consents proposed by “Data: A new direction”, p.8. Available at: https://www.which.co.uk/policy/digital/8426/consumerdatadirection (Accessed: 19 November 2021).[/footnote]
  • The ICO and Alan Turing Institute’s conducted two citizens’ juries in 2019, as part of an initiative called Project ExplAIn.[footnote]Citizens’ Juries C.I.C. and Jefferson Centre. (2019). Artificial intelligence (AI) & explainability: Citizens’ Juries Report. Available at: http://assets.mhs.manchester.ac.uk/gmpstrc/C4-AI-citizens-juries-report.pdf (Accessed: 18 January 2022).[/footnote] These juries found that ‘the importance placed on an explanation for automated decisions’ varies depending on the scenario, and ‘a majority [of participants] felt that explanations for AI decisions should be offered in situations where non-AI decisions come with an explanation.’ In some high-stakes scenarios or scenarios that are more technical than social, such as medical diagnosis, participants felt system accuracy was more important than transparency. 
  • Ofcom’s 2020/21 Adult Media Use & Attitudes Survey reported that ‘the most common reasons internet users gave for allowing companies to collect and use their data were having appropriate reassurances on the protection and use of their data: including that they can opt out at any point and the company will stop using their data; or that their information will not be shared with other companies.’[footnote]Ofcom. (2021). Adults’ Media Use and Attitudes 2020/21, p.2. Available at: https://www.ofcom.org.uk/__data/assets/pdf_file/0025/217834/adults-media-use-and-attitudes-report-2020-21.pdf.[/footnote]
  • The 2021 Lloyds Digital Consumer index found that:

    36% of people who are offline say that more transparency about the data organisations have and how they are using it would encourage them to get online.

    44% of those offline say that the ability to easily stop organisations from using their data would encourage them to get online.[footnote]Lloyds Bank. (2021). Consumer Digital Index. Available at: https://www.lloydsbank.com/banking-with-us/whats-happening/consumer-digital-index.html (Accessed: 31 January 2022).[/footnote]

Future research avenues: transparency

Though a vast body of research literature points towards a public desire for transparency, there is limited understanding of what the public expect this to look like in practice. Some studies, such as those conducted by the CDEI and Understanding Patient Data, cited above, have contributed to this understanding. However, more research is needed to understand the detail of what the public expect transparency and enacting their rights to look like in practice.

Further, there is little research that has focused on the link between transparency and understanding. Surveys about public awareness of data practices and GDPR, for instance, often do not explore whether transparency increases awareness, or whether awareness is sufficient or satisfactory. Nor does this work drill down into the specifics of people’s awareness: what people know, as opposed to what they are simply ‘aware’ of. This ‘awareness/understanding’ challenge is common in public attitudes studies, particularly polling, and requires further research in relation to data. Qualitative and co-production methodologies will therefore likely be key to better understanding what meaningful transparency should look like in practice.

4. Creating a trustworthy data ecosystem is critical to protecting against potential public backlash or resistance

 

Emerging research suggests that regulation is a key component in those ecosystems

Building or ensuring ‘public trust’ in the use of data is a common and laudable priority for policymakers and technology innovators. This is for good reason: without public trust, data innovation cannot proceed with social licence or legitimacy, and there is a risk of potential public backlash or resistance. Instances of this backlash are already evident in responses to the Cambridge Analytica Scandal,[footnote]Butow, D. (2018). ‘Trust in Facebook has dropped by 66 percent since the Cambridge Analytica scandal’. NBC News. Available at: https://www.nbcnews.com/business/consumer/trust-facebook-has-dropped-51-percent-cambridge-analytica-scandal-n867011 (Accessed: 3 March 2022).[/footnote] and more recently, the response to the A-level grading algorithm,[footnote]Porter, J. (2020). ‘UK ditches exam results generated by biased algorithm after student protests’. The Verge. Available at: https://www.theverge.com/2020/8/17/21372045/uk-a-level-results-algorithm-biased-coronavirus-covid-19-pandemic-university-applications (Accessed: 3 March 2022).[/footnote] and the GP Data for Planning and Research (GPDPR) programme.[footnote]Jayanetti, C. (2021). ‘NHS data grab on hold as millions opt out’. The Observer. Available at: https://www.theguardian.com/society/2021/aug/22/nhs-data-grab-on-hold-as-millions-opt-out (Accessed: 3 March 2022).[/footnote]

There is a rich body of research relating to public trust in data use. Our analysis of this evidence suggests that aims to ‘build public trust’ can too often place the burden on the public to be ‘more trusting’ and will do little to address the issue of trust in data. Instead, policymakers and regulators should focus on encouraging more trustworthy practices from data innovators and data processors.

Evidence

  • A 2014 study by Ipsos MORI for the Royal Statistical Society found a ‘data trust deficit’ in the UK, where the NHS and public institutions are among the most trusted when it comes to data use, but social media companies, technology companies and retail companies are the least trusted, and national and local government bodies fare somewhere in the middle.[footnote]Varley-Winter, O. and Shah, H. (2014). Royal Statistical Society research on trust in data and attitudes toward data use and data sharing. Royal Statistical Society. Available at: https://www.statslife.org.uk/images/pdf/rss-data-trust-data-sharing-attitudes-research-note.pdf (Accessed: 30 March 2021).[/footnote]
  • In 2022, the Centre for Data Ethics and Innovation’s Public Attitudes to Data and AI tracker showed that a similar ‘data trust deficit’ found by the Royal Statistical Society remains today. Average trust in managing data was highest for NHS (74%), universities (63%) and banks (66%). It was lowest for social media companies (33%) and big tech (46%). Local councils (51%), Utilities providers (51%) and local independent businesses (49%) fare in the middle. Notably, the Government scored low in this survey, just below big tech at 44%. The CDEI reported that people’s willingness to share data is ‘closely related to trust in the organisation [that uses the data] to act in the public’s best interest’.[footnote]Centre for Data Ethics and Innovation. (2022). Public Attitudes to Data and AI Tracker: Wave 1, pp.31-32. Available at: https://www.gov.uk/government/publications/public-attitudes-to-data-and-ai-tracker-survey (Accessed: 15 November 2021).[/footnote]
  • In Doteveryone’s 2020 survey, they found high levels of distrust in technology companies, with only 19% of respondents agreeing that tech companies design products and services with users’ best interests in mind, with qualitative data linking these responses to lack of trust.[footnote]Miller, C., Kitcher, H., Perera, K., Abiola, A., (2020) People, Power and Technology: The 2020 Digital Attitudes Report. London: doteveryone. Available at: https://doteveryone.org.uk/wp-content/uploads/2020/05/PPT-2020_Soft-Copy.pdf (Accessed: 4 March 2021).[/footnote] Their 2018 survey found that:

    Only 25% of people say they trust technology companies ‘to do the right thing’.[footnote]Miller, C., Coldicutt, R. and Kos, A., (2018) People, Power, Technology. London: doteveryone. Available at: https://doteveryone.org.uk/wp-content/uploads/2018/06/People-Power-and-Technology-Doteveryone-Digital-Attitudes-Report-2018.compressed.pdf (Accessed: 30 November 2021).[/footnote]

  • In 2017 and 2018, the Open Data Institute surveyed public attitudes towards sharing personal data across the UK, France, Germany, Belgium and the Netherlands. It found only 2% of people in the UK trust social media companies with data, 22% trust online retailers, and 94% say trust is an important factor when deciding whether or not to share data. Findings are similar in the European countries surveyed.[footnote]Open Data Institute. (2018). ‘Who do we trust with personal data?’. Available at: https://theodi.org/article/who-do-we-trust-with-personal-data-odi-commissioned-survey-reveals-most-and-least-trusted-sectors-across-europe/ (Accessed: 4 March 2021).[/footnote]
  • The Living With Data survey found that trust in data use breaks down into three categories of trust in organisations to: a) keep personal data safe, b) gather and analyse data in responsible ways, and c) be open and transparent about what they do with data. Using this framework, they found that:

    67% to 69% of people trust the NHS with data, contrasted against only 5% of people who trust social media companies

    across all three categories.[footnote]Kennedy, H. et al. (2021). Living with Data survey report, p.27. University of Sheffield. Available at: https://livingwithdata.org/resources/living-with-data-survey-results/.[/footnote]

  • A 2020 report by the Ada Lovelace Institute drew on multiple public dialogues and concluded that public trust in data use is dependent on not just privacy and data protection, but on whether digital interventions are effective and whether the organisations involved were perceived to be trustworthy.[footnote]Ada Lovelace Institute. (2020). No green lights, no red lines. Available at: https://www.adalovelaceinstitute.org/report/covid-19-no-green-lights-no-red-lines/ (Accessed: 4 March 2021).[/footnote]
  • The ICO found that public trust in companies and organisations storing and using data ‘shifted towards the neutral middle point in 2020’. Low trust dropped from 38% to 28%, and high trust dropped from 32% to 27%. Neither ‘trusting nor distrusting’ increased from 30% to 45%, suggesting growing ambivalence around data use.[footnote]Worledge, M. and Bamford, M. (2020). ICO Trust and Confidence Report. Harris Interactive and Information Commissioner’s Office. Available at: https://ico.org.uk/media/about-the-ico/documents/2618178/ico-trust-and-confidence-report-2020.pdf. (Accessed: 29 March 2022).[/footnote] Alongside these shifts, the ICO’s findings also reiterate the ‘data trust deficit’: higher trust in public services using personal data, lower trust in social media and online platforms, and mid-level trust in government.

Future research avenues: trust

Creating ‘trustworthy data ecosystems’ is a complex challenge and will not be solved by policymakers or regulation alone. Our view is that the evidence presented above shows that trustworthy data ecosystems are core to public trust, but it does not show how to achieve those ecosystems. Entire research communities are dedicated to tackling various components that might be part of such ecosystems – such as transparency around data processors’ intentions, appropriate governance and rules, effective safeguards and protections for data subjects – but it is clear that it will be some time before this work yields change at scale.

We recommend that, to complement this emerging research field, policymakers and regulators should shift their mindset away from ‘building public trust’ and focus instead on designing policy, legislation and strategies that encourage more trustworthy data practices from innovators and data controllers. This will go a long way towards fostering trustworthy data ecosystems, incentivising more trustworthy practices, and realising the ultimate goal: widespread public trust in data use.

5. Public concerns around data should not be dismissed as lack of awareness or understanding, and simply raising awareness of the benefits of data will not increase public trust

 

More research is needed to understand the connection between awareness of and attitudes to data

Public concerns around an issue are commonly attributed it to a lack of understanding, knowledge or awareness. This is particularly so in the case of data, which often generates a tendency to reach for solutions that would ‘raise public awareness’ or ‘inform the public’ about the benefits of data and related regulation.

While there is some evidence of correlation between higher awareness of data use and higher support for it, the assumption that higher awareness causes higher support is flawed. Beyond the correlation-causation fallacy,[footnote]See: Hansen, H. (2020). ‘Fallacies’, in Zalta, E. N. (ed.) The Stanford Encyclopedia of Philosophy, Summer 2020 edition. Metaphysics Research Lab, Stanford University. Available at: https://plato.stanford.edu/archives/sum2020/entries/fallacies/ (Accessed: 31 January 2022).[/footnote] this logic follows a widely criticised ‘deficit model’ of the public understanding of science and technology, which falsely assumes that the more a person knows about a technoscientific issue, the more they will support it.[footnote]Bauer, M. W., Allum, N. and Miller, S. (2007). ‘What can we learn from 25 years of PUS survey research? Liberating and expanding the agenda’. Public Understanding of Science, 16(1), pp. 79-95. doi: 10.1177/0963662506071287[/footnote] Such assumptions fail to recognise that many people’s concerns about data correlate with both high and low levels of understanding, and often those concerns are persistent or strengthened after being provided with information about data use.

Moreover, there is increasing evidence that public trust in and support for data-driven technologies is correlated with factors such as digital exclusion or ethnicity, rather than awareness. This points to a corollary argument for taking public trust seriously: in an increasingly complex digital world, trust in data is likely to be a contributing factor to digital inequality.

Evidence

  • Multiple public deliberation and public dialogues – such as our Citizens’ Biometrics Council, the Geospatial Commission’s public dialogue on location data ethics, and multiple citizens’ juries on the use of health data – provide strong evidence that informing people about data does not necessarily mean they will become more supportive of its use. These methods are designed to increase awareness and understanding by providing participants with information, access to experts, and time and support to consider this evidence carefully. Such methods consistently report that as participants in these dialogues develop informed views, they recognise the benefits of data-driven innovation, but this does not diminish their concerns about harms that might arise from data use. Importantly, many dialogues about data conclude with people recognising both benefits and disbenefits of data and concluding that neither wide deregulation nor blanket bans will work. Instead, nuanced approaches to regulation are required.
  • In 2021, Which? conducted online deliberative focus groups with 22 people, focusing on specific components of the ‘Data: A New Direction’ consultation. Which? reported that ‘consumers are fully able to understand complex issues such as automated decision-making when they are presented with information in accessible, digestible and relevant ways.’ They also found that, following consideration of information about automated decision-making and cookie notices, the participants showed expectations for better regulation and oversight that ensures data subjects have control over data use and their rights are protected (see above).[footnote]Which? (2021). The Consumer Voice: Automated Decision Making and Cookie Consents proposed by “Data: A new direction”. Available at: https://www.which.co.uk/policy/digital/8426/consumerdatadirection[/footnote]
  • The Living With Data survey, mentioned above, found that people want to know more about data uses, but the people who know most about them are the most concerned. Grouping respondents into four clusters, researchers found that the most well-informed respondents are more likely to be critical about data practices, and moderately well-informed respondents are more likely to be cautious. In other words, people who are more well-informed about data uses are more likely to have negative attitudes towards them. This suggests that increased awareness of data uses does not result in increased trust or acceptance.[footnote]Kennedy, H. et al. (2021). Living with Data survey report. University of Sheffield. Available at: https://livingwithdata.org/resources/living-with-data-survey-results/. (Accessed: 29 March 2022).[/footnote]
  • Focus groups conducted by Britain Thinks for the CDEI in 2021 found people often have negative views on data use because the bad examples are more memorable. These findings suggest these negative views are not a result of lack of awareness or understanding.[footnote]CDEI and Britain Thinks. (2021). Trust in Data. Available at: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1049179/Trust_In_Data_-_Publishable_Report__1.pdf (Accessed: 17 March 2022).[/footnote]
  • Academics on the ‘Me and My Big Data’ project published research in 2021 about the UK public’s data literacy, outlining five categories that can be used to describe people’s knowledge, understanding and awareness about data and related issues. They found an overall majority of people are ‘uncomfortable with third-party sharing of data’. ‘Limited users’ – those with the lowest data literacy – are the second most uncomfortable with third-party sharing of personal data (71%), and ‘General users’ – who have just-above average data literacy – are the most uncomfortable with third-party sharing of personal data (74%). Those with much higher-than-average data literacy scores are ‘happier’ with data collection to deliver consumer benefit, but a majority of them (66%) still report discomfort around third-party data use.[footnote]Yates, P.S.J. et al. (2021). Understanding Citizens Data Literacies Research Report, p. 125. Available at: https://www.liverpool.ac.uk/media/livacuk/humanitiesampsocialsciences/meandmybiddata/Understanding,Citizens,Data,Literacies,Research,,Report,Final.pdf (Accessed: 18 January 2022).[/footnote]
  • According to the Lloyds Digital Consumer Index, among the 14.9 million people in the UK with low digital engagement, 74% are concerned about using (digital) sites/tools to enter personal details. Among the 9.8 million with very high digital engagement, 58% are concerned about using sites/tools to enter personal details. 51% of non-users say they are worried about privacy and security and having their identity taken, and 44% say they are worried about how organisations use their data (up by more than 10 points since 2020.[footnote]Lloyds Bank. (2021). Consumer Digital Index. Available at: https://www.lloydsbank.com/banking-with-us/whats-happening/consumer-digital-index.html (Accessed: 31 January 2022).[/footnote]
  • Findings from the 2019 Oxford Internet Survey show that 70% of respondents are not comfortable with companies tracking them online, and non-users are more likely to be concerned about privacy threats online (72% versus 52% among users). Only 29% of non-users think that technology is making things better. It is important to note higher concern is related to usage, not necessarily awareness.[footnote]Blank, G., Dutton, W. H. and Lefkowitz, J. (2019). Perceived Threats to Privacy Online: The Internet in Britain, the Oxford Internet Survey, 2019. SSRN Scholarly Paper ID 3522106. Rochester, NY: Social Science Research Network. Available at: https://doi.org/10.2139/ssrn.3522106[/footnote]
  • The ICO’s 2020 trust and confidence survey suggests that different demographic groups hold different views about different kinds of data use. It found that ‘Non-BAME respondents (60%) are significantly more likely to have a high level of trust and confidence in the Police storing and using their personal information than BAME respondents (50%).’[footnote]Worledge, M. and Bamford, M. (2020). ICO Trust and Confidence Report. Harris Interactive and Information Commissioner’s Office. Available at: https://ico.org.uk/media/about-the-ico/documents/2618178/ico-trust-and-confidence-report-2020.pdf.[/footnote]

Future research avenues: awareness of data

The studies cited here have drawn connections between understanding and attitudes, but they have not explicitly explored this connection. This evidence suggests that understanding alone does not engender trust; in fact, it may generate more critical views of data. It is also increasingly clear that trust is integral to closing some of the digital divides that contribute to unequal experiences and outcomes for people online, but exactly how trust and digital divides are connected remains poorly understood.

This means that, as noted by academics on the Living With Data project, more ‘analysis is needed [to] understand the relationship between awareness or understanding of data uses and attitudes towards them.’[footnote]Kennedy, H. et al. (2021). Living with Data survey report. University of Sheffield. Available at: https://livingwithdata.org/project/wp-content/uploads/2021/10/living-with-data-2020-survey-full-report-final-v2.pdf [/footnote]

Conclusion

It is clear that data – and the technologies built on it – will continue to define our societies and affect every member of the public. It is therefore paramount that people’s perspectives, attitudes and experiences directly shape the way data is governed. This is vital if we want the data-driven world we build to work for everyone, and if the UK wants to be world leading in data governance and innovation.

The research reviewed here represents only part of a vast body of work conducted in academia, civil society, the public sector and industry. Much of this research covers public attitudes towards data beyond just regulation: covering issues like inequality, privacy, agency and more. As data and data-driven technologies become increasingly central to our everyday lives, it is ever more important to bring these evidence-based insights together, identify key messages and themes, and develop policy that benefits people and society first and foremost.

The research reviewed for this briefing raises further questions that researchers and policymakers must consider:

  • How do the public expect regulation to balance the minimisation of harm with realising the benefits of data innovation?
  • How do the public define and determine what constitutes public benefit in data innovation?
  • What practical mechanisms for transparency meet public expectations?
  • What can public perspectives tell us about what trustworthy data use looks like in practice?
  • How can we build a more in-depth and robust understanding of the relationship between people’s awareness of data practices and their attitudes towards them?

In our report Participatory Data Stewardship, we describe a framework for such involvement of the public in the governance of data.[footnote]Ada Lovelace Institute. (2021). Participatory data stewardship. Available at: https://www.adalovelaceinstitute.org/report/participatory-data-stewardship/ (Accessed: 10 January 2022).[/footnote] One approach to this is to conduct regular surveys to track public opinion to monitor attitudes, and efforts to conduct such surveys will be informative for policymakers.[footnote]The CDEI public attitudes tracker is a good example of such regular surveying. See: Centre for Data Ethics and Innovation. (2022). Public Attitudes to Data and AI Tracker: Wave 1. Available at: https://www.gov.uk/government/publications/public-attitudes-to-data-and-ai-tracker-survey (Accessed: 15 November 2021).[/footnote] For these surveys to be meaningful, they will need to ensure granular representation of digitally excluded and other marginalised groups. 

However, the value- and context-based nature of many of these questions – such as what constitutes a use of data for the public benefit – means traditional research methods may struggle to provide concrete answers or meaningfully incorporate public perspectives. Here, more participatory, deliberative and dialogue-based methods will be required, such as citizens’ assemblies, Government-led public dialogue, and co-design or ethnographic practices.  And more experimental methods will be needed too, such as randomised control trials or futures thinking. These methods complement public opinion surveys, because participants are supported to give informed and reasoned views, which are of more value to effective, evidence-based policymaking than survey responses that cannot offer the same depth of consideration.

There is one other key finding that this evidence review offers: the value of public participation lies in not only helping align data use with societal values, but in offering vital signals for what trustworthy, responsible and ethical data use looks like in practice.

We conclude this review by recommending that the Government, academia, industry and civil society work together to ensure public participation, engagement and attitudes research meaningfully informs and is embedded in the UK’s future data regulation.

 

 

Notes on the evidence included in this paper

  • Public attitudes research often describes public ‘expectations’. Where we have reported what the public ‘expect’ in this review, our interpretation of this term means what the public feel is necessary from data practices and regulation. ’Expectation’, in this sense, does not refer to what people predict will happen.
  • Data and digital technologies (like the internet, artificial intelligence and computer algorithms) are deeply intertwined. In this review, we focus on research into public attitudes towards data specifically, but draw on research about digital technologies more broadly where we feel it is applicable, relevant and informative.
  • We focus on research conducted in the UK within recent years. The oldest evidence included dates from 2014, and the vast majority has been published since 2018. We chose this focus to ensure findings are relevant, given that events of recent years have had a profound influence on public attitudes towards technology, such as the Cambridge Analytica scandal, the growth and prominence of large technology companies in society and the impacts of the COVID-19 pandemic.
  • Various methodologies are used in the research we have cited, from national online surveys to deliberative dialogues, qualitative focus groups and more. Each of these methods has different strengths and limitations. We place no hierarchy on the value of any one particular method; each has a role to play, and the strengths of one approach complement the limits of another. It is the collective value of drawing from a range of robust methods that we prioritise, as this helps to address the complex and context-dependent nature of public attitudes towards data.

Image credit: SIphotography

1–12 of 15