Skip to content

The tension between the promises and realities of data use in health and social care

A partnership with the Health Foundation to study the relationships between data-driven systems and inequalities in health

Mavis Machirori , Kira Allmann , Mai Stafford , Josh Keith

10 June 2022

Reading time: 8 minutes

Crowd of people walking along the South Bank next to the River Thames in London

You don’t need to look too deeply to recognise that there is a tension between the present, data-optimistic policy discourse and the unequal distribution of the benefits of data-driven health and social care.

While data can play a substantial role in understanding and addressing inequalities in this context, increased reliance on its use risks exacerbating existing forms of discrimination. Pursuing the discourse that the use of data itself brings positive outcomes, without addressing the infrastructural inequalities underlying data-driven systems, might even generate new forms of discrimination and create new vulnerable groups in ways that are not always easy to identify.

The use of data-driven technologies in the UK health and social care system has risen rapidly in recent years. The trend has only accelerated during the COVID-19 pandemic, when data-driven interventions have been a central feature of the emergency response.

The NHS COVID-19 contact tracing app, downloaded by more than 30 million people in England and Wales (and equivalents in Scotland and Northern Ireland); the Government’s coronavirus data dashboard, which many of us have relied upon as a trusted source of information on the prevalence and spread of the virus; the data-driven approach to identifying those who are clinically vulnerable; and the use of predictive analytics to anticipate additional pressure on NHS units are just a few examples of the types of data-driven interventions that have emerged.

The use of data-driven systems has expanded in parallel with the enthusiasm of policymakers. The draft Data Saves Lives white paper, which encourages service providers in the sector to embrace data analytics to improve health and social care provision, exemplifies the now prevalent discourse.

The recent Goldacre review recommends specific strategies to maximise the benefits of using data in health research. Some of the steps towards this goal include avoiding duplication of work, establishing open methods, implementing the use of clear standards for data curation and shared research environments that gradually phase out outdated privacy mechanisms, such as an over-reliance on pseudonymisation.

However, the tangible benefits that more effective data-driven interventions may hold for improving health and social care services – such as helping with early diagnoses, moving care closer to home, encouraging healthier behaviours, hospital bed management and care prioritisation – may come with unintended and possibly unexpected consequences.

They may deliver positive outcomes, but only for some people, due to the structural discriminations that are embedded in the systems that collect and process data, as previously highlighted by the Race and Health Observatory Review. Incomplete datasets and the exclusion of ethnic minority groups from data collection and analysis, for instance, interact with intersecting identities and circumstances of deprivation, already amplifying existing inequalities and cutting people out of the benefits of data-powered applications.

Studies are beginning to draw attention to some of these data-related issues, such as the emerging digital divide, the data divide and their effects on the widening social and health inequalities across diverse communities.

These analyses, however, have not yet been able to identify concretely how discrimination is exacerbated, or exactly which factors are at play, nor have they articulated how we can plan data-driven interventions in such a way that they reverse rather than accelerate the process.

In other words, there is a lack of evidence where we need it most: we do not really know what happens at the intersection of data-driven health and social care and existing inequalities and how the former may amplify or mitigate the latter.

The research partnership between the Health Foundation and the Ada Lovelace Institute has been established to develop this missing evidence base and understand what measures should be taken to ensure that the use of data benefits everyone. Our project has been investigating the data-driven technologies and methods adopted by the UK Government to stop the spread of COVID-19, with a focus on how they have impacted, negatively and positively, on health inequalities.

This question is particularly difficult to address due to the complex nature of the systems involved: the UK health and social care system, sophisticated data-driven interventions and systemic intersectional injustices. Potential discriminatory assumptions can creep into each step of an intervention, from problem-framing to design to deployment, and at the stages of data-collection, algorithmic training, data analysis by different research teams, etc.

Notably, the decisions made at each of these steps and their immediate consequences are not always obviously connected to the inequalities people later face as a result. This means that they may also be creating new vulnerable groups.

For example, in the first phases of the COVID-19 pandemic, stay-at-home orders, informed by R-numbers deduced from contact tracing data, did not always acknowledge that issues of social isolation could be exacerbated for those without access or the skills to use digital technologies. These issues increased mental health needs and reduced people’s ability to carry out necessary tasks, such as shopping online.

Digital inclusion or exclusion is also becoming a determinant of health in general. When people are unable to access remote public services, not only are they not benefitting, but they are also unrepresented in the data collected through the service and used to develop new services in the future. Moreover, as has been highlighted, deploying data interventions without adequately interrogating the sources and the contexts of the data, further fuels inequalities.

In fact, recognising and then mitigating inequalities is becoming increasingly challenging over time. To truly enable change, we need to be able to compare and establish which parts of a system impact on inequalities – either by reducing or exacerbating them. Crucially, we need to be able to link each intervention’s decision-making process to people’s experiences and outcomes.

This enduring challenge is fundamental to realising the promises of data and AI strategies for health and social care for all, but is made harder by the present pace of technological transformation. Overlooked inequalities are already being baked into data-driven systems, making the identification, measurement and documentation of unequal outcomes a truly complex task.

To improve our capacity to evaluate different systems might require the collection of more and/or different data, but it also requires tracing all decisions and assumptions made about what goes in and out of a data-driven system.

How are we investigating these issues?

Building on our initial report, ‘The data divide’, we aim to explain how the design and deployment of data-driven systems interact with existing inequalities, what impact it has on people’s experiences of health, and whether specific demographic groups are disproportionately impacted.

We also aim to stimulate a thoughtful debate on the measures to take to ensure that data-driven systems make a positive contribution to reducing inequalities in health and social care.

Our project has three workstreams:

  1. Desk-based research on the deployment of data-driven approaches during the COVID-19 pandemic emergency, supplemented by stakeholder interviews with experts working in health and social care, data analytics, industry and local government offices.
  2. Place-based, ethnographically informed interviews conducted by peer-researchers from the APLE collective across the UK, drawing on their lived experience of social and economic marginalisation.
  3. An in-depth case study of one data-driven health service tool to understand how the design and decisions taken in the context of a specific intervention may affect and be affected by existing social inequalities.

We are particularly aware of the fact that the complexity of the issues at stake makes translating the evidence gathered into meaningful advice and recommendations challenging. How will different stakeholders, in their respective positions, approach the issues that the evidence surfaces? Which actions are likely to be more impactful in the short, medium and long term?

Continued engagement with analytics specialists, product owners, patient safety professionals, AI ethics practitioners and policy advisers from central Government will be a key part of understanding our findings in more detail, and co-creating recommendations for these different groups.

Furthermore, ensuring a more holistic understanding of inequalities requires that the voices of patients, marginalised communities and those who have supported them during the pandemic are heard in policy discussions too.

Convening these discussions will go part of the way to fulfil one of our objectives: explaining how developers, analysts, engineers, policymakers, project managers and frontline service providers can each play a role in mitigating inequalities. Inevitably, this will require bringing conceptual notions down to the level of day-to-day practice and making them relevant to individuals operating in the system, turning the issue of inequality from an abstract problem to a concrete question.

How can you get involved?

If any of these areas interest you, you work with data to design or deploy data-driven interventions, or are interested in inequalities and want to be informed of our findings as they emerge, please get in touch with Mavis Machirori:

Image credit: mattjeacock