Skip to content

Don’t start with the data: a people-centred approach to addressing health inequalities

Why identities and experiences can’t be reduced to categories

Mavis Machirori

16 March 2022

Reading time: 8 minutes

Crowded sidewalk on Oxford Street with commuters and tourists from all over the world.

On 14 February, the NHS Race and Health Observatory published a rapid review of evidence focusing on ‘ethnic inequalities in healthcare’. The report raises serious concerns about inequalities affecting UK minority ethnic groups.

We will use this evidence as a jumping-off point to understand how some forms of discrimination have become encoded in the health and care system’s use of data, and explore why a people-centred approach that focuses on the health experiences and identities of individuals and communities might be the key to tackling the sources of inequality in a world that uses data to address health and social concerns.

Based on literature searches covering 2011 to 2021 and looking at outcomes across mental health, digital access, maternity and neonatal care, genomics and the composition and experiences of the NHS workforce, the authors of the review identified ‘structural, institutional and interpersonal racism’ in the healthcare sector as the main cause of unequal outcomes.

While not explicitly focused on responding to the COVID-19 crisis, the Observatory’s report comes at a time of increased awareness of both health and social inequality in the UK. While these recent cases may be novel – as is well evidenced in the Race Observatory Report, inequalities in the healthcare system are not new. And of course, the COVID-19 pandemic has also highlighted how inequalities are entrenched in society beyond health and social care.

The findings demonstrate that, if unchecked, different factors can increase the COVID-related burden of mortality and morbidity faced by vulnerable groups in areas ranging from excess mortality among people from marginalised communities, limited digital access to technologies, and badly designed systems, such as algorithms disproportionately allocating funds away from minority groups, or oxygen monitoring systems that work less effectively on darker skin.

Health inequalities are increasingly understood as a matter of social justice – from the Race and Health Observatory review, to the Ada Lovelace Institute’s work with the Health Foundation on inequalities in data-driven systems, to the latest Marmot Review. However, how to tackle the sources of inequality in a world that increasingly (and often selectively) uses data to address health and social concerns is still an open question.

Data plays an important role in evidencing inequalities, but data-processing practices themselves – the ways in which data is collected or not, labelled, interpreted and put to use in regular healthcare provision as well as during emergencies – may be revealing of the discriminatory ways in which marginalised social groups are considered and treated. The COVID-19 pandemic offers an interesting and challenging case study to understand how forms of discrimination have become encoded in the health and care system’s use of data.

Exploring how data interacts with inequalities requires a methodological reflection on the use of data itself – on the role of data and data users – and on how to think about inequalities in an expansive manner, that is, without always deferring to potentially reductive categories that focus, for instance, on ethnicity alone.

To comprehend the acceleration and creation of new inequalities in the healthcare system, we need to move from abstract or quantitative notions of inequality (and equality) to a holistic approach that begins by asking: Who is a system or emergency intervention designed for? Who benefits from it? Who does not benefit? And who is left worse off?

Adopting this perspective will mobilise a conception of health and social inequalities that goes beyond narrow ethnicity categories and improves our understanding both of ethnicity-related and other forms of inequalities, based on various health and social factors. We need to consider the many ways in which people are represented in a particular health care system, where ethnicity is only one way of knowing how people experience health and social outcomes, and the system of representation itself.

In practice, this means conducting a critique of the use of data in healthcare that combines questioning and improving the current use of data with approaches that engage directly with marginalised communities and do not begin with data collection. We need to get better at consistently recording demographic and health data and go beyond the basics of individual datasets to consider how each of them is enrichened by contextual information.

This does not necessarily entail accelerating data collection, but rather using existing information in better ways, whether that is about geographical, economic or other types of social divides. It also means looking genuinely at different forms of inequality from the perspective of the groups who experience it.

First, we should interrogate the system in which decisions are made over data and how these decisions lead to the racial/ethnic inequalities that the Race Observatory report has surfaced. This includes being transparent about the reasons that motivate each and every step of the ‘data lifecycle’ – including why we choose to collect certain data, how we choose to analyse it, the methods of interpretation and the meaning attributed to data variables, the ways in which the limitations we find in datasets are mitigated.

All decisions in this process, even the apparently innocuous ones, may function as pathways to inequality that demand investigation. Interrogating the ‘data lifecycle’ and the rationale behind each of its steps will make visible how inequalities arise from the specific actions and intentions of a variety of data users that operate in a particular data landscape – this being the data-powered component of health and social care structures and institutions as a whole. It will also enable us to make population-wide interventions without necessarily compromising individual needs, or at the very least, being more explicit about the trade-offs we choose to make.

Limitations in data processes were evident at the height of the COVID-19 pandemic, when interventions relied heavily on rapidly emerging data. The absence of relevant datasets (such as employment), the incompleteness of others (including those related to ethnic background), or the limited ability to combine data in ways reflecting people’s experiences meant that public health interventions had variable rates of success. Interventions were based on partial information and usually targeted at a ‘population’ (both general and of assumed homogeneity).

Even when interventions were explicitly aimed at supporting minority groups, identification of a single category of ethnicity with little further disaggregation or explanation disregarded many of the specific ways in which people faced inequalities. We identified the effects of these data choices in our earlier report, The data divide, the first output from our research programme in partnership with the Health Foundation.

The research found that not everyone benefited from the accelerated use of data-driven technologies during the COVID-19 crisis. The majority of those less likely to have benefited from data-driven and digital interventions were not only from minority groups, but also from low-income backgrounds or the elderly. Inequalities were exacerbated for those individuals who sat at intersections of vulnerabilities, like the 13% of respondents in our survey, who reported that they did not have a smartphone and were at high risk of COVID-19.

If a data-driven intervention or health and care system is designed for the general population, or without consideration of how intersectional inequality is experienced, then its design risks reflecting and perpetuating existing inequalities. By not responding to structural data issues, a health system may inadvertently set itself up as a racist structure, as the Observatory’s review warns. Responding to these kinds of intersectional inequalities is therefore not only about recognising individual identities, but demands that we respond to how power imbalances are embedded in the very processes that categorise, code and name people, experiences noted through gender, race, sex, employment status, religion and more.

Understanding this context means that the tasks of: 1) questioning who a system is designed for, making the rationale behind the decisions taken throughout the ‘data lifecycle’, and how they may interact with inequalities, explicit and 2) collecting data that reflects the multiple and intersecting causes of inequality and experiences of power dynamics will lead us to a different and perhaps more critical task: questioning the meaning of categories, codes and structures. As we improve how and what kind of data we collect, we cannot forget that identities and experiences cannot simply be reduced to categories.

This means that, instead of starting from the data, we need to adopt a people-centred approach that focuses on the health experiences and the identities of individuals and communities, which cannot always be seen in health-related datasets. This stance can be taken as championing a comprehensive approach to the issue of inequality, which aims to consider all factors at play, such as economic deprivation, geographical and place-based inequalities, digital inclusion and skills.

This approach takes the social determinants of health as dynamic and continually interacting causes. It also helps us see that the social determinants we now use to talk about inequalities do not exist outside of data decisions, but come about because of what we choose to measure and collect and of how we, in turn, mobilise specific categories to address public-health emergencies.

The approaches raised in this piece could enable the health and care system to design and deploy data-driven interventions and insights that not only do not reproduce the situation reported by the Race Observatory review but actually mitigate the existing situation of racial/ethnic inequalities.

Starting from people’s experience means first to recognise them beyond the data and then intentionally collect data to address their specific needs, instead of using unrepresentative datasets to design systems, based on measures of inequality that are not aligned with people’s everyday experiences.

If the pandemic can deliver anything positive, it will be the opportunity for all actors, and particularly data users, in the health and social care data landscape to consider how their decisions may have reproduced inequalities. It is no longer possible to believe that one’s professional role is not part of a structural pathway to increasing or reducing inequalities.

We cannot continue talk about inequalities and address them for any social groups, without finding a way to truly represent people’s experiences in the data. This change in attitude may not necessarily start from data but from asking ‘Who is a specific choice going to benefit?’.

If you would like to find out more about our project on data-driven inequalities during the COVID-19 pandemic in collaboration with the Health Foundation, please email Mavis Machirori, Senior Researcher.