Facial recognition technology is a complex area, which means the risk of misunderstandings is high.
To make it simpler, we’ve created a grid of important terms and issues that can form a basis for a shared language. This will make it easier for everyone – from technical experts to policymakers to the public – to be able to participate in the conversation on facial recognition.
Many things come under the umbrella of ‘facial recognition technology’: using your face to unlock your mobile phone, using a shop or a bar’s security camera footage to match against a watchlist of possible shoplifters, checking someone’s age when buying alcohol at self-checkouts, or using an airport e-passport gates. All of these are called facial recognition technology, but there are significant differences in how they work, the data they use, where they happen and who is in control of the system.
These four dimensions help clarify different cases of facial recognition technology, and the issues that surround them:
- Task – when people talk about facial recognition technology, they’re usually referring to the use of advanced statistical analysis to do one of a number of tasks
- Data – the data on which a facial recognition system operates
- Deployment – where and how a facial recognition technology is used in the world
- Purpose – by who, and for whom, the technology is used
Facial recognition technology usually refers to using advanced statistical analysis on images of people to do one (or more) of the following tasks:
- Detection – identifying that there is a face in an image or series of images, and where it is located in the image(s). This is usually the first step of a facial recognition process, to enable matching, identification or classification on only the relevant parts of an image.
- Clustering – grouping similar faces in a collection of images. For instance, if a system had photographs of people attending a football match, clustering could be used to group together the photos of each unique face at the football match, without having pre-existing data about attendees.
- Matching – comparing a facial image or images against a pre-existing set of images to see if there’s a match e.g. matching faces from shop surveillance footage against a list of images of barred persons or matching faces from a crowd against images of ‘persons of interest’ in police watchlists.
- Identification – comparing a face to a specific identity. This could be done for two purposes:
- Verification – answering the question, ‘Is this person who we think they are?’ e.g. the police checking if the person in an image matches their suspect.
- Authorisation – answering the question, ‘Is this person who they claim to be?’ e.g. to allow access to something, such as using Face ID to unlock an iPhone.
- Classifying – identifying a characteristic of a face, such as age, gender or expression. This is sometimes referred to as facial analysis because it’s used to tell us something about the face e.g. at a supermarket checkout to assess whether someone is old enough to buy alcohol.
Apple’s Face ID lets you unlock your phone with your face. This is an example of facial identification for authorisation. This differs from Facewatch, a facial recognition security system for businesses, which does facial matching of CCTV footage against a watchlist of ‘persons of interest’. Facial ‘emotion’ recognition has been in the news after use by Unilever in the UK for job interviewing. This is classification of facial images according to expressions – a smile, a frown – though the claim that this can then be used to identify emotion is highly contested.
Facial recognition systems are biometric systems – they use biometric data, which is sensitive as it’s very personal and based on biological characteristics that are hard for people to change. If someone finds out your password, you can change it. But if someone has a recording or photo of your face, you probably don’t want to change your face. Many people already have photos of themselves on the internet, and it can be hard to spot if you’re recorded by CCTV, making it difficult for people to know when their data might be collected or subject to facial recognition technology.
Probe images are new, unknown images collected to input into a facial recognition system when it’s in use. Here are several ways probe images may differ across facial recognition systems:
- Personal/private vs public – images can be taken in public places, from government databases, from the internet, or collected privately.
- Retention and duration – how long data will be stored, what format it will be stored in (e.g. original images, metadata or abstracted representation) and where it will it be stored are relevant, as are opportunities for redress and transparency about retention.
- Resolution and image quality – resolution and other quality factors such as lighting will make images more or less likely to be accurately recognised by a system.
There are also specific considerations around training data – which is initial data used to develop a machine learning model for facial recognition, often referred to as ‘training’ the model. This is typically a set of images or features already labelled by a human that the model can ‘learn’ from. Additional considerations here include how representative the images are and the risk of bias – if one group of people is over/under represented in the training data, the system might be better/worse at recognising them.
Facial recognition systems can be used in different environments or scenarios – such as in airports, bars or out on the streets. This can be called the deployment of the system. Some ways to think about these different deployments are:
- Live vs after-the-fact – whether images are processed ‘live’, by which we mean near-real time, versus at a later point can change the way outputs can be used, and could have implication for transparency and civil liberties. For instance, ‘live’ facial recognition technology can result in actions being taken immediately, whereas performing facial recognition on historical images, such as yesterday’s CCTV, can raise questions as to whether people are aware of how their image is being processed.
- Controlled vs uncontrolled – a controlled environment is one where important factors like lighting, background and position can be controlled, e.g. a lab environment, or e-passport gates where parts of the process are standardised. An uncontrolled environment is the real world, where all these things may vary, making facial recognition much more challenging.
- Transparency – those deploying facial recognition technology may be more or less transparent about the fact it is being used, how it is being used and how it works. That means we may not always be able to identify the characteristics described here for every system.
Lastly, it’s important to consider for and by who the technology is being used – whose purpose? In our first research into public attitudes, ‘Beyond Face Value’, we included a range of different purposes. We found 77% of people were uncomfortable with facial recognition technology being used in shops to track customers and 76% were uncomfortable with it being used by HR departments in recruitment, but that people were more comfortable with use in policing or airports.
Sometimes the technology will be used for more than one of these purposes (just as it could do more than one task), and will likely face multiple data considerations. Identifying what those purposes are, however, is important to framing the discussion and pinpointing causes of excitement or concern. A technology or system may be helpful for one group’s purposes, but problematic when used for the purpose of another.
At a glance
We hope these terms are helpful to conversations moving forward. If you’ve used them or have feedback – we’d love to hear about it. You can reach us at firstname.lastname@example.org. If you’d like to share these terms to help clarify the conversation, we’re on Twitter @adalovelaceinst.
We’ve also put the table of terms and definitions on GitHub and welcome contributions.
Thanks to William Isaac from Google DeepMind for feedback on a draft of this piece.