Data stewardship is a relatively new, but increasingly popular concept that represents both a fundamental rehaul of existing top-down decisions on data, as well as more functional practices on how this can be done. It can help rethink how the value of data is defined and distributed, and create mechanisms for people to negotiate on their data rights.
In a shocking admission in late April this year, the Government of India acknowledged in Parliament that no data was available on the number of migrant deaths due to the harsh lockdown the country imposed in response to the COVID-19 pandemic, which forced millions of migrant workers to move back to their homes at very short notice. Faced with the reality of no work or income, many took arduous journeys on foot in the absence of trains or buses.
There have been varying efforts to consolidate data for migrants across the country during the health emergency. For example, the India Observatory dashboard run by a civil society organisation, the Foundation for Ecological Security (FES), presented data on movements of migrants in real-time as well as information on critical facilities and relief measures. However, it lacked proper consent protocols, did not offer migrants control over their data and was unable to deploy data to provide better care to the workers.
This absence of reliable information renders the migrant’s experience invisible and points to an imbalance of power in the data economy. On one hand, migrant workers are in unequal and data-extractive relationships with the State, regularly parting with personal details, including biometric identification, in return for services like subsidised food grains or the right to work.
On the other hand, migrant workers’ vulnerabilities are accentuated when relevant data about them is unavailable during emergencies. This points to the fact that data is fundamentally about power, its collection is used for the exertion of control, surveillance and its absence is used for exclusion. This social and data injustice needs to be addressed to enable workers to bargain better with the State and the private sector.
One possible mechanism to tilt authority in the favour of migrant workers, in this case, and those of vulnerable communities more broadly, is through data stewardship – a set of governance, legal and technological instruments to enhance data subjects or individual and community’s agency and decision-making over data governance, while harnessing data for societal impact.
There are several ways in which data can be stewarded with greater participation from the people whose data it is – data subjects. For instance, personal data stores enable individuals to seek benefits by sharing their data; data cooperatives can ensure that groups make data-related decisions in the interest of the collective; data trusts revolve around a legally entrusted group that makes decisions on behalf of data principals/subjects.
Examples of data stewardship models are emerging globally. Digi.me is a personal data store that aggregates individual data from multiple sources and allows subjects to share it with the companies they choose. This model aims to empower individuals and help them extract value from the data they produce.
Data cooperatives, such as the MiData platform, which shares medical data for research, allow members to collectively govern the use of their data. Similarly, Driver’s Seat, a driver-owned cooperative, collects and sells mobility data from drivers to city governments and research institutes to create value for society and people. The cooperative model can be used to generate societal value or help groups evolve mechanisms of participative governance and negotiate better on issues of data rights.
Data trusts have a similar function to cooperatives i.e. they aim to empower collectives to draw more value from their data. But they are differently governed. A trust entails a legally defined board of trustees that exercises delegated consent on behalf of the people/community that are part of the trust and is committed to taking decisions in their best interest. Data trusts can be used for multiple purposes, from unlocking the value of data to solving societal problems, to helping groups use data for collective empowerment. Each of these models have their own logic, rules and governance systems defined by their own peculiar context.
The Data Economy Lab, a collaborative effort between Aapti Institute and Omidyar Network, has studied over 100 existing models of data stewardship and are in the process of drawing out lessons about how to approach more operational questions for designing models across sectors. Our early insights suggest that the purpose of the steward, or the problem it is intended to solve, will define which governance stewardship framework to adopt. Further, sector-specific rules would apply to data stewards and outline their characteristics: a data stewardship model that is structured to make transport data available to mobility start-ups will be designed very differently from a migrants’ data stewarding body that exists to parley with the State for better use of existing data.
An evolved data stewardship ecosystem would include multiple types of stewards, offering a variety of levels of control, governance models, security benefits, degree of fiduciary responsibilities etc. in different sectors. Data principals would be able to choose one data steward to use their medical data for research, and another to share their mobility data with municipalities for more efficient public transport. Giving users choice and control is fundamental to a people-centric data economy.
The vision of a vivacious user-centric ecosystem of data stewardship models is further in the future. At the moment, we are still trying to answer foundational questions. For instance, if stewardship models aim to empower communities through data and want to implement bottom-up principles of governance, how do we define communities in the context of data? Digital groups are complex, and unlike geographic communities, do not have clearly defined boundaries. They are also difficult to organise and coalesce around specific issues. Furthermore, individuals may be part of multiple data communities – the interests of which may overlap or be antagonistic to one another. Individuals may be part of groups on social media as well as being platform workers, but each of these groups has different types of lived experiences with data. We have to study data communities further to identify shared norms and data experiences.
Another functional question that needs to be resolved concerns how we communicate the value of data stewardship as an ideal, and relatedly the value of data itself. Given the unequal relationships between technology companies and people, the fact that data is a valuable resource that can be used for negotiation is still not necessarily understood. For instance, drivers on ride-hailing applications are forced to part with significant data that is used to generate profits for companies, but the drivers who are struggling for social security, and labour rights are not equipped to also bargain for data rights.
To reimagine the existing fundamentals of how power is organised and wielded in the data economy, there need to be multiple and simultaneous experiments to test various aspects of stewardship. It is imperative to study and learn, to understand the possibilities and limitations of stewardship in practice. It is also important to trickle down ideas of reframing data governance to diverse communities through civil society organisations, labour unions and citizen action groups, to bring forward their concerns and evolve models of data stewardship.
Through robust data stewardship practices, vulnerable communities such as migrant workers in India could be empowered to navigate exploitative and extractive data relationships with the State and private sector more effectively and discover avenues for how their data can be used in their own interest.
Building a system for operationalising data rights and holding those in power more accountable is imperative for rebalancing power towards individuals and communities.
This article is the second in a series about data stewardship. Across the series, researchers and practitioners working in different organisations and contexts, who each have a unique perspective on data stewardship, will share practical experience and research ideas.
It’s not possible or desirable for one person or organisation to decide what a ‘good’ use of data is. That’s why we hope this series and our research will help push forward thinking on how to govern data for good and ensure diverse voices contribute to defining it.
Image credit: imagedepotpro