Corporate and contractual mechanisms can create an ecosystem of trust where those involved:
- establish a common purpose
- share data on a controlled basis
- agree on structure (corporate or contractual).
Exploring legal mechanisms for data stewardship
This is chapter three from a joint publication from the Ada Lovelace Institute and the AI Council.
Why corporate and contractual mechanisms?
Corporate and contractual mechanisms can facilitate data sharing between parties for a defined set of aims or an agreed purpose. For the purposes of this report, it is envisaged that the overall purpose of a new data model will be to achieve more than mere data sharing, and
data stewardship can be used to generate trust between all the parties and help overcome relevant contextual barriers. The core purpose for data sharing will be wider than just the benefit gained by those who make use of data.
The role of the data model we envisage therefore includes:
- enabling data to be shared effectively and on a sustainable basis
- being for the benefit of those sharing the data, and for wider public benefit
- ensuring the interests of those with legal rights over data
- ensuring data is ethically used and in accordance with the rules of the institution
- ensuring data is managed safely and securely.
How to establish the right approach?
The involvement of an independent data steward is envisaged as a means of creating a trusted environment for stakeholders to feel comfortable sharing data with other parties who they may
not necessarily know, or with whom they have not had an opportunity to develop a relationship of trust.
Incentives for allowing greater access to data and for making best use of internal data will vary according to an individual organisation’s circumstances and sector. While increased efficiency, data insights, improved decision making, new products and services and getting value from data are potential drivers, there are also a number of challenges to sharing data:
- operating in highly competitive or regulated sectors, and concerns about undermining value in IP and confidential information
- a fear of being shown up as having poor-quality or limited data sets
- a fear of breaching commercial confidentiality, competition rules or GDPR
- a lack of knowledge of business models to support data sharing – access to examples, lessons learned and data sharing terms can help others feel able to share
- a lack of understanding of the potential benefits
- not knowing where to find the data or limited technical resource to implement (e.g. to extract the data and transform it into appropriate formats for ingestion into a data-sharing platform)
- fear of security and cybersecurity risks.
All these challenges can lead to inertia and lack of motivation.
Where a group of stakeholders see benefits in coming together to share data they will still need to be confident that this is done in a way that maintains a fair equilibrium between them, and that no single stakeholder will dominate the decision as regards the management and sharing of data. In order to establish and maintain the confidence of the stakeholders, they should all be fully engaged in the determination of what legal mechanism should be established. One or two stakeholders deciding and simply imposing a structure on other stakeholders is unlikely to engender a sense of trust, confidence and common purpose.
It is for this reason that we recommend the following approach.
1. Establish a clearly defined purpose
Establishing a clearly defined purpose is the essential starting point for stakeholders. Not only will a compelling statement of purpose engender trust among stakeholders, but it will also provide the ultimate measure against which governance bodies and stakeholders can check to ensure that the data-sharing venture remains true to its purpose. A clearly defined purpose can also help in assessing compliance with certain principles of the GDPR and other data-related regulations, including ePrivacy,1 or Payment Services Directive 2,2 which are often tested against the threshold of whether a data-processing activity, or a way in which it is carried out, is ‘necessary’ for a particular purpose or objective.
Any statement of purpose will need to be underpinned by agreement on:
- the types of data of which the data-sharing venture will take custody or facilitate sharing
- the nature of the persons or organisations who will be permitted access to that data
- the purpose for which they will be permitted access to that data
- the data-stewardship model and governance arrangements for overseeing the structure and processing, including to enforce compliance with its terms and facilitate the exercise of rights by individuals, and to ensure that data providers and data users have adequate remedies if compliance fails.
2. Data provider considerations
The data-sharing model has to be an attractive proposition for the intended data providers, with clear value and benefit, and without unacceptable risk. There will need to be strong and transparent governance to engender the level of confidence required to encourage data sharing. This will include confidence in not only the data provider’s ability to share the data with the data-sharing venture without incurring regulatory risk or civil liability, but also in its ability to recoup losses from the data-sharing set up or from relevant data users if the governance fails and this results in a liability for the data provider. Other considerations for governance could be related to managing intellectual property rights and control over products developed based on the data shared.
3. Data user considerations
As with the data providers, the data-sharing model must be an attractive proposition for intended data users. The data will need to be of sufficient quality (including accuracy, reliability, currency and interoperability) and not too expensive, for the data users to want to participate. Data users will also require adequate protection against unlawful use of data. For example, in relation to personal data, data users will typically have no visibility of the origins of the datasets and the degree of transparency (or lack of it) provided to the underlying data subjects. They will also be relying on the data providers’ compliance with the governance model to ensure that use of the contributed datasets will not be a breach of third-party confidentiality or IP rights.
4. Data steward considerations
The data steward’s role is to make decisions and grant access to data providers’ data to approved data users in accordance with the purpose and rules of the data-sharing model. The steward may take on additional responsibilities such as due diligence on data providers and users, and enforcement of the purpose of the data-sharing model; however, the way in which the model is funded and structured3 will impact on the extent of any such duties and who is practically responsible for performing them.
The responsibilities of the data steward may impact on considerations for the data providers and data users of the overall impact on risk and developing trust in the relationships.
5. Relationship/legal personality
The formal relationship between the parties will depend on the previous steps and the project structure that the stakeholders are comfortable with, based on the relevant risk, economic, regulatory and commercial considerations. Where there is no distinct legal personality, the
relationship may be governed by a series of contracts between the data providers, users and data steward – whether bilateral or a contract club with multiple parties. Where there is a legal personality, then as well as there being likely to be a series of contracts, there will be the documents establishing the relevant legal entity.
6. The rules
The rules of the data-sharing model will form part of the corporate and/or contractual relationship between stakeholders. This is discussed in more detail below in the ‘Mapping data protection requirements onto a data sharing venture’ section and in Annex 1 on ‘Existing mechanisms for supporting data stewardship’ when discussing regulatory mechanisms.
What is the appropriate legal structure?
As outlined, the aim is to design an ecosystem of trust. The data stewardship model will sit at the heart of this ecosystem. In this section we address two broad possibilities as to the legal form this should take:
- a contractual model: this would involve a standardised form of data sharing
agreement without the establishment of any form of additional legal structure or personality
- a corporate model: this would involve the establishment of a company or other legal person, which would be responsible for various tasks relating to the provision of access to and use of data. The documents of incorporation would be supplemented by contractual arrangements.
In the contractual model, all of the rules for the operation of the data venture would need to be set down (and repeated) in a series of bilateral (or multilateral) agreements between data providers and data users. This, when combined with the fact that each party would need to take action on its own behalf to enforce the terms of that agreement against any counterparties, makes it likely that providers of data will only be willing to provide access to data on highly specific terms.
Where the aims of the stakeholders will require significant flexibility and scalability then a simple contractual model may not be the most appropriate. For example, a contractual model does not easily accommodate dedicated resources which may be required to govern and administer a growing data-sharing establishment (such as full-time employees, for which an employing entity is required). Also, an independent entity may find it easier to vary the rules of participation, or make other changes for the benefit of all, as the model evolves or laws change. Whereas a multilateral contractual arrangement may require protracted negotiation amongst the various stakeholders who each bring their own commercial objectives to the discussion.
In the corporate model, there is a degree of flexibility and scalability that is lacking from the contractual model. This model requires a greater degree of trust on the part of stakeholders, however. In conceptual terms, data providers are being asked to give up a degree of control over the data they are providing – presumably in return for some incentive or reward. They will only do so if they feel they can trust the structure or organisation that has been set up to effect this.
We consider three forms of company here: a company limited by shares, a company limited by guarantee (a CLG) and a community interest company (a CIC).
Whichever form is chosen, the company in question would operate as the data-platform owner and manager, and would enter into contractual arrangements with providers of data and proposed users.
The contractual terms would allow for:
- required investment in the company to fund infrastructure requirements such as platform development and maintenance – this could be by way of non-returnable capital contribution or loan from either the data provider or data users as circumstances merit required returns on supply of data
- required charges for use of the data
- other contractual rights and obligations specific to the circumstances including access to and usage of data.
Returns and charges could be related to commercial exploitation or fixed. Also, depending on the nature of the venture, data users may be obliged to share insights gained from access to the data with the venture so that it can be shared with other data users (e.g. see the Biobank example below). The contract terms would dictate all required obligations and liabilities between the contracting parties. Bear in mind that the structure of a data-sharing venture could be adapted over time. For example, at the outset, the stakeholders may not be in a position to finance the establishment and resourcing of a corporate entity, or it may not be seen as appropriate to a data-sharing trial. As the venture scales, however, the stakeholders may determine that a corporate structure should be implemented.
1. Choice of corporate form
One of the key questions that will determine the appropriate form of company, is whether the data-sharing venture is intended to be able to make a profit other than for the benefit of its own business – i.e. whether profits are required to be applied to the furtherance of its business,
or whether surplus profits may be dividended up to the data-sharing venture’s shareholders.
CLGs are not usually used as a vehicle for a profit-making enterprise, and a CLG’s articles of association will often (but not always) prohibit or restrict the making of distributions to members. Any profits made by a CLG will generally be applied to a not-for-profit cause such as the data-sharing venture’s purpose.
A CLG may be the most appropriate vehicle where it is not envisaged that profit or surplus generated will be distributed to its members; and it is not envisaged that the institution will seek to raise debt or equity finance. In this case activities will need to be financed by other means,
such as revenue generated from its own activities including the provision of data services or third-party funding. If the focus changes over time to encompass more commercial activities, then establishing a trading subsidiary company limited by shares could also be considered.
It should be borne in mind that a CLG (unlike a company limited by shares) does not have share capital that it is able to show on its balance sheet. This often makes it more difficult for a CLG to raise external debt finance. The alternative possibility available to companies limited
by shares, of investment by way of equity finance, is precluded here because of the structure of the CLG. Because of these difficulties, it is worth drawing attention to CICs as a further alternative corporate vehicle.
A CIC is a limited-liability company that has been formed specifically for the purpose of carrying on a business for social purposes, or to benefit a community. Although it is a profit-making enterprise, its profits are largely applied to its community purpose rather than for private gain. This is achieved by way of a cap on any movements of value from the CIC to its shareholders or members (such as by way of dividends).
This model allows shareholders to share in some of the profit, while ensuring that the CIC continues to pursue its community purpose. CICs are regulated by the Office of the Regulator of CICs (the CIC Regulator), and are required to file a community interest statement at Companies House, which is also scrutinised by the CIC Regulator. The CIC’s share
capital would appear on its balance sheet, thus increasing its ability to raise external finance.
If surpluses generated by its activities (including the provision of data services) are to be applied to its business, and its financing arrangements are secure, then a CLG will likely assist in gaining traction with those stakeholders who believe that the independence of the data trust would be compromised by virtue of its ability to pay dividends to shareholders. The structure of a company limited by guarantee provides a well-established framework of governance and liability management, and avoids the risk of exposure to a proliferation of liabilities that exists
in shareholding and trust environments.
A guarantor, which could be a non-government organisation (NGO) or other suitably established and populated body, could be appointed to monitor compliance and governance. This could address the requirement for oversight in a way that is specific to the requirements
of the platform and data supplier, and to subjects not easily undertaken by other pre-established bodies, such as the Charity Commission or Regulator of Community Interest Companies, neither of which is specifically equipped to perform this function.
2. Governance and rules
The agreed purpose for the data-sharing venture will drive the overall governance of the data arrangement and its objectives, the rules for its operation and the parameters for all data-sharing agreements entered into. That purpose and those objectives should be reflected (including, where appropriate, as binding obligations) in its governance framework,
rules and the contractual framework governing the provision and use of data.
While governance and rules are not necessarily made public documents, the greater the degree of transparency as to the data venture’s operations, the greater the level of confidence that stakeholders and the wider public will be likely to feel in its functioning. Strong and transparent governance is a critical factor in establishing trust to encourage data
sharing. The rules and governance framework will underpin the purpose. Confidence that strong governance will ensure strict compliance with the rules of the trust and enforcing any failings is critical.
There needs to be confidence that the interests of all key stakeholders are represented. In a corporate model, there are a number of means of achieving this that may include board representation and/or a mix of decision-making and advisory committees representing the various interest groups. Boards and committees that are made up of trusted, respected independent members will also help engender confidence.
Depending on the circumstances and scale of the data-sharing venture, as well as an overall Governance Board, there may be an Operations Committee, a Funding Risk Advisory Committee, an Ethics Committee, a Technical Committee and a Data Committee. Alternatively, committees might be set up to represent different groups of stakeholders e.g. data providers, data users and data subjects.
With the contractual model, it would also be possible to constitute an unincorporated governance body, such as a board that comprises representatives of the stakeholders, together with some independent members who have relevant expertise. However, one can foresee potential practical difficulties with governance bodies that are more ad hoc and decentralised, including generating sufficient trust for data providers and users to submit to the jurisdiction of the body via the contractual arrangements.
The documentation will need to cover the constituent parts that make up the data-sharing venture and also, if the contractual model is adopted, how these will be constituted from among the stakeholders. Participants will need to sign up to the rules of the venture, either as a stand-alone document, or by incorporation into the operational agreements, such as a data-provision agreement or data-use agreement, or the articles of a corporate vehicle. The exact contracting arrangements will be bespoke to the specific arrangement. If the venture is intended to enable additional participants to join, there will also need to be robust
arrangements (e.g. through accession agreements) to avoid re-execution of multilateral arrangements for each new joiner.
The common agreement could prescribe the arrangement in broad terms, the nature of the data that will be collected; the identity or class of the persons or organisations with whom it will be shared; and the uses to which such persons or organisations will be entitled to put that data.
It can address leaver/joiner bases,4 due diligence, terms that underpin certain values or principles, for example the five data-access ‘control dimensions’ commonly referred to as the ‘Five Safes’.5 Or, in the context of personal data, the core principles contained in Article 5 of the GDPR, change approval, the financial model for the operation of the club, dispute resolution, etc.
As mentioned above, the framework documents would need to cover the purpose of the venture and the type(s) of data in issue, along with the identity of persons or entities, or types of those that may be granted access, and the use to which they may put that data.
In addition, the documents will need to cover other important areas, such as:
- technical architecture
- decision-making roles
- the obligations of each participant and how any monitoring or audit of data use, particularly in respect of personal data will take place
- information security.
There will inevitably be other areas that the rules should also cover.
Key legal considerations include data protection and privacy law; regulatory obligations or restrictions; commercial confidentiality; intellectual property rights; careful consideration of liability flows (particularly important if personal data is in issue), competition and external contractual obligations. As will be seen from some of the examples (detailed in the section below), such as iSHARE, it is possible to utilise existing standard documents to cover off some of the key issues, rather than developing everything from scratch. For example, existing open-source licences could be used to protect intellectual property rights of the data providers and control data usage, bolstered by data-sharing arrangements specific to the venture.
As regards the nature of the data and its use in specific circumstances, the data providers may want to share data on a segregated and controlled basis. This means there will not be access to overall aggregated data, but there may be layered access or access to a limited number of aggregated datasets to reflect any restrictions on sharing of some data (e.g. certain data only to be shared with certain users or shared for specific insights/activities). In some instances there may be agreement to pool datasets between parties. The following requirements may be set:
- each contributor would provide raw data/datasets that include but are not limited to personal data, and that data could include normal personal data as well as special category/sensitive personal data
- no contributor would see all the raw data provided by the other contributors6
- each contributor would want to be able to analyse, and to derive data and insights from aggregate datasets, without being able to identify individuals or confidential data in the datasets
- individuals whose data is shared in this way would have the usual direct rights under data protection law in relation to the processing of their personal data.
Mapping data protection requirements onto a data-sharing venture
Where the data-sharing venture will involve processing of personal data, it will of course be necessary for all data providers, users and others processing personal data to comply with the GDPR (see in Annex 1 some of the key GDPR considerations). Depending on the nature of
the legal structure, there will be contractual terms and also potentially a Charter/Code of Conduct or Rulebook setting out the obligations of the data providers and data users including those relating to the GDPR. In some sectors, these may incorporate by referencing internationally recognised standards for data sharing, rather than completely reinventing the wheel.7
It will be necessary for each stakeholder who processes data (whether they are a data controller, joint data controller or data processor) to ensure they are compliant with GDPR requirements. This will be determined by the individual circumstances and a particular stakeholder may well be a data controller in some regards and a joint data controller
in others. Similarly, a stakeholder may be a data controller as regards some processing and a data processor in relation to others.
Privacy-enhancing technologies (PETs) are increasingly being advocated as a means to help ensure regulatory compliance and the protection of commercially confidential information more generally. For example, technologies facilitating pseudonymisation, access control and
encryption of data (in transit and at rest) and more sophisticated PETs such as differential privacy and homomorphic encryption. This is an area of development with some already mature market offerings and others still undergoing significant development.
Examples of data-sharing initiatives with elements of data stewardship
1. The Data Sharing Coalition
The Data Sharing Coalition is an international initiative started in January 2020, after the Dutch Ministry of Economic Affairs and Climate Policy invited the market to seek cooperation in pursuit of cross-sectoral data-sharing.8 It ‘builds on existing data-sharing initiatives to enable data sharing across domains. By enabling multilateral interoperability between existing and future data-sharing initiatives with data sovereignty as a core principle, parties from different sectors and domains can easily share data with each other, unlocking significant economic and societal value.’
It aims to foster collaboration between a wider range of stakeholders, providing a platform for structured exchange of knowledge in the data-sharing coalition community.9 It plans to explore and document generic data-sharing agreements which it will capture in a Trust Framework governed by the Coalition. It will support the development of existing and new data-sharing initiatives, including around technical standards, data semantics, legal agreements, and trustworthy and reusable digital identities.
The Data Sharing Coalition has six core principles:
- Be open and inclusive: any interested party is welcome to participate in the Data Sharing Coalition.
- Deliver practical results: the Data Sharing Coalition will deliver functional frameworks and facilities that provide true value for all stakeholders of the data economy and that will help them accelerate in their data sharing context.
- Promote data sovereignty: the Data Sharing Coalition aims to enable the entitled party(ies) to control their data by including this as a requirement in the use cases and frameworks.
- Leverage existing building blocks: all Data Sharing Coalition frameworks and facilities will incorporate international open standards, technology and other existing facilities where possible.
- Utilise collective governance: all frameworks and facilities produced by the Data Sharing Coalition will be governed in a transparent, consensus-driven manner by a collective of all Data Sharing Coalition participants.
- Be ethical, societal and compliant: all activities of the Data Sharing Coalition are in line with societal values and compliant with relevant legislation.
It has two initial use cases:
- green mortgages for investment in energy-saving measures
- improving risk management for shipment insurance.
The Data Sharing Coalition currently has about 30 member participants including: iSHARE, IDSA, MAAS Lab, Equinix, NLAI Coalition, Amsterdam University: Connect2Trust, Dexes, ECP, Equinix, FOCWA, Fortierra, GO FAIR, HDN, International Data Spaces Association, iSHARE, KPN, Maas-Lab, MedMij, Nederlandse AI Coalitie, NEN, Netbeheer Nederland, Nexus, NOAB, Ockto, Roseman Labs, SAE ITC, SBR, SURF, Sustainable Rescue, TanQyou, Techniek Nederland, Thuiswinkel.org, Universiteit van Amsterdam, UNSense, Verbond van Verzekeraars and Visma Connect.
iSHARE is a Dutch Transport and Logistics Trust Framework for data sharing and was developed as part of the Government-backed Data Sharing Coalition.10
It is a decentralised model, where parties maintain control of what data will be shared with whom and on what conditions/for what purpose. iSHARE is not a platform but a framework. INNOPAY co-created the iSHARE framework with about 20 organisations (customs, ports, logistics, etc). It has only the list of participants and the fact that they have agreed to and demonstrated conformance with operational, technical and legal specifications; so it deals with identification, authentication and access. The idea is that an accession agreement removes the need for separate bilaterals.
It doesn’t appear to involve any data stewardship in the sense of a trusted third party being given control of what data is shared, for what purpose and with whom.
iSHARE is trying to facilitate info on or access to various agreement terms to choose from. The website has a 50-page document setting out typical agreement terms for data sharing and then links to 10–15 sets of licences, and a table for each one setting out which of those typical
terms that particular licence covers.11 The aim is to have 50 sets of terms during 2020. Currently, the licence agreements include Creative Commons, Google API Licence, Montreal, ONS, Open Banking, NIMHDA, Apache, CDLA – (copyleft Linux), Open Database Copyleft, Swedish API Open Source, Microsoft Data Use Agreement and Norwegian Open Data. Currently about 20 organisations are participants.
3. Amsterdam Data Exchange
AMDEX was initiated by the Amsterdam Economic Board and was backed by Amsterdam Science Park and Amsterdam Data Science.12 The project is supported by the City of Amsterdam.
‘The Amsterdam Data Exchange (in short: Amdex) aims to provide broad access to data for researchers, companies and private individuals. Inspired by the Open Science Cloud of the European Commission, the project is intended to connect with similar projects across Europe.
And eventually even become part of a global movement to share data more easily.’
Amdex’s CTO, Ger Baron is quoted as follows: ‘Since 2011, the municipality have had an open data policy. Municipal data is from the community and must therefore be available to everyone, unless privacy is at stake. In recent years we have learned to open up data in different
ways… We want to share data, but under the right conditions. This requires a transparent data market which is exactly what the Amsterdam Data Exchange can offer.’
The owner decides which data can be shared with whom and under what conditions. They build a ‘market model in which everyone is able to consult and use data in a transparent, familiar manner.’ 13
4. INSIGHT: The Health Data Research Hub for Eye Health
INSIGHT is a collaboration between University Hospitals Birmingham NHS Foundation Trust (lead institution), Moorfields Eye Hospital NHS Foundation Trust, the University of Birmingham, Roche, Google and Action Against AMD.
INSIGHT’s objective is to make anonymised, large-scale data, initially from Moorfields Eye Hospital and University Hospitals Birmingham, available for patient-focused research to develop new insights in disease detection, diagnosis, treatments and personalised healthcare.
Access to the datasets curated by INSIGHT is through the Health Data Research Innovation Gateway. Applications to access the data will be reviewed by INSIGHT’s Chief Data Officer and then passed to the Data Trust Advisory Board (Data TAB). The Data TAB is formed of members of the public, patients and other stakeholders joining in a private capacity.
Applications will be accepted or rejected in a transparent manner and applicants will need to sign strict licensing agreements that prioritise data security and patient benefit.
Currently the governance of INSIGHT is managed through the Advisory Board but at the recent ODI Data Institutions event, it is anticipated that a company Limited by Guarantee may be created.
5. Nallian for Cargo
Nallian is a common infrastructure for data sharing between commercial sectors.14 Nallian for Air Cargo is a set of applications built on top of Nallian’s Open Data Sharing Platform. The platform allows all stakeholders of a cargo community to connect and share relevant data across their processes, resulting in de-duplication and a single version of the truth for the benefit of airport operators, ground handlers, freight forwarders, shippers, etc. Each data source stays in control of who sees which parts of his data for which purpose. Example communities include Heathrow, Brussels and Luxembourg (e.g. Heathrow Cargo Cloud).15
6. Pistoia Alliance
The Pistoia Alliance’s mission is to lower barriers to R&D innovation by providing a legal framework to enable straightforward and secure pre-competitive collaboration.16 The Alliance is a global, not-for-profit members’ organisation conceived in 2007 and incorporated in 2009 by representatives of AstraZeneca, GSK, Novartis and Pfizer, who met at a conference in Pistoia, Italy.
The Pistoia Alliance’s projects help to overcome common obstacles to innovation and to transform R&D – whether identifying the root causes of inefficiencies, working with regulators to adopt new standards, or helping researchers implement AI effectively. There are currently more than 100 member companies – ranging from global organisations, to medium enterprises, to start-ups, to individuals – collaborating as equals on projects that generate value for the worldwide life sciences community.
Biobanks collect biological samples and associated data for medical-scientific research and diagnostic purposes and organise these in a systematic way for use by others.17 The UK Biobank is a registered charity that had initial funding of circa £62 million. Its aim is to improve the prevention, diagnosis and treatment of a wide range of serious and life-threatening illnesses such as cancer, heart disease and dementia.
UK Biobank was established by the Wellcome Trust medical charity, Medical Research Council, Department of Health, Scottish Government and the Northwest Regional Development Agency. It has also had funding from relevant charities. UK Biobank is supported by the National Health Service (NHS). Researchers apply to access its resources. The resource
is available to all bona fide researchers for all types of health-related research that is in the public interest. Researchers submit an application explaining what data they would like access to and for what purpose. The website provides summaries of funded research and academic papers.
Researchers have to pay for access to the resource on a cost-recovery basis for their proposed research, with a fixed charge for initiating the application review process and a variable charge depending on how many samples, tests and/or data are required for the research project.
- UK Biobank remains the owner of the database and samples, but will have no claim over any inventions that are developed by researchers using the resource (unless they are used to restrict health-related research or access to health-care unreasonably).
- Researchers granted access to the resource are required to publish their findings and return their results to UK Biobank so that they are available for other researchers to use for health-related research that is in the public interest.
The personal information of those joining the UK Biobank is held in strict confidence, so that identifiable information about them will not be available to anyone outside of UK Biobank. Identifying information is retained by UK Biobank to allow it to make contact with participants when required and to link with their health-related records. The level of access that is allowed to staff within UK Biobank is controlled by unique usernames and passwords, and restricted on the basis of their need to carry out particular duties.
8. Higher Education Statistics Agency
The Higher Education Statistics Agency (HESA) is the body responsible for collecting and publishing detailed statistical information about the UK’s higher education sector.18 It acts as a trusted steward of data that is made available and used by public-sector bodies including universities, public-funding bodies and the new Office for Students.
HESA was set up by agreement between funding councils, higher education providers and Government departments. It is a charitable company operating under a statutory framework and it is a recognised data source for ‘statistical information on all aspects of UK higher
education’.19 It was confirmed as a designated data body (DDB) for Higher Education in England in 2018.20
HESA collects, assures and disseminates higher education data on behalf of specific public bodies e.g. Department for Business, Energy and Industrial Strategy (DBEIS), Department for Education (DfE), Office for Students (OfS), UK Research & Innovation (UKRI) and its counterparts in the rest of the UK. As DDB, it compiles appropriate information about higher education providers and courses and makes this available to OfS, UKRI and the Secretary of State for Education. It consults as to the information it publishes with providers, students
and graduate employers. OfS holds HESA to account, reporting on its performance every three years.
HESA provides a trusted source of information, supporting better decision making, and promoting public trust in higher education. In addition, it is driven by the wider public purpose of advancing higher education in the UK.
It deploys statistical and open-data techniques to transform and present higher education data. It looks to develop low-cost techniques to improve quality and efficiency of data collection, and aims to ensure as much data as possible is open and accessible to all.
HESA may charge cost-based fees, operating on a subscription basis.
9. Safe Havens Scotland NHS Trusts for Patient Data
Safe Havens were developed in line with the Scottish Health Informatics Programme (SHIP), a blueprint that outlined a programme for a Scotland-wide research platform for the collation, management, dissemination and analysis of anonymised Electronic Patient Records(EPRs).21 The agreed principles and standards to which the Safe Havens are required to operate are set out in the Safe Haven Charter. They aim to get funding research from grants.
The Safe Havens provide a virtual environment for researchers to securely analyse data without the data leaving the environment. Their data repositories provide secure handling and linking of data from multiple sources for research projects. They also provide research support, bringing together teams around health data science. The research coordinators provide support to researchers navigating the data requirements, permissions landscape and provide a mechanism to share the lessons from one project to the next. Users are researchers who are vetted and approved. Data is never released, and personal data cannot be sold. Together, the National Safe Haven within Scottish Informatics Linkage Collaboration (SILC)22 and the four NHS Research Scotland (NRS) Safe Havens have formed a federated network of Safe Havens in order to work collaboratively to support health informatics research across Scotland.
All the Safe Havens have individual responsibility to operate at all times in full compliance with all relevant codes of practice, legislation, statutory orders and in accordance with current good professional practice. Each Safe Haven may also work independently to provide advice and assistance to researchers as well as secure environments, to enable health informatics research on the pseudonymised research datasets they create. The charter and the network facilitate collaboration between the Safe Havens by ensuring that they all work to the same principles and standards.
Problems and opportunities addressed by corporate and contractual mechanisms
Many organisations have started to explore data sharing via the use of contracts, and this model is already used in practice. The complexity of the governance model will vary depending on whether the relationships involved are one-to-one or multi-party data-sharing arrangements and whether there are singular use cases or multiple uses for the same type of purpose. Where the tools of use such as machine learning or AI become part of the agreement, further consideration is needed for defining the architecture of the legal mechanisms involved.
Multi-party and multi-use scenarios using corporate and contractual mechanisms will need to ensure an independent governance body is able to function within the structure. The role of the specific parties involved in the data ecosystem, their responsibilities, qualifications and potential competing interests will need to be considered and balanced. A difficult question emerges where the stewardship entity is absent. In this scenario, who would be the data steward that a contract could be entered into with? For example, an oversight committee composed of representatives of data users and providers could be established, but this would not be a legal entity with an ability to contract.
Other requirements that will need thoughtful consideration, as they have been mentioned throughout this chapter, are connected to the privacy and security of the data, the retention and deletion policy, and restrictions on use and onward transfers and rules of publication of
results or research.
To conclude, a series of steps need to be walked through with stakeholders to reach an agreed decision about the model to be employed. Concrete use cases are more likely to generate tangible and efficient mechanisms for the sharing of data, than vague overarching
statements of general purpose. The key element here is stakeholder engagement and the more engagement that can be encouraged at the design stage – in terms of purpose, structure and governance – the more likely it is that a data-sharing venture institution will succeed.
Image credit: Who_I_am
- Directive 2002/58/EC of the European Parliament and of the Council of 12 July 2002 concerning the processing of personal data and the protection of privacy in the electronic communications sector (Directive on privacy and electronic communications). Available at: https://eur-lex.europa.eu/legal-content/EN/ALL/?uri=CELEX%3A32002L0058 [Accessed 18 Feb. 2021].
- Directive 2015/2366 of the European Parliament and of the Council of 25 November 2015 on payment services in the internal market. Available at: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32015L2366 [Accessed 18 Feb. 2021].
- Article 11 and Recital 25 of the draft Data Governance Act include requirements for data-sharing services to be placed in a separate legal entity. This is required both in business-to-business data sharing as well as in business-to-consumer contexts where separation between data provision, intermediation and use needs to be provided. The text does not distinguish between closed or open groups.
- In order to improve the chances of participation, and where technically feasible, the exit arrangements for leavers should focus on the ability of a participant to leave the venture and remove their data. This respects the data sovereignty of the participant and enables them to remain in control of data, particularly important for personal data as participants will be conscious of their obligations under GDPR.
- The ‘Five Safes’ comprise: safe projects, safe people, safe data, safe settings and safe outputs. Ritchie, F. (2017). The “Five Safes”: a framework for planning, designing and evaluating data access solutions. [online] Zenodo. Available at: https://zenodo.org/ record/897821 [Accessed 18 Feb. 2021].
- As part of the stewardship model, one of the protections should be only the data needed for an activity is accessed by other participants/stakeholders.
- An example is the Rules of Participation used by Health Data Research UK (HDR UK). Organisations requesting data access from one of the hubs set up through HDR UK (including the INSIGHT hub) are required to commit to these rules, which reference published standards. See Health Data Research UK (2020). Digital Innovation Hub Programme Prospectus Appendix: Principles For Participation. [online]. Available at: www.hdruk.ac.uk/wp-content/uploads/2019/07/Digital-Innovation-Hub- Programme-Prospectus-Appendix-Principles-for-Participation.pdf [Accessed 18 Feb. 2021].
- See The Data Sharing Coalition (n.d.) Home. [online]. Available at: https://datasharingcoalition.eu.
- The Data Sharing Coalition published an exploration on standards and agreements for enabling data sharing. See Data Sharing Coalition (2021). Harmonisation Canvas [online]. Available at: https://datasharingcoalition.eu/app/uploads/2021/02/210205- harmonisation-canvas-v05-1.pdf
- See Support Centre for Data Sharing (2020). iSHARE: Sharing Dutch transport and logistics data. [online] Support Centre for Data Sharing. Available at: https://eudatasharing.eu/examples/ishare-sharing-dutch-transport-and-logistics-data [Accessed 18 Feb. 2021].
- Support Centre for Data Sharing (2019). Report on collected model contract terms. [online]. Available at: https://eudatasharing.eu/ sites/default/files/2019-10/EN_Report%20on%20Model%20Contract%20Terms.pdf [Accessed 18 Feb. 2021].
- For more information see Amsterdam Smart City (2020). Amsterdam Data Exchange [online]. Available at: https://amsterdamsmartcity.com/updates/project/amsterdam-data-exchange-amdex [Accessed 18 Feb. 2021].
- For more information see Nallan (2020). Home. [online] Available at: www.nallian.com [Accessed 18 Feb. 2021].
- For more information see Heathrow (2020). Cargo. [online] Available at: www.heathrow.com/company/cargo [Accessed 18 Feb. 2021].
- For more information see Pistoia Alliance (2020). About. [online]. Available at: www.pistoiaalliance.org/membership/about [Accessed 18 Feb. 2021].
- For more information see UK Biobank (2020). Home. [online]. Available at: www.ukbiobank.ac.uk [Accessed 18 Feb. 2021].
- For more information see HESA (2020). About. [online] Available at: www.hesa.ac.uk/about
- HESA (2017). HE representatives comment on consultation on designated data body [online] hesa.ac.uk. Available at: www.hesa.ac.uk/news/19-10-2017/consultation-designated-data-body [Accessed 18 Feb. 2021].
- See HESA (2020). Designated Data Body. [online]. Available at: www.hesa.ac.uk/about/what-we-do/designated-data-body [Accessed 18 Feb. 2021].
- Scottish Government (2015). Charter for Safe Havens in Scotland: Handling Unconsented Data from National Health Service Patient Records to Support Research and Statistics. [online] www.gov.scot. Available at: www.gov.scot/publications/charter-safe-havensscotland- handling-unconsented-data-national-health-service-patient-records-support-research-statistics/pages/3 [Accessed 18 Feb. 2021].
- For more information see Data Linkage Scotland (2020). Home. [online ] Available at: www.datalinkagescotland.co.uk [Accessed 18 Feb. 2021].
A joint publication with the AI Council, which explores three legal mechanisms that could help facilitate responsible data stewardship
The executive summary for a joint publication with the AI Council, which explores three legal mechanisms for responsible data stewardship
Chapter one from Exploring legal mechanisms for data stewardship – a joint publication with the AI Council
Chapter two from Exploring legal mechanisms for data stewardship – a joint publication with the AI Council