This piece looks at data trusts and data trust frameworks.
Data trusts are gaining traction as an innovative way to facilitate trusted data sharing. The idea came to public attention in the October 2017 Hall/Pesenti ‘Growing the UK AI Industry’ report, which described data trusts as:
“a set of relationships underpinned by a repeatable framework, compliant with parties’ obligations, to share data in a fair, safe and equitable way”.
Passing the baton to the Open Data Institute (‘ODI’), the Report’s top recommendation was that:
“Government and industry should deliver a programme to develop Data Trusts – proven and trusted frameworks and agreements – to ensure exchanges are secure and mutually beneficial”.
Although incubated in AI, data trusts have broader potential across the field of data science and, more generally, in helping organisations manage data sharing responsibilities around GDPR/personal data, non-personal data, cloud/information security and governance as well as AI deployment/ethics. The ODI found in research in April 2019:
“that there is huge demand from private, public and third sector organisations in countries around the world to explore data trusts. Whilst organisations have different ideas about what data trusts could do, they are nevertheless enthusiastic and eager to find ways of sharing data whilst retaining trust, and still deriving benefits for themselves and others.”
In July 2019, the ICO endorsed this view in their draft data sharing code of practice consultation:
“There is a great deal of interest, both in the UK and internationally, in the concept of ‘data trusts’. … In essence they are a new model to enable access to data by new technologies (such as artificial intelligence), while protecting other interests and retaining trust, and following a “privacy by design” approach. They have potential for use in data sharing”.
Towards a definition of data trust
The ODI has done the heavy lifting around what a data trust is. It found the term interpreted variously as a ‘repeatable framework of terms and mechanisms’, ‘mutual organisation’, ‘legal structure’, ‘store of data’ and ‘public oversight of data access’, before coming down in favour of ‘a legal structure that provides independent stewardship of data’. In addition to aligning to the ODI’s principles for good data infrastructure, the ODI set out six characteristics that it believes a data trust should have:
- a clear purpose;
- a legal structure ‘including trustors, trustees with fiduciary duties and beneficiaries’;
- rights and duties over stewarded data;
- a defined decision making process;
- a description of how benefits are shared; and
- sustainable funding.
In the commercial arena – likely to be where many data trusts will operate – there are two initial issues with the ODI’s suggested definition. First, advocating a legal structure implies a separate legal entity, which in turn imposes formalities that may not be necessary in all use cases, particularly where a similar result may more simply be obtained through an ecosystem of clearly defined contract terms that each participant expressly accepts.
Second, as commercial lawyers, we’re taught to steer clear of fiduciary duties where we can. This is because fiduciary duties are onerous and challenging to calibrate precisely, and because the remedies for breach of fiduciary duty are more extensive than for breach of contract. This is not to downplay data stewards’ responsibilities in any way, but to repeat that clearly expressed contractual rights and duties can achieve a similar (or better) result. They can also be more easily negotiated and risk insured.
The answer may be to accept that ‘data trust, the framework’ (what the Hall/Pesenti report described as a set of contractual relationships underpinned by a repeatable, legally compliant framework) can live alongside ‘data trust, the entity’ (proposed by the ODI) and that each may have a role to play in particular use cases.
What does a data trust framework (‘DTF’) look like?
Essentially, we’d see a DTF as a legal framework and a set of common operating rules, technical specifications and interfaces (APIs) applying for the DTF’s particular purposes and agreed between all the participants of the IT ecosystem concerned. Together, the legal and operating rules, specs and interfaces enable and manage all ‘lifecycle’ activities for the data concerned (acquisition, flow, storage, use, sharing, consumption and deletion) within the ecosystem.
The DTF is underpinned by a standardised approach to data categorization, data management and data governance. Data is categorized by a set of commonly defined terms to describe all use cases – each data processing and data sharing activity within the ecosystem. Data management takes the key step of recognising data as business assets (or liabilities) and their value (or risk) to ecosystem participants. Looking at data through the ‘asset/value’ lens in this way enables impact to be assessed and appropriate decisions taken on personal data (should it be hashed or anonymized? what are the AI ethics risks? what does the DPIA say?) and other data in the ecosystem. Data governance frames issues, decisions and responsibilities for the Board/senior stakeholders.
Standardising the approach to data categorization, management and governance enables DTFs to be built from componentry using ISO/IEC and other internationally agreed technical standards. For example, the ISO/IEC has published:
- ISO/IEC 19944 (on data categories, flows and use for cloud services and devices);
- ISO/IEC 29134 (on privacy impact assessment guidelines) and 29151 (code of practice for PII/personal data). The ISO/IEC has also established JTC 1/SC 42 on AI and this has a number of AI standards in the pipeline; and
- IS0/IEC 38505-1 on data governance for the organisation.
Combining an approach based on technical standards with design sprints and usability workshops means that a particular DTF can be constructed quickly, that DTF can be modified for other use cases efficiently, and that different DTFs can work together.
Examples of data trusts and DTFs
Although DTFs look set to proliferate in the months ahead, currently they are relatively few and far between. A few examples include:
Silicon Valley Regional Data Trust. SVRDT aggregates and uses:
“data from different educational organisations in California and seeks to enable the use of data currently siloed in different organisations for purposes including policy, research and case management.”
Trūata. Trūata, which counts MasterCard and IBM as foundational partners:
“enables its clients to derive the maximum value from their data assets while complying with the highest data protection standards. Offering its clients a service to independently anonymise data, enabling them to conduct privacy-enhanced analytics to drive business growth, uphold customer trust and protect brand reputation.”
SITA BagTrust. SITA, the leading IT provider to the air transport industry, offers a range of services that ‘track baggage like a parcel’. SITA is planning to launch BagTrust as a new feature of its baggage services for its airlines customers. BagTrust will enable the airline to manage its GDPR policy by setting preferences for data sharing and deciding which airport partner will have access to which pieces of data. The range of preferences is to be based on rules published by SITA that SITA has developed using its domain expertise. This ‘published rules’ approach is preferred to specific contract terms to foster greater transparency and trust.
HMG data trust pilot. In January 2019, the UK government announced it was investing £700,000 in a “world-first ‘data trust’ programme to be piloted in the UK”, with three initiatives including WILDLABS Tech Hub to tackle illegal wildlife poaching and WRAP to address food waste.
Data law and compliance issues are proliferating around personal data/GDPR, non-personal data, AI deployment/ethics, cloud/information security and data governance. Data volumes are growing by 30% to 40% each year. As a tool to help manage data sharing issues and in the face of this exponential growth, data trusts and DTFs appear to be on the cusp of widespread adoption with great potential as a practical and workable way forward. We will be hearing a lot more about them in the months ahead.