Data is often the fuel that powers AI used by organisations. It tailors search parameters, spots behavioural trends, and predicts future possible outcomes (to highlight a just a few uses). In response, many of these organisations seek to accumulate and use as much data as possible, in order to make their systems work that little bit faster or more accurately.
In many cases, providing the data is not subject to copyright or other such restrictions, this is without many issues – organisations are able to amass large quantities of data that can be used initially to train their AI systems, or, after deployment, continue to update their datasets to ensure the latest and most accurate data is used.
Where this becomes a potential issue, is when the data being collected and used is personal information. For example, the principle of ‘data minimisation’ requires that only the necessary amount and type of personal data is used to develop an AI system. This is at odds with the ‘data hoarding’ corporate mentality described above, which seeks to know as much detail as possible. Furthermore, the principle of ‘purpose limitation’ places several restrictions on the re-use of historic data sets to train AI systems. This may cause particular headaches when working with an AI vendor that wishes to further commercialise the AI which has benefited from the learnings and developments of your data in a way that is beyond the purpose for which the data was originally provided.
It is however acknowledged by the Information Commissioner’s Office (“ICO”), the UK’s data regulator, that AI and personal data will forever be interlinked – unavoidably so in certain situations. In response, in November 2022, the ICO released a set of guidance on how organisations can use AI and personal data appropriately and lawfully, in accordance with the data privacy regime of the UK. The guidance is also supplemented by a number of frequently raised concerns when combining AI with personal data, including: should I carry out an impact assessment, do outputs need to comply with the principle of accuracy, and do organisations need permission to analyse personal data.
In this article we discuss some of the key recommendations in the context of the wider regulatory landscape for data and AI.
The guide offers eight methods organisations can use to improve their handling of AI and personal information.
Take a risk-based approach when developing and deploying AI:
A first port of call for organisations should be an assessment of whether AI is needed for what is sought to be deployed. Most AI will typically fall within the remit of ‘high-risk’ if it engages with personal information for the purposes of the proposed EU AI Regulation (“AI Act”) (and likely a similar category within the developing UK framework). This will result in additional obligations and measures that will be required to be followed by the organisation in its deployment of the AI. A less technical and more privacy preserving alternative is therefore recommended by the ICO where possible.
Should AI be chosen after this, a data privacy impact assessment should be carried out to identify and minimise data risks that the AI poses to data subjects, as well as mitigating the harm it may cause. At this stage the ICO also recommends consulting different groups who may be impacted using AI in this context to better understand the potential risks.
Consider how decisions can be explained to the individuals affected:
As the ICO notes, it can be difficult to explain how AI arrives at certain decisions and outputs, particularly in the case of machine learning and complex algorithms where input values and trends change based on the AI’s ability to learn and teach itself based on the data it is fed.
Where possible, the ICO recommends that organisations:
- be clear and open with subjects on how and why personal data is being used;
- consider what explanation is needed in the context that the AI will be deployed;
- assess what explanations are likely to be expected;
- assess the potential impact of AI decisions to understand the detail required in explanations; and
- consider how individual rights requests will be handled.
The ICO have acknowledged that this is a difficult area of data privacy and has provided detailed guidance, co-badged with the Alan Turing Institute, on “Explaining decisions made with AI”.
Limit data collection to only what is needed:
Contrary to several held beliefs by organisations, the ICO recommend that data is kept to a minimum where possible. This does not mean that data cannot be collected, but rather appropriate consideration must be given to the data that is collected and retained.
Organisations should therefore:
- ensure that the personal data you use is accurate, adequate, relevant and limited, based on the context of the use of the AI; and
- consider which techniques can be used to preserve privacy as much as practical. For example, as the ICO notes, synthetic data or federated learning could be used to minimise the personal data being processed.
It should be noted that data protection’s accuracy principle does not mean that an AI system needs to be 100% statistically accurate (which is unlikely to be practically achievable). Instead organisations should factor in the possibility of inferences/decisions being incorrect, and ensure that there are processes in place to ensure fairness and overall accuracy of outcome.
Address risks of bias and discrimination at an early stage:
A persistent concern throughout many applications of AI, particularly those interacting with sensitive data, is bias and discrimination. This is made worse in instances where too much of one trend of data is used, as the biases present in such data will form part of the essential decision-making process of the AI, thereby ‘hardwiring’ bias into the system. All steps should therefore be taken to (to the extent that it reflects the wider trend accurately) get as much variety within data used to train AI systems as possible.
To greater understand this issue, the ICO recommends that organisations:
- assess whether the data gathered is accurate, representative, reliable, relevant, and up-to-date with the population or different sets of people with which the AI will be applied; and
- map out consequences of the decisions made by the AI system for different groups and assess whether these are acceptable from a data privacy regulatory standpoint as well as internally.
Where AI does produce biased or discriminatory decisions, this is likely to conflict with the requirement for processing of personal data to be fair, as well as obligations of several other more specific regulatory frameworks. A prime example of this is the Equality Act, which ensures that discrimination on the grounds of protected characteristics, by AI or otherwise, is prohibited. Care should be taken by organisations to ensure that decisions are made in such a way that prevents repercussions from the wider data privacy and AI regimes, as well as those specific to the sectors and activities in which they are involved.
Dedicate time and resources to preparing data:
As noted above, the quality of an AI’s output is only going to be as good as the data it is fed and trained with. Organisations should therefore ensure sufficient resources are dedicated to preparing the data to be used.
As part of this process, organisations should expect to:
- create clear criteria and lines of accountability about the labelling of data involving protected characteristics and/or special category data;
- consult members of protected groups where applicable to define the labelling criteria; and
- involve multiple human labellers to ensure consistency of categorisation and delineation and to assist with fringe cases.
Ensure AI systems are made and kept secure:
It should be of little surprise that the addition of new technologies can create new security risks (or exacerbate current ones). In the context of the AI Act and UK data privacy regulation (and indeed when a more established UK AI regime emerges), organisations are/will be legally required to implement appropriate technical and organisational measures to ensure suitable security protocols are in place for the risk associated with the information.
In order to do this, organisations could:
- complete security risk assessments to create a baseline understanding of where risks are present;
- complete regular model debugging on a regular basis; and
- proactively monitor the system and investigate any anomalies (in some cases, the AI Act and any future UK AI framework may require human oversight as an additional protective measure regardless of the data privacy requirement).
Human review of AI outcomes should be meaningful:
Depending on the purpose of the AI, it should be established early on whether the outputs are being used to support a human decision-maker or whether decisions are solely autonomous. As the ICO highlights, data subjects deserve to know whether decisions with their data have been made purely autonomously, or with the assistance of AI. In instances where they are being used to assist a human, the ICO recommends that they are reviewed in a meaningful way.
This would therefore require that reviewers are:
- adequately trained to interpret and challenge outputs made by AI systems;
- sufficiently senior to have the authority to override automated decisions; and
- accounting for other additional factors that weren’t included as part of the initial input data.
Data subjects have the right under the UK GDPR not to be subject to a solely automated decision, where that decision has a legal or similarly significant effect, and also have the right to receive meaningful information about the logic involved in the decision. Therefore, although worded as a recommendation, where AI is making significant decisions, meaningful human review becomes a requirement (or at least must be available on request).
Work with external suppliers involved to ensure that AI is used appropriately:
A final recommendation offered by the ICO is that where AI is procured from a third party, it is done so with their involvement. While it is usually the organisation’s responsibility (as controller) to comply with all regulations, this can be achieved more effectively with the involvement of those who create and supply the technology.
In order to comply with the obligations of both the AI Act and relevant data privacy regulations, organisations would therefore be expected to:
- choose a supplier by carrying out the appropriate due diligence ahead of procurements;
- work with the supplier to carry out assessments prior to deployment, such as impact assessments;
- agree and document roles and responsibilities with the external supplier, such as who will answer individual rights requests;
- request documentation from the external supplier that demonstrates they implemented a privacy by design approach; and
- consider any international transfers of personal data.
When working with some AI providers, for example, with larger providers who may develop AI for a large range of applications as well as offer services to tailor their AI solutions for particular customers (and to commercialise these learnings), it may not be clear whether they are a processor or controller (or even a joint controller with the client for some processing). Where that company has enough freedom to use its expertise to decide what data to collect and how to apply its analytic techniques, it is likely to be a data controller as well.