Anonymisation has always been (and still is) a real challenge for those carrying out clinical research. To shed some light on this matter, the Medical Research Council (MCR) – which is part of UK Research and Innovation – has recently published guidance on Identifiability, anonymisation and pseudonymisation (the guidance). Although the guidance itself states that it has been developed with the participation of the Information Commissioner’s Office (ICO), it is not ICO-approved and so institutes and organisations should be cautious when relying on the criteria set out in the guidance.
Identifiability of information
The guidance recognises that the law is binary on the definition of data in the sense that data is either personal data (subject to the General Data Protection Regulation 2016/679 (GDPR)) or anonymised (and, therefore, outside the scope of the GDPR). However, identifiability is far from being a binary concept. The guidance points out that identifiability is a continuum, which varies from direct, real-world identifiers (e.g. name, email addresses, NHS numbers and the Community Health Index in Scotland), through “jigsaw identifiability” (in other words, putting together pieces of information to identify individuals), to complete anonymity.
The guidance acknowledges that obtaining total anonymity of person-level information may be impossible and, particularly, that almost no person-level information, rich enough to be useful for research, could be considered inherently anonymous.
Is it possible to anonymise person-level information?
The MRC acknowledges that is still possible to anonymise personal data and use it for research purposes without being subject to the GDPR if the following steps are taken with respect to the content and context of the information.
Content-wise, in order to ensure that information is rendered anonymous, organisations involved in clinical research should:
- Remove all direct real-world identifiers from the information (pseudonymisation); and
- Limit the potential identifiability of the remaining information through different techniques, including anonymisation techniques (e.g. Barnardisation).
Additionally, the control of the context in which the information will be viewed is also important and the steps to be taken include the following:
- Ensuring that the recipients of the information have no access to the codes used for pseudonymisation and that they do not also know other information that may help identification (e.g. when information is about one of their patients).
- Ensuring that appropriate controls are in place to limit the risk that recipients would make any attempt to re-identify information.
The guidance provides different criteria to control the context in which the information is viewed depending on whether the information will be kept within the organisation or it will be shared with third parties.
Where the information is shared with third parties, the guidance states that anonymisation can be achieved through:
- Data sharing agreements which impose obligations on the recipient not to attempt to re-identify information and envisage what to do in the event of accidental re-identification;
- Appropriate information security or governance policies in place;
- Appropriate training for all those accessing the information;
- Compliance with professional bodies’ codes of practice, including sanctions for not complying; and/or
- Sharing information in a safe environment (use is physically restricted in a trustworthy environment).
With regard to disclosure within an organisation the guidance recognises that the organisation will have access to the real world identifiers of the pseudonymised data and that any internal controls would not be sufficient to render the data anonymous. In other words the guidance acknowledges that personal data cannot be anonymised within the same organisation, which is in line with the GDPR interpretation of anonymisation. The guidance goes on to say that risks of common law disclosure within the organisation can be reduced through robust controls such as pseudonymisation, appropriate organisational security measures, internal training and compliance with professional bodies’ codes of practice.
The guidance also acknowledges that there may be rare occurrences (e.g. a rare disorder) where the likelihood of re-identification would be greater and the chances of getting anonymised information lower. In these circumstances, research could continue but subject to GDPR considerations, e.g. by managing the individuals’ expectations through consent. Another instance where anonymisation is not possible according to the guidance as identifiability is too great is in relation to NHS numbers and the Community Health Index (CHI) in Scotland.
Pseudonymisation is not the same as anonymisation and, as set out above, pseudonymised information would still be subject to the GDPR. Indeed, pseudonymisation should be seen as a technical and organisational measure which limits the risk of re-identification to some extent.
Genetic (sequence) information
With regard to genetic (sequence) information, the guidance states that where the steps above are undertaken with respect to the content and context of the data (i.e. pseudonymisation, anonymisation techniques, data sharing agreements, employer sanctions, training, etc.), it is possible to share genetic information and use it for research purposes without the need to comply with the GDPR.
However, as genetic data can expose information also about family members, the aforementioned controls must be particularly robust when sharing this information. Additionally, the guidance explains that there are different types of genetic information and that, where the level of uniqueness of the genetic sequence raises, so does the likelihood of identification.
Similarly, it may not be possible to anonymise genetic data where researchers/clinical staff are familiar with specific sequence patterns and know who they relate to. In such cases consent should be obtained for confidentiality issues and GDPR requirements should be taken into account.
EDPB and ICO guidance on anonymisation
The European Data Protection Board’s (EDPB) and the ICO’s traditional (pre-GDPR) standards for anonymisation have been certainly higher than those followed by the MRC and, therefore, institutes and organisations should look at the guidance carefully and be aware that following it bears some risks. While the ICO has announced that they are working on updating the current version of its anonymisation code of practice, neither the EDPB work programme 2019/2020 nor the agendas of its recent plenary meetings have mentioned any forthcoming guidance on anonymisation at European level aimed at updating/replacing the existing guidance adopted by the former Article 29 Working Party back in 2014.
As the guidance has not been approved by the ICO, institutes and organisations wishing to rely on it should anyway continue to apply strong security to the data. In particular, recipients of data anonymised in accordance with the guidance should to the extent possible continue to treat such data as personal data to minimise risks.