The Data Protection Commissioner (“DPC”) recently published guidance on the use of data anonymisation and pseudonymisation techniques. In our last blog, we examined these concepts and some of the key points in the DPC’s guidance. We focused, in particular, on the difficulties in implementing these techniques and the scope of what is considered to be personal data.
We take a closer look at some of the anonymisation techniques referenced in the DPC’s guidance. We also consider data protection law obligations arising for organisations wishing to anonymise or pseudonymise certain data sets.
The Data Protection Acts 1988 and 2003 (the “Acts”) do not explicitly recognise the concepts of data anonymisation and pseudonymisation, meaning there is no prescriptive standard of anonymisation under Irish law. Consequently, an organisation hoping to employ anonymisation techniques will have to decide, on a case-by-case basis, what techniques to use to sufficiently anonymise data sets. In this regard, the DPC’s guidance note is useful for organisations wanting to assess what techniques (or combination of techniques) they should use.
The DPC’s guidance note discusses the two main forms of anonymisation, namely randomisation and generalisation.
Randomisation techniques involve altering personal data in order to remove the link between the data and the individual. There are a host of randomisation techniques available including “noise addition” and “permutation”.
Noise addition (or “noise injection”) involves the addition of random variables to personal data to reduce the risk that an individual can be identified from the data. For example, in a database which records the height of individuals, each individual’s height could be increased, or decreased, by a small amount. It can be stated to be accurate only with a certain range, such as +/-10cm.
Permutation, on the other hand, involves the swapping or shuffling of data between the records of individuals, making it harder to identify a particular individual. For example, a data set containing the height of individuals could be “randomised” by shuffling the height values so that they are no longer connected to other information about the individual. These techniques are useful in reducing the risk of inference and the matching up of data between data sets.
Generalisation involves the dilution of identifiers attributable to data subjects so that individuals cannot be singled out. This can be done by modifying the scale of data attributable to an individual. For example, a data set containing the dates of birth of individuals could be diluted by using the year as opposed to the individuals’ day and month of birth. There are a wide number of generalisation anonymisation techniques including “k-anonymity”, “aggregation”, “l-diversity” and “t-closeness”.
The DPC briefly addresses other techniques, such as masking and pseudonymisation, and observes that these techniques, while useful, merely assist in reducing the risk of identification but are not sufficient on their own in anonymising data.
What are the Legal Obligations under the Acts?
Data which has been irreversibly anonymised ceases to be “personal data” and so falls outside the scope of the Acts. However, in order to anonymise personal data, the starting assumption is that the personal data must first have been collected and processed by organisations in accordance with the Acts.
According to the DPC, the process of anonymisation itself constitutes the further processing of personal data. Accordingly, if organisations wish to render personal data anonymous, this should be done in accordance with the Acts. Therefore, organisations should ensure that personal data is:
- obtained and processed fairly;
- kept only for specified, explicit and lawful purposes; and
- used only in a manner that is compatible with these purposes.
The DPC also warns that an organisation which extracts personal data from an anonymised dataset must do so fairly and in compliance with the Acts. If an organisation can identify an individual from the personal data, the organisation may become a data controller, depending on whether it meets the other criteria of a data controller under the Acts. Organisations must be aware that extracting personal data from anonymised sources may result in the organisation becoming subject to the obligations of the Acts.
Tips for Organisations
Organisations wishing to make use of anonymisation and pseudonymisation techniques should ensure to:
- Use a combination of techniques to ensure that data is sufficiently de-identified. There are inherent limitations in some anonymisation and pseudonymisation techniques. Careful consideration is required in devising appropriate anonymisation techniques.
- Take into account all means reasonably likely to be used to identify an individual – both within the organisation and held by third parties. Consideration should be given to additional data sets which an organisation may obtain and which could allow it to identify an individual.
- Test the effectiveness of anonymisation techniques regularly to ensure that they are sufficiently robust to avoid the identification of individuals. Consideration should be given to developments in re-identification technologies which may result in re-identification.
The content of this article is provided for information purposes only and does not constitute legal or other advice.