The privacy regulator, the Office of the Australian Information Commissioner (OAIC), has issued two new guides that will be useful for any business considering data sharing and analytics.

The first is the Guide to Data Analytics and the Australian Privacy Principles, a draft of which was issued in 2016, and the second is De-identification and the Privacy Act which replaces business resource 4 De-identification of data and information, originally issued in 2014.

Why are they relevant?

Guide to Data Analytics and the Australian Privacy Principles

Data analytics is generally thought of as not needing to comply with the Privacy Act as it entails working on big data sets, assumed by many not to contain personal information and hence not be regulated by the Privacy Act. This assumption is often not correct.

The Guide to Data Analytics and the Australian Privacy Principles helpfully defines some key terms such as big data, data integration, data mining and data matching to show how the use of data sets can or may involve the collection, use, creation and disclosure of personal information in the analytics process.

For those supplying data sets, or maintaining the sets used for analytics, and for those determining the output of the analytics process, the threshold question is set: does the Privacy Act apply to this data?

If it does, it may prevent the intended analytics. Importantly, this is a question that should be asked before the process is undertaken so that if a Privacy Act risk is identified, it can be mitigated.

The next question then is how we can modify, by way of de-identifying or anonymising the data, so the analysis can be done without modifying the data to such an extent there is no remaining utility in the data.

It is a question often faced by those intending to undertake analytics, especially where companies intend to share data.

An initial review of the risks, and taking steps to mitigate them from a privacy perspective, is a key message from the Guide to Data Analytics and the Australian Privacy Principles. It is a message about the importance of transparency for individuals who provide information as well as the contractual parameters required to mitigate risk.

There are practical tips throughout the guide, including brief case study examples. At the end, Attachment 1 consolidates all of the risk points and tips, including the important issues of sharing information and using it for direct marketing purposes.

The Guide to Data Analytics and the Australian Privacy Principles touches on de-identification and re-identification. It provides one cautionary example of a US university student using a publicly available health insurance data set, combining it with other publicly available data sets such as voter registration and using personal details of the State Governor, including his date of birth to accurately identify him and his sensitive medical information.

This then leads logically into the second guide that has been released.

De-identification and the Privacy Act

You may recall that the OAIC together with CSIRO and Data61 issued the De-Identification Decision-Making Framework in 2017, which was an Australian adaptation of the UK ICO 2012 publication on data anonymisation. It is a hefty volume focusing on the how of de-identification.

This new publication, De-identification and the Privacy Act is an 18 page easy-to-read guide focusing on how the techniques in that lengthier document, De-Identification Decision-Making Framework, relate to obligations of organisations under the Privacy Act, providing a valuable view on how the regulator sees the technical steps in the de-identification process applying to Privacy Act obligations.

Again, if information is de-identified and there is a low risk of re-identification then the Privacy Act will not apply. This guide sets out some ways a user of de-identified information can have confidence they have taken all reasonable steps to de-identify information and have its subsequent use fall outside the Privacy Act.

The Guide considers various risk management techniques including de-identification and raises briefly the issues of utility and its reduction after the application of de-identification techniques.

It also raises considerations after data has been released. That is, as set out in the US example above, de-identified data released into the public domain may not remain de-identified and if data is to be publicly released the de-identification steps will need to be more robust.

This raises the application of additional controls in the data environment including limited access, maintaining a secure environment and importantly imposing contractual obligations on the use and distribution of data. We often see parties relying on a simple confidentiality agreement, but this guide makes clear that additional contractual controls to limit re-identification are required.

Takeaways

If you are going to share data or merge data sets it is advisable to undertake a review before you begin to ensure any personal information is de-identified.

Don’t simply rely on a confidentiality agreement to share data; ensure you have a contract that addresses risk and protects you in the event of a privacy breach.