We live in a modern, globally-connected, technology-dependant world. Data is everywhere. Whether you 'like' a post on Facebook, purchase a book online or are simply browsing the web, chances are that somebody, somewhere, is collecting your data. It is estimated that the volume of data produced globally is increasing by 50% each year. By 2020, data production will be 44 times greater than it was in 2009.

Data (and, more particularly, 'big data') plays an increasingly important role, both in the private and public sector. Private sector organisations can use data on an individual's attributes to develop products and target advertising to gain a competitive advantage. Public sector organisations can share information and innovate in the fields of science and health for the public benefit.

A key issue arising from the use of information derived from big data is to ensure that, where such information constitutes 'personal information' regulated by Australia's privacy laws, the information is collected, held, stored and disclosed in a manner that is consistent with those laws.

What is big data?

Big data is often defined by use of a technical 'three V' definition – data that is high volumehigh velocity and high variety, and consequently, is data that demands cost-effective, innovative forms of information processing to facilitate enhanced insight and decision making.

Big data, personal information and privacy

Big data, like other forms of data, may contain personal information. The collection, use, storage and disclosure of personal information is regulated by Australia's privacy laws.

Most public universities are regulated under applicable State privacy legislation, although some (like the Australian National University) and private universities are subject to the Commonwealth Privacy Act 1988. The Commonwealth Privacy Act contains the 13 Australian Privacy Principles (APPs). These are largely mirrored in State privacy legislation (although with some differences).

Under the Commonwealth Privacy Act, 'personal information' is defined as 'information or an opinion about an identified individual, or an individual who is reasonably identifiable whether the information or opinion is true or not and whether the information or opinion is recorded in a material form or not'.

APP 3 (collection of solicited personal information) is an important APP to consider when dealing with big data. An entity governed by the Commonwealth Privacy Act must not collect personal information of a non-sensitive nature unless the information is reasonably necessary or directly related to the entity's functions or activities. (By way of example, the Victorian Privacy and Data Protection Act 2014 has a similar proscription in Information Privacy Principle 1, although the words 'reasonably' and 'directly related' are omitted.)

There are two key aspects to this APP – recognising when big data contains personal information, and then assessing whether the APP 3 proscription (or corresponding State counterpart) applies.

Personal information

Assessing whether big data contains personal information may pose difficulties from a technical perspective. Keeping data anonymous (so that an individual cannot be identified) can be challenging – particularly where disparate data sets are made available to be analysed by increasingly more advanced data analytics techniques. Data that begins life as de-identified information can be combined and re-combined with other data, so as to facilitate the re-identification of one or more individuals – thereby rendering that data 'personal information'.

APP 3

In APP 3, the 'reasonably necessary' test is an objective test – whether a reasonable person who is properly informed would agree that the collection is necessary. The burden is on the APP entity to justify that the collection is reasonably necessary.

The following factors may be relevant in determining whether the collection of personal information is reasonably necessary for the APP entity's functions: 

  1. the primary purpose of collection; 
  2. how the personal information will be used in undertaking a function or activity; and 
  3. whether the activity can be carried out by the entity without collecting the personal information.

APP 5

APP entities are required under APP 5 to take reasonable steps to inform individuals of the purpose for which their personal information is being collected either at the time of collection or, if that is not practicable, as soon as practicable after the collection takes place.

In the context of big data, it is important for universities to carefully consider the purpose for which information will be used and to ensure that the purposes are adequately disclosed to individuals. For example, when collecting personal information comprised in big data, universities should consider whether the information will be used for a specific research project only or if it will be retained for use in future research.

Safeguards and protections when dealing with big data

In dealing with big data, safeguards and protections should be implemented to prevent anonymous data from being analysed and attributable to any particular individual, and to ensure continued compliance with APP 3, APP 5 and other relevant APPs (and their State counterparts).

In this respect, it is important for universities to: 

  1. understand what (and from where) data is being collected, with the goal of minimising the personal information being collected; 
  2. consider how they intend to use the information and whether consents are needed for the intended use; 
  3. when compiling data sets for research purposes, consider whether the research can be effectively conducted using de-identified (and possibly aggregated) data; 
  4. ensure that their processes for de-identifying information are robust and up to date; 
  5. maintain appropriate security measures for big data (particularly where it contains personal information); and 
  6. include privacy and data security governance as part of their regular governance processes.