In brief

  • The White House has released a report that presents the results of a 90-day study into the nature, impact, benefits and risks of ‘big data’ announced by President Barack Obama on 17 January 2014.
  • The report provides a useful introduction into the nature of big data, the opportunities it creates and the unique challenges and concerns it raises, particularly in the privacy context. This article also applies an Australian lens to the US report.
  • The issues and recommendations outlined in the report provide guidance in relation to, and a useful starting point from which to, engage with the privacy, security and related issues and concerns raised by big data.

Release of the White House report on big data

On 1 May 2014, the White House released Big Data: Seizing Opportunities, Preserving Values (the Report),1 which presents the results of a 90-day study into the nature, impact, benefits and risks of ‘big data’ announced by President Barack Obama on 17 January 2014.

Although the Report focuses on the US context and legal landscape, it provides a useful overview of, and guidance in relation to, the constantly innovating, evolving and changing world of big data technology and analytics.

What is ‘big data’?

‘Big data’ refers to the significant overall volume and variety of data being created, captured and analysed today, particularly in comparison with historical volumes of (and methods of analysing) such data.

The Report echoes the now widely accepted definition of ‘big data’ as data that is ‘so large in volume, so diverse in variety or moving with such velocity, that traditional modes of data capture are insufficient’. That is, big data reflects our current environment of ‘near-ubiquitous data collection’, in which such data is collected in a wide variety of sources and formats and at near-real time. These ‘3 Vs’ are what differentiate big data from traditional modes of data collection and analysis.2

Big data technologies and analysis therefore make use of newer, more advanced data processing capabilities, business intelligence tools and applications and methods of analysis in order to analyse and make sense of the varied and numerous data sources collected, in various industry sectors.

The Report notes that the value derived from such technologies and analysis includes the capability to ‘boost economic productivity, drive improved consumer and government services, thwart terrorists and save lives’. To use the commercial context as an example, while perhaps not saving lives, big data technologies are often used to improve the customer experience and business performance, which may, at the customer level, involve analysing and predicting the personal preferences, behaviour and attitudes of individual customers.

Concerns raised by big data

The Report notes that ‘[t]he march of technology always raises questions about how to adapt our privacy and social values in response’, and the concerns raised in relation to big data are no different. These include, relevantly:

The effectiveness of de-identification of data

In the Australian context, the Privacy Act 1988 (Cth) will not govern big data analytics where the relevant entity has conducted demonstrably highly effective anonymisation or de-identification of the relevant data such that it no longer constitutes ‘personal information’. However, the Report notes that the use of de-identification to protect privacy is only ‘a limited proposition’, given that:

  • equally effective techniques exist to ‘re-identify’ purportedly de-identified or anonymised data, particularly where it can be matched or otherwise tied back to an individual when used in combination with other available information (known as ‘the mosaic effect’), and
  • effective de-identification may render the relevant information less useful, and limit individuals’ ability to correct any errors in such data.

The protection of metadata

In general, privacy protection afforded to ‘metadata’, or ‘transaction records about communications and documents’, is lesser than the protection afforded to the content of those communications and documents. However, the rise of big data has led to a corresponding increase in the amount and type of metadata being collected (for example, IP addresses and geo-location data), which has the potential to be used to reveal more information about individuals than ever before.

Potential discrimination and unfairness

The use of big data analytics also has the potential to lead to discriminatory treatment of individuals, whether:

  • an inadvertent outcome of relying on apparently ‘neutral’ technology, for example, where a technology that relies on smartphone data to identify roads requiring repair unintentionally favoured neighbourhoods with (wealthier) smartphone owners, or
  • an intentional result of the collection of data to compile a profile of an individual, including by imposing differential pricing on different groups of people or even denying products, services, employment, credit or housing to certain groups.

Due to the lack of transparency and autonomy of users in relation to such uses of their data, this may be particularly harmful to individuals.

The persistence of data

The increasing ability to capture, store and share data in large volumes, in multiple locations (both locally and in the cloud) and for an indefinite period means that individuals are unlikely to be able to maintain personal control over their data. As a result, the accuracy and security of such data becomes of increasing importance.

The usefulness of the ‘notice and consent’ privacy framework

The overall trends evident in the rise of big data (including the collection, retention and use of ‘as much data as possible’, the issues that arise with effectively de-identifying data and the reuse of data for purposes - and by persons - well beyond those initially contemplated) call into question the ‘notice and consent’ framework of privacy protection.

This framework, whereby individuals must be notified of, and consent to, the collection of data, places the responsibility on the individual, whom the Report notes ‘is not well equipped to understand or contest consent notices as they are currently structured in the marketplace’. Instead, the Report indicates that a ‘responsible use’ framework, which places the emphasis on the entities that collect, maintain and use data and the subsequent use such entities make of that data, may become more relevant.

Recommendations of the Report

The Report makes a number of recommendations, including, among others:

  1. advancing the ‘Consumer Privacy Bill of Rights’ released by the White House in 2012, which focuses on individual control, transparency, respect for the context of collection of data, security, access and accuracy, reasonable limits on collection and accountability,
  2. passing national data breach legislation,3
  3. extending its privacy protections to non-US persons (as a means of recognising that ‘privacy is a worldwide value’),
  4. engaging in, and leading, international conversations on big data with partners including the European Union, APEC and the OECD, to ensure ‘interoperable global privacy frameworks’,
  5. paying increased attention to the potential for discrimination in big data technologies and analysis, and
  6. additional recommendations in relation to the use of big data in the educational, government and law enforcement context.

How does this relate to the Australian context?

Although the US privacy landscape described in the Report, which is largely fragmented and sector-specific, differs greatly from the regime imposed by the Privacy Act 1988 (Cth),4 the Report provides a useful starting point from which to engage with the privacy, security and related issues and concerns raised by big data. Companies should therefore aim to be aware of developments in, and recommendations with respect to, big data laws and practices both in Australia - including the ADMA Best Practice Guideline - Big Data,5 the Australian Public Service Big Data Strategy6 and the Office of the Australian Information Commissioner’s Privacy Business Resource on the De-identification of Personal Information7 - and worldwide in order to comply with ‘best practice’ privacy management.