What is Big Data?
Big Data is about using mathematical models to spot patterns or “footprints” in large datasets. The area is gaining prominence. Kenneth Cukier recently appeared on BBC Radio 4’s “Start the Week” to discuss it. He has also teamed up with Viktor Mayer-Schönberger to publish a new book: “Big Data: A Revolution That Will Transform How We Live, Work and Think”. The book discusses risks and benefits of Big Data and privacy concerns. Interestingly, the authors also say that anonymisation is not possible when it comes to Big Data!
Where can it be used?
There is no doubt that Big Data is big news. You just have to think about data on the digital services you use or the TV and movies that you stream. Twitter recently released a 20 page booklet for advertisers on trends in television viewing. The data shows how people use Twitter when watching TV programmes and what they say about them. Research from the University of Cambridge says that a person’s political leanings, age, gender and sexual orientation can be deciphered by studying their Facebook “likes”!
Anonymisation is not the answer?
So Big Data is big news. Let’s circle back to that new book. It’s interesting that the authors say that anonymisation is not possible when it comes to Big Data. If that’s correct, then we have a problem. Assume Big Data cannot escape compliance risk by de-identification / anonymisation. Then look at the draft General Data Protection Regulation. The new Regulation (currently about to collapse under the weight of its own amendments) includes proposed amendments about “profiling” (Article 20) proposed by the Albrecht Committee.
Albrecht proposal on “profiling”
Profiling is defined by Albrecht as: “any form of automated processing of personal data intended to evaluate certain personal aspects relating to a natural personal or analyse or predict in particular that natural person’s performance at work, economic situation, location, health, personal preferences, reliability or behaviour”. This is pretty much any kind of profiling required for Big Data! The Albrecht proposal is that you only “profile” people where (1) necessary to enter into or carry out a contract; (2) expressly authorised by EU law; or (3) based on the data subject’s consent. This is a much broader prohibition than was in the original draft or than the current Data Protection Directive.
Other amendments proposed by the other Committees take a different approach. They prohibit profiling which results in decisions which are “unfair or discriminatory” (borrowing equivalent concepts from consumer law). However it isn’t clear how the consumer law concepts would map across to the data protection world. Other amendments specifically encourage the use of pseudonymised data. But surely the use of pseudonyms would be good practice in any event (in line with the new principle of data minimisation)?
A technical point
Well, I have another thought: the new Regulation requires compliance with gateway conditions as per the current Directive. The new consent gateway will require consents to be “explicit”. It will also be much harder to rely on “legitimate interests” as an alternative as you will have to specify the “legitimate interests” in privacy policies up front. This pushes us firmly towards a “permission-based” model of which Big Data will have to form part. So we may be in a consent solution already.
Over prescription of Big Data could kill it off before it has begun. Big Data becomes little data? Isn’t it better to create a legal regime in which it can be implemented “safely” (by using pseudonyms and proper transparency) rather than making it subject to a prior consent or equivalent condition?