Big Data and Data Protection: Preparing for Tales of the Unexpected
Data protection law – the bundle of statutory duties on those who handle personal data about individuals and the corresponding rights for the individuals concerned – sits plumb in the centre of data law, an increasingly broad and complex amalgam of contract law, intellectual property and regulation.
An important area of looming challenge for data protection lawyers at the moment is Big Data, the aggregation and analysis of datasets of great volume, variety and velocity for the purpose of competitive advantage, where the business world is just at the start of a period of rapid adoption.
On 28 July 2014, the Information Commissioner’s Office (ICO) published a paper on Big Data and Data Protection. Unsurprisingly, the paper’s main themes are that Big Data’s complexity is no reason not to apply data protection law, and that that the Data Protection Act 1998 (DPA) is fundamentally fit for purpose when it comes to Big Data. In applying the DPA’s principles to the issues that arise and providing some practical pointers on how to address them, the paper makes a timely contribution to Big Data management and governance but has fought shy on a number of the more challenging technical and legal issues.
Big Data – Data Protection issues
The report is specific around a number of Big Data use cases for personal data:
“If, forexample, information that people have put on social media is going to be used to assess their health risks or their creditworthiness, or to market certain products to them, then unless they are informed of this and asked to give their consent, it is unlikely to be … fair” (paragraph 69).
More generally, in focusing on how the DPA’s principles of fairness (Principle 1), purpose limitation (Principle 3) and data minimisation (Principles 3 and 5) apply in the Big Data world, the report emphasises:
that specific and informed consent must be obtained before data collected for one purpose is analysed for a materially different purpose;
that organisations must find the right point to explain the benefits of the analytics, present users with a meaningful choice, and then respect that choice; and
just because Big Data analytics involves collecting as much data as possible (‘N = all’) does not mean that the DPA’s data minimisation principles do not apply: “finding a correlation does not retrospectively justify obtaining the data in the first place. Organisations therefore need to be able to articulate at the outset why they need to collect and process particular datasets” (paragraph 73).
It would have been helpful to get an expression of ICO’s views on the technical legal questions of quantifying the harm that individuals may suffer, and the corresponding liability that may arise, as a result of using non-DPA compliant Big Data analytics, and it is to be hoped that ICO will shed light on this before too long.
Practical pointers towards addressing Data Protection issues in Big Data
Having illustrated the tensions between the DPA and Big Data, the paper also suggests pointers that organisations should address when considering Big Data analytics:
anonymisation: data is no longer personal if fully anonymised, but the growing power of Big Data means that absolute anonymisation may not be possible, so that organisations “should focus on mitigating the risks [of re-identification] to the point where the chance … is extremely remote” (paragraph 42) using “solutions proportionate to the risk [which] may involve a range and combination of technical measures such as data masking, pseudonymisation, aggregation and banding, as well as legal and organisational safeguards” (paragraph 43). This formulation is rather bland, however, and the report shies away from more contentious technical considerations about the ability to re-identify anonymised and pseudonymised data.
privacy impact assessments: the report advocates the privacy impact assessment as a tool to be used before processing begins to assess how Big Data analytics is likely to affect the individuals whose data is being processed and whether processing is fair.
building trust: citing IBM and Nectar loyalty card operator Aimia, “some evidence” is noted of companies “developing an approach to Big Data that focuses on the impact of the analytics on individuals” (paragraph 137) with companies looking:
“to place big data in a wider and essentially ethical context. In other words, they are asking not only “can we do this with the data?”, ie does it meet regulatory requirements, but also “should we do this with the data?” ie is it what customers expect, or should expect?”.
information governance: swelling the theme of “a trust based ethical approach” ICO notes “a growing emphasis on the issue of data quality and information governance in relation to Big Data analytics” (paragraph 141).
ICO’s paper is a timely reminder of the pervasiveness of data protection law in the developing world of Big Data and is a start to providing helpful compliance pointers to organisations.
where “it is not possible to identify an individual from the data itself or from that data in combination with other data, taking account of all the means that are reasonably likely to be used to identify them” (paragraph 40)