The speed in which vast and complex “bits” can now be processed is staggering. This information is being collected, stored and analyzed to drive efficiency and create actionable intelligence – translating into increased revenues. In short, in today’s world data is power.

For those engaged in technologies that collect data sets and use Machine Learning and Deep Learning techniques, there are a minefield of legal issues to avoid.

Privacy

Privacy, has long been established as a right, and despite popular belief it is not dead (yet).

An individual’s consent is usually required before Personally Identifiable Information (PII) about him or her can be stored, used or shared. However, one of the main hurdles for those who collect data is that such information is often amassed from across the globe. Crawlers and scrapers are being employed to scour data from all over the web, but different jurisdictions have different laws, regulations and interpretations. What constitutes PII or consent in one country may not be the same in the next.

These laws are also evolving, for example Israel’s Protection of Privacy Law, (1981), is soon to be bolstered by the Protection of Privacy Regulations (Data Security) 2017 (the Regulations) which come into force in March 2018, and the new European General Data Protection Regulations (GDPR) come into force in May 2018. In addition, different sectors such as banking and healthcare have their own privacy provisions.

The breadth of laws to consider poses a serious challenge for those engaged in Big Data.

Not only do we usually have the right to object to our data being collected, in certain circumstances we also have the right of erasure (AKA the right to be forgotten), and can demand that our information be deleted at a later stage. This means that companies that hold PII need to ensure they have the ability to locate and erase such information further down the line.

Many jurisdictions require the holder of data to register a database with the local privacy regulator – in Israel the Israeli Law, Information and Technology Authority (ILITA). Given the value of data it is unsurprising that the news is full of stories of database hackings.

Under the Regulations, Database Owners are required to immediately notify the ILITA and, in some instances, data subjects upon serious security breaches. While this increases transparency, it also increases exposure to litigation and class actions, and can be a PR nightmare. Accordingly, the security concerns around PII should not be trivialized.

A popular way of avoiding these privacy mines is to anonymize and aggregate all data obtained and store as little of it as possible.

With both the GDPR and Regulations coming into force early next year, the former carrying significant fines (the maximum being the greater of €20million or 4% of the total worldwide annual turnover for the preceding financial year) and the latter carrying civil and criminal liability, anyone engaged in Big Data in Israel should be taking steps to mitigate risks and ensure compliance.

Ownership

Enterprises understand that although data is power, holding it tight to their chest is not always the best way to leverage it and realize its economic potential; so such information is often shared with third parties. Take the autonomous driving space for example, car manufacturers such as BMW may share data garnered pertaining to vehicle performance and driver safety and behavior with the OEMs developing their cars’ autonomous driving capabilities. These OEMs may then combine BMW’s information with their existing data sources and/or create a Deep Learning model based on BMW’s information to achieve this goal. The question then arises as to who owns what in the co-mingled data or the conclusions derived from the model.

Data is usually protected by copyright laws, but when it is shared it is paramount to clearly and contractually define each party’s rights and obligations, including ownership, scope of license, confidentiality, limitation on liability and consequences of termination of the relationship.

Competition

Data is being seen less as a byproduct and more and more as an asset in and of itself. Some recent acquisitions clearly suggest that the potential of the target to acquire data can be the driving force behind the deal. Think of Microsoft’s $26 billion purchase of LinkedIn last year, which brought it an enormous amount of data about users’ work, skills and interests. If data is indeed power and can be used to suppress competition, then legislators and anti trust regulators will soon sit up and reconsider how they calculate market power and categorize restrictive trade practices, affecting how data companies behave and current relationships.

Discrimination

Algorithms dissect data and spew out conclusions. Allegedly this enhances our ability to make evidence-based decisions and avoid human bias. However, an algorithm is only as impartial as its author or the data sets it works with. There have been various high profile instances of technology discriminating, for example Google Photos tagged several African-American users as gorillas (experts say this was probably due to insufficient representation of African-Americans in the database used to train the computer vision algorithm), and Microsoft’s artificial intelligence bot “Tay”, designed to learn from the behaviors of Twitter users, went rogue last year after processing racist tweets.

There are a plethora of laws prohibiting discrimination and in the event your algorithm falls foul of these, reliance on raw data rather than intent will not absolve you of liability.

With the proliferation of Big Data in the last few years, the role that data is playing in replacing human decision making and the move towards a connected world, I am sure we have not heard the last word from our legislators and regulators, and the legal landscape on which businesses are being built will continue to shift. For those engaged in Big Data even in today’s legal world, you better be careful where you tread.