This article originally appeared in the UCLA Law Review.
On August 22, 2018, the California Attorney General Xavier Becerra wrote a letter to both the California State Senate and Assembly.1 The letter stated that A.B. 375, the California Consumer Privacy Act of 2018 (hereinafter Act) imposed “unworkable obligations and serious operational challenges upon” the AG’s office, as the Act appointed the AG as the chief enforcer of California’s new data privacy law.2 Among other complaints, the letter states that the AG will not be able to provide advice to businesses or timely complete a rulemaking in connection with the Act. While lawmakers passed legislation amending the Act that addressed some of Becerra’s collateral concerns,3 the core rulemaking concerns remain. Because of this inaction, it is becoming clear that the Act will go into effect with few substantive alterations or rulemaking amendments.
Though the Act stands to pose a challenge to the AG’s office, the real concern is the impact the Act will have on companies researching or deploying blockchain and artificial intelligence (AI). These companies must consider how they can comply with their new obligations under the Act or possibly face enormous penalties. Compliance with the statute’s requirements may create substantial obstacles to the wide-scale deployment of AI4 and blockchain technologies in the California and national markets. Indeed, in many cases, compliance will not be possible due to the very nature of these technologies. As such, the Act will stifle the ability of AI and blockchain companies to innovate and prevent them from operating effectively.
I. The Basics of Blockchain and AI Technologies
A blockchain is generally “an open, distributed ledger that can record transactions between two parties efficiently and in a verifiable and permanent way.”5 The ledger itself is created from a growing list of transactional records, called blocks. Typically, each block contains a cryptographic hash of the previous block, a timestamp, and transaction data. The transaction data is generally represented as a merkle tree root hash, by which each parent block is labelled with a cryptographic hash of its child blocks. A peer-to-peer network validates each new block. Once recorded, the data in any given block cannot be altered retroactively without alteration of all subsequent blocks, which requires consensus of the network majority.
Another common attribute of a blockchain is its open nature; each block is generally open to inspection. Indeed, the public inspection of these blocks is a key feature of this consensus validation technology. To be clear, this does not mean that the block chain must be public and permissionless. However, if the consensus is only achieved in a closed, private network, many of the values of consensus are lost, including the inalterability of the records, the network effect, and many of the consensus security benefits.6
AI, broadly speaking, is the study of any device that perceives its environment and takes actions that maximize its chance of successfully achieving its goals.7 There are a number of ways in which this is accomplished. However, a common feature of almost all of these AI technologies is the accumulation of data, which is then used to inform the decisional processes of the AI.8 This acquired AI data is often personal information (PI) under the Act. Examples include healthcare (using Personal Health Information), automotive (geolocation data), finance (fraud detection), advertising (predictive behavior), and many others.9 AI technologies, in a very real sense, are composed of, and depend on, the data that they incorporate. It is not simple or effective to delete or “forget” incorporated data because the entire AI technology is based on learning, and deleting the incorporated data defeats the learning process.10
II. The Act’s New Requirements Upon Blockchain and AI Businesses
Under the Act, businesses must address new consumer rights relating to the sales of PI to third parties. Almost all of these rights will likely be broadly applied and require detailed discussion.
The Act applies to any organization that conducts business in California and satisfies any of three conditions: (1) has annual gross revenue in excess of $25 million; (2) annually buys, receives for the business’s commercial purposes, sells, or shares for commercial purposes the personal information of fifty thousand or more consumers, households, or devices, alone or in combination; or (3) derives 50 percent or more of its annual revenue from selling consumers’ personal information. The definitions of “consumers” and “sales” under the act are so broad that, effectively, a business is covered if it collects fifty thousand personal identifiers (names, social security numbers, IP addresses, device identifiers, or many others, as defined below) or has more than half its revenue related to the sale or transfer of that information. These thresholds will be met very quickly by ongoing blockchain and AI businesses. Assuming that a business is covered by the Act, and many will be, the next issue is the broad definition of PI provided by the Act.
PI, under the Act, includes all “information that identifies, relates to, describes, is capable of being associated with, or could reasonably be linked, directly or indirectly, with a particular consumer or household.” Almost every application of a blockchain technology will contain PI in each block. As to AI, there will certainly be applications of the technology that do not gather PI. However, PI specifically includes, but is not limited to: (1) identifiers, such as names, aliases, addresses, and IP addresses; (2) characteristics of protected classifications under California or federal law; (3) commercial information, including records of personal property, products or services purchased, or consuming histories or tendencies; (4) biometric information; (5) Internet or other electronic network activity information, such as browsing history; (6) geolocation data; (7) audio, electronic, visual, thermal, olfactory, or similar information; (8) professional or employment-related information; (9) education information; and finally, (10) any inferences drawn from any of the information identified to create a profile about a consumer. Most AI and nearly all blockchain technologies will gather one or more of these types of PI from a consumer. In addition, each of those blocks will be automatically shared with everyone in the peer-to-peer network, potentially multiplying the responsibilities.
With regard to this PI, all covered businesses must satisfy five new customer rights with regard to PI: (1) the right to know what categories of PI have been collected and the purpose for the collection, which must be disclosed specifically as to each customer on demand; (2) the right to access, which includes copies of all specific pieces of PI collected for the requesting customer; (3) the customer’s right to force companies to delete PI collected from the customer, subject to vaguely defined exceptions;11 (4) the right to opt out of the sale of PI to third parties, where “sale” is broadly defined to include all transfers (including orally or in writing) to a third party “for monetary or other valuable consideration,” subject to a “business purpose” exception;12 and (5) the right to equal service, which means that customers opting out or exercising other rights cannot receive different prices or benefits by virtue of exercising those rights.13 These requirements may serve to bolster some individual privacy rights, but the benefits to the customer may be outweighed by the burden they that these new rights impose on firms attempting to provide innovative services using blockchain or AI technologies.
III. The Act’s Detrimental Effects on Blockchain and AI Businesses
Consider a blockchain technology that operates a public distributed ledger of all transactions in a pet food business. That technology would utilize this public ledger to effectuate transactions between the consumer and their third-party payment services. Each exchange of information with the consumer’s payment processors would likely not constitute a “sale” of PI.14 However, each publication of the transaction on the public ledger would constitute a “sale” of the same PI, because it is a publication of the PI for a business purpose. Each ledgered “sale” would thus be subject to deletion or opt-out at the consumer’s demand. However, in the typical public blockchain, neither option would be possible. 15 Most blockchain technologies therefore cannot comply with the California Act, and blockchain kibble is likely doomed.16
Similarly, consider a customer service AI, which builds its conversations from the information gathered in thousands of conversations and applies it to a specific consumer conversation in California. If that information was transferred to any third party, that consumer could exercise their rights to deletion and opt-out, which would require the business to tear the conversation out of the technology and prevent any further sales. Moreover, AI technologies are often in conversation with each other, making it even more difficult to meet this requirement. If a company’s AI interacted with that of Google Assistant or Amazon Alexa over the customer service inquiry—which is already likely today—it would be nearly impossible to honor the new consumer rights.
There are also a number of other problems posed by the Act to AI technologies. First, the right to know the purpose behind data collection will require companies to manually review significant algorithmic decisions, raising the overall cost of AI. This raises similar problems as the requirement under the EU General Data Protection Regulation (GDPR) that companies must have humans review certain algorithmic decisions.17 Both restrictions significantly raise labor costs and, by folding humans into the process, prevent the automation functions that are the point of AI. In addition, the Act is broader than GDPR in that each individual consumer can demand the purposes of their specific PI collections, which magnifies the inefficiencies exponentially.
The right to erasure could also damage AI systems. All AI systems that improve themselves by learning from the data they process necessarily “remember” the data that they used in order to learn. Erasing data that underpins key rules in an AI technology’s behavior will limit its efficacy and may even break the AI.18 Similarly, the right to explanation could reduce AI accuracy because there is inherently a trade-off between accuracy and transparency in algorithmic decisions.19
The regulatory dangers are not only confined to direct customer interaction. PI is defined as a set of specific data types, but also includes information that “relates to” or “describes” a particular consumer or household, “browsing history, search history, and information regarding a consumer’s interaction with an Internet Web site, application, or advertisement,” and “[i]nferences drawn from any of the information identified in this subdivision to create a profile about a consumer reflecting the consumer’s preferences, characteristics, psychological trends, preferences, predispositions, behavior, attitudes, intelligence, abilities, and aptitudes.”20 Because of this definition, blockchain and AI companies doing work with other companies’ PI may be considered “third parties” under the Act. As third parties, AI and blockchain companies would have nearly all of the same responsibilities as a business collecting the PI directly from the consumer.
While the third-party problem can be addressed with specialized contracts in certain circumstances, those contracts
prohibit the entity receiving the information from retaining, using, or disclosing the personal information for any purpose other than for the specific purpose of performing the services specified in the contract for the business, or as otherwise permitted by this title, including retaining, using, or disclosing the personal information for a commercial purpose other than providing the services specified in the contract with the business.21
As noted above, neither AI nor blockchain technologies can silo information in that manner. Therefore, it is unlikely that third-party AI or blockchain companies can circumvent the requirements set out under the Act.
B. Vagueness Kills
The Act appears to substantially increase regulatory risks for AI and blockchain firms. These are complicated rules with sometimes arduous compliance mechanisms. The costs of compliance, especially in new technology areas, will most likely deter investment. And the fines and penalties under the Act will only compound those difficulties. The risks will disproportionately impact smaller innovative firms—as smaller firms typically generate less income than their larger counterparts, the maximum fines end up being proportionally costlier for small companies, which will be even less likely to adopt AI.
Vague rules could also deter companies from using de-identified data. Although the California Act (like GDPR) permits exemptions for de-identified data,22 the overbreadth and vagueness of the rules swallow up this exception. For example, information that is retained in pseudonymously identified profiles is not de-identified. Therefore, in the context of blockchain, most blocks, even when cryptographically hashed, will violate the Act.
In addition, even if they were effectively de-identified already, they would have to be re-identified upon consumer request. For example, the FTC indicated in its 2012 report that de-identification is considered successful when there is a reasonable basis to believe that the remaining information in a particular record cannot be used to identify an individual.23 This process will no longer be effective because the FTC’s definition is narrower than PI under the Act. Among other things, the “identity” of an individual is much broader under the Act. Therefore, data that has been de-identified to FTC standards would need to be re-identified for individualized disclosure to requesting consumers under the Act. This will undermine companies’ incentives to process and share de-identified data that could be used to improve AI and blockchain systems, while at the same time, driving some firms to process personal data when de-identified data would suffice, and as a result incur unnecessary compliance costs and restrict their range of legal uses.
Some in the blockchain and AI industry may think that their efforts can, for now, be considered as mere research, not business. However, this approach will provide no relief because research is actually doubly constrained under the Act. To be retained despite a request for deletion, research must be peer-reviewed scientific research “in the public interest” and must be “[c]ompatible with the [specific business purpose] for which the [PI] was collected.”24 Therefore, research must be tracked and disclosed on demand. Third-party research and analysis and “inferences” made by academics would be treated as PI and subject to opt-out. Therefore, for example, researchers could not utilize blockchain or AI data gathered by a retail business to track the medical or sociological inferences of that retail activity. Some of the best research insights come from looking at old data in a new manner. Similarly, research into additional business applications for the data (for example, marketing research that is accomplished by examining or otherwise processing purchased consumer data) could not be considered either in the “public interest” or for the same purpose that the information was proffered. Thus, the narrow scope of this exception makes such innovative thought nearly impossible.
The Act seriously threatens the ability of AI and blockchain companies to innovate or even operate if they have any contact with California. Companies should start planning for the changes now. As part of that analysis, affected companies might consider pushing for comments and rulemaking from the AG and possible challenges to the Act in court. Among other things, such actions should be focused on the business purpose exceptions to the Act. As discussed above, both blockchain and AI technologies rely on the immutability of their gathered data. Since companies in this space literally make their business about unchangeable data gathered from consumers, perhaps the maintenance of such data should provide a clear exception to the Act’s requirements. The extension of these exceptions should flow from further research and discussion about privacy, data security, and data hygiene as collective concepts, what could be referred to as Complete Information Security. Increasingly, the seemingly disparate concepts of state-sponsored intelligence, online disinformation campaigns, irresponsible or inaccurate reporting, altered data sets, compromised communication accounts, doxing campaigns, intellectual property theft, network security, and many other concepts occupy the same intellectual space. All of these concepts can, at the same time, define aspects of the ongoing modern struggle to maintain pristine old information, integrate new helpful information, and take informed future decisions. Complete Information Security is a new integrative concept I have coined that seeks to define the old struggle to make informed decisions.
Thinking in that integrative manner, perhaps the answer to the larger privacy question is not to force secondary inspection, deletion, and opt-outs upon blockchain and AI in order to externally ensure the proper accuracy and use of data. Instead, perhaps the answer may be to employ these new technologies to build accuracy and use restrictions into the data collection itself: a priori complete information security, not post hoc. After all, we will not need to enforce compliance where the technology guarantees it. Looking forward toward technological innovation bakes integrative privacy and data-security principles into the discussion at the time of collection. In this manner, AI and blockchain technologies might be the best chance to actually effect beneficial changes and to ensure our complete information security.