Under the Hart-Scott-Rodino Antitrust Improvements Act of 1976 (“HSR Act”), certain mergers and other transactions are subject to review by the federal antitrust agencies—namely, the Department of Justice’s Antitrust Division and the Federal Trade Commission’s Bureau of Competition. In some cases, the antitrust agencies issue requests for “additional information and documentary material relevant to the proposed acquisition,”1 generally known as “Second Requests.”

Responding to a Second Request, particularly complying with the document production specifications, can be time consuming and expensive. And these costs continue to skyrocket because of the accelerating volume of email and other “electronically stored information” (“ESI”).

Two decades ago, a hundred boxes of documents would have been a sizeable production response to a Second Request. Today, a production volume in excess of a thousand boxes would not be unusual.

The collection, review, and production of all this ESI has presented a growing challenge to merging parties and their counsel. The latest tool to manage this discovery is the analytics-based technique known as “predictive coding.”

Prior to this year, the Antitrust Division's Model Second Request required parties to describe only “the search methodologies and the applications used to execute the search,” which arguably excluded predictive coding. In March, however, the agency modified its Model Second Request to require an explanation of predictive coding strategies.2

Even more significantly, earlier this year, the Wall Street Journal reported the first instance of the Antitrust Division actually approving the request to use predictive coding to streamline the determination of which documents needed to be produced in a high-profile merger in the alcohol industry.3

These recent actions by the Antitrust Division are noteworthy because they reflect the transformation of predictive coding into a recognized method of containing compliance costs. All companies seeking approval under the HSR Act should be aware of how predictive coding may substantially reduce expenses.

What is Predictive Coding?

Predictive coding—sometimes known as “technology assisted review” (“TAR”) or suggestive coding—is a process by which the computer takes the lead in deciding whether documents are responsive to a Second Request.

Unlike traditional keyword-based Boolean searching, predictive coding ranks documents by assigning numerical scores to them. For some predictive coding systems, these scores will reflect an overall relative likelihood of a given document being responsive and/or privileged. For other systems, the scores will reflect the level of similarity of the document to previously identified relevant or privileged material. Furthermore, predictive coding is a learning technology—and this function sets it apart from other analytical techniques.

Generally speaking, the process works like this: Lawyers analyze sample (“seed”) documents, and their determinations are input into the coding engine, which is able to generalize those decisions across the entire collection. The document scores are continually refined through further analysis so that—much like the self-correction of spam filters—the software minimizes disagreement with human reviewers and “learns” how to recognize what the lawyers are seeking.

Although predictive coding is a relatively new technology, the early returns suggest it may allow for substantial cost savings. Last year, for instance, a study by the RAND Corporation observed that the traditional review process consumes more than 70% of the total costs of a document production. RAND therefore concluded that predictive coding “has the potential to significantly reduce costs without compromising the quality of the assessment when compared with large-scale reviews conducted in the traditional way.”4

While predictive coding is not yet widespread in civil litigation, it is becoming more common. Several recent decisions offer a judicial imprimatur for using the technology to review and produce documents.5

Antitrust Case Study

As noted above, the Antitrust Division approved the first-ever use of predictive coding in an HSR Review when it examined the recent proposed merger between Anheuser-Busch InBev and Grupo Modelo.  The agency arrived at a settlement with the parties in April 2013.

As part of this process, the agency requested strategic plans and other competitive data from spirits company Constellation Brands Inc., which the beer makers had lined up as a buyer for assets that would be sold in the deal, and Crown Imports LLC, a joint venture between Modelo and Constellation.

With DOJ’s approval, the companies identified a universe of more than a million documents necessary for review.  Counsel then loaded the documents into a predictive coding program and reviewed a seed set to train the software. This human review was repeated until the agency and parties were comfortable with the scores assigned to the remaining documents.

Constellation and Crown Imports eventually turned over hundreds of thousands of documents.  According to reports, however, the review cost half of what would have been spent using traditional methods to comply with a Second Request.

In the end, Grupo Modelo and AB InBev agreed to sell Grupo’s U.S. business to Constellation, and the Antitrust Division dropped its challenge to the merger.

The Future

In a February 2013 speech, Deputy Assistant Attorney General Renata B. Hesse observed: “When it works well, predictive coding reduces the document review and production burden on parties while still providing the [antitrust] division with the documents it needs to fairly and fully analyze transactions and conduct.” At the same time, Ms. Hesse cautioned: “Of course, for predictive coding to work for the division, we require a high degree of cooperation and transparency about the implementation and structure of the predictive coding process.”6

Thus, the truism “an ounce of prevention is worth a pound of cure” retains its wisdom as the process for responding to Second Requests continues to evolve.

Now more than ever, it is of critical importance to establish a working relationship based on trust with the antitrust agencies as early in the process as possible. Putting every protocol on the table and determining reasonable ways to reduce burden should be part of every negotiation with the antitrust agencies—from preservation to collection to production.

As with any complex process, the devil is in the details, so careful planning must go into how the potentially responsive population is defined, how the predictive models will be trained, how to address non-assessable documents, and how to ensure the validity of the overall process.