Anyone engaged or interested in the electronic disclosure process should pay close attention to the landmark decision handed down earlier this year by the English High Court in Pyrrho Investments Ltd v MWB Property Ltd (‘Pyrrho’),1 the first ever decision on the use of predictive coding software.
What is Predictive Coding?
Electronically stored information (‘ESI’) often constitutes the majority, if not all, of the entire universe of information held by parties to litigation. The digital nature of our lives and businesses leads to large volumes of ESI being created, duplicated and stored in a variety of formats, locations and jurisdictions.
Parties to litigation may face difficulties in determining how much ESI they have or where it is located, or they may simply have too much to sift through. In past years, parties have relied on simple keyword or Boolean searches (using operators such as AND, OR and NOT in a search) to sift through data, but these methods of search can be unreliable, subject to human error and simply under- or over-inclusive. Predictive coding is a powerful tool that seeks to address these issues. It has the potential to augment human decision making with computer aided pattern recognition.
Predictive coding (also known as technology-assisted review, or TAR) is computer software that can be trained to perform searches and assess relevance of ESI. At the outset, it is critical to understand that predictive coding is meant to enhance the efficiency of manual review, not replace human review entirely. In most reviews using predictive coding software, no document is disclosed without a lawyer reviewing it manually. The software instead performs an initial ranking of documents based on the likelihood the document will be relevant. In theory, then, the documents most likely to be relevant will be front-loaded; the review is meant to stop when the reviewers get to the group of documents deemed less likely to be relevant. This approach is at least theoretically a vast improvement to so-called “linear review,” in which reviewers simply looked at each document one by one without regard to the likely responsiveness of the document.
The Facts & Decision
In Pyrrho, the parties had been ordered to give standard disclosure i.e. to disclose those documents supporting or adversely affecting the case of any of the parties to the dispute. However, it came to light that the Claimants possessed over three million potentially relevant electronic files. In view of the enormous expense of manually searching through these documents, the parties agreed that predictive coding ought to be used but felt it necessary to seek formal approval from the Court.
Master Matthews stated that what fundamentally matters in the disclosure process is the scope and quality of the search, rather than the listing and production for inspection of the relevant documents found. Master Matthews listed ten factors that favoured the use of predictive coding technology in this case:
- Experience in other jurisdictions has been that predictive coding software can be useful in appropriate cases.
- There is no evidence that predictive coding leads to less accurate disclosure compared to manual review and there is some evidence that it is more accurate than manual review.
- The use of a computer to apply the approach of a senior lawyer towards the initial sample will result in greater consistency in the review compared to using multiple lower-grade fee earners each independently applying the relevant criteria.
- There is nothing in the CPR or Practice Directions to prohibit the use of such software.
- The number of electronic documents in the case before him case was huge, amounting to over 3 million.
- The cost of manually searching the documents would be enormous amounting to several million pounds at least. Thus a full manual review would be unreasonable within paragraph 25 of Practice Direction 31B because a suitable automated alternative at lower cost.
- The cost of predictive coding software depends on various factors but the estimates in the case were obviously far less than the cost of the full manual alternative. However, there may be some manual review to be carried out once the software had done all it could.
- The value of the claims made in the case were in the tens of millions of pounds. Thus the estimated costs were proportionate.
- The trial is not to be heard until June 2017. This would provide plenty of time to consider other disclosure methods if predictive coding turned out to be unsatisfactory.
- The parties agreed on the use of the software and how to use it.
Taking all these factors into account, it was held that the use of predictive coding was suitable in this case and would promote the overriding objective of dealing with cases justly and at proportionate cost.2
Ultimately, there are many aspects of a case that are relevant in determining predictive coding’s suitability such as cost, time and party agreement. In this case, the Court fell on the side of predictive coding. Future computing developments may mean that the cost of predictive coding will reduce while further refinement of the technology may result in more powerful data analytics which will further improve the precision and accuracy of its results, making it even more attractive as an option.
The decision in Pyrrho – handed down on 16th February 2016 – saw both parties in agreement as to the suitability of predictive coding. What if parties do not agree? It is not hard to imagine a case where parties disagree over the use of predictive coding given its novelty. The Court will resolve any such dispute in light of the overriding objective. In addition, there are matters where the types of documents involved or the issues at play make predictive coding a less feasible option.
What would be more difficult for the Court is a case where the costs of manual review and predictive coding are the same, or similar, and the parties are in disagreement. Assuming there are large volumes of documents then how should that disagreement be solved? If the studies referred to in Pyrrho are to be believed,3 and predictive coding is, in fact, more accurate than human review, does justice demand that it be used as a matter of course? Certainly such an argument can be made.