Like most complex commercial litigation, class actions typically require parties to preserve, collect, review and produce large volumes of data. In addition to adding legal costs and requiring the active involvement of in-house counsel and information technology staff, compliance with disclosure obligations can have a significant organizational impact on executives and other senior business leaders whose work may be affected by identification and collection activities. Furthermore, failing to meet obligations to preserve and produce relevant data can create significant legal and reputational risk.

There are of course important differences between class actions and other complex litigation. For one, class actions typically focus on the defendant’s conduct. As a result, the burden of complying with disclosure obligations falls almost entirely on the defendant, and so class counsel rarely have an incentive to limit the breadth of their disclosure demands. As well, in a class action, the obligation to make production of documents does not arise until after the case has been certified, which can take years. Managing documents in a class action can therefore be complex and costly. However, it is also an area where good project management can have a significant impact on managing litigation costs.

In this blog post, we examine two important elements of managing the costs of document production in a class action: defining the scope through data mapping and identification and designing a defensible and cost-effective process for collection and review of documents which takes advantage of recent technological innovations in e-discovery.

Data Mapping and Identification

An accurate understanding of an organization’s potentially relevant data is essential to meeting preservation and disclosure obligations and managing the costs of a class action because that understanding will help define the scope of document-related activities.

An up-to-date data map which describes the organization’s major sources of data can greatly assist in forming this understanding. At a minimum, for each source of data, such as an e-mail system or file server, a data map should describe what it contains, where it is located, and who is responsible for it.

Once a class action has been commenced, a data map can facilitate the process of identifying custodians within the organization who may have relevant data. This identification process typically involves interviews with individuals with potential knowledge of the matters at issue in the class action to more precisely determine:

  • the nature of the data (i.e., how might it be relevant?)
  • the location of the data, including any copies, such as backups
  • the device or medium (desktop, laptop, mobile device, network-connected storage, cloud storage)
  • the age of the data
  • the volume of the data
  • the number of records reflected in the data
  • the format of the data
  • the time span covered by the data
  • the business purpose of the data

Much of the value of data mapping and identification comes from improved litigation cost management. With a detailed understanding of the organization’s potentially relevant data, outside counsel are able to provide a much more precise estimate of the cost of downstream steps, such as document review. This understanding also arms outside counsel with the information necessary to approach class counsel, and if necessary the court, during the mandatory discovery planning stage to argue for reductions in the scope of production on the grounds that broad production will be disproportionately costly.

It is therefore crucial to engage both IT staff and custodians, particularly senior business leaders, during the identification process. If custodians do not participate meaningfully in the identification process, there is a risk of either over-collection or under-collection of potentially relevant data. Over-collection results from custodians failing to give precise information regarding their potentially relevant data and leads to a waste of time, effort and costs in collecting and reviewing irrelevant data. Conversely, under-collection occurs when custodians fail to disclose potentially relevant sources of data, creating a risk of unwelcome surprises and possibly sanction for failure to comply with disclosure obligations.

As we’ve noted, an important distinction between class actions and other complex litigation is that the document production stage can be delayed, sometimes by many years, while the parties contest the suitability of the case for certification as a class action. There is a natural and understandable desire to defer document identification activities until after certification, but there are also good reasons not to delay. Identifying potentially relevant data can be a valuable element of blueprinting the defence by facilitating estimates of defence costs. As well, delay increases the risk that important data will be deleted or lost as a result of the passage of time. A potential option is to engage in limited identification at the outset of the case, with a view to expanding that effort if and when the class action gets to the document production phase.

Cost-Effective and Defensible Process Design for Document Collection and Review

After potentially relevant data has been identified and preserved, it must be collected and reviewed and produced. There are opportunities to reduce costs during both of these steps through careful project management.

Data collection represents a significant opportunity for cost savings through in-sourcing. As internal IT staff will have the best understanding of the organization’s data and the technical means necessary to collect it, integrating them into the collection process can create significant efficiencies, provided that they undertake the activity in a legally defensible fashion.

Following collection, documents must be assessed for relevance (is this within the scope of what we must disclose?) and privilege (are there grounds to assert the document is privileged?). Optionally, the review process may consider significance: is this document likely to matter in the case? A significance review requires more effort, but it can reduce costs over the course of the class action by early identification of the documents that actually matter.

The costs of document review can be reduced by effectively leveraging teams of people and technological innovation. Increasingly, large document review projects employ specialized teams of lawyers or other professionals. These teams can be cost-efficient because the team members are well-versed in the processes and technologies used for document review and can be both more efficient and less costly than non-specialized reviewers.

A related and growing trend is to outsource review to a third party service provider that uses contract lawyers, non-lawyers with document review experience, or foreign-trained lawyers working in lower-cost jurisdictions to conduct the document review. Although outsourcing can further reduce costs, there can also be risks related to protection of private or confidential information, maintenance of privilege, and the adequacy of quality control. Generally, a process that balances these competing considerations will employ one or two levels of review by specialized review teams.

Another opportunity for costs savings is the use of sampling. Many class actions involve allegations of systematic conduct that breaches a duty said to be owed to a class of people. Whether or not such conduct took place in a systematic fashion is a question that can potentially be proved or disproved by examining only a representative sample of relevant documents. As a sampling approach expressly contemplates that not all relevant documents will be produced, it necessarily requires an agreement with class counsel or approval by the court.

Computer-assisted review (CAR) is increasingly a means for reducing costs in large cases, including class actions. CAR captures a number of technologies that expedite the organization and prioritization of documents by: identifying duplicate documents so that only one copy is reviewed (exact de-duplication), identifying nearly duplicate document so that they can be reviewed together (near de-duplication), grouping e-mails into chains so that they can be reviewed together (e-mail threading), identifying e-mail chains so that e-mails included within a larger chain are not reviewed (e-mail thread suppression), identifying documents that are conceptually similar so they can be reviewed collectively (document clustering), and classifying documents using statistical analysis (predictive coding).

Each of these CAR technologies provides opportunities for cost savings by:

  • Reducing the number of documents requiring review. Exact de-duplication has long been used to avoid review of multiple copies of the same document, but newer analytical technologies can further reduce costs. For example, e-mail thread suppression can reduce the volume of e-mails to be reviewed by 10-30%, with a commensurate cost saving. Predictive coding can also be used to identify documents that statistical analysis suggests are very likely to be irrelevant, again reducing what needs to be examined by a human reviewer.
  • Increasing reviewer productivity. Grouping similar or related documents together through e-mail threading and clustering potentially allows reviewers to make faster decisions and apply them to groups of documents.
  • Reducing quality control costs by improving consistency. Near de-duplication can be invaluable during a review for privileged documents by allowing reviewers to identify similar documents that are also privileged, reducing the risk of inconsistency and inadvertent disclosure of privileged documents. Similarly, technologies that group similar or related documents together reduce the risk of inconsistent judgment calls across the review team, reducing quality control costs.

In terms of defensibility, certain technologies, such as de-duplication, are well-accepted by opposing counsel and the courts. More cutting-edge technologies, such as predictive coding, may be more controversial, depending on how they are used. Familiarity with rapidly evolving trends in technology, e-discovery practice and case law is essential to designing a process that is likely to withstand challenge within the class action.

Effectively integrating CAR into the document review process in a class action requires careful planning and a high level of legal and technical expertise to ensure that it is cost-effective and defensible. For example, a failure to use technological tools correctly can result in relevant documents being excluded from human review and production. The costs to remediate this non-disclosure can easily outweigh the efficiencies of using the technology in the first place. The solution is to integrate quality control processes into the review plan at the outset so that quality issues with either reviewers or technology can be identified and addressed at an early stage.


Managing documents is a significant sub-project, or series of sub-projects, within a class action. These activities can be a significant driver of overall costs. Scope definition through data mapping and identification and effective process design that leverages both teams of people and technology can significantly assist in the management of class actions by improving cost estimates, creating opportunities for cost reductions and keeping the class action project on budget.