Take Away: This article provides a general summary of some of the various forms of litigation support technologies currently available. Investment in litigation support technology, either up-front or in connection with a particular case, may increase efficiency and therefore reduce the expense of performing document review.
In the years since the 2006 Amendments to the Federal Rules of Civil Procedure, a wide variety of vendors have come to the marketplace with technology solutions aimed at saving companies money by improving the efficiency of electronic discovery. In 2007, the Sedona Conference put out a very helpful and still relevant paper called “Best Practices For the Selection of Electronic Discovery Vendors: Navigating the Vendor Proposal Process.” In this report, Sedona notes: “The number of vendors in the electronic discovery business has ballooned in recent years, and there are now hundreds of companies offering electronic discovery services in one form or another. Many have come to the world of electronic discovery by way of expanding existing services, such as software vendors, litigation support providers, document management experts, or forensic specialists.”
The trend continues. The annual Socha-Gelbmann survey, which formally served as a widely read and marketed informal ranking of vendors, has tracked the market for the past seven years. In the 2009 Socha-Gelbmann Electronic Discovery Survey, which discontinues the rankings, they make the following observation:
These are strange times indeed for the world of electronic data discovery. Many providers have suffered from dwindling revenues, some have failed. Yet as quickly as established providers shrink, disband, or are absorbed by others, new ones appear — with the ranks of EDD [Electronic Data Discovery] providers swelling to more than 600. The new kids in town are arriving with creative, innovative approaches to e-discovery: integrated litigation hold platforms; more efficient and user-friendly data collection software; strong, scalable, sophisticated advanced search tools; platforms capable of handling hundreds of languages; software as a service for nearly everything; and deeper services to deploy and support all these and more.
Both the Sedona RFP paper and the annual Socha-Gelbmann surveys (past and present) are useful tools for companies who may be considering investing in litigation support technology or services. Below, we generally outline some (but not nearly all) of the various types of systems in use today.
Options include traditional client installed software as well as hosted online software services, via the Software-as-a-Service (SaaS) model, formally known as the Application Service Provider (ASP) model. In the usual case, the installed system requires a greater up-front investment but then expenditures are limited to maintenance and new releases. Businesses who have a high frequency of litigation and a strong IT department often prefer this model. SaaS offers the benefit of “pay as you go” while also relieving the business of the burden of building and supporting the technology infrastructure.
Within the very wide range of functionality for litigation support systems, two basic functions include data repositories used to store the data during the case (“Discovery Vaults”) and software that allow users to process this data (“Evidence Processing Platforms”). Commercially available litigation support systems often combine the concept of a Discovery Vault with the functionality of an Evidence Processing Platform, although systems that only do one or the other, or aspects of one or other, are certainly in abundance as well.
A Discovery Vault essentially allows a party to collect all the relevant information from its many systems and store it in a single database. One obvious advantage of a Data Vault is increased certainty with on-going evidence preservation without the business impact associated with preserving the data on the company's active systems. Moreover, a central storage location may aid in the processing of evidence. At the very least, it allows for evidence processing tools to be pointed at one system rather than several, each with their own interfaces.
The ability to clearly establish data authenticity is a “must have” for any Data Vault. For example, may systems apply a hashing algorithm to each record before it is loaded into the Vault. Hashing creates a uniquely identifiable string of characters and numbers (“hash value”). Any change to the data after this point would result in a different hash value. Thus, hashing results in a “digital fingerprint” useful in demonstrating that a record stored in the Vault is the same one that was pulled from the business system from which it originated. The best practice is to hash the data at the point it is collected, to avoid the potential for chicanery between extraction from the source system and load into the discovery vault.
Hashing is also useful in performing data de-duplication, either before data is loaded into the Discovery Vault, or afterwards, by operation of the Evidence Processing Platform. The gist is that each electronic record (Word document, email, etc) is compared to all other electronic records by comparing their respective hash values. Exact duplicate files will generate the same hash values. For purposes of litigation, programs that apply the hashing algorithm on a sub-document basis are generally more helpful than those that compare entire documents only. This allows for the identification of “near duplicates,” those that are close enough such that only one need be preserved, reviewed, and produced.
In addition to de-duplication, certain parameters may allow for further “first pass culling” of information. For example, data that is earlier than or later than a certain date may be irrelevant. In some cases, the parties have even agreed upon certain search terms up front, allowing for any document not containing these terms to be culled out before a human reviewer ever has to look at it.
The best Evidence Processing Platforms provide a variety of centralized searching options that can be consistently applied to a heterogeneous record set. For example, some tools will allow the user to search data, regardless of its format, by any number of criteria, such as custodian, date range, record type, key word, proximity, concept searching, and other means.
More sophisticated tools also include proprietary algorithms and graphical displays to speed the process of honing in on relevant data by grouping and describing data automatically. Such tools may, for example, group data into pre-set categories such as all documents that may be privileged attorney/client communications. Such tools also typically allow users to group data based upon criteria the user supplies, for example, all documents containing a certain key word. Analytics performed by some systems can highlight interesting patterns, such as the magnitude of email chatter related to a particular individual or a particular key word over time.
At some point in the process, an actual human needs to eyeball the remaining (non-culled) documents and decide what to do with them. Convenient data review interfaces, note taking, and redaction functionality can therefore have a big impact on the efficiency of personnel involved in performing this legal review. Many systems allow for the creation of “review sets” to manage the activities of a team of reviewers. Users are provided a window to view the electronic document, a place to make notes, and the ability to perform redaction. Most systems provide several note taking and marking options, including text boxes for comments as well as a quick means to designate a document or a group of documents as belonging to a particular category. For example, users may be provided with radio buttons for “relevant” or “not relevant.” Likewise, check boxes may allow a reviewer to quickly mark a document or group of documents “privileged.” A standard feature is the ability of an administrative user to set up case specific categories for reviewers to apply in their review. Some systems allow users to perform redaction on the fly, while others allow users to designate documents for redaction through a collective batch process.
After each electronic document has been processed and a party is ready to produce its discovery responses to the requesting party, one final step remains. A product set must be generated in an acceptable format and delivered via acceptable media. In some cases, this may simply mean printing out hard copies of each electronic document. The electronic equivalent is to provide an image of each document, either in PDF or TIFF. If metadata must be produced, a “load file” can be generated, which combines the image of the document with a flat file, which contains the relevant metadata. Finally, if native production will be made, the actual Word documents and other files are simply copied verbatim and delivered “as is.”
Typically, each produced document must also be marked with a unique identifier, commonly referred to as a “bates stamp.” In the case of native production, one option some have purposed is to produce a sub-set of the hash value for each document, which would avoid the need to modify the original by electronically “stamping” a bates number on it.
Click here to view a graphical representation of the various systems that may potentially be implicated in a hypothetical case.