Absent showing of sound methodology, Judge Scheindlin finds identification, self-collection, and keywords to be insufficient.
In her latest opinion in the National Day Laborer matter, Judge Shira A. Scheindlin of the U.S. District Court for the Southern District of New York highlights many of the pitfalls encountered in identifying, collecting, searching, and producing relevant information in the discovery process. Although decided under the demanding standard of the Freedom of Information Act (FOIA), many of the lessons apply in the context of the Federal Rules of Civil Procedure (FRCP). Judge Scheindlin applies insights from cases decided under the FRCP, noting that "much of the logic behind the increasingly well-developed caselaw on e-discovery searches is instructive in the FOIA search context because it educates litigants and courts about the types of searches that are or are not likely to uncover all responsive documents."
In National Day Laborer, the plaintiffs made a request under the FOIA seeking information from five federal agencies concerning the controversial Secure Communities program. In response to the plaintiffs' requests (and a court order), the defendant agencies performed "massive" searches and made voluminous productions. The parties then cross-moved for summary judgment based on whether those searches (and consequently, the productions) were adequate. To support their motions, each defendant agency filed a declaration describing its efforts. Those declarations varied greatly in level of detail and illustrated a multitude of different approaches. In the end, Judge Scheindlin ordered further action by the defendants, the extent of which was driven not only by the accuracy of the searches performed, but also by the degree to which those search efforts were documented and verifiable.
Scope of Required Searches
Much of the opinion focuses on which custodians' files should be searched. Several of the defendant agencies contended that a reasonable search had been performed when the records of those custodians most likely to have responsive information were searched. The court disagreed, noting that the government is "not required to search only the files of the . . . custodians who are 'most likely' to have responsive records; it must also search other locations that are reasonably likely to contain records." Paralleling her findings in Pension Committee and other matters decided under the FRCP, Judge Scheindlin found that the government attorneys should contact potential custodians to determine whether they were involved in the activity at issue and have relevant information. The National Day Laborer opinion also reiterates the duty to follow up with custodians who fail to respond after receiving a request to provide responsive documents. The court noted that it is "plainly improper" to view "a non-response as a 'no records' response." The court also noted that the following inactions may render a search for responsive information inadequate:
- Failure to search the records of former employees
- Failure to search archived records
- Failure to follow through on obvious leads, such as searching the records of individuals copied on key correspondence
- Failure to search, or to describe the extent of the search, for materials in the possession of third parties, including independent contractors retained by the government who were involved in the decision-making process
Document Identification and Review Methodology
In attempting to defend its search for responsive materials as adequate, the Federal Bureau of Investigation (FBI) simply stated in its declaration that it had conducted a "manual review" of documents in order to find relevant records without describing what this actually entailed. Judge Scheindlin could not determine that the phrase "manual review" indicated that every document was reviewed by human eyes, as it was much more likely that the FBI custodians looked through specific categories of documents or searched using keywords. Absent evidence of what specifically had been done to locate relevant information, it was found that the FBI failed to carry its burden of showing under the FOIA standard that its search was reasonably calculated to uncover all relevant records.
Reliance on Custodian Self-Collection Without Appropriate Supervision
Several of the responding government agencies did not monitor custodians but instead relied on them to conduct appropriate searches to gather relevant information. Judge Scheindlin noted that"most custodians cannot be 'trusted' to run effective searches because designing legally sufficient electronic searches in the discovery or FOIA contexts is not part of their daily responsibilities." Quoting her opinion in Pension Committee, Judge Scheindlin reiterated, "'I note that not every employee will require hands-on supervision from an attorney. However, attorney oversight of the process, including the ability to review, sample, or spot-check the collection efforts[,] is important.'" The court will not simply take at face value unsupported assertions that custodians have "designed and conducted a reasonable search." Counsel must take appropriate steps to oversee and monitor the process, documenting each step along the way.
The Limitations of Keyword Searches
A number of defendant agencies provided custodians with lists of keywords to use in searching for relevant information. For the most part, the custodians were asked to run the searches themselves and provide the resulting relevant documents. However, these search terms were not tested, and documents without search-term hits were not sampled. Running parallel to cases decided under the FRCP, the opinion highlights the inadequacies of simple keyword searches and the need for parties using such terms to establish their efficacy. Looking beyond keyword search, the court noted the "parties can (and frequently should) rely on latent semantic indexing, statistical probability models, and machine learning tools to find responsive documents," explaining that "[t]hrough iterative learning, these methods (known as 'computer-assisted' or 'predictive' coding) allow humans to teach computers what documents are and are not responsive to a particular FOIA or discovery request and they can significantly increase the effectiveness and efficiency of searches."
The court further opined that "in order to determine adequacy, it is not enough to know the search terms. The method in which they are combined and deployed is central to the inquiry." Without such evidence, the court could not assess the adequacy of the searches. In order to rectify any shortcomings in the defendants' responses without incurring wasteful and redundant re-searching, the court ordered the parties to cooperate and work together to craft search terms and protocols, including methods to test search-term validity.
While this case is specific to the high standard imposed by the FOIA, lessons regarding the shortcomings of untested keyword searches, inadequately supervised custodian self-collection, and failure to search in appropriate locations also apply where discovery is governed by the FRCP. It is no less important in that context to have a defensible discovery plan and a sound process for its implementation.