Take Away: All the technology and complex algorithms in the world can’t replace the human element when it comes to finding relevant data through searches. It pays to work closely with custodians and opposing counsel to craft the proper search terms.
In the beginning there was electronic data, and when a party needed to produce relevant documents as part of litigation, their attorney would collect up the electronic data, review it, and produce that which was proper. This was not a big change from the paper world, where someone would go to the filing cabinet, grab the corresponding file, and make copies for the attorney to review.
As the use of technology increased and as the cost of digital media sharply decreased the amount of data being stored grew at an ever increasing rate. Smart organizations responded by using keyword search terms to limit the amount of data the attorney needed to review, thus saving themselves some money in legal fees. Typically these were crude search terms, but they did a decent job of “culling” down the amount of data to be reviewed.
For example, if you were involved in litigation because of the construction of a Washington Elementary School, you might search through the custodian’s files for the words “Washington”, and “School”, or possibly “construction”, thus avoiding having to review documents that have nothing at all to do with this case. This is still somewhat crude, however, as you would then miss any emails or documents that referred to just the “building”, abbreviated the Elementary School as “ES”, or one of the many, many terms that might be used by individuals to refer to the school.
So, as the amount of data continued to grow, the idea of concept searching came to the forefront as a potential solution. With concept searching, now instead of having to think of all the different words an individual might use when writing about a school construction project, the technology will automatically search for the concept of school construction in documents, not just the exact words. So now, when someone refers to the building, or the project, those will be recognized as relevant documents. You don’t have to know all the words, nor search for all the potential words used to refer to the same idea. (A simpler example might be litigation involving a sports league. Concept searching would return any mention of any team in the league, as opposed to having to do keyword searches for each team individually.)
Of course, while both of these will help eliminate some of your needed document review, they are still far from perfect. In our example above, if your client is a construction company that specializes in school construction, and is located in the state of Washington, your search process is still going to result in volumes of unrelated documents being loaded for review. In order to be effective, your search criteria would have to be much more complicated than the simple terms we’ve used here.
Search is still very much an inexact science, which is why there are still teams of folks trying to find a better way. One of those is the Text Retrieval Conference Legal Track, which in 2008 took a good look at the human element of search. As you can read in this Law.com article, the 2008 tests had some rather striking results, showing that the team that spent the most time working with topic authorities was able to get the best results. This suggests, clearly, that getting the best search results is a function of collaboration and communication, not solely technology.
As much as we techies might like to be able to simply sit in a back room with a powerful search engine and get the best search results ourselves, the truth appears to be that the best results are actually found through working directly with the people who wrote these documents and bringing in the right team. The search team includes these “subject matter experts,” the attorneys, and possibly even linguists, who can help craft the best searches. Given the complexity of the task, designing the search in broad daylight with the cooperation and collaboration of the opposing party (with all due protections in place for attorney-client privilege, trade secrets, and privacy) may be a useful means of avoiding costly discovery disputes, as has been frequently reported in recent months, including on our own blog here, here and here.
Search is a complex area, made all the more complex because it involves interfacing technology with the way we, as humans, communicate with each other. We all know that individuals communicate in many different ways, with their own styles and use of language, and that every case is different as they involve different people. We shouldn’t expect the search process to work the same way in every case either. There is something refreshing in the realization that in an ever more technologically complex world, it’s still people that make the difference between success and failure.