Monday, December 6, 2010

The Downstream Cost of Data IS the Problem

Every day, it seems, there is yet another article about how corporations need to clean up their records management programs and deploy millions of dollars of technology to solve the “eDiscovery” cost problem. In an economy where corporations are hard pressed to make and sell widgets, much less fund a fleet of consultants and a massive IT undertaking, this is a big pill to swallow for most in the business world. The vast majority of corporate America views records management and associated technology as more of a compliance issue because litigation just does not consume enough of the corporate budget to get finance attention. When it does, especially for those big guys that get sued a lot, there is ample technology and consultants to implement those solutions. While the legal departments and IT professionals would really prefer to get rid of all that unnecessary information, the business people who make and sell the widgets think their RIM programs work just fine. They can generally find what they need, when they need it. That is not to say that every company – large or small, litigious or not – couldn’t benefit from some proactive data organization, the reality is that in most cases those programs just don’t get properly funded. So, we are left with solving the real problem – the cost of sending documents out to the lawyers and investigators.
"People have an average of 30,000 e-mails per year per person," says Atlanta-based SunTrust Banks deputy general counsel Brian Edwards. “Over the last five years, that has meant as many as 1.5 million documents for a single matter. Throwing $150-$300-per-hour law firm associates at the mess, for privilege and responsiveness review, is too expensive. "Without a tool that would let you do it faster ... you could get 50-100 document decisions per hour per person," Edwards says. That's 3,000 documents for one person's 40-hour week: "Then do the math for 900,000." See How to Keep ESI at Bay in E-Discovery by Erik Sherman, Corporate Counsel, November 29, 2010 (LTN Technology News at Law.com).
We did the math, and that’s 18,000 review hours at 50 Document decisions per hour (“DDH”). At $150 per hour, review cost for Mr. Edwards’ 900,000 documents would be $2.7 Million! That would take one person over 8 years to review working an average 40 hours a week. Technology can make that process faster and there is ample technology available to increase review speeds. Depending on the tool and the workflow, DDH can be as high as 100, 200 or even 300 DDH and higer. Going fast, however, is not the only component where improvement is needed. In Mr. Edwards' example, reveiwing 900,000 records where the response rate is likely very low, in most cases is simply not cost justified. Search, retrieve and review, no matter how fast the DDH, is only part of the answer. More advanced processes and technologies are required. Advanced technology has been around for some time. Predictive coding algorithms are using sophisticated Bayesian inference from review inputs, key terms and concepts to organize information based upon perceived and sometimes user validated review. “Products that offer only keyword searches, no matter how robust, may have insufficient power. You can no longer rely on so-called Boolean combinations of keywords”, adds David Stanton, a partner at New York-headquartered Pillsbury Winthrop Shaw Pittman.

“Some technologies that can speed results are semantic indexing that deciphers the meaning of what people have written, text clustering that groups files with common content, and sophisticated Bayesian statistical analysis (emphasis added) that uses feedback to improve results.” (Law.com)
Legal professionals have been historically slow to adopt sophisticated technology and processes due primarily to a lack of understanding to the point where advanced data reduction techniques can be widely utilized. Even techniques as simple as data sampling while accepted, are difficult to support and justify without the right process and advice. Sampling becomes vital when manually reviewing every document is impossible. "You're in a world of information retrieval," says Stanton. "You've stepped out of the law."

Sophisticated statistical techniques like reviewing only representative samples of documents can slice the workload, when done properly.
“Sample size can be relatively small as long as it’s properly randomized,” says Stanton. “If your desired error rate is 5 percent and your confidence level is 95 percent, then you have to review about one in every 400 documents.” The good news? That’s about 2,500 documents out of 1 million. The bad news? Someone, whether staffer or consultant, has to understand the math involved. (Law.com)
Legal professionals, we believe, get too hung up on the need to understand every line of code around a technology – how does it work? Too much emphasis on what needs to be defensible, as defensibility relates to the technology being used. If your only using a black box technology to organize information, understanding how the technology works is less important if the end product is based upon a larger workflow that provides validation – checks and balances. Technology is just a tool. How those tools are used, not how they work necessarily, is what must be understood and defended. Before we reach any sort of understanding on how to reduce the cost of discovery, however, everyone involved must recognize that review cost IS the problem that needs to be solved and no matter where you spend your money, something at some level must be done to reduce reviewable volume with confidence that nothing important gets left behind.