TAR - Not Just For Big Data Volume Cases
The events of the last couple of weeks have given me a great
real-life example to share with you regarding Technology Assisted Review
(TAR). These use-case anecdotes are right in line with our
educational program this month providing education with TAR. It’s
our duty to continue to educate ourselves on the technology available, and
the risks and benefits of its use, and below are two great examples of
instances demonstrating that TAR is not only valuable, delivering ROI, in big
data volume cases, but in small ones as well.
The use of TAR and its work flows is nearly a common practice
(and in fact almost mandatory in BIG data volume cases). Indeed, in our
shop, we just completed a large 8.5-million record case where the lawyers
reviewed only 6,000 (less than 1%) documents to achieve technology
training stabilization. What is stabilization?
Stabilization is the point where stability scores tell us that the technology
has learned all it is likely going to learn from a sample review. Because
of how well TAR worked in that case, we measured over $1.4 M in ACTUAL review
cost savings just based upon what TAR indicated would not be relevant
documents. The vast majority of what was identified as relevant by this
process was produced without review – over 350,000 (a claw back agreement was
used to protect any privileged documents produced). There were about
30,000 documents for priority custodians that had to be reviewed before production.
The legal team chose to review only what TAR determined as relevant.
Precision was measured at 77%. What does that mean? 77% of what the
TAR process deemed relevant was in fact relevant, confirmed by human
review. This precision rate is very good, and the savings remarkable,
right?
Well, that wasn’t the only remarkable thing we learned about TAR this week. I ran into a lawyer at an event a few weeks back and we exchanged greetings. I gave him my business card and told him “call me if you ever need help with eDiscovery.” A week later, my phone rings and the conversation begins “I have your card here, and remember that you said call me if I need help with this “eDiscovery stuff.” He needed help indeed, and fast. He represented a client who has been sued over a trademark issue. They were sitting on the other side of a motion to compel ruling that required them to collect, filter, review and produce in less than two weeks. The attorney had a 3-person staff to get the work done and knew that the normal approach would not meet the deadline and an extension was not available. He asked if I had any idea what he should do. We were looking at a situation most shops would consider a small case with one custodian which traditionally is not a great number of documents. The attorney was from a small firm, with limited resources and budget, and limited time. I decided to advise that we treat this matter as if it were the 8.5-million record case I talked about above, and use TAR and its work flows. I am sharing with you below the steps we took. Again, this feeds directly back to my opening paragraph: Some lawyers today are not familiar with technology, which is one of the primary drivers behind the amendment to the ABA Model Rules of Conduct. In those cases, we use a defined step-by-step process to educate and inform how the process works.
The upshot in this “small” case is that the deadline was
met. In fact, we were a day early. Documents reviewed – 650.
Documents produced 12,211.
Step 1: Collect Data. Ooooops – we discovered the custodian in this small case had much more data than expected -- more than 300 GB! Finding more data was not conducive to meeting the tight deadline in a standard approach!
Step 2: Filter out all the file types we do not want or need – the lawyer decided to focus on a few very specific file types. Process and deduplicate. Weed out whatever we can by other judgmental means. The result – 210,000 documents remain. OK, that is better than the original collection, but way too much to review!
Step 4: Enter TAR and EnvizeTM, our machine-learning tool with Active Learning. We will use the initial (completely untested) terms and run analytics just on the 28,000 documents hitting those terms. We create a few “Judgmental” random samples and launch into review/training. No control batch because EnvizeTM doesn’t need one, at least not at this stage.
Step 6: Stabilization occurred very quickly. Figure 1 above shows the result after 815 documents. At this point, we switch to Continuous Active Learning (CAL) to feed the reviewers highly relevant content – documents that have the highest relevance scores.
Step 7: After just a few hundred CAL docs reviewed, lawyers report that they have become confident that the technology has done its job and ask that we run priv screen and produce. We suggest QC and audits. Lawyer says – not looking for precision, just looking to make sure we are not missing anything and don’t care if we are a bit over inclusive. We ultimately review a random sample of the “left behind”, just to make sure we were not missing anything. We had not.
Conclusion? TAR has utility beyond big-data
volume cases. Almost any case of any size that has ESI can benefit from
using machine learning technology and a sound TAR work flow.
Want to learn more? See the July Webinar replay here.:
TAR: A Peek Inside the Black Box.
TAR: A Peek Inside the Black Box.
No comments:
Post a Comment