Wednesday, July 6, 2011

So Easy A Cave Man Can Do IT? Part III – ITs Complex

Introduction
In Part I of the Cave Man series, Lessons From Howrey”, we looked to some lessons taken from the failure of Howrey.  Heavy investment in a state-of-the-art litigation support center and associated support staff, infrastructure and the inability to compete with “service providers” were cited as among the many reasons the firm failed. 
Part II, “Keeping Up with The Joneses”, dove into Information Technology (IT) Security and touched on the complexity of IT infrastructure.  Part III, “ITs Complex”, will continue to explore IT complexities. 
Law Firm IT has come a long way over the past several years.  We don’t want to send the wrong message here. There are many law firm IT environments that are sophisticated, mature and well maintained with considerable investment in infrastructure year over year.   There have been, and will continue to be, firms that are successful at managing Electronic Discovery – soup-to-nut - largely in-house.   Electronic Discovery is not just for paralegals, techno geeks and other non-lawyers anymore.  A growing number of lawyers are choosing litigation, practice support and broader ESI management roles over the practice of law alone.  Some are successfully merging the practice of law and technology.   Even many mid-sized firms, much less the large ones, simply can’t survive without in-house expertise and capability.   Hosting case management applications and some level of processing capability is becoming increasingly more important.   Electronic Discovery across the EDRM on a large scale, however, is a completely different ball game.  Most firms won’t follow Howrey there.  For those that do, this “So Easy a Cave Man Can Do IT” series is for you.  In this article, we will cover just one aspect of the business – data storage and associated bandwidth.
Storage – You Need Lots of Space!
Why do you need so much space, comes the cry from the CIO, or IT Manager? A few years ago, a typical case consumed gigabytes (GB) of network storage. Today we speak in terabytes (TB) and are beginning speak of petabytes (PB). One of the many things that surprised IT manager needs to know about electronic discovery and ESI processing and hosting environments is that 1 TB of data processed requires in many cases two or three times as much storage space. In other words, 1 TB of original data can require two, three or even four TBs of network storage space. During the act of processing, files are replicated and rendered at least once and sometimes many times.  A SAN (Storage Area Network) and sound data management principles are essential for ESI processing and hosting.  A SAN is a complex array of disk storage devices that utilize block level storage.  Not to be confused with a NAS (Network Attached Storage), that utilizes file level storage, or far inferior DAS (Direct-attached Storage) that is simply a storage device attached to a server or other device that has an operating system attached to the network. 
In a traditional environment, users call for MS Word, Email and other application files one at a time and a typical SAN or NAS configuration works just fine. In a complex processing and hosting environment, however, the game changes dramatically. The number of file “transactions” in a processing environment is much more dynamitic with millions of requests to the storage array in the time span such a system is normally configured to handle just thousands of transactions. Oh, and you need to back everything up either digitally, or to tape.  Most commercial processing environments are fully redundant meaning there are essentially two SANs - one for redundancy.  Everything then gets backup digitally or to tape and kept off site for disaster recovery.  That effort alone can require considerable expertise and expense.  Careful customization and continuous optimization of a highly dynamic complex storage environment is critical.
Bandwidth - How much is enough?

There is no easy answer to how much bandwidth is enough. When we think of bandwidth, we usually think of the pipe that sends your information across the internet from point A to point B. Internet bandwidth is actually a fairly easy problem to solve in most locations today. You just buy as much bandwidth as you need, right? IT is only money after all. Of course, you can’t control the bandwidth at the other end. You know the one, that user that sits at home with a 14 kbit/s wireless modem and complains about your bandwidth.
The more complex bandwidth challenge is that of your internal bandwidth – the speed at which communication occurs between devices within the environment.  In the case of processing, the communication between the various devices within the environment that are needed for processing.  In computing, these paths are referred to as input/output, or I/O.  When there is not enough bandwidth (internally our externally), a system can become “I/O bound”. This condition exists where information is being requested faster than information can be processed.  In the most simplistic terms, there is not enough bandwidth for data to travel from one device to another.  There are many pathways those data travel on the way to the ultimate home on the SAN.

Life usually begins (acquisition) and ends (production) on some external storage media.   Until very recently, transferring files from external storage utilized very slow USB (Universal Serial Bus) connections.  So it did not matter how much “bandwidth” one had elsewhere on the network, data transfer was, and still is, highly restricted, even with newer USB 3.0 devices that are 10 times faster than USB 2.0 connections.  Even with more expensive hardware such as solid state drives (SSDs) with SATA (Serial Advanced Technology Attachment) connections, we are nowhere close to network grade data transfer speeds from external devices.  Careful planning and expectation setting is critical. 

Once data makes it to the network, there remains ample opportunity for bottlenecks. There are switches, routers, Ethernet cables and connections, and a variety of other data pathways all competing for “bandwidth”. All of that infrastructure must be implemented, configured, maintained, optimized and kept redundant. Considerable storage management expertise is needed for a processing operation of any size. IT is not just complex, it is also expensive. In Part IV, we will explore the rapidly moving software side of the business.

Conclusion

Acquiring, filtering, processing, hosting, reviewing and producing large volumes of information require considerable space and bandwidth.  High availability and highly redundant storage and the associated bandwidth challenges alone not only require a considerable, and on-going investment of state-of-the-art infrastructure, considerable expertise is also needed. The most advanced technology on the planet is only as good as the human capital investment. IT is complex.

No comments: