Information Governance Archives

More Updates from the EDRM Annual Meeting – eDiscovery Trends

May 10, 2013

Yesterday, we discussed some general observations from the Annual Meeting for the Electronic Discovery Reference Model (EDRM) group and discussed some significant efforts and accomplishments by the (suddenly heavily talked about) EDRM Data Set project. Here are some updates from other projects within EDRM.

It should be noted these are summary updates and that most of the focus on these updates is on accomplishments for the past year and deliverables that are imminent. Over the next few weeks, eDiscovery Daily will cover each project in more depth with more details regarding planned activities for the coming year.

Model Code of Conduct (MCoC)

The MCoC was introduced in 2011 and became available for organizations to subscribe last year. To learn more about the MCoC, you can read the code online here, or download it as a 22 page PDF file here. Subscribing is easy! To voluntarily subscribe to the MCoC, you can register on the EDRM website here. Identify your organization, provide information for an authorized representative and answer four verification questions (truthfully, of course) to affirm your organization’s commitment to the spirit of the MCoC, and your organization is in! You can also provide a logo for EDRM to include when adding you to the list of subscribing organizations. Pending a survey of EDRM members to determine if any changes are needed, this project has been completed. Team leaders include Eric Mandel of Zelle Hofmann, Kevin Esposito of Rivulex and Nancy Wallrich.

Information Governance Reference Model (IGRM)

The IGRM team has continued to make strides and improvements on an already terrific model. Last October, they unveiled the release of version 3.0 of the IGRM. As their press release noted, “The updated model now includes privacy and security as primary functions and stakeholders in the effective governance of information.” IGRM continues to be one of the most active and well participated EDRM projects. This year, the early focus – as quoted from Judge Andrew Peck’s keynote speech at Legal Tech this past year – is “getting rid of the junk”. Project leaders are Aliye Ergulen from IBM, Reed Irvin from Viewpointe and Marcus Ledergerber from Morgan Lewis.

Search

One of the best examples of the new, more agile process for creating deliverables within EDRM comes from the Search team, which released its new draft Computer Assisted Review Reference Model (CARRM), which depicts the flow for a successful Computer Assisted Review project. The entire model was created in only a matter of weeks. Early focus for the Search project for the coming year includes adjustments to CARRM (based on feedback at the annual meeting). You can also still send your comments regarding the model to mail@edrm.net or post them on the EDRM site here. A webinar regarding CARRM is also planned for late July. Kudos to the Search team, including project leaders Dominic Brown of Autonomy and also Jay Lieb of kCura, who got unmerciful ribbing for insisting (jokingly, I think) that TIFF files, unlike Generalissimo Francisco Franco, are still alive. 🙂

Jobs

In late January, the Jobs Project announced the release of the EDRM Talent Task Matrix diagram and spreadsheet, which is available in XLSX or PDF format. As noted in their press release, the Matrix is a tool designed to help hiring managers better understand the responsibilities associated with common eDiscovery roles. The Matrix maps responsibilities to the EDRM framework, so eDiscovery duties associated can be assigned to the appropriate parties. Project leader Keith Tom noted that next steps include surveying EDRM members regarding the Matrix, requesting and co-authoring case-studies and white papers, and creating a short video on how to use the Matrix.

Metrics

In today’s session, the Metrics project team unveiled the first draft of the new Metrics model to EDRM participants! Feedback was provided during the session and the team will make the model available for additional comments from EDRM members over the next week or so, with a goal of publishing for public comments in the next two to three weeks. The team is also working to create a page to collect Metrics measurement tools from eDiscovery professionals that can benefit the eDiscovery community as a whole. Project leaders Dera Nevin of TD Bank and Kevin Clark noted that June is “budget calculator month”.

Other Initiatives

As noted yesterday, there is a new project to address standards for working with native files in the different EDRM phases led by Eric Mandel from Zelle Hofmann and also a new initiative to establish collection guidelines, spearheaded by Julie Brown from Vorys. There is also an effort underway to refocus the XML project, as it works to complete the 2.0 version of the EDRM XML model. In addition, there was quite a spirited discussion as to where EDRM is heading as it approaches ten years of existence and it will be interesting to see how the EDRM group continues to evolve over the next year or so. As you can see, a lot is happening within the EDRM group – there’s a lot more to it than just the base Electronic Discovery Reference Model.

So, what do you think? Are you a member of EDRM? If not, why not? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Reporting from the EDRM Annual Meeting and a Data Set Update – eDiscovery Trends

May 9, 2013

The Electronic Discovery Reference Model (EDRM) Project was created in May 2005 by George Socha of Socha Consulting LLC and Tom Gelbmann of Gelbmann & Associates to address the lack of standards and guidelines in the electronic discovery market. Now, beginning its ninth year of operation with its annual meeting in St. Paul, MN, EDRM is accomplishing more than ever to address those needs. Here are some highlights from the meeting, and an update regarding the (suddenly heavily talked about) EDRM Data Set project.

Annual Meeting

Twice a year, in May and October, eDiscovery professionals who are EDRM members meet to continue the process of working together on various standards projects. This will be my eighth year participating in EDRM at some level and, oddly enough, I’m assisting with PR and promotion (how am I doing so far?). eDiscovery Daily has referenced EDRM and its phases many times in the 2 1/2 years plus history of the blog – this is our 144th post that relates to EDRM!

Some notable observations about today’s meeting:

New Participants: More than half the attendees at this year’s annual meeting are attending for the first time. EDRM is not just a core group of “die-hards”, it continues to find appeal with eDiscovery professionals throughout the industry.
Agile Approach: EDRM has adopted an Agile approach to shorten the time to complete and publish deliverables, a change in philosophy that facilitated several notable accomplishments from working groups over the past year including the Model Code of Conduct (MCoC), Information Governance Reference Model (IGRM), Search and Jobs (among others). More on that tomorrow.
Educational Alliances: For the first time, EDRM has formed some interesting and unique educational alliances. In April, EDRM teamed with the University of Florida Levin College of Law to present a day and a half conference entitled E-Discovery for the Small and Medium Case. And, this June, EDRM will team with Bryan University to provide an in-depth, four-week E-Discovery Software & Applied Skills Summer Immersion Program for Law School Students.
New Working Group: A new working group to be lead by Eric Mandel of Zelle Hoffman was formed to address standards for working with native files in the different EDRM phases.

Tomorrow, we’ll discuss the highlights for most of the individual working groups. Given the recent amount of discussion about the EDRM Data Set group, we’ll start with that one today!

Data Set

The EDRM Enron Data Set has been around for several years and has been a valuable resource for eDiscovery software demonstration and testing (we covered it here back in January 2011). The data in the EDRM Enron PST Data Set files is sourced from the FERC Enron Investigation release made available by Lockheed Martin Corporation. It was reconstituted as PST files with attachments for the EDRM Data Set Project. So, in essence EDRM took already public domain available data and made the data much more usable. Initially, the data was made available for download on the EDRM site, then subsequently moved to Amazon Web Services (AWS).

In the past several days, there has been much discussion about the personally-identifiable information (“PII”) available within the FERC (and consequently the EDRM Data Set), including social security numbers, credit card numbers, dates of birth, home addresses and phone numbers. Consequently, the EDRM Data Set has been taken down from the AWS site.

The Data Set team led by Michael Lappin of Nuix and Eric Robi of Elluma Discovery has been working on a process (using predictive coding technology) to identify and remove the PII data from the EDRM Data Set. Discussions about this process began months ago, prior to the recent discussions about the PII data contained within the set. The team has completed this iterative process for V1 of the data set (which contains 1,317,158 items), identifying and removing 10,568 items with PII, HIPAA and other sensitive information. This version of the data set will be made available within the EDRM community shortly for peer review testing. The data set team will then repeat the process for the larger V2 version of the data set (2,287,984 items). A timetable for republishing both sets should be available soon and the efforts of the Data Set team on this project should pay dividends in developing and standardizing processes for identifying and eliminating sensitive data that eDiscovery professionals can use in their own data sets.

The team has also implemented a Forensic Files Testing Project site where users can upload their own “modern”, non-copyrighted file samples that are typically encountered during electronic discovery processing to provide a more diverse set of data than is currently available within the Enron data set.

So, what do you think? How has EDRM impacted how you manage eDiscovery? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

I Tell Ya, Information Governance Gets No Respect – eDiscovery Trends

April 4, 2013

If Rodney Dangerfield were a records manager, he probably would say something like this: “I tell ya, my CEO is so dumb, I taught him how to defensibly delete – he forgot how to preserve!” Ba-dum-bum!

As reported by Sean Doherty in Law Technology News (New Research Reveals Information Governance Gets No Respect), a new report from 451 Research has indicated that “although lawyers are bullish about the prospects of information governance to reduce litigation risks, executives, and staff of small and midsize businesses, are bearish and ‘may not be placing a high priority’ on the legal and regulatory needs for litigation or government investigation.”

In its March report, E-Discovery and E-Disclosure 2013: The Ongoing Journey to Proactive Information Governance, 451 Research conducted a survey of small, midsize, and large companies late last year regarding the handling of corporate data, with a specific focus on “enterprise IT and included relational and non-relational databases, data warehousing, text analytics, and business intelligence.”

Most notable in the survey, as reported by Doherty, “Of the 2,320 respondents, less than one-half believed that an information governance program was important to their organization” and “only 32 percent of senior management believed information governance important.” Apparently, “more than one-half of the information technology staff who responded thought it important”. According to the article, “larger organizations viewed information governance more importantly.”

Doherty also notes that “The 451 Research report covers a lot of ground in approximately 50 pages, the importance of IG by enterprise size, industry, job function, and jurisdiction; legal technology trends including the impact of social media and bring-your-own-device programs on e-discovery; a breakdown of e-discovery costs; and U.S. and state case law as well as a survey of legal and regulatory developments in the EU.”

For more information about the report, including other report findings and comments from David Horrigan (eDiscovery and information governance analyst and author of the report), click on the article link above.

Want specific survey results? To purchase a copy of the report (for $3,750), click here.

In our recent thought leader interview series, several of our thought leaders mentioned information governance as a leading emerging trend within the industry. This report appears to suggest that we still have a long way to go in educating organizations on the importance of a sound information governance program.

So, what do you think? Do those survey numbers surprise you? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

eDiscovery Daily Is Thirty! (Months Old, That Is)

March 21, 2013

Thirty months ago yesterday, eDiscovery Daily was launched. It’s hard to believe that it has been 2 1/2 years since our first three posts that debuted on our first day. 635 posts later, a lot has happened in the industry that we’ve covered. And, yes we’re still crazy after all these years for committing to a daily post each business day, but we still haven’t missed a business day yet. Twice a year, we like to take a look back at some of the important stories and topics during that time. So, here are just a few of the posts over the last six months you may have missed. Enjoy!

Industry Consolidation Continues: If you think there have been a lot of acquisitions in the eDiscovery industry, you’re right.
Don’t Be “Duped”: Files with Different HASH Values Can Still Be the Same.
Want the Right Balance of Recall and Precision in Your Search? Try Proximity Searches.
Are You Requesting the Best Production Format for Your Case? Maybe not, according to Craig Ball.
In a recent case, Both Sides Were Instructed to Use Predictive Coding or Show Cause Why Not.
Did you know that Only One in Eight Records Managers Trusts Their ESI?
Plaintiff Hammered with Case Dismissal for “Egregious” Discovery Violations: Apparently, destroying your first computer with a sledgehammer and using Evidence Eliminator and CCleaner on your second computer are not considered to be best practices for preservation.
Even a “Rap Weasel” can be sanctioned for spoliation of data. It isn’t every day that we cite The Hollywood Reporter for a story.
Problems with Review? It’s Not the End of the World.
$2.9 Billion? Is the eDiscovery Software Market Going to Double by 2017?
Want to catch up on 2012 eDiscovery cases? Here is your chance.
Is 31,000 Missed Relevant Documents an Acceptable Outcome for Predictive Coding? It might be, if the alternative is 62,000 missed relevant documents.
What do various eDiscovery thought leaders think about the industry? For the third year in a row, we find out.
Must Losing Plaintiff Pay Defendant $2.8 Million for Predictive Coding of One Million Documents? Court Says Yes.
Do you have some misperceptions about predictive coding? Maybe so. Here are Five Common Myths About Predictive Coding.

In addition, Jane Gennarelli has been publishing an excellent series to introduce new eDiscovery professionals to the litigation process and litigation terminology. Here is the latest post, which includes links to the previous twenty one posts.

Thanks for noticing us! We’ve nearly quadrupled our readership since the first six month period and almost septupled (that’s grown 7 times in size!) our subscriber base since those first six months! We appreciate the interest you’ve shown in the topics and will do our best to continue to provide interesting and useful eDiscovery news and analysis. And, as always, please share any comments you might have or if you’d like to know more about a particular topic!

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Outlook Emails Can Take Many Forms – eDiscovery Best Practices

March 19, 2013

Most discovery requests include a request for emails of parties involved in the case. Email data is often the best resource for establishing a timeline of communications in the case and Microsoft® Outlook is the most common email program used in business today. Outlook emails can be stored in several different forms, so it’s important to be able to account for each file format when collecting emails that may be responsive to the discovery request.

There are several different file types that contain Outlook emails, including:

EDB (Exchange Database): The server files for Microsoft Exchange, which is the server environment which manages Outlook emails in an organization. In the EDB file, a user account is created for each person authorized at the company to use email (usually, but not always, employees). The EDB file stores all of the information related to email messages, calendar appointments, tasks, and contacts for all authorized email users at the company. EDB files are the server-side collection of Outlook emails for an organization that uses Exchange, so they are a primary source of responsive emails for those organizations. Not all organizations that use Outlook use Exchange, but larger organizations almost always do.

OST (Outlook Offline Storage Table): Outlook can be configured to keep a local copy of a user’s items on their computer in an Outlook data file that is named an offline Outlook Data File (OST). This allows the user to work offline when a connection to the Exchange computer may not be possible or wanted. The OST file is synchronized with the Exchange computer when a connection is available. If the synchronization is not current for a particular user, their OST file could contain emails that are not on the EDB server file, so OST files may also need to be searched for responsive emails.

PST (Outlook Personal Storage Table): A PST file is another Outlook data file that stores a user’s messages and other items on their computer. It’s the most common file format for home users or small organizations that don’t use Exchange, but instead use an ISP to connect to the Internet (typically through POP3 and IMAP). In addition, Exchange users may move or archive messages to a PST file (either manually or via auto-archiving) to move them out of the primary mailbox, typically to keep their mailbox size manageable. PST files often contain emails not found in either the EDB or OST files (especially when Exchange is not used), so it’s important to search them for responsive emails as well.

MSG (Outlook MSG File): MSG is a file extension for a mail message file format used by Microsoft Outlook and Exchange. Each MSG file is a self-contained unit for the message “family” (email and its attachments) and individual MSG files can be saved simply by dragging messages out of Outlook to a folder on the computer (which could then be stored on portable media, such as CDs or flash drives). As these individual emails may no longer be contained in the other Outlook file types, it’s important to determine where they are located and search them for responsiveness. MSG is also the most common format for native production of individual responsive Outlook emails.

Other Outlook file types that might contain responsive information are EML (Electronic Mail), which is the Outlook Express email format and PAB (Personal Address Book), which, as the name implies, stores the user’s contact information.

Of course, Outlook emails are not just stored within EDB files on the server or these other file types on the local workstation or portable media; they can also be stored within an email archiving system or synchronized to phones and other portable devices. Regardless, it’s important to account for the different file types when collecting potentially responsive Outlook emails for discovery.

So, what do you think? Are you searching all of these file types for responsive Outlook emails? Please share any comments you might have or if you’d like to know more about a particular topic.

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Craig Ball of Craig D. Ball, P.C. – eDiscovery Trends, Part 3

March 8, 2013

This is the tenth (and final) of the 2013 LegalTech New York (LTNY) Thought Leader Interview series. eDiscoveryDaily interviewed several thought leaders at LTNY this year and generally asked each of them the following questions:

What are your general observations about LTNY this year and how it fits into emerging trends?
If last year’s “next big thing” was the emergence of predictive coding, what do you feel is this year’s “next big thing”?
What are you working on that you’d like our readers to know about?

Today’s thought leader is Craig Ball. A frequent court appointed special master in electronic evidence, Craig is a prolific contributor to continuing legal and professional education programs throughout the United States, having delivered over 1,000 presentations and papers. Craig’s articles on forensic technology and electronic discovery frequently appear in the national media, and he writes a monthly column on computer forensics and eDiscovery for Law Technology News called Ball in your Court, as well as blogs on those topics at ballinyourcourt.com.

Craig was very generous with his time again this year and our interview with Craig had so much good information in it, we couldn’t fit it all into a single post. Wednesday was part 1 and yesterday was part 2. Today is the third and last part. A three-parter!

Note: I asked Craig the questions in a different order and, since the show had not started yet when I interviewed him, instead asked about the sessions in which he was speaking.

What are you working on that you’d like our readers to know about?

I’m really trying to make 2013 the year of distilling an extensive but idiosyncratic body of work that I’ve amassed through years of writing and bring it together into a more coherent curriculum. I want to develop a no-cost casebook for law students and to structure my work so that it can be more useful for people in different places and phases of their eDiscovery education. So, I’ll be working on that in the first six or eight months of 2013 as both an academic and a personal project.

I’m also trying to go back to roots and rethink some of the assumptions that I’ve made about what people understand. It’s frustrating to find that lawyers talking about, say, load files when they don’t really know what a load file is, they’ve never looked at a load file. They’ve left it to somebody else and, so, the resolution of difficulties has gone through so many hands and is plagued by so much miscommunication. I’d like to put some things out there that will enable lawyers in a non-threatening and accessible way to gain comfort in having a dialog about the fundamentals of eDiscovery that you and I take for granted. So, that we don’t have to have this reliance upon vendors for the simplest issues. I don’t mean that vendors won’t do the work, but I don’t think we should have to bring a technical translator in for every phone call.

There should be a corpus of competence that every litigator brings to the party, enabling them to frame basic protocols and agreements that aren’t merely parroting something that they don’t understand, but enabling them to negotiate about issues in ways that the resolutions actually make sense. Saying “I won’t give you 500 search terms, but I’ll give you 250” isn’t a rational resolution. It’s arbitrary.

There are other kinds of cases that you can identify search terms “all the live long day” and they’re really never going to get you that much closer to the documents you want. The best example in recent years was the Pippins v. KPMG case. KPMG was arguing that they could use search terms against samples to identify forensically significant information about work day and work responsibility. That didn’t make any sense to me at all. The kinds of data they were looking for wasn’t going to be easily found by using keyword search. It was going to require finding data of a certain character and bringing a certain kind of analysis to it, not an objective culling method like search terms. Search terms have become like the expression “if you have a hammer, the whole world looks like a nail”. We need to get away from that.

I think a little education made palatable will go a long way. We need some good solid education and I’m trying to come up with something that people will borrow and build on. I want it to be something that’s good enough that people will say “let’s just steal his stuff”. That’s why I put it out there – it’s nice that they credit me and I appreciate it; but if what you really want to do is teach people, you don’t do it for the credit, you do it for the education. That’s what I’m about, more this year than ever before.

Thanks, Craig, for participating in the interview!

And to the readers, as always, please share any comments you might have or if you’d like to know more about a particular topic!

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

Craig Ball of Craig D. Ball, P.C. – eDiscovery Trends, Part 2

March 7, 2013

This is the tenth (and final) of the 2013 LegalTech New York (LTNY) Thought Leader Interview series. eDiscoveryDaily interviewed several thought leaders at LTNY this year and generally asked each of them the following questions:

What are your general observations about LTNY this year and how it fits into emerging trends?
If last year’s “next big thing” was the emergence of predictive coding, what do you feel is this year’s “next big thing”?
What are you working on that you’d like our readers to know about?

Today’s thought leader is Craig Ball. A frequent court appointed special master in electronic evidence, Craig is a prolific contributor to continuing legal and professional education programs throughout the United States, having delivered over 1,000 presentations and papers. Craig’s articles on forensic technology and electronic discovery frequently appear in the national media, and he writes a monthly column on computer forensics and eDiscovery for Law Technology News called Ball in your Court, as well as blogs on those topics at ballinyourcourt.com.

Craig was very generous with his time again this year and our interview with Craig had so much good information in it, we couldn’t fit it all into a single post. Yesterday was part 1. Today is part 2 and part 3 will be published in the blog on Friday. A three-parter!

Note: I asked Craig the questions in a different order and, since the show had not started yet when I interviewed him, instead asked about the sessions in which he was speaking.

I noticed that you are speaking at a couple of sessions here. What would you like to tell me about those sessions?

{Interviewed the evening before the show} I am on a Technology Assisted Review panel with Maura Grossman and Ralph Losey that should be as close to a barrel of laughs as one can have talking about technology assisted review. It is based on a poker theme – which was actually Matt Nelson’s (of Symantec) idea. I think it is a nice analogy, because a good poker player is a master or mistress of probabilities, whether intuitively or overtly performing mental arithmetic that are essentially statistical and probability calculations. Such calculations are key to quality assurance and quality control in modern review.

We have to be cautious not to require the standards for electronic assessments to be dramatically higher than the standards applied to human assessments. It is one thing with a new technology to demand more of it to build trust. That’s a pragmatic imperative. It is another thing to demand so exalted a level of scrutiny that you essentially void all advantages of the new technology, including the cost savings and efficiencies it brings. You know the old story about the two hikers that encounter the angry grizzly bear? They freeze, and then one guy pulls out running shoes and starts changing into them. His friend says “What are you doing? You can’t outrun a grizzly bear!” The other guy says “I know. I only have to outrun you”. That is how I look at technology assisted review. It does not have to be vastly superior to human review; it only has to outrun human review. It just has to be as good or better while being faster and cheaper.

We cannot let the vague uneasiness about the technology cause it to implode. If we have to essentially examine everything in the discard pile, so that we not only pay for the new technology but also back it up with the old. That’s not going to work. It will take a few pioneers who get the “arrows in the back” early on—people who spend more to build trust around the technology that is missing at this juncture. Eventually, people are going to say “I’ve looked at the discard pile for the last three cases and this stuff works. I don’t need to look at all of that any more.

Even the best predictive coding systems are not going to be anywhere near 100% accurate. They start from human judgment where we’re not even sure what “100% accurate” is, in the context of responsiveness and relevance. There’s no “gold standard”. Two different qualified people can look at the same document and give a different assessment and approximately 40% of the time, they do. And, the way we decide who’s right is that we bring in a third person. We indulge the idea that the third person is the “topic authority” and what they say goes. We define their judgment as right; but, even their judgments are human. To err being human, they’re going to make misjudgments based on assumptions, fatigue, inattention, whatever.

So, getting back to the topic at hand, I do think that the focus on quality assurance is going to prompt a larger and long overdue discussion about the efficacy of human review. We’ve kept human review in this mystical world of work product for a very long time. Honestly, the rationale for work product doesn’t naturally extend over to decisions about responsiveness and relevance. Even though, most of my colleagues would disagree with me out of hand. They don’t want anybody messing with privilege or work product. It’s like religion or gun control—you can’t even start a rational debate.

Things are still so partisan and bitter. The notions of cooperation, collaboration, transparency, translucency, communication – they’re not embedded yet. People come to these processes with animosity so deeply seated that you’re not really starting on a level playing field with an assessment of what’s best for our system of justice. Justice is someone else’s problem. The players just want to win. That will be tough to change.

We “dinosaurs” will die off, and we won’t have to wait for the glaciers to advance. I think we will have some meteoric events that will change the speed at which the dinosaurs die. Technology assisted review is one. We’ve seen a meteoric rise in the discussion of the topic, the interest in the topic, and I think it will have a meteoric effect in terms of more rapidly extinguishing very bad and very expensive practices that don’t carry with them any more superior assurance of quality.

Craig Ball of Craig D. Ball, P.C. – eDiscovery Trends, Part 1

March 6, 2013

This is the tenth (and final) of the 2013 LegalTech New York (LTNY) Thought Leader Interview series. eDiscoveryDaily interviewed several thought leaders at LTNY this year and generally asked each of them the following questions:

What are your general observations about LTNY this year and how it fits into emerging trends?
If last year’s “next big thing” was the emergence of predictive coding, what do you feel is this year’s “next big thing”?
What are you working on that you’d like our readers to know about?

Today’s thought leader is Craig Ball. A frequent court appointed special master in electronic evidence, Craig is a prolific contributor to continuing legal and professional education programs throughout the United States, having delivered over 1,000 presentations and papers. Craig’s articles on forensic technology and electronic discovery frequently appear in the national media, and he writes a monthly column on computer forensics and eDiscovery for Law Technology News called Ball in your Court, as well as blogs on those topics at ballinyourcourt.com.

Craig was very generous with his time again this year and our interview with Craig had so much good information in it, we couldn’t fit it all into a single post. So, today is part 1. Parts 2 and 3 will be published in the blog on Thursday and Friday. A three-parter!

Note: I asked Craig the questions in a different order and, since the show had not started yet when I interviewed him, instead asked about the sessions in which he was speaking.

If last year’s “next big thing” was the emergence of predictive coding, what do you feel is this year’s “next big thing”?

I think this is the first year where I do not have a ready answer to that question. It’s like the wonderful movie Groundhog Day. I am on the educational planning board for the show, and as hard as we try to find and present fresh ideas, technology assisted review is once again the dominant topic.

This year, we will see a change of the marketing language repositioning the (forgive the jargon) “value proposition” for the tools being sold continuing to move more towards the concept of information governance. If knowledge management had a “hook up” here at LTNY with eDiscovery, their offspring would be information governance. Information governance represents a way to spread the cost of eDiscovery infrastructure among different budgets. It’s not a made up value proposition. Security and regulatory people do have a need, and many departments can ultimately benefit from more granular and regimented management of their unstructured and legacy information stores.

I remain something of a skeptic about what has come to be called “defensible deletion.” Most in-house IT people do not understand that, even after you purchase a single instance de-duplication solution, you’re still going to have as much of 40% “bloat” in your collection of data between local stores, embedded and encoded attachments, etc. So, there are marked efficiencies we can achieve by implementing sensible de-duplication and indexing mechanisms that are effective, ongoing and systemic. Consider enterprise indexing models that basically let your organization and its information face an indexing mechanism in much the same way as the internet faces Google. Almost all of us interact with the internet through Google, and often get the information we are seeking from the Google index or synopsis of the data without actually proceeding to the indexed site. The index itself becomes the resource, and the document indexed a distinct (and often secondary) source. We must ask ourselves: “if a document is indexed, does it ever leave our collection?”

I also think eDiscovery education is changing and I am cautiously optimistic. But, people are getting just enough better information about eDiscovery to be dangerous. And, they are still hurting themselves by expecting there to be some simple “I don’t really need to know it” rule of thumb that will get them through. And, that’s an enormous problem. You can’t cross examine from a script. Advocates need to understand the answers they get and know how to frame the follow up and the kill. My cautious optimism respecting education is function of my devoting so much more of my time to education at the law school and professional levels as well as for judicial organizations. I am seeing a lot more students interested in the material at a deeper level, and my law class that just concluded in December impressed me greatly. The level of enthusiasm the students brought to the topic and the quality and caliber of their questions were as good as any I get from my colleagues in the day to day practice of eDiscovery. Not just from lawyers, but also from people like you who are deeply immersed in this topic.

That is not so much a credit to my teaching (although I hope it might be). The greatest advantage that students have is that they have haven’t yet acquired bad habits and don’t come with preconceived notions about what eDiscovery is supposed to be. Conversely, many lawyers literally do not want to hear about certain topics–they “glaze” and immediately start looking for a way to say “this cannot be important, I cannot have to know this”. Law students don’t waste their energy that way. If the professor says “you need to know this”, then they make it their mission to learn. Yesterday, I had a conversation with a student where she said “I really wish we could have learned more about search strategies and more ways to apply sophisticated tools hands on”. That’s exactly what I wish lawyers would say.

I wish lawyers were clamoring to better understand things like search or de-duplication or the advantages of one form of production over another. Sometimes, I feel like I am alone in my assessment that these are crucial issues. If I am the only one thinking that settling on forms of productions early and embracing native forms of production is crucial to quality, what is wrong with me?

I am still surprised at how many people TIFF most of their collection or production.

They have no clue how really bad that is, not just in terms in cost but also in terms of efficiency. I am hoping the dialogue about TAR will bring us closer to a serious discussion about quality in eDiscovery. We never had much of a dialogue about the quality of human review or the quality of paper production. Either we didn’t have the need, or, more likely we were so immersed in what we were doing we did not have the language to even begin the conversation.

I wrote in a blog post recently about an experiment discussed in my college Introductory Psychology course where this cool experiment involved raising kittens such that they could only see for a few hours a day in an environment composed entirely horizontals or verticals. Apparently, if you are raised from birth only seeing verticals, you do not learn to see horizontals, and vice-versa. So, if I raise a kitten among the horizontals and take a black rod and put it in front of them, they see it when it is horizontal. But, if I orient it vertically, it disappears in their brain. That is kind of how we are with lawyers and eDiscovery.

There are just some topics that you and I and our colleagues see the importance of, but lawyers have been literally raised without the ability to see why those things matter. They see what has long been presented to them in, say, Summation or Concordance, as an assemblage of lousy load files and error ridden OCR and colorless images stripped of embedded commentary. They see this information so frequently and so exclusively that they think that’s the document and, since they only have paper document frames of reference (which aren’t really that much better than TIFFs), they think this must be what electronic evidence looks like. They can’t see the invisible plane they’ve been bred to overlook.

You can look at a stone axe and appreciate the merits of a bronze axe – if all that you’re comparing it to are prehistoric tools, a bronze axe looks pretty good. But, today we have chainsaws. I want lawyers demanding chainsaws to deal with electronic information and to throw away those incredibly expensive stone axes; but, unfortunately, they make more money using stone axes. But, not for long. I am seeing the “house of cards” start to shake and the house of cards I am talking about is the $100 to $300 (or more) per gigabyte pricing for eDiscovery. I think that model is not only going to be short lived, but will soon be seen as negligence in the lawyers who go that route and as exploitive gouging by service providers, like selling a bottle of water for $10 after Hurricane Sandy. There is a point at which price gouging will be called out. We can’t get there fast enough.

Nigel Murray of Huron Legal – eDiscovery Trends

February 28, 2013

This is the eighth of the 2013 LegalTech New York (LTNY) Thought Leader Interview series. eDiscoveryDaily interviewed several thought leaders at LTNY this year and generally asked each of them the following questions:

What are your general observations about LTNY this year and how it fits into emerging trends?
If last year’s “next big thing” was the emergence of predictive coding, what do you feel is this year’s “next big thing”?
What are you working on that you’d like our readers to know about?

Today’s thought leader is Nigel Murray. Nigel is Managing Director at Huron Legal. Nigel has been at the forefront of the litigation support and e-Disclosure industry in the UK since 1991. He managed the first e-disclosure project to go before a U.K. court in the early 2000s and has since advised and worked with many clients in the U.K., mainland Europe and the Middle East in a range of industry sectors. Prior to joining Huron, Nigel was the founder and managing director of TRILANTIC, the first U.K.-based e-disclosure company, and a litigation support manager in a major international law firm. Nigel has been a speaker at engagements throughout the U.S., Europe and the Middle East, and he has published multiple articles.

What are your general observations about LTNY this year and how it fits into emerging trends?

This was my 15^th Legal Tech show over 18 years and it was as good as ever. The show attracts all the key people in the industry to New York where new ideas and concepts are discussed and shared in an informal environment. This year did not bring any startling “new” technology, more a shift along the evolutionary cycle.

If last year’s “next big thing” was the emergence of predictive coding, what do you feel is this year’s “next big thing”?

I have three thoughts about “big things” for the coming year. The first is a continued refinement of the thinking on technology assisted review. This year, technology assisted review (sometimes called predictive coding) is becoming widespread and there are now a lot of companies that offer it. However, my personal view is that there are still only a few of those solutions that are defensible and repeatable. Regardless of how good the technology is, it still requires a great deal of expertise and work flow to actually get what you want out of it. I predict that one of the challenges that will arise at some point will be a court case against a company that offers technology assisted review and it has gone wrong. The people who really understand computer assisted review understand that it requires a process.

Another area that has been around for a while but is gaining emphasis – is the effective management of corporation’s data. New, affordable technologies are available to dramatically reduce the amount of rubbish within an organization, as well as de-duplicate the huge volumes of data. That falls into a number of areas within the EDRM model and within organizations’ structures: its partly risk, partly records and information management (RIM) and partly information governance. I feel that over the next three years, the whole area could become increasingly important. Now, that will drive down the cost of eDiscovery because if, after you have effectively whittled down your rubbish and got rid of the duplicates, you have only one-third of the documents to manage, which will ensure that your eDiscovery costs are going to be dramatically lower.

Data management combines with the third area that I think will be talked about this year, and that is information security. A lot of corporations understand the importance of keeping their information secure and some corporations, like banks, are required to do so. However, the model that we have built up is that even though companies may keep their sensitive data secure internally, when it is time for discovery, they give the data to other organizations to process and work with; and those organizations may not have that same level of security. At a fascinating dinner the other night, I heard about 20 to 50 corporations saying, “we cannot trust our law firms to look after our data securely.” The keynote speaker told the dinner that he had recently gone to a law firm and asked whether they believed they were secure, and they said “of course, we are secure.” He then produced the minutes from the firm’s board meeting two days previously! Stories like that are becoming widely known by corporations, so I think the effect is that the corporations are increasingly going to want to keep the data behind their own firewalls. The data will be reduced, analyzed and hosted behind a company’s firewall and the external review entity and the law firm will be looking at the data within that domain. I think that is going to be a significant change to this industry.

What are you working on that you’d like our readers to know about?

At Huron Legal this year, we have launched Integrated Analytics, which falls under the TAR/CAR brackets. Integrated Analytics is built around data analytics specialists who are both lawyers and database administrators, so they understand data and are lawyers as well, which is an unusual, but effective, combination. The approach that we have taken is that we will work with internal counsel and external advisers where we do the “pushing of the buttons” and perform the searches. We prefer to do it for our clients because junior and senior attorneys charging 200 an hour are not necessarily the most qualified to be performing the analytics in the most defensible and reputable manner. So, we launched this service to help our clients reduce the amount of data that needs to be reviewed and also speed up that review process. We have the expertise to get through data more quickly, resulting in cost savings, so it’s a different model from those who try to do it themselves. We also provide within our pricing expert testimony from statisticians and lawyers on our process, if required. The launch of our Integrated Analytics team is our big news here at the show.

Thanks, Nigel, for participating in the interview!

And to the readers, as always, please share any comments you might have or if you’d like to know more about a particular topic!

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.

George Socha of Socha Consulting LLC – eDiscovery Trends

February 25, 2013

This is the seventh of the 2013 LegalTech New York (LTNY) Thought Leader Interview series. eDiscoveryDaily interviewed several thought leaders at LTNY this year and generally asked each of them the following questions:

What are your general observations about LTNY this year and how it fits into emerging trends?
If last year’s “next big thing” was the emergence of predictive coding, what do you feel is this year’s “next big thing”?
What are you working on that you’d like our readers to know about?

Today’s thought leader is George Socha. A litigator for 16 years, George is President of Socha Consulting LLC, offering services as an electronic discovery expert witness, special master and advisor to corporations, law firms and their clients, and legal vertical market software and service providers in the areas of electronic discovery and automated litigation support. George has also been co-author of the leading survey on the electronic discovery market, The Socha-Gelbmann Electronic Discovery Survey; in 2011, he and Tom Gelbmann converted the Survey into Apersee, an online system for selecting eDiscovery providers and their offerings. In 2005, he and Tom Gelbmann launched the Electronic Discovery Reference Model project to establish standards within the eDiscovery industry – today, the EDRM model has become a standard in the industry for the eDiscovery life cycle and there are nine active projects with over 300 members from 81 participating organizations. George has a J.D. for Cornell Law School and a B.A. from the University of Wisconsin – Madison.

What are your general observations about LTNY this year and how it fits into emerging trends?

First of all, this year’s show has a livelier feel to it after a few years where it was feeling a bit flat, no doubt probably due to the economy. The show has more “spark” to it, which is good not just for this conference but also for the industry and where it’s at and where it’s going.

As for the curriculum, if last year was the year of TAR/CAR/Predictive Coding, so was this year. It’s also the year of “big data” – whatever “big data” means – and it may or may not be the year of information governance – whatever that means. I think a lot of what we see continues to focus on the same underlining set of issues, that providers are being ever more creative with the packages of the services, software and the capabilities they are offering. They are trying to figure out how to get those offerings in front of the consuming audience with a compelling story addressing the question of why should you go the extra step and use what they have to offer instead of doing things as you always have done them. Predictive coding is still more discussion than action, but it is interesting to hear the different opinions. I moderated a panel with two trial lawyers who are head of their eDiscovery practice groups, who talked about the processes they now go through with clients where discussing predictive coding, to determine whether it’s appropriate for a given case. The two attorneys were discussing the benefits of CAR, the drawbacks, how much extra it is likely to cost, how much it is likely to save and whether it is likely to even save anything. This is a discussion that didn’t happen much a year ago and hardly at all two years ago. To place this in context, however, I have worked with one corporation that has been doing what we now call Computer Assisted Review since 2003 to my direct knowledge and, I am told, since 2000. CAR is not new in terms of techniques, rather it is new in terms of its packaging and presentation and “productization”.

If last year’s “next big thing” was the emergence of predictive coding, what do you feel is this year’s “next big thing”?

If you look at the eDiscovery industry, what the software providers have been developing and the skills and expertise that the service providers and law firms have been building up over the years, they are amassing a powerful set of capabilities that until now has been focused on one pretty narrow set of issues – eDiscovery. I see people starting to take those tools, techniques and experience and beginning to point them in new directions far beyond just eDiscovery, because most of what we deal with in eDiscovery applies in other areas as well. For example, I see a turn toward broader information governance issues, such as how you get your electronic house in order so that things like eDiscovery become less of a pain point, and how do you do a better job or figuring out what is and what isn’t a record, and how can you get rid of content you been holding onto for years. These issues extend beyond eDiscovery. They include what you do to identify compliance challenges, and monitoring whether you are meeting those challenges in an effective fashion. You could use the same technologies and approaches to improve how you manage your intellectual property assets, essentially pointing the EDRM framework in a new direction. I think we are on the brink of what could be an enormous of expansion of uses of these capabilities that have been developed in a niche area for some time now.

What are you working on that you’d like our readers to know about?

With regard to EDRM, we are approaching our tenth year. We are looking to that milestone and asking ourselves what EDRM should be today, what it should be tomorrow, and what can we do to improve what we do and how we do it. We are going to shift to smaller working groups focused on more targeted projects with a shorter delivery cycle. You can see the beginnings of that in some of our recently published deliverables.

The Computer Assisted Review Reference Model (CARRM) (our blog post about CARRM here) was our first outcome using this process and the second was the EDRM Talent Task Matrix (our blog post about it here) that we published on Monday. For now, the Talent Task Matrix consists of a diagram that helps explain the concept as well as an accompanying spreadsheet which is available in Excel format (XLSX) or Adobe Acrobat (PDF) format that anyone can download. We are looking for comments and feedback on the matrix and anticipate that it will fill a need and a gap that are not otherwise being addressed.

With regard to Apersee, providers continue to add information about themselves and we continue to add features. In the past year, we replaced the search engine with a faceted search mechanism that is simpler to use. We added an Event Calendar with links to Apersee providers. We added in a Press Release section which works in much the same way. We’re looking to develop two additional sections which take specific types of content associated with providers and make that available within the application. The underlining notion is to better help consumers evaluate providers on many dimensions, with an easily followed structure to the content available through the site.

Finally, we added the ability for consumers to submit Special Requests, so that if in looking for a provider and searching through the website they do not find the result they need, they always can submit a special request to us through the click of a button. We reformulate the message and send it out to about 2,700 people in the provider community. Unless you choose otherwise, the request is totally anonymous. Typically, we get back 20 to 40 relevant responses within the first few hours, which usually is more information than the requestor can handle. The responses from the request system have been very positive.

Thanks, George, for participating in the interview!

And to the readers, as always, please share any comments you might have or if you’d like to know more about a particular topic!

Disclaimer: The views represented herein are exclusively the views of the author, and do not necessarily represent the views held by CloudNine Discovery. eDiscoveryDaily is made available by CloudNine Discovery solely for educational purposes to provide general information about general eDiscovery principles and not to provide specific legal advice applicable to any particular circumstance. eDiscoveryDaily should not be used as a substitute for competent legal advice from a lawyer you have retained and who has agreed to represent you.