Intro to Predictive Coding: Overview & Interpretation of Terminology June 2014 | Page 6
Chapter One
Introduction
Predictive coding uses computers and machine learning to
reduce the number of documents in large document sets to
those that are relevant to the matter. It is a highly effective
method for culling data sets to save time, money and effort.
Predictive coding learns to categorize documents (for example,
as responsive or non-responsive) based on a relatively small
sample of example documents.
Predictive coding is not magic. It does not replace all of human
review. It does not cure cancer. Predictive coding is mathematical algorithms and applied statistical analysis used to
emulate the decisions that an authoritative expert would make,
based on the evidence in the documents.
Predictive coding allows one person or a small group of people
to effectively review millions of documents in a short period of
time, with higher accuracy and consistency, and at a much
lower cost than traditional review methods. In predictive
coding, a computer is “trained” to distinguish between
responsive and non-responsive documents. The system can
then use the