Intro to Predictive Coding: Overview & Interpretation of Terminology June 2014 | Page 28
Glossary
category assigned to the most similar labeled
example (its nearest neighbor) or examples. Knearest neighbor classification uses the k most
similar classified objects to determine the
classification of an unknown object.
Population – the universe of things about which we
are trying to infer with our samples. For example,
the population may be the set of documents that
we want to classify as putatively responsive or
putatively non-responsive. The group from which
we pull our samples. Also called the sampling
frame.
Precision – the proportion of retrieved documents
that are responsive. See also recall.
Predictive coding – a group of machine learning
technologies that predict which documents are
and are not responsive based on the decisions
applied by a subject matter expert to a small
sample of documents.