Intro to Predictive Coding: Overview & Interpretation of Terminology June 2014 | страница 18
There are several ways that an evaluation can be conducted
following predictive coding.
a. After the documents have been categorized by the
system, review can be continued on newly generated
random samples of documents. That is, the same expert
continues to evaluate random samples of documents until
a sample size the parties agree is adequate has been
obtained. The system’s efficacy on this sample is taken as a
measure of its performance.
b. A separate random sample of documents designated by
the predictive coding system as non-responsive can be
evaluated to compute the Elusion measure. Elusion is the
proportion of documents classified as putatively nonresponsive that should have been classified as responsive.
Ideally, only a small proportion of the documents in the
putatively non-responsive set will be found to be
responsive. In practice, the proportion of responsive
documents in the putatively non-responsive set should be
only a small fraction of the prevalence of responsive
documents. Elusion, therefore, needs to be compared to
the original estimate of responsive document prevalence.
The size of this sample will depend on the required
confidence level and confidence interval.
.