Intro to Predictive Coding: Overview & Interpretation of Terminology June 2014 | Page 10
There are several ways that systems can get their training
examples. These training documents are a sample of all of the
documents in the collection. The examples can be selected
randomly and categorized, can be provided by expert reviewers,
chosen by the computer, or determined by some combination of
these. It gives the new document the same category as the most
similar trained example.
5. Active Learning. An iterative process that presents for reviewer
judgment those documents that are most likely to be misclassified.
In conjunction with Support Vector Machines, it presents those
documents that are closest to the current position of the separating line. The line is moved if any of the presented documents has
been misclassified.
6. Language Modeling. A mathematical approach that seeks to
summarize the meaning of words by looking at how they are used
in the set of documents. Language modeling in predictive coding
builds a model for word occurrence in the responsive and in the
non-responsive documents and classifies documents according to
the model that best accounts for the words in a document being
considered.
7. Relevance Feedback. A computational model that adjusts the
criteria for implicitly identifying responsive documents following
feedback by a knowledgeable user as to which documents are
relevant and which are not.