Intro to Predictive Coding: Overview & Interpretation of Terminology June 2014

There are several ways that systems can get their training examples. These training documents are a sample of all of the documents in the collection. The examples can be selected randomly and categorized, can be provided by expert reviewers, chosen by the computer, or determined by some combination of these. It gives the new document the same category as the most similar trained example. 5. Active Learning. An iterative process that presents for reviewer judgment those documents that are most likely to be misclassified. In conjunction with Support Vector Machines, it presents those documents that are closest to the current position of the separating line. The line is moved if any of the presented documents has been misclassified. 6. Language Modeling. A mathematical approach that seeks to summarize the meaning of words by looking at how they are used in the set of documents. Language modeling in predictive coding builds a model for word occurrence in the responsive and in the non-responsive documents and classifies documents according to the model that best accounts for the words in a document being considered. 7. Relevance Feedback. A computational model that adjusts the criteria for implicitly identifying responsive documents following feedback by a knowledgeable user as to which documents are relevant and which are not.

Intro to Predictive Coding: Overview & Interpretation of Terminology June 2014 | Page 10