Intro to Predictive Coding: Overview & Interpretation of Terminology June 2014 | Page 27

Glossary Latent Semantic Indexing—(LSI) the use of latent semantic analysis to index a collection of documents. Machine learning—a branch of computer science that deals with designing computer programs to extract information from examples. For example, properties that distinguish between responsive and nonresponsive documents may be extracted from example documents in each category. The goal is to predict the correct category for future untagged examples based on the knowledge extracted from the previously classified examples. Example approaches include neural networks, support vector machines, Bayesian classifiers and others. Nearest neighbor classification—a statistical procedure that classifies objects, such as documents, according to the most similar item that has already been assigned a category label. This approach uses a set of labeled examples to classify subsequent unlabeled items, by choosing the