Intro to Predictive Coding: Overview & Interpretation of Terminology June 2014 | Page 27
Glossary
Latent Semantic Indexing—(LSI) the use of latent
semantic analysis to index a collection of
documents.
Machine learning—a branch of computer science
that deals with designing computer programs
to extract information from examples. For example,
properties that distinguish between responsive and
nonresponsive documents may be extracted from
example documents in each category. The goal is to
predict the correct category for future untagged
examples based on the knowledge extracted from
the previously classified examples. Example approaches include neural networks, support vector
machines, Bayesian classifiers and others.
Nearest neighbor classification—a statistical
procedure that classifies objects, such as
documents, according to the most similar item that
has already been assigned a category label. This
approach uses a set of labeled examples to classify
subsequent unlabeled items, by choosing the