This is known as perceptual linear prediction (PLP).
The actual speech spectrum (obtained by a DFT of
the speech samples) is modified based on the prin-
ciples of critical-band auditory masking and the un-
equal sensitivity of human hearing at different fre-
quencies [4].
The next step is to convert the predictor coefficients
into feature vectors. Examples of such vectors in-
clude the predictor coefficients themselves, cepstral
coefficients and their derivatives, line spectral pairs
(LSP), log area ratios
(LAR), vocal-tract area functions, and the impulse
response h (n) of the filter H(z) [5]. For speaker
recognition, the cepstral coefficients were found to
provide the best results [6].
Cepstrum provides a good measure of the differ-
ence in the spectral envelope of the speech frames
that the cepstral vectors were derived from.
44