O-Engineers O-Engineers Aug 2017 | Page 39

The first and earliest approach uses long-term av- erages of acoustic features, such as spectrum rep- resentations or pitch. Second approach models the speaker-dependent acoustic features within the individual phonetic sounds that comprise the ut- terance. Third and the latest approach is the use of discriminative neural networks (NN). Speech and speaker recognition in general are the subset any pattern recognition. Thus, three stages are applied in any speaker recognition task (1) training (2) test- ing and (3) implementation. The logic behind the speaker recognition is to clas- sify the differences in speaker’s articulatory organs, shape of vocal tract, size of the nasal cavity, speak- er intonation and speaker prosody to identify the speaker correctly. Furthermore, a language model can be used to improve the performance. In actu- al, significant errors introduced in the training and testing data due to the inclusion of environmental noise, , convolution or white noise, and speaker’s 39