2.6. Audio-Based Feature Extraction and Pattern Classification

Audio-based feature extraction consists of parameterizing speech signals into a sequence of feature vectors, which are less redundant for statistical modeling. Although speech signals are nonstationary, their short-term segments can be considered to be stationary. This means that classical signal processing techniques, such as spectral and cepstral analysis, can be applied to short segments of speech on a frame-by-frame basis.

It is well known that the physiological and behavioral characteristics of individual speakers are different. While the physiological differences (e.g., vocal tract shape) result in the variation of low-level spectral features among speakers, the behavioral differences ...

Get Biometric Authentication: A Machine Learning Approach now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.