Computer Vision And Pattern Analysis Laboratory Home Page  Home
People  People
Publications  Publications
Publications  Databases
Contact Information  Contact
Supported Research Projects  Supported Research Projects
Research Activites  Research Activites
Research Groups
SPIS - Signal Processing and Information Systems Lab.SPIS - Signal Processing and Information Systems Lab.
Medical Vision and Analysis Group  Medical Research Activities
Biometrics Research Group  Biometrics Research Group
SPIS - Signal Processing and Information Systems Lab.MISAM - Machine Intelligence for Speech Audio and Multimedia.
Knowledge Base
  Paper Library
Regularizing Linear Discriminant Analysis for Speech Recognition
Authors: Hakan Erdogan
Published in: Interspeech 2005 - Eurospeech,
Publication year: 2005
Abstract: Feature extraction is an essential first step in speech recog-
nition applications. In addition to static features extracted from
each frame of speech data, it is beneficial to use dynamic fea-
tures (called D and DD coefficients) that use information from
neighboring frames. Linear Discriminant Analysis (LDA) fol-
lowed by a diagonalizing maximum likelihood linear transform
(MLLT) applied to spliced static MFCC features yields impor-
tant performance gains as compared toMFCC+D+DDfeatures
in most tasks. However, since LDA is obtained using statistical
averages trained on limited data, it is reasonable to regularize
LDA transform computation by using prior information and ex-
perience. In this paper, we regularize LDA and heteroschedastic
LDA transforms using two methods: (1) Using statistical priors
for the transform in a MAP formulation (2) Using structural
constraints on the transform. As prior, we use a transform that
computes static+D+DD coefficients. Our structural constraint
is in the form of a block structured LDA transform where each
block acts on the same cepstral parameters across frames. The
second approach suggests using new coefficients for static, first
difference and second difference operators as compared to the
standard ones to improve performance. We test the new algo-
rithms on two different tasks, namely TIMIT phone recognition
and AURORA2 digit sequence recognition in noise. We ob-
tain consistent improvement in our experiments as compared to
MFCC features. In addition, we obtain encouraging results in
some AURORA2 tests as compared to LDA+MLLT features.
  download full paper

Home Back