From: http://tech.groups.yahoo.com/group/icsi-speech-tools/message/144
It's generally recommended to always use some kind of feature mean and
variance normalization with Quicknet.
The minimum amount of normalization is normalization using a single
set of mean and variance estimates calculated over the entire training
set. In this case, normalization simply amounts to translating the
origin of the feature space (the mean normalization) and re-scaling
the axes of the feature space (the variance normalization), since all
data is normalized using the same mean and variance estimates. This is
recommended as a minimum amount of normalization since very big or
very small numbers can cause sigmoids to saturate, and since the
optimal learning rate during MLP training may depend on the scale of
the data (and thus normalization makes it less likely you'll need to
re-tune the learning rate).
It's very common to go further and do normalization at the utterance
or speaker level. This can be useful for reducing the amount of
variability in the features due to speaker differences and channel
differences.
It's generally recommended to always use some kind of feature mean and
variance normalization with Quicknet.
The minimum amount of normalization is normalization using a single
set of mean and variance estimates calculated over the entire training
set. In this case, normalization simply amounts to translating the
origin of the feature space (the mean normalization) and re-scaling
the axes of the feature space (the variance normalization), since all
data is normalized using the same mean and variance estimates. This is
recommended as a minimum amount of normalization since very big or
very small numbers can cause sigmoids to saturate, and since the
optimal learning rate during MLP training may depend on the scale of
the data (and thus normalization makes it less likely you'll need to
re-tune the learning rate).
It's very common to go further and do normalization at the utterance
or speaker level. This can be useful for reducing the amount of
variability in the features due to speaker differences and channel
differences.
No comments:
Post a Comment