In standard ASR systems, the pronunciation models are derived from the dictionary. For different speakers, they may pronounce the same word with different phone sequences, i.e. ignoring some consonants, producing some adjacent phones together, etc.
In this paper, the pronunciation model (i.e. the lexicon ) is adapted to each speaker to dealing with this kind of variability.
No comments:
Post a Comment