Tuesday, December 8, 2009


Some information about the WSJCAMP0 corpus:
1. Totally 140 speakers and 110 utterances per speaker;

2. 92 training speakers, 20 development test speaker and two sets of 14 evaluation test speakers. Each speaker provides approximately 90 utterances and an additional 18 adaptation utterances.

3. The same set of 18 adaptation sentences was recorded by each speaker, consisting of one recording of background noise, 2 phonetically balanced sentences and the first 15 adaptation sentences from the initial WSJ experiment.

4. Each training speaker read out some 90 training sentences, selected randomly in paragraph units. This is the empirically determined maximum number of sentences that could be squeezed into one hour of speaker time.

5. Each of 48 test speakers read 80 sentences. The final development test group consists of 20 speakers.

The CD-ROM publication consists of six discs, with contents organized as follows:

  • discs 1 and 2 - training data from head-mounted microphone
  • disc 3 - development test data from head-mounted microphone, plus first set of evaluation test data
  • discs 4 and 5 - training data from desk-mounted microphone
  • disc 6 - development test data from desk-mounted microphone, plus second set of evaluation test data
There are 90 utterances from each of 92 speakers that are designated as training material for speech recognition algorithms. An additional 48 speakers each read 40 sentences containing only words from a fixed 5,000 word vocabulary, and another 40 sentences using a 64,000 word vocabulary, to be used as testing material. Each of the total of 140 speakers also recorded a common set of 18 adaptation sentences. Recordings were made from two microphones: a far-field desk microphone and a head-mounted close-talking microphone.


