ABOUT THE TRAINING AND TEST MATERIAL USED FOR SPHINX

Appendix B.3. "Training and Test Speakers" in Kai-Fu Lee's
dissertation [1] enumerates the set of 120 speakers comprising the
Resource Management Corpus's Speaker Independent Training and
Development Test Sets. Lee regarded this set of 120 speakers as
consisting of  80 training speakers, a subset of the Development
Test set he terms the "first 25 evaluation speakers", a set of 10
"March-87 evaluation speakers", and a set of 6 "October-87
evaluation speakers". "Since one speaker is overlapped between the
two evaluation sets [speaker gwt0], there are actually 15 test
speakers" cited in Lee's early work with Sphinx.

Some confusion has been engendered by Lee's reference to speakers
from the Resource Management Corpus Development Test set as
"evaluation speakers", since another portion of the Corpus is
designated as the Resource Management "Evaluation Test" set. At the
time of Lee's early work, this set had not been released to the
DARPA research community, and Lee's "evaluation speakers" are in
fact speakers from the Speaker Independent Development Test set. 

Lee chose to use all 80 of the training speakers and the "first 25
of the evaluation speakers" ["Development Test" speakers] to
comprise a set of 105 speakers for system training. For each of the
training speakers, Lee used all 40 of the available sentence
utterances (including the 10 rapid adaptation sentence utterances
for each of the Development Test speakers) to "train the HMMs". 

Subsequent to the work with Sphinx reported in Lee's dissertation,
the DARPA June '88 Benchmark Test Material was designated. This
test set includes material from 8 of the speakers included in the
corpus of 80 [Speaker Independent] training speakers. This occurred
because of an unanticipated overlap of speakers in the Speaker
Dependent and Speaker-Independent components of the corpus.
Accordingly, for the June 1988 and subsequent tests, it became
necessary to designate subsets of the Speaker Independent portions
of the corpus that did not involve overlaps of training and test
material. This effort resulted in designation of the "Standard 72
Speaker" and "Augmented 109 Speaker" Training Sets for use with the
June 1988 and subsequent DARPA Benchmark Tests.

Lee made use of an aggregate set of 15 speakers for results
reported in his dissertation, including speakers from both the
March '87 and October '87 DARPA Benchmark Tests. Ten sentence
utterances for each of these 15 speakers were used for test
purposes, for a total test set size of 150 sentence utterances.
Since Lee used an aggregated set of test material, it becomes
difficult to make direct comparisons between the results of Lee's
early work with Sphinx and other results that might be reported on
the individual March 1987 and October 1987 test sets. 

[1] Lee, Kai-Fu "Large-Vocabulary Speaker-Independent Continuous
Speech Recognition: The SPHINX System", Ph.D. Dissertation,
Carnegie Mellon University Computer Science Department, Report No.
CMU-CS-88-148, April, 1988.
