The CMU SPHINX [1, 2] continuous speech recognition system uses 
context-dependent phonetic hidden Markov models (HMM) to achieve
state-of-the-art recognition performance on large vocabulary
applications.

The system used to generate results for the designated Oct 89
evaluation test had the following notable features:

(1) Speech input to the system was represented by 3 independent
    8-bit codebooks made from the following features:
    - 12 bilinear transformed LPC cepstrum coefficients, at 10 ms frame rate.
    - 12 differenced (delta) cepstrum over a 40 ms window.
    - Normalized energy and differenced energy (2 features total).

(2) Generalized triphone [2] models were trained by first training
    all triphone models (within-word and between-word triphones), and 
    then clustering these triphone models using a maximum likelihood
    criterion.  A total of 1100 generalized triphones were trained.
    Generalized triphone models are interpolated with context-independent
    phones.

(3) A corrective training algorithm [3] is applied to enhance 
    discrimination given a grammar.  The model parameters are modified
    to make the ones contributing to correct recognition more likely,
    and the ones contributing to incorrect or near-miss recognition
    less likely.

(4) Viterbi beam search augmented with word duration modeling is
    used for decoding.


References:

[1] Lee, K.F., "Automatic Speech Recognition: The Development of
    the SPHINX System", Kluwer Academic Publishers, Boston, MA, 1989.

[2] Lee, K.F., Hon. H.W., Reddy, R., "An Overview of the SPHINX Speech
    Recognition System", IEEE Transactions on Acoustics, Speech, and
    Signal Processing, January, 1990.

[3] Lee, K.F., "Context-Dependent Phonetic Hidden Markov Models
    for Continuous Speech Recognition", IEEE Transactions on Acoustics, 
    Speech, and Signal Processing, April, 1990.

[4] Lee, K.F., Mahajan, S., "Corrective and Reinforcement Learning for
    Speaker-Independent Continuous Speech Recognition", Computer Speech
    and Language, April, 1990.