The RuSTeN (Russian through
Switched Telephone Network) database was recorded between March 2001 and
February 2003 by
The files were recorded with
sample frequency 11025 Hz, 1-channel, 16-bit linear.
Each of the speakers made at
least 5 calls from different locations and/or telephone sets.
Most of the calls were made from home or office environment with uncontrolled noise level. Besides, one call per speaker was made from a public telephone (with either street or metro station noise in the background).
The recordings are spontaneous (sometimes guided by the near-end speaker) conversations between the caller and the speech database collector on various subjects (the weather, the caller’s biography, hobbies etc.) and include approximately 150 seconds of the far-end and at least 5 seconds of the near-end speaker. Besides, each time the caller was asked to utter the usual digits set (0-9) and the words “yes” and “no”.
The time interval between 2
successive sessions is at least 2 days.
The database contains 125
speakers (far-end), 58 male and 67 female. Each far-end speaker is represented
by at least 5 speech files. The sound files are in the wav-format. The speech
filenames contain the following information: FFF (far-end speaker number), SS
(session number).
Information about the far-end speakers is available in the speakers.doc file.