1.
Title: Russian through Switched
Telephone Network (RuSTeN)
2. Authors: Anrey
Raev, Serguei Koval, Natalia Smirnova, Daria
Khitrova, Vitaly Stepanov.
tel:+7 812 3258848 (STC)
raev/koval/nsmirnova/khitrova@speechpro.com
3.
Data Type: speech
4.
Data Source: telephone. Collected between 2001 and 2003 at Speech Technology
Center (STC).
5. Project: “Trawl” (Automatic
Voice Identification System in Telephone Channel). The purpose of the project was to develop
software for automatic identification of speakers based on
voice samples acquired through telephone channels. The training of the system
was performed with the telephone speech corpus RuSTeN. The project is completed
by now. The
“Trawl” software has 91% far-end speaker identification reliability, provided
the duration of the sample recording is at least 16 sec and that of the
questioned recording – at least 96 sec.
6.
Application: speaker identification
7. Language: Russian (RUS)
8. Special license:
9.
Copyright: Portions © 2001 Speech
Technology Center Limited
10. Description:
·
Speech in the wav-format
·
627 speech files, 2 doc-files. Uncompressed.
·
recorded at sample rate 11025 Hz, 16-bit, linear, 1-channel. The corpus’ size is 4,5 Gb
(abt 60 hrs). Audio files can be played in any sound editor.
·
The corpus is recorded on 1 DVD.
·
Directories content:
|_DOC ____
| | readme.doc
|
| speakers.doc
| |__________
|
|_DATA____
|
00101.wav
|
00102.wav
| 00103.wav
|
00104.wav
|
00105.wav
|
00201.wav
|
00202.wav
|
......
|
16705.wav
|__________
The Doc directory on CD1 contains general information
about the corpus (readme.doc), procedure specification (procedure.doc) and
information about the speakers (speakers.doc) – sex, age, education, place
where born and raised, current residence).
The Data directories contain speech files in the
wav-format. The filenames are read as FFFSS, where FFF stands for the far-end
speaker code and SS for the session number. The codes were given to the
recruited speakers before or during the first call. However, some speakers
failed to record the required minimum of 5 sessions. Besides, the quality of
the recorded material was not always acceptable. In all such cases the
recordings were not included into the corpus. This accounts for the fact that
certain speakers and/or sessions are missing.
The Data directories on each CD contain:
CD1 – 17 speakers (001-017)
CD2 – 18 speakers (018-035)
CD3 – 18 speakers (036-039,041-048, 050-053, 055, 057)
CD4 – 20 speakers 058-061, 064-066, 069-075, 077-079,
082, 086, 088)
CD5 – 21 speaker (092-094,
099-101, 103-107, 109-111, 113-119)
CD6 – 22 speakers (120-125, 127, 129-133, 138-140,
143-148, 151)
CD7 – 9 speakers (152, 154, 155, 157, 158, 161, 164,
166, 167)
Quality control:
Some of the files were erroneously
recorded at sample frequency 8 kHz, in particular: 07904, 08604, 08605, 08801,
08802, 09201, 09301, 10005.
Suggested price:1,000 USD