Voicemail Corpus Part I

Item Name: Voicemail Corpus Part I
Author(s): M Padmanabhan, G Ramaswamy, B Ramabhadran, P Gopalakrishnan, C Dunn
LDC Catalog No.: LDC98S77
ISBN: 1-58563-141-8
ISLRN: 074-386-777-466-6
Member Year(s): 1998
DCMI Type(s): Sound
Sample Type: 1-channel ulaw
Sample Rate: 8000
Data Source(s): telephone speech
Application(s): speech recognition
Language(s): English
Language ID(s): eng
Online Documentation: LDC98S77 Documents
Licensing Instructions: Subscription & Standard Members, and Non-Members
Citation: Padmanabhan, M, et al. Voicemail Corpus Part I LDC98S77. Web Download. Philadelphia: Linguistic Data Consortium, 1998.

Introduction

This corpus was created by: M. Padmanabhan, G. Ramaswamy, B. Ramabhadran, P. S. Gopalakrishnan and C. Dunn

Data

This corpus consists of 1,801 messages, collected from volunteers at various IBM sites in the United States, comprising the training data set and 42 messages in the development test set. The average voicemail message is 31 seconds in duration and has about 100 words. Approximately 38% of the messages correspond to male speakers the remainder correspond to females. All messages were transcribed by IBM.

Updates

There are no updates at this time.

Pricing

The Reduced Licensing Fee for this corpus is US$150.

Available Media

View Fees





Login for the applicable fee