1999 HUB4 Broadcast News Evaluation English Test Material

Item Name: 1999 HUB4 Broadcast News Evaluation English Test Material
Author(s): Linguistic Data Consortium
LDC Catalog No.: LDC2000S88
ISBN: 1-58563-176-0
ISLRN: 691-755-940-811-0
Member Year(s): 2000
DCMI Type(s): Sound
Data Source(s): broadcast news
Application(s): speech recognition
Language(s): English
Language ID(s): eng
Online Documentation: LDC2000S88 Documents
Licensing Instructions: Subscription & Standard Members, and Non-Members
Citation: Linguistic Data Consortium. 1999 HUB4 Broadcast News Evaluation English Test Material LDC2000S88. Web Download. Philadelphia: Linguistic Data Consortium, 2000.

Introduction

This publication contains the English evaluation test material used in the 1999 NIST Broadcast News Transcription Evaluation administered by the NIST, Spoken Natural Language Processing Group and produced by the Linguistic Data ConsortiumCatalog number LDC2000S88 ISBN 1-58563-176-0.

Data

The test material is contained in two SPHERE-formatted waveform files. The file bn99en_1.sph (set1) contains 1.5 hours of Broadcast News excerpts from last year's set2 epoch. The file bn99en_2.sph (set2) contains 1.5 hours of Broadcast News excerpts from the summer of 1998. Each file should be separately recognized per the Broadcast News English Evaluation Specification.

Additional test material for each set is also included. Test materials include evaluation map files (bn99en_1.uem), automatically generated segmentation files (bn99en_1.seg), transcripts from the evaluation (bn99en_1.utf) and the utf.dtd used to validate the transcripts, reference STM files (bn99en_1.stm), and transcript orthography mapping files (en981118.glm). For more complete information, see the 1998 HUB4 Website.

Updates

There are no updates at this time.

Note that the waveform and transcript data on this disc are licensed through the Linguistic Data Consortium (LDC) and are subject to usage restrictions. Contact the LDC for license agreement information.

Pricing

The Reduced Licensing Fee for this corpus is US$150.

Available Media

View Fees





Login for the applicable fee