1999 HUB4 Broadcast News Evaluation English Test Material
Item Name: | 1999 HUB4 Broadcast News Evaluation English Test Material |
Author(s): | Linguistic Data Consortium |
LDC Catalog No.: | LDC2000S88 |
ISBN: | 1-58563-176-0 |
ISLRN: | 691-755-940-811-0 |
DOI: | https://doi.org/10.35111/r4e7-nb71 |
Member Year(s): | 2000 |
DCMI Type(s): | Sound |
Data Source(s): | broadcast news |
Application(s): | speech recognition |
Language(s): | English |
Language ID(s): | eng |
Online Documentation: | LDC2000S88 Documents |
Licensing Instructions: | Subscription & Standard Members, and Non-Members |
Citation: | Linguistic Data Consortium. 1999 HUB4 Broadcast News Evaluation English Test Material LDC2000S88. Web Download. Philadelphia: Linguistic Data Consortium, 2000. |
Related Works: | View |
Introduction
This publication contains the English evaluation test material used in the 1999 NIST Broadcast News Transcription Evaluation administered by the NIST, Spoken Natural Language Processing Group and produced by the Linguistic Data ConsortiumCatalog number LDC2000S88 ISBN 1-58563-176-0.
Data
The test material is contained in two SPHERE-formatted waveform files. The file bn99en_1.sph (set1) contains 1.5 hours of Broadcast News excerpts from last year's set2 epoch. The file bn99en_2.sph (set2) contains 1.5 hours of Broadcast News excerpts from the summer of 1998. Each file should be separately recognized per the Broadcast News English Evaluation Specification.
Additional test material for each set is also included. Test materials include evaluation map files (bn99en_1.uem), automatically generated segmentation files (bn99en_1.seg), transcripts from the evaluation (bn99en_1.utf) and the utf.dtd used to validate the transcripts, reference STM files (bn99en_1.stm), and transcript orthography mapping files (en981118.glm). For more complete information, see the 1998 HUB4 Website.
Updates
There are no updates at this time.
Note that the waveform and transcript data on this disc are licensed through the Linguistic Data Consortium (LDC) and are subject to usage restrictions. Contact the LDC for license agreement information.
Additional Licensing Instructions
This 'members-only' corpora is available to current members who can request the data at the listed reduced-license fee. Contact ldc@ldc.upenn.edu for information about becoming a member.
Copyright
Portions Copyright 1998 PRI-Public Radio International Portions Copyright 1997-1998 ABC News Portions Copyright 1998 NBC News Portions Copyright 1997-1998 Cable News Network, Inc. All Rights ReservedNote that the waveform and transcript data on this disc are licensed through the Linguistic Data Consortium (LDC) and are subject to usage restrictions. Contact the LDC for license agreement information.