WAV

.wav files are compressed speech sample files.  Each .wav file has an
ascii header, which is not compressed, followed by the compressed
speech data.  Compression was done using the "shorten" speech
compression algorithm developed by Tony Robinson at Cambridge
University, UK, and implemented by NIST for use in their SPHERE v2.0
software package.  Accordingly, the ascii file headers are SPHERE
format headers, comprising the first 1024 bytes of each .wav file.
The first line specifies the header type and the second line specifies
the header length.  Below is a sample header:

NIST_1A
   1024
sample_byte_format -s2 10
channel_count -i 1
sample_count -i 15312
sample_rate -i 8000
sample_n_bytes -i 2
sample_sig_bits -i 16
end_head

As per the NIST format definition, the bytes between the "end_head"
line and the 1024th byte are undefined.  Full details on the header
format are given in the file "header.doc" in this directory.

In order to uncompress the speech data, you should obtain the latest
version of the SPHERE 2.0 software package.  This is available for
free via anonymous ftp, as follows ("%" is the operating system
prompt; "ftp>" is the prompt given by the ftp program):

	% ftp jaguar.ncsl.nist.gov
        Name:  anonymous
        Password:  <your email address>
        ftp> binary
        ftp> cd pub
        ftp> get sphere_2.0_Beta2.tar.Z
        ftp> bye
        % uncompress sphere_2.0_Beta2.tar.Z
        % tar xf sphere_2.0_Beta2.tar

If you do not have access to ftp file transfer services, please
contact the Linguistic Data Consortium; the software package can be
sent to you by mail on the media of your choice.

Once the SPHERE 2.0 package has been installed, the .wav files can be
uncompressed using the "w_decode" command; this program will convert
the compressed waveform data into 16-bit signed linear sample data,
using the byte order that is native to the system running the program.
Please refer to the documentation included in the SPHERE package for
further information.


LOLA

The lola files are ascii "location and label" files.  They are similar
to the ".phn" files of the TIMIT database except

1) the locations are given in a unit of time other than the sample.  
2) there is a short header saying what this unit is

Each file in this distribution has the header

	MillisecondsPerFrame: 3.0
	END OF HEADER

After that are a series of lines, one per segment, of the form

	<begin frame> <end frame + 1>  label

For example

	200  237   ah
	237  289   m

The [ah] segment extends from from 200 to frame 236 inclusive.  The
end label is 237 for "historical" reasons.

The lola files have the extension ".ptl"