ABOUT SPEECH FILE COMPRESSION AND UNCOMPRESSION ----------------------------------------------- All speech data files in the VAHA corpus (*.sph) are stored on the CD-ROMs in compressed form. The particular form of compression used is version 2.0 of ``shorten'', developed by Tony Robinson of Cambridge University. This algorithm is intended to give optimal compression results for speech sample data, and the version 2.0 release provides improved performance for digitally captured telephone (mu-law) sample data, compared to earlier releases. (Earlier versions of shorten will not perform correctly when uncompressing the VAHA data.) Two software implementations of shorten are available. Both will provide the same output when uncompressing the VAHA speech data, and complete source code is included here for both. One is a stand-alone program developed by Tony Robinson (in the ``shorten'' directory). This package includes a pre-compiled executable program file for MS-DOS users (shorten.exe), so that MS-DOS users do not need a C compiler system to get started. (Users of other operating systems will need a C compiler and related utilities to make an executable program from the source files; but assuming these are available, the actual compilation is very simple.) The other is as embedded functions within the NIST SPHERE software package (in the ``sphere'' directory). This package must be installed via a process involving creation of some object library files and several executable utility programs; this installation process is designed for use with the UNIX operating system, and is unlikely to be easily adaptable to other systems. In terms of choosing which implementation to use, people who are not using UNIX platforms should simply use the stand-alone shorten program; this will be sufficient to provide uncompressed mu-law sample data. People who are using a UNIX system will have a choice of using shorten or sphere. These differ in the following regards. Stand-alone shorten is compact, and is easy and quick to install and use, but it does only one thing: compression or uncompression of speech files. The sphere package is much larger, takes longer to install, and may require custom installation steps on some UNIX systems; execution speed may be slightly slower (but perhaps not significantly so); program usage is only slightly more complex, by virtue of having options to support a wider range of activities. The ``w_decode'' utility can produce uncompressed output in 16-bit linear form if desired (with selectable byte order), rather than the original mu-law. It also makes sure that the file header of the resulting file is updated to reflect all changes to the data; this allows for use of other sphere utilities (w_edit, h_edit, h_read, etc) on the output data, which can be very convenient. UNIX users should NOT use both packages (e.g. shorten to uncompress and other sphere utilities to do other things); shorten will not modify the file headers, which will cause the sphere utilities to perform incorrectly on the resulting files. If you intend to use other sphere utilities (or other processes that recognize and use sphere file headers), be sure to use ``w_decode'' for uncompression. The following explains how to use each package to uncompress the waveform data; it will be assumed that the programs can be found in the user's current execution path, and that the names ``infile.sph'' and ``outfile.sph'' represent suitable file names, with directory paths included if necessary, to locate and identify the input and output files. SHORTEN: shorten -x -a 1024 infile.sph outfile.sph (The "-a 1024" option specifies that the 1024-byte sphere header should be passed through unmodified to the output file; without this option, the command will fail.) SPHERE: w_decode -o ulaw infile.sph outfile.sph or w_decode -o pcm infile.sph outfile.sph (The first form restores the original mu-law sample data; the second converts the samples to a native 16-bit linear form before writing to outfile.sph.) Note that the compression has been done in a way that leaves the sphere headers uncompressed; it is therefore possible to read the headers without having to uncompress the files first.