Thursday, November 24, 2011

[HTK] Increase HTK feature dimension limit

In the HTK feature file, there is a header file specify the basic information of the parameters. 

HTK format files consist of a contiguous sequence of samples preceded by a header. Each sample is a vector of either 2-byte integers or 4-byte floats. 2-byte integers are used for compressed forms as described below and for vector quantised data as described later in section 5.11. HTK format data files can also be used to store speech waveforms as described in section 5.8 

The HTK file format header is 12 bytes long and contains the following data

nSamples                - number of samples in file (4-byte integer)

sampPeriod - sample period in 100ns units (4-byte integer)

sampSize - number of bytes per sample (2-byte integer)

parmKind - a code indicating the sample kind (2-byte integer)

From the above specification, the sampSize is short integer, thus the maximum value for sampSize is 32768. For uncompressed data, the maximum dimension for each sample is thus 32768/4=8192. However, usually even just 1000+ D feature will cause the HTK tools to generate following errors:

OpenParmChannel: cannot read HTK Header in File 

The reason is that in the function ReadHTKHeader of the file HWave.c, there is check for the sampSize value:

if (hdr.sampSize <= 0 || hdr.sampSize > 5000 || hdr.nSamples <= 0 ||

       hdr.sampPeriod <= 0 || hdr.sampPeriod > 1000000)

      return FALSE;

That's to say, in HTK the dimension of the feature vector is limited by this check instead of data type specified in the header format. In the standard version of HTK, at most 1250D feature could be used. To increase the limit, what we need to do is to change the number 5000, but do remember sampSize is short integer, changing to any value larger than 32768 would be useless.

The code at about line 1427 of the file HTKLib/HWave.c.

Posted via email from Troy's posterous


  1. Can you please explain what to do for a similar error as given below,

    ERROR [+1013] IsWave: cannot read HTK Header in File fSX9.wav
    FATAL ERROR - Terminating program HCopy

    I am trying to create an mfc file from a wav file using HCopy...
    My config file is

    # Configuration File

    TARGETRATE = 100000.0
    WINDOWSIZE = 250000.0
    PREEMCOEF = 0.97
    NUMCHANS = 26
    CEPLIFTER = 22
    NUMCEPS = 12

    Thanks in advance..

    1. The data used is recorded by you or from any database? I think the issue is with that. If it is from TIMIT database your SOURCEFORMAT in configuration file must be NIST.