The FFT algorithm implemented in the HTK is the Decimation-In-Time FFT algorithm as detailed explained in the attached file. The computation could be reflected in the following flow graph ( 8-poing DFT ):
A minor difference is that in the HTK implementation, the Wn=exp(j * 2 * pi / n), while in common the Wn adopted is Wn=exp(-j * 2 * pi).
The above FFT algorithm only deals with complex numbers. To do DFT on the real valued signal sequence we need a real version of the FFT, for which we could utilize the complex FFT. In the second document from Texas Instrument, they give a fast implementation of the FFT for real valued sequences ( on page 12-19). The algorithm is illustrated briefly in following figure: