Wednesday, March 31, 2010

Figure spanning 2 columns in Latex

If your using 2 columns in a latex document, you'll usually find that a table or figure is just too big for a single column. All you do is use
\begin{figure*}
\end{figure*}

\begin{table*}
\end{table*}

and that will make the figure span the width of the entire page.

Posted via email from Troy's posterous

Tuesday, March 23, 2010

Statistical significance

In statistics, a result is called statistically significant if it is unlikely to have occurred by chance.

The amount of evidence required to accept that an event is unlikely to have arisen by chance is known as the significance level or critical p-vale. In traditional Fisherian statistical hypothesis testing, the p-value is the probability conditional on the null hypothesis of the observed data or more extreme data.

p(Data|NullHypothesis)

If the obtained p-value is small then it can be said either the null hypothesis is false or an unusual event has occurred.

In SCTK toolkits, the Null Hypothesis is:

There is no performance difference between the two systems.

Thus the p-value is, assuming the two system have no difference, the probability of the test statistic having a value at least as extreme as that actually found, is no more than p-value.

So, the small the p-value is, the more statistically significant the system is.

Posted via email from Troy's posterous

Speaker Adaptation Method

Download now or preview on posterous

Liang_ICASSP_2010.pdf (135 KB)

In the attached paper, the authors combine two speaker adaptation methods to improve the performance of speech synthesis of cross-lingual experiments.

The system are HMM based, and the two methods are:

1) Decision Tree Marginalization

2) HMM State Mapping

From: http://publications.idiap.ch/downloads/papers/2009/Liang_ICASSP_2010.pdf

Posted via email from Troy's posterous

Investigation on Tandem Approach

In this attached paper, the authors investigated different aspects of the Tandem approach in the hybrid system.

The overall system they used:

Different processing combinations:

The best is deltas+PCA+normalize.

Download now or preview on posterous

Investigations into Tandem Acoustic Modeling for the Aurora Task.pdf (71 KB)

Download now or preview on posterous

euro01-aurora-poster.pdf (54 KB)

Posted via email from Troy's posterous

Tandem Approach for NN/HMM

Tandem approach of NN/HMM is to use NN for feature extraction. Then the posterior features are used in the conventional HMM systems.

The architecture is:

The training procedure:

Discussion:

Details are in the attached paper.

Download now or preview on posterous

icassp00-nnhmm.pdf (59 KB)

Download now or preview on posterous

icassp2000-poster.pdf (84 KB)

Posted via email from Troy's posterous

Monday, March 22, 2010

Install JDT on top of CDT

Inside the Eclipse IDE, under Help-> Install New software:

Add following Eclipse Galileo Repository:
http://download.eclipse.org/releases/galileo

to 'Available Software Sites'.

Then select from the Programming Language, the JDT.

Following are some repositories:

Eclipse Galileo Repository - http://download.eclipse.org/releases/galileo
galileo - http://download.eclipse.org/tools/cdt/releases/galileo
update site - http://www.eclipse.org/jdt/core/update-site

Posted via email from Troy's posterous

Friday, March 19, 2010

Phoneme Recognition

In Petrov et.al's paper, they reported their phone recognition results on TIMIT to be 21.4% PER.

List of different methods mentioned in their paper:

In Sung et.al.'s paper, they used Hidden Conditional Random Field for phone recognition on TIMIT, and achieved 28.3% PER.

Download now or preview on posterous

emnlp07a.pdf (326 KB)

Download now or preview on posterous

asru09.pdf (210 KB)

Download now or preview on posterous

icassp10.pdf (66 KB)

Download now or preview on posterous

icassp97.pdf (394 KB)

Download now or preview on posterous

high accuracy phone recognition using context clustering and quasi-triphonic models.pdf (850 KB)

Posted via email from Troy's posterous

Wednesday, March 17, 2010

Using Sclite

Step I: Convert the Recognition Results generated by HTK to ScLite format, such as trn format.

In trn format, each line is an utterance with the utterance id in parentheses after it.

e.g:

I LIKE ICE CREAM (c02abd30)

Step I: Generate Alignment using sclite

e.g:

sclite -F -i wsj -r ref.trn -h w15_bg.trn -o sgml

-F scores segments as correct instead of "-d" which uses "diff" for differences;

-i sets the utterance id type to be wsj (Wasll Street Journey)

-r sets the reference transcription to be file "ref.trn"

-h sets the recognized results file to be "w15_bg.trn"

-o sets the output result format to be "sgml"

The output is a file named "w15_bg.trn.sgml".

Note: for each experimental result file, we have to align it to the reference to generate a "sgml" format file.

Step III: Significance Test using sc_stats

e.g:

cat w15_bg.trn.sgml w15_tg.trn.sgml | sc_stats -p -t mapsswe -v -u -n result_bg_tg

-p reads from stdin, the piped output;

-t specifies the test to be mapsswe (the Matched Pairs Sentence Segment Word Error Test)

-v performs the tests on a pair of hypothesis files

-u unifies the test instead of creating comparison matrix for each test

-n output report file name

Appendix:

sc_stats options:

Sc_stats Commandline Options

The commandline options for sc_stats can be broken into four categories:

Input File Options:
- Output Options:
  - Report Generation Options:
    - Statistical Test Options:

Input File Options:

-p

Alignments are read from 'stdin' as input to sc_stats. The format of the input must be in the "sgml" output format, created either by '-o sgml' or by piped input from another sctk utility.

Output Options:

-e desc

Description of the ensemble of hyp files.

-O output_dir

Writes all output files into output_dir. Defaults to the hypfile's directory

-n name

Writes all multiple hypothesis file reports to files beginning with 'name'. Using '-' writes to stdout. Default: 'Ensemble'

Report Generation Options:

-g

- The 'range' graphs are an ASCII representation of the of the variablity in error rates for a given speaker. The graph is sorted be the mean of statistic computed for each speaker. EXAMPLE

- The 'grange' graph is a gnuplot version of the same data ploted in 'range. There are two sets of files created. The first set, which is called '*.grange.spk.plt' and '*.grange.spk.dat', contains the gnuplot command files and data files respectively for the speaker performance variability across systems graph. The second set, which is called '*.grange.sys.plt' and '*.grange.sys.dat', contains the gnuplot command files and data files respectively for the system performance variability across speakers graph. EXAMPLE

- The 'grange2' graph is similar to the 'grange' graph except that each systems speaker word error scores are identified by a unique symbol. EXAMPLE

-r

prn -: Example
sum -: Example
rsum -: Example
lur -: Example
es -: Example
res -: Example
none -: Produce no output reports, Default.

Statistical Test Options:

-t

mcn -: Perform the McNemar Test.
mapsswe -: Perform the Matched Pairs Sentence Segment Word Error Test
sign -: Perform the Sign Test
wilc -: Perform the Wilcoxon Signed Rank Test
anovar -: Perform the Analysis of Variance by Rank Test
std -: This is a shorthand notation to do the 'standard' four tests: mcn, mapsswe, wilc and sign.

-v

For each test performed on a pair of systems files, output a detailed analysis.

-u

Rather than creating a comparison matrix for each test, unify statistical test results into a single comparision matrix

-f

E -> Percentage Word Error
R -> Percentage Words Correctly Recognized
E -> Percentage Word Accuracy

Posted via email from Troy's posterous

Statistically Significant Test - ScLite

Materials are found from: http://www.isip.piconepress.com/projects/speech/software/tutorials/production/fundamentals/current/section_04/s04_03_p06.html

Download now or preview on posterous

significance testing.pdf (125 KB)

Download now or preview on posterous

signtest.pdf (208 KB)

Posted via email from Troy's posterous

Wednesday, March 10, 2010

chmod

File permissions

Use the chmod command to set file permissions.

The chmod command uses a three-digit code as an argument.

The three digits of the chmod code set permissions for these groups in this order:

Owner (you)
Group (a group of other users that you set up)
World (anyone else browsing around on the file system)

Each digit of this code sets permissions for one of these groups as follows. Read is 4. Write is 2. Execute is 1.

The sums of these numbers give combinations of these permissions:

0 = no permissions whatsoever; this person cannot read, write, or execute the file
1 = execute only
2 = write only
3 = write and execute (1+2)
4 = read only
5 = read and execute (4+1)
6 = read and write (4+2)
7 = read and write and execute (4+2+1)

Chmod commands on file apple.txt (use wildcards to include more files)
Command	Purpose
chmod 700 apple.txt	Only you can read, write to, or execute apple.txt
chmod 777 apple.txt	Everybody can read, write to, or execute apple.txt
chmod 744 apple.txt	Only you can read, write to, or execute apple.txt Everybody can read apple.txt;
chmod 444 apple.txt	You can only read apple.txt, as everyone else.

Detecting File Permissions

You can use the ls command with the -l option to show the file permissions set. For example, for apple.txt, I can do this:

$ ls -l apple.txt
-rwxr--r--   1 december december       81 Feb 12 12:45 apple.txt
$

The sequence -rwxr--r-- tells the permissions set for the file apple.txt. The first - tells that apple.txt is a file. The next three letters, rwx, show that the owner has read, write, and execute permissions. Then the next three symbols, r--, show that the group permissions are read only. The final three symbols, r--, show that the world permissions are read only.