Showing posts with label Research. Show all posts

Tuesday, December 8, 2009

WSJCAMP0

Some information about the WSJCAMP0 corpus:
1. Totally 140 speakers and 110 utterances per speaker;

2. 92 training speakers, 20 development test speaker and two sets of 14 evaluation test speakers. Each speaker provides approximately 90 utterances and an additional 18 adaptation utterances.

3. The same set of 18 adaptation sentences was recorded by each speaker, consisting of one recording of background noise, 2 phonetically balanced sentences and the first 15 adaptation sentences from the initial WSJ experiment.

4. Each training speaker read out some 90 training sentences, selected randomly in paragraph units. This is the empirically determined maximum number of sentences that could be squeezed into one hour of speaker time.

5. Each of 48 test speakers read 80 sentences. The final development test group consists of 20 speakers.

The CD-ROM publication consists of six discs, with contents organized as follows:
discs 1 and 2 - training data from head-mounted microphone
disc 3 - development test data from head-mounted microphone, plus first set of evaluation test data
discs 4 and 5 - training data from desk-mounted microphone
disc 6 - development test data from desk-mounted microphone, plus second set of evaluation test data
There are 90 utterances from each of 92 speakers that are designated as training material for speech recognition algorithms. An additional 48 speakers each read 40 sentences containing only words from a fixed 5,000 word vocabulary, and another 40 sentences using a 64,000 word vocabulary, to be used as testing material. Each of the total of 140 speakers also recorded a common set of 18 adaptation sentences. Recordings were made from two microphones: a far-field desk microphone and a head-mounted close-talking microphone.

http://ccl.pku.edu.cn/doubtfire/CorpusLinguistics/LDC_Corpus/available_corpus_from_ldc.html#wsjcam0

Tuesday, August 18, 2009

Doing what the brain does - how computers learn to listen

Max Planck scientists develop model to improve computer language recognition

We see, hear and feel, and make sense of countless diverse, quickly changing stimuli in our environment seemingly without effort. However, doing what our brains do with ease is often an impossible task for computers. Researchers at the Leipzig Max Planck Institute for Human Cognitive and Brain Sciences and the Wellcome Trust Centre for Neuroimaging in London have now developed a mathematical model which could significantly improve the automatic recognition and processing of spoken language. In the future, this kind of algorithms which imitate brain mechanisms could help machines to perceive the world around them. (PLoS Computational Biology, August 12th, 2009)

Many people will have personal experience of how difficult it is for computers to deal with spoken language. For example, people who 'communicate' with automated telephone systems now commonly used by many organisations need a great deal of patience. If you speak just a little too quickly or slowly, if your pronunciation isn’t clear, or if there is background noise, the system often fails to work properly. The reason for this is that until now the computer programs that have been used rely on processes that are particularly sensitive to perturbations. When computers process language, they primarily attempt to recognise characteristic features in the frequencies of the voice in order to recognise words.

'It is likely that the brain uses a different process', says Stefan Kiebel from the Leipzig Max Planck Institute for Human Cognitive and Brain Sciences. The researcher presumes that the analysis of temporal sequences plays an important role in this. 'Many perceptual stimuli in our environment could be described as temporal sequences.' Music and spoken language, for example, are comprised of sequences of different length which are hierarchically ordered. According to the scientist’s hypothesis, the brain classifies the various signals from the smallest, fast-changing components (e.g., single sound units like 'e' or 'u') up to big, slow-changing elements (e.g., the topic). The significance of the information at various temporal levels is probably much greater than previously thought for the processing of perceptual stimuli. 'The brain permanently searches for temporal structure in the environment in order to deduce what will happen next', the scientist explains. In this way, the brain can, for example, often predict the next sound units based on the slow-changing information. Thus, if the topic of conversation is the hot summer, 'su…' will more likely be the beginning of the word 'sun' than the word 'supper'.

To test this hypothesis, the researchers constructed a mathematical model which was designed to imitate, in a highly simplified manner, the neuronal processes which occur during the comprehension of speech. Neuronal processes were described by algorithms which processed speech at several temporal levels. The model succeeded in processing speech; it recognised individual speech sounds and syllables. In contrast to other artificial speech recognition devices, it was able to process sped-up speech sequences. Furthermore it had the brain’s ability to 'predict' the next speech sound. If a prediction turned out to be wrong because the researchers made an unfamiliar syllable out of the familiar sounds, the model was able to detect the error.

The 'language' with which the model was tested was simplified - it consisted of the four vowels a, e, i and o, which were combined to make 'syllables' consisting of four sounds. 'In the first instance we wanted to check whether our general assumption was right', Kiebel explains. With more time and effort, consonants, which are more difficult to differentiate from each other, could be included, and further hierarchical levels for words and sentences could be incorporated alongside individual sounds and syllables. Thus, the model could, in principle, be applied to natural language.

'The crucial point, from a neuroscientific perspective, is that the reactions of the model were similar to what would be observed in the human brain', Stefan Kiebel says. This indicates that the researchers’ model could represent the processes in the brain. At the same time, the model provides new approaches for practical applications in the field of artificial speech recognition.

Original work:

Stefan J. Kiebel, Katharina von Kriegstein, Jean Daunizeau, Karl J. Friston
Recognizing sequences of sequences
PLoS Computational Biology, August 12th, 2009.

Max Planck Society
for the Advancement of Science
Press and Public Relations Department

Hofgartenstrasse 8
D-80539 Munich
Germany

PO Box 10 10 62
D-80084 Munich

Phone: +49-89-2108-1276
Fax: +49-89-2108-1207

E-mail: presse@gv.mpg.de
Internet: www.mpg.de/english/

Head of scientific communications:
Dr. Christina Beck (-1275)

Press Officer / Head of corporate communications:
Dr. Felicitas von Aretin (-1227)

Executive Editor:
Barbara Abrell (-1416)

ISSN 0170-4656

PDF (121 KB)

Contact:

Dr Christina Schröder
Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig
Tel.: +49 (0)341 9940-132
E-mail: cschroeder@cbs.mpg.de

Dr Stefan Kiebel
Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig
Tel.: +49 (0)341 9940-2435
E-mail: kiebel@cbs.mpg.de

Saturday, August 8, 2009

Statistics

For Today's Graduate, Just One Word: Statistics
New York Times (08/06/09) Lohr, Steve; Fuller, Andrea

The statistics field's popularity is growing among graduates as they realize that it involves more than number crunching and deals with pressing real-world challenges, and Google chief economist Hal Varian predicts that "the sexy job in the next 10 years will be statisticians." The explosion of digital data has played a key role in the elevation of statisticians' stature, as computing and the Web are creating new data domains to investigate in myriad disciplines. Traditionally, social sciences tracked people's behavior by interviewing or surveying them. “But the Web provides this amazing resource for observing how millions of people interact,” says Jon Kleinberg, a computer scientist and social networking researcher at Cornell, who won the 2008 ACM-Infosys Foundation award. In research just published, Kleinberg and two colleagues tracked 1.6 million news sites and blogs during the 2008 presidential campaign, using algorithms that scanned for phrases associated with news topics like “lipstick on a pig.” The Cornell researchers found that, generally, the traditional media leads and the blogs follow, typically by 2.5 hours, though a handful of blogs were quickest to mention quotes that later gained wide attention. IDC forecasts that the digital data surge will increase by a factor of five by 2012. Meeting this challenge is the job of the newest iteration of statisticians, who use powerful computers and complex mathematical models to mine meaningful patterns and insights out of massive data sets. "The key is to let computers do what they are good at, which is trawling these massive data sets for something that is mathematically odd," says IBM researcher Daniel Gruhl. "And that makes it easier for humans to do what they are good at--explain those anomalies." The American Statistical Association estimates that the number of people attending the statistics profession's annual conference has risen from about 5,400 in recent years to some 6,400 this week.

View Full Article - May Require Free Registration | Return to Headlines

Friday, June 19, 2009

Speech Recognition Tools

1. quicknet
2. dpwelib
3. feacat
4. pfile_utils
5. libx11-dev
6 HTK

Monday, June 8, 2009

Very Short Tutorial: How to submit jobs through SGE?

Very Short Tutorial: How to submit jobs through SGE?

Construct a short shell script that runs your program. In this example, we create submit_tut.sh with the following contents:

date
hostname
uptime
echo parameters are $*
date

Important Note: Please make sure that your script does not die before all your processes complete or our cleanup script will kill your processes. This can be assured by having a "wait" at the end of your script if you "background" any of your processes or (better) not use any background processes at all.
Load the SGE settings.

$ . /opt/sge/settings.sh

Note the "." in front of "/opt/sge/settings.sh". You need that to source the settings file or it will not work!! Do this exactly once for every session you want to use SGE. You may also want to put it in your .profile.
Submit your job:

qsub < jobname >

in our case:

$ qsub submit_tut.sh

$ qsub submit_tut.sh test

$ qsub submit_tut.sh test test2
Other SGE commands you might be interested in:

You can check on the progress of your jobs by using the command

$ qstat

If you wish to abort your run, you can use the command

$ qdel -u <userid> # or
$ qdel <jobid>...

Do not worry if your jobs went into the background as we have a cleanup procedure that will help you remove your processes within 5 minutes of your 'qdel'.

SGE also does process accounting. Contents of the accounting database can be access using the command

$ qacct

There is also a X11-based GUI tool for all of SGE's controls.

To start it, you need to enable X11 forwarding. Here's a link containing instructions on setting up X11 support on a Windows machine.
To verify if X11 is enabled on your particular session, just do

$ echo $DISPLAY

If the $DISPLAY variable is set to something like localhost:10.0, you're all set to go.
Load the SGE settings if you have not done so.
$ . /opt/sge/settings.sh
Start the qmon program:

$ qmon

Details about how to use the GUI and SGE are in the SGE User Guide availablehere.

From: https://www.comp.nus.edu.sg/cf/tembusu/sge.html

Thursday, June 4, 2009

S2S (Speech-to-Speech Translation)

Construction of robust systems for
speech-to-speech translation to facilitate cross-lingual oral
communication has been the dream of speech and natural language
researchers for decades. It is technically extremely difficult because
of the need to integrate a set of complex technologies – Automatic
Speech Recognition (ASR), Natural Language Understanding (NLU), Machine
Translation (MT), Natural Language Generation (NLG), and Text-to-Speech
Synthesis (TTS) – that are far from mature on an individual basis, much
less when cascaded together. Blindly integrating ASR, MT and TTS
components does not provide acceptable results because typical machine
translation technologies, primarily oriented towards well-formed
written text, are not adequate to process conversation speech materials
rife with imperfect syntax and speech recognition errors. Initial work
in this area in the 1990s, for example, by researchers at CMU and
Japan’s ATR labs, resulted in systems severely limited to a small
vocabulary or otherwise constrained in the variety of expressions
supported. Currently, the only commercial available speech translation
technology is Phraselator, a simple unidirectional translation device
that is customized for military use. It searches from a fixed number of
English sentences and plays out the corresponding voice recordings in
foreign languages, and cannot handle bidirectional speech.

Resources:

IBM Lab: Speech-to-Speech Translation
http://domino.watson.ibm.com/comm/research.nsf/pages/r.uit.innovation.html

http://www.google.com/goog411/
http://googlesystem.blogspot.com/2008/10/machine-translation-and-speech.html

TC-STAR
http://www.tc-star.org/

CMU-LTI
http://www.lti.cs.cmu.edu/Research/cmt-projects.html

http://domino.watson.ibm.com/comm/research.nsf/pages/r.uit.innovation.html/$FILE/speech_to_speech.mpg

Books:

Incremental speech translation

http://books.google.com.sg/books?id=QEr6dTamixQC&printsec=frontcover&dq=speech+translation&ei=QbonSuPPNY2GkQTb3KjaCg#PPA1,M1

Verbmobil: Foundations of Speech-to-Speech Translation

By Wolfgang Wahlster
http://books.google.com.sg/books?id=RiT0aAzeudkC&printsec=frontcover

Speech-to-speech translation

http://books.google.com.sg/books?id=T0diAAAAMAAJ&q=speech+translation&dq=speech+translation&ei=QbonSuPPNY2GkQTb3KjaCg&pgis=1

Machine Translation

By Conrad Sabourin, Laurent Bourbeau
http://books.google.com.sg/books?id=IsqLGQAACAAJ&dq=speech+translation&ei=QbonSuPPNY2GkQTb3KjaCg

KI 2006

By Christian Freksa, Michael Kohlhase, Kerstin Schill
One of the main lessons learned from all the research during the past three decades is that the problems of natural language understanding can only be cracked by the combined muscle of deep and shallow processing approaches. This means that corpus-based and probabilistic methods must be integrated with logic-based and linguistically inspired approaches to achieve true progress on this AI-complete problem.

Thursday, May 28, 2009

QuickNet II

1. Install dpwelib. Get it from http://www.icsi.berkeley.edu/~dpwe/projects/sprach/sprachcore.html (http://www.icsi.berkeley.edu/%7Edpwe/projects/sprach/dpwelib-2009-02-24.tar.gz).

Just type "./configure"
then "make"
last "sudo make install"
It's done.
You will find those tools in "/usr/local/bin" and "/usr/local/lib".

2. Install feacat. Get it from http://www.icsi.berkeley.edu/~dpwe/projects/sprach/sprachcore.html (ftp://ftp.icsi.berkeley.edu/pub/real/davidj/feacat.tar.gz).

Make sure dpwelib and quicknet are installed.
Then type "./configure --with-dpwelib=/usr/local/lib --with-quicknet=/usr/local/lib"
Next "make"
Last "make install".
It's done.
You will find the tool feacat in "/usr/local/bin".

Monday, May 25, 2009

Using QuickNet with tools not SPRACHcore

The SPRACHcore package has not been updated since 2004. We have
switched to releasing new versions of individual components of
SPRACHcore. We recommend that you use the new versions. The most recent
version of the core neural net code (QuickNet) has many speed
improvements
and other enhancements and it is available here.
The most recent version (compatible with gcc4) of the dpwelib sound utilities
is

dpwelib-2009-02-24.tar.gz.
The most recent versions of feacat, feacalc, noway, and pfile_utils (a set of 18 feature file tools including pfile_klt)
are here.

Not build the whole SPRACHcore again!!!!!!

QuickNet - 1

Platform
--------

This testbed has only been tested on an x86 Red Hat Linux platform.

NOTE: Quicknet includes versions of the qnstrn MLP training tool that
are optimized for specific architectures such as the Pentium 4 or the
AMD Opteron. Because these different versions of the tool use
different matrix routines, there may be different patterns of rounding
errors. Thus if you want maximum scientific comparability of
different results, don't mix different versions of the qnstrn tool in
your experiments. (This is not expected to be a significant issue for
the forward pass tool, qnsfwd, because only training involves a
feedback process which can magnify the significance of different
rounding error patterns.) For maximum comparability with the results
quoted in this README file, the tools/train script specifically
invokes the Pentium 4 qnstrn binary from Quicknet release 3.11, which
is named qnstrn-v3_11-P4SSE2 at ICSI. (This binary cannot be used with older
processors which predate the Pentium 4, but it will run on an AMD
Opteron.) To change this, edit the variable $qnstrnBinary in
tools/train.

tools/train invokes single-threaded MLP training. On a multi-core
machine, training can be sped up by making it multi-threaded using the mlp3_threads option of qnstrn. The most convenient way to measure
training speed is by the MCUPS number reported in the qnstrn log
file. If you use more than one thread, you will probably get more
MCUPS if you increase the value of the mlp3_bunch_size option.
However, if it is increased too much, this can reduce the quality of
the trained MLP. The maximum bunch size before this problem occurs
depends on the corpus.

Feature calculation
-------------------

The neural net software uses the pfile feature file format, which
stores the features for many utterances together in a single file.
The SPRACHcore feacalc tool can calculate a pfile of PLP features
directly. Pfiles can be created from other feature file formats using
the SPRACHcore feacat tool.

Source for starting using Quicknet for MLP training

Source for starting using Quicknet for MLP training

http://www.icsi.berkeley.edu/Speech/papers/gelbart-ms/hybrid-testbed/

http://www.icsi.berkeley.edu/~dpwe/projects/sprach/sprachcore.html

Noisy Numbers data and Numbers speech recognizer

Noisy ISOLET and ISOLET testbeds

The SPRACHcore software package

ICSI Speech FAQ

Tembusu

1. Connect to the cluster:
ssh tembusu2.comp.nus.edu.sg -l [username]
Then you will be asked for the password. The username and password are the same with the mysoc account.

Official instructions are:
from: https://www.comp.nus.edu.sg/cf/tembusu/login.html
Introduction

Tembusu2 accounts (aka home directories) are SEPARATE and DELINKED from the standard SoC Unix account.

Instructions for Login to Linux-based nodes

You must first have created an account on tembusu2.

To login to any "access" node on the cluster, use

ssh -t -l username tembusu2

You will be directed to an available "access" node in a round robin fashion.

To login to a particular "access" node, eg. access5 or access10, use the following:

ssh -t -p 2005 -l username tembusu2

ssh -t -p 2010 -l username tembusu2

To access any "access" nodes from within the cluster, use:

ssh nodename

eg:

ssh access0

ssh access12

Tembusu2's "comp" nodes are meant for batch jobs and therefore, non login.

Instructions for Login to Solaris-based nodes

(Obsolete)

You must first have created an account on tembusu2.

Currently, there is only one Ultrasparc-based Solaris "access" node on the cluster. This can be accessed using the hostname saga.

ssh -t -l username saga

X11 connections

All
the cluster access nodes are configured to support SSH X11 tunneling.
This provides a secure way to forward any GUI applications to your
desktop. You just need to enable X11 support on your desktop to use
this feature when you connect to the cluster.

Do
note that even though the applications forwards the GUI to your
desktop, the application is really running on the access node you are
connected to. As such, it can only access the file system in tembusu
and have no access to any files on your desktop.

See here for instructions to enable X11 tunneling on you ssh client.

2. Check the available disk space
df
or
df -h /home/l/[username]

More information about this cluster https://www.comp.nus.edu.sg/cf/tembusu/index.html

Saturday, May 23, 2009

QUICKNET installation on Linux

1. Download the QuickNet on http://www.icsi.berkeley.edu/Speech/qn.html . ( ftp://ftp.icsi.berkeley.edu/pub/real/davidj/quicknet.tar.gz )

2. Download the ATLAS BLAS libraries from http://math-atlas.sourceforge.net/ . ( http://sourceforge.net/project/showfiles.php?group_id=23725 )

3. Download the rtst library for testing from http://www.icsi.berkeley.edu/Speech/qn.html . ( ftp://ftp.icsi.berkeley.edu/pub/real/davidj/rtst.tar.gz )

4. Download the example data files from http://www.icsi.berkeley.edu/Speech/qn.html . ( ftp://ftp.icsi.berkeley.edu/pub/real/davidj/quicknet_testdata.tar.gz )

5. Install ATLAS BLAS libraries.
5.1 Turn off CPU throttling.
Use the command: sudo /usr/bin/cpufreq-selector -g performance
Before this command, we can use the command: cat /proc/cpuinfo to see the current CUP information before changing it. Note the one "cpu MHz". Also if more than one processor is listed, do the cpufreq-selecrtor command for each processor, using the command " sudo /usr/bin/cpufreq-selector -c 1 -g performance" to turn off throttling for the second processor, as the first processor is denoted as processor 0 and is the default one when "-c " is not specified.
After issue the command turning off the CUP throttling, we can use the command "cat /proc/cupinfo" to list out the information of the CPU, then you will see that the "cpu MHz" has been increase.
5.2 After all the processors are set to the performance mode, Extract the source from the downloaded tarball file. And create a build folder under the ATLAS folder, for example named with "build".
Then first "cd" to the ATLAS folder.
cd build
../configure -b 64 -D c -DPentiumCPS=2400
"-b 64" set the target to be 64 bit, which is recommanded if no other reasons.
"-D c -DPentiumCPS=2400" tells the configuration the CPU is Cour2Core 2.4GHz.
After I type the above command, it prompts errors saying that no Fortran compiler is found. Then,
sudo apt-get install gfortran
sudo apt-get install fort77
Next redo the command "../configure -b 64 -D c -DPentiumCPS=2400".
At the last line, "DONE configure" is printed out, yes!
5.3 Type command : make build
After several steps during building, i got following errors:
ATL_dset_xp1yp0aXbX.c: Assembler messages:
ATL_dset_xp1yp0aXbX.c:96: Error: bad register name `%rsp)'
ATL_dset_xp1yp0aXbX.c:97: Error: bad register name `%rsp)'
ATL_dset_xp1yp0aXbX.c:101: Error: bad register name `%rsi)'
ATL_dset_xp1yp0aXbX.c:102: Error: bad register name `%rsi'
make[6]: *** [ATL_dset_xp1yp0aXbX.o] Error 1
make[6]: Leaving directory `/home/troy/Software/quicknet/ATLAS/build/src/blas/level1'
make[5]: *** [dgen] Error 2
make[5]: Leaving directory `/home/troy/Software/quicknet/ATLAS/build/src/blas/level1'
make[4]: *** [dlib] Error 2
make[4]: Leaving directory `/home/troy/Software/quicknet/ATLAS/build/src/blas/level1'
make[3]: *** [lib.grd] Error 2
make[3]: Leaving directory `/home/troy/Software/quicknet/ATLAS/build/src/auxil'
make[2]: *** [IStage1] Error 2
make[2]: Leaving directory `/home/troy/Software/quicknet/ATLAS/build/bin'
ERROR 437 DURING CACHESIZE SEARCH!!. CHECK INSTALL_LOG/Stage1.log FOR DETAILS.
make[2]: Entering directory `/home/troy/Software/quicknet/ATLAS/build/bin'
cd /home/troy/Software/quicknet/ATLAS/build ; make error_report
make[3]: Entering directory `/home/troy/Software/quicknet/ATLAS/build'
make -f Make.top error_report
make[4]: Entering directory `/home/troy/Software/quicknet/ATLAS/build'
uname -a 2>&1 >> bin/INSTALL_LOG/ERROR.LOG
gcc -v 2>&1 >> bin/INSTALL_LOG/ERROR.LOG
Using built-in specs.
Target: i486-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 4.3.3-5ubuntu4' --with-bugurl=file:///usr/share/doc/gcc-4.3/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --with-gxx-include-dir=/usr/include/c++/4.3 --program-suffix=-4.3 --enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc --enable-mpfr --enable-targets=all --with-tune=generic --enable-checking=release --build=i486-linux-gnu --host=i486-linux-gnu --target=i486-linux-gnu
Thread model: posix
gcc version 4.3.3 (Ubuntu 4.3.3-5ubuntu4)
gcc -V 2>&1 >> bin/INSTALL_LOG/ERROR.LOG
gcc.real: '-V' option must have argument
make[4]: [error_report] Error 1 (ignored)
gcc --version 2>&1 >> bin/INSTALL_LOG/ERROR.LOG
tar cf error_Core264SSE3.tar Make.inc bin/INSTALL_LOG/*
gzip --best error_Core264SSE3.tar
mv error_Core264SSE3.tar.gz error_Core264SSE3.tgz
make[4]: Leaving directory `/home/troy/Software/quicknet/ATLAS/build'
make[3]: Leaving directory `/home/troy/Software/quicknet/ATLAS/build'
make[2]: Leaving directory `/home/troy/Software/quicknet/ATLAS/build/bin'
Error report error_.tgz has been created in your top-level ATLAS
directory. Be sure to include this file in any help request.
cat: ../../CONFIG/error.txt: No such file or directory
cat: ../../CONFIG/error.txt: No such file or directory
make[1]: *** [build] Error 255
make[1]: Leaving directory `/home/troy/Software/quicknet/ATLAS/build'
make: *** [build] Error 2
Having no idea to go on...
At last I found that somebody install the static library of ATLAS.
Using Synaptic Package Manager to search atlas, and select "libatlas-sse2-dev" to install. At the same time two dependencies "libatlas-headers" and "libatlas3gf-see2" are installed together.
Maybe in the command line, using sudo apt-get install libatlas-sse2-dev will also work.
From: http://seehuhn.de/pages/linear

However, after using that package, I could not find the include and lib file of ATLAS for QuickNet's installation. Thus I removed those package and retry the installation from the source.

Searching the web, I found that the error was caused by the settings of 32 bits or 64 bits. To have a try, I set the configuration to not use the default system archeticuture.
../configure -b 64 -D c -DPentiumCPS=2400 -v 2 -Si archdef 0
make build
Using these two commands, it works. What's more, after the make build command, it automatically make check, make tune and install. Even finally make clean.

In my opinion, maybe for the configuration, -b 64 is changed to be 32 will work either. Or only use the last parameter is enough "-Si archdef".

Finally, it was correctly installed after nearly 3 hours!

However, one thing must be noted is that when type the make install command, do be sure you have the root privillages. Or the operation will be denied. So for this command use "sudo make install".

6. Install QuickNet.
Extract QuickNet source files and the test data files.
"cd" to into the source folder.
mk build
cd build
../configure --with-blas=atlas --with-testdata=/home/troy/Software/quicknet/quicknet_testdata CPPFLAGS="-I/usr/local/atlas/include" LDFLAGS="-L/usr/local/atlas/lib"
make
sudo make install
In this step, errors occurs at the make step.
In file included from ../QN_utils.cc:53:
../QN_fir.h:116: error: extra qualification ‘QN_InFtrStream_FIR::’ on member ‘FillDeltaFilt’
../QN_fir.h:118: error: extra qualification ‘QN_InFtrStream_FIR::’ on member ‘FillDoubleDeltaFilt’
../QN_utils.cc: In function ‘void QN_output_sysinfo(const char*)’:
../QN_utils.cc:87: error: ‘ATL_CCVERS’ was not declared in this scope
../QN_utils.cc:88: error: ‘ATL_CCFLAGS’ was not declared in this scope
What I did is just open the "QN_fir.h" file, remove 'QN_InFtrStream_FIR::'. As in the declearation of the class member function, no need to add the class field.
For the second error, I just commented those two lines as they are just printing out some information.

Also if the option "--with-testdata" is specified, after the install, we can use the scritps built from the source with name "testdata_*.sh" to test the QuickNet.
For example, run command:
cd build
./testdata_qnstrn.sh

Finally, it is installed on my machine. Next thing is to understand how it works, so that I can use it for my experiments. Good luck to myself!

Thursday, May 21, 2009

Wu Jun's Beauty of Mathematics

Beauty of Mathematics

http://jun.wu.googlepages.com/beautyofmathematics

数学之美

(Written in Chinese)

I am writing a serial of essays introducing the applications of math in natural language processing, speech recognition and web search etc for non-technical readers . Here are the links

0. Page Rank ( 网页排名算法 )

1. Language Models (统计语言模型)

2. Chinese word segmentation (谈谈中文分词)

3. Hidden Markov Model and its application in natural language processing (隐含马尔可夫模型)

4. Entropy - the measurement of information (怎样度量信息?)

5. Boolean algebra and search engine index (简单之美：布尔代数和搜索引擎的索引)

6. Graph theory and web crawler (图论和网络爬虫 Web Crawlers)

7. Information theory and its applications in NLP (信息论在信息处理中的应用)

8. Fred Jelinek and modern speech and language processing (贾里尼克的故事和现代语言处理)

9. how to measure the similarity between queries and web pages. (如何确定网页和查询的相关性)

10. Finite state machine and local search (有限状态机和地址识别)

11. Amit Singhal: AK-47 Maker in Google (Google 阿卡 47 的制造者阿米特.辛格博士)

12. The Law of Cosines and news classification (余弦定理和新闻的分类)

13. Fingerprint of information and its applications (信息指纹及其应用)

14. The importance of precise mathematical modeling (谈谈数学模型的重要性)

15. The perfectionism and simplism 繁与简自然语言处理的几位精英

16. Don't put all of your eggs in one basket - Maximum Entropy Principles 不要把所有的鸡蛋放在一个篮子里 -- 谈谈最大熵模型(A)

17. Don't put all of your eggs in one basket - Maximum Entropy Principles 不要把所有的鸡蛋放在一个篮子里 -- 谈谈最大熵模型 (B)

18. 闪光的不一定是金子谈谈搜索引擎作弊问题(Search Engine Anti-SPAM)

19. Matrix operation and Text classification 矩阵运算和文本处理中的分类问题

20. The Godfather of NLP - MItch Marcus 自然语言处理的教父马库斯

21. The extension of HMM, Bayesian Networks 马尔可夫链的扩展贝叶斯网络

22. The principle of cryptography 由电视剧《暗算》所想到的 — 谈谈密码学的数学原理

23. How many keys need we type to input a Chinese character 输入一个汉字需要敲多少个键 — 谈谈香农第一定律

吴军主页的中文首页

吴军（Jun Wu) 的英文首页

Labs

Electronic Visualization Laboratory (EVL) at the University of Illinois at Chicago http://www.evl.uic.edu/index2.php

Sunday, May 17, 2009

Hit rate and False alarm rate

From: http://www.ecmwf.int/products/forecasts/guide/Hit_rate_and_False_alarm_rate.html

Verification measures
like the RMSE and the ACC will value equally the case of an event
being forecast, but not observed, as an event being observed but
not forecast. But in real life the failure to forecast a storm that
occurred will normally have more dramatic consequences than
forecasting a storm that did not occur. To assess the forecast
skill under these conditions another type of verifications must be
used.

For any threshold (like
frost/no frost, rain/dry or gale/no gale) the forecast is
simplified to a yes/no statement (categorical forecast). The
observation itself is put in one of two categories (event
observed/not observed). Let H denote "hits", i.e. all correct
yes-forecasts - the event is predicted to occur and it does occur,
F false alarms, i.e. all incorrect yes-forecasts, M missed
forecasts (all incorrect no-forecasts that the event would not
occur) and Z all correct no-forecasts. Assume altogether N
forecasts of this type with H+F+M+W=N. A perfect forecast sample is
when F and M are zero. A large number of verification scores 13 are computed from
these four values.

A forecast/verification table

forecast\obs	observed	not obs
forecast	H	F
not forecast	M	Z

The frequency bias
BIAS=(H+F)/(H+M), ratio of the yes forecast frequency to the yes
observation frequency.

The proportion of
correct PC=(H+Z)/N, gives the fraction of all the forecasts that
were correct. Usually it is very misleading because it credits
correct "yes" and "no" forecasts equally and it is strongly
influenced by the more common category (typically the "no"
event).

The probability of
detection POD=H/(H+M), also known as Hit Rate (HR), measures the
fraction of observed events that were correctly forecast.

The false alarm ratio
FAR=F/(H+F), gives the fraction of forecast events that were
observed to be non events.

The probability of
false detection POFD=F/(Z+F), also known as the false alarm rate,
is the measure of false alarm given the vent did not occur. POFD is
generally associated with the evaluation of probabilistic forecast
by combining it with POD into the Relative Operating Characteristic
diagram (ROC)

A very simple measure of
success of categorical forecasts is the difference POD-FAR which is
known as the Hansen-Kuiper or True Skill Score. Among other
properties, it can be easily generalised for the verification of
probabilistic forecast (see 7.4 below).

Saturday, May 16, 2009

JabRef

JabRef reference manager: http://www.stumbleupon.com/toolbar/#topic=Management/HR&url=http%253A%252F%252Fjabref.sourceforge.net%252F

JabRef

Content Cleanup Required: This article should be cleaned-up to follow the content standards in the Wiki Guide. More info...

Contents

Introduction

JabRef is an open source graphical bibliography reference manager. It uses BibTeX as its native file format, which makes it ideal for modifying your LaTeX bibliographies.

Screenshot

Installation

JabRef is in the Universe repository for Ubuntu Hardy, Intrepid and Jaunty, and can be installed using the package manager: System → Administration → Synaptic Package Manager.

You may need to enable the Universe repository see the RepositoriesUbuntu page for instruction on how to do this.

JabRef can be started from Application → Office → JabRef.

Adding & Editing References

Before adding references create a new database using the File → Open Database or open an existing BibTeX file using the New → Open Database menu option.

JabRef provides a number of ways to add references:

BibTeX → Add New Entry menu option will allow a new reference type to selected and added.
BibTteX → Add new entry from plain text menu option will open a window into which a plain text BibTeX reference can be inserted.
An existing BibTeX file can be added into an existing or new database using the File → Import into New database or File → Import into current database menu options.

References can be edited by:

Selecting BibTeX → Edit entry.
Selecting an entry and pressing Ctrl + E.
Double clicking on an entry.

Hints and Tips

A file can be linked to a reference using the file field in the General tab when editing an entry. A linked file can be opened from within JabRef by right clicking the file field and selecting Open or selecting the entry and pressing F4.

A reference key can be copied by selecting a reference and pressing Ctrl + K.

If you use Lyx or Kile to edit LaTeX documents you can transfer a reference to these editors by pressing the button towards the right of the toolbar with the Lyx logo.

Finding the right bibliographic/reference tool

From: http://www.fauskes.net/nb/bibtools/

Handling references and bibliographic information is an essential part of all research. Summarized, this task can be divided into three parts:

Finding
Organizing
Displaying/formatting

As a PhD student, I need tools to help me with these tasks. In this article I will share with you some thoughts and experiences I have had during my search for the right bibliographic tool.

Update: Added Referencer to the list of tools.

Update: Added RRiki to the list of tools.

Update: Added KBibTeX to the list of tools.

What I am looking for

Finding references has become quite easy. Services likeEngineering Village , IEEE Xplorer and many others make it a breeze to search huge online bibliographic databases. Thousands of reference records and full-text articles are available for download, making it easy to have your own personal library stored on your hard disk. A good bibliographic tool should make it easy to import and organize downloaded references and documents.

When it comes to formatting bibliographic information in articles and reports, there is for me only one option:BibTeX and LaTeX. BibTeX uses style files to produce bibliographies, and handles all the formatting. Some bibliographic tools come with hundreds of output styles for different publications. For me this is not important, since I let BibTeX do all the work.

Many people use MS Word, even though Word is not well suited for writing scientific papers and technical reports. Most of the commercial bibliographic software can easily be integrated with MS Word. However, for me it is more important to have BibTeX support than a nifty Word plug-in.

So what I am looking for in a bibliographic tool is:

Easy to import references from online sources.
Must support common bibliographic formats, especially BibTeX.
It should be possible to organize references in categories, by authors, by keywords etc.
Nice and functional GUI
Non-hassle interface with BibTeX and LaTeX.
Easy access to electronic documents and URLs.
MS Windows compatible. Cross platform is a plus
Should not cost me a lot of money :-)

The tools

A quick search on the Internet reveals that there are many tools to choose from. Fortunately there exist a few web pages that have links to and describes most of them. Two such pages are:

The latter covers only free and open source tools.

Commercial software

The most popular commercial tools seem to be:

All three are from Thomson ResearchSoft. They have a long history and are recommended by many universities . Their user interface seems to be unchanged since the Windows 3.1 era, and would have benefited from a face lift. They are all quite similar in functionality and have tons of import/export filters and output styles. ProCite also has some nice functionality for organizing references in groups.

Some tools with more modern user interfaces are:

The first three programs on the list have much of the same functionality as the software from Thomson. Biblioscape and Bibliographix have built in word processors that let you write notes and drafts. They also offer functionality for organizing references in folders. BibTeXMng is a shareware program for manipulating BibTeX files.

If you do not need software that runs on your personal computer, there exist tools that only need a browser and a connection to the Internet. Two such tools are:

Free and open source software

I have always been attracted to free and open source software. I have no principles against commercial software, but when I compare free and commercial software in cost-benefit terms, free and open source software tends to win.

Among the free software are most of the tools that use the BibTeX format as their native database or output format:

BibEdit: A small and simple program for creating and editing BibTeX files. No import or export functionality. Windows only.
BibDB: A BibTeX database manager, bundled with Scientific Workplace. Has an ancient looking GUI, but has a decent set of features.
JabRef: A Java based graphical front end to manage BibTeX databases. Nice GUI, can easily import from online sources and is actively developed.
Pybliographer: A framework for managing bibliographic databases. Written in Python. Provides a scripting environment and a Gnome GUI. Can also be used with formats other than BibTeX.
BibDesk: Graphical BibTeX-bibliography manager for Mac OS X.
KBibTeX: A graphical BibTeX editor for KDE.
Referencer: A Gnome application for organizing documents and references. Supports automatic meta data retrieval if aDOI code or ArXiV ID is available.

My favourite is currently JabRef. It is an excellent program. The group functionality in JabRef has improved a lot in the latest versions and is now quite powerful. The import-filters are not perfect, but JabRef is constantly evolving and they will probably be improved in future versions.

I am attracted to Pybliographer since it is written in Python, my favourite programming language. Sadly the GUI only runs on Linux, and it does not have all the features of JabRef.

All of the tools above are basically front ends for plain BibTeX files. I find this very convenient and it's a good solution for single users. However, if a bibliographic database is to be used and edited by multiple users, it may be a problem to keep everything in a single BibTeX file. A solution to this problem may be to have an online bibliographic database with a web front end. Some such tools are:

WIKINDX: A web based bibliographic and quotations/notes management and article authoring system designed either for single use or multi-user collaborative use across the Internet.
refbase: A web-based, platform-independent, multi-user interface for managing scientific literature and citations
Aigaion: A web-based platform for managing annotated bibliographies. It allows the user(s) to order publications in a self-chosen topic structure

All of the tools run on the Apache web server with PHP and MySQL. They have good support for BibTeX, but they also support other bibliographic formats compatible with Endnote, Refman, Procite etc. The main strength of a web based tool is that multiple users can use and maintain a common bibliographic database. Personally, I find web applications a bit awkward to use compared to desktop applications. However, with technologies like AJAX, web applications will become more attractive.

RRiki is an interesting mix between a desktop application and a web application:

RRiki: A tool for storing and organizing information on citations for sources (articles,books etc), notes, figures and dossiers on researchers. RRiki uses the Ruby-On-Rails framework to display and store information on a MySQL database via a web-browser.

For information about other free tools, take a look at theOpen standards and software for bibliographies and cataloging page.

Concluding remarks

I am currently using JabRef for organizing my bibliographic information. It satisfies many of my needs. Some of the commercial alternatives may have more features, but most of them are too MS Word oriented and lack proper BibTeX support. I am also reluctant to pay for software when the free alternatives are so good.

The web based tools are very interesting alternatives, but with my current work flow I prefer to work directly with the BibTeX files. However, I see the usefulness of having an online bibliographic database and I will probably create one in the near future in addition to my personal BibTeX database.

I still haven't found the perfect bibliographic tool. However, by choosing open source tools I can contribute to and influence the development. My first contribution isBibConverter, a simple web-application that converts citations from IEEEXplore, Engineering Village and ISI Web of Science to the BibTeX format. The tool outputs BibTeX records that are more accurate and contains more information than when using the export functionality of IEEEXplore and EV2. Read more about BibConverter in a separate notebook entry.

Thursday, May 14, 2009

Online System Rates Images by Aesthetic Quality

Online System Rates Images by Aesthetic Quality
Penn State Live (05/05/09) Spinelle, Jenna; Messer, Andrea

Pennsylvania State University (PSU) has launched the Aesthetic Quality Inference Engine (ACQUINE), an online system for determining the aesthetic quality of an image. The online photo-rating system helps establish the foundation for determining how people will react emotionally to a visual image. ACQUINE delivers ratings--from zero to 100--within seconds, based on visual aspects such as color saturation, color distribution, and photo composition. PSU researchers hope to improve upon the system's current performance level of more than 80 percent consistency between human and computer ratings. "Furthermore, aesthetics represents just one dimension of human emotion," says PSU professor James Z. Wang. "Future systems will perhaps strive to capture other emotions that pictures arouse in people." Wang says that linking cameras to ACQUINE could potentially enable a photographer to instantly see how the public might perceive a photo.

View Full Article

Wednesday, April 15, 2009

How to understand a algorithm

It should be mentioned immediately that the reader should not expect to read an algorithm as if it were part of a novel; such an attempt would make it pretty difficult to understand what is going on. An algorithm must be seen to be believed, and the best way to learn what an algorithm is all about is to try it. The reader should always take pencil and paper and work through an example of each algorithm immediately upon encountering it in the test. This is a simple and painless way to gain an understanding of a given algorithm, and all other approaches are generally unsuccessful.

From: The art of computer programming, vol1

Wednesday, March 4, 2009

Create video from images

 Using FFmpeg is good choice:

ffmpeg -r 1 -f image2 -i jpeg\*.jpg video.avi

http://stackoverflow.com/questions/539257/working-way-to-make-video-from-images-in-c