Saturday, May 11, 2013
Wednesday, May 8, 2013
Tuesday, May 7, 2013
$ sudo apt-get install tasksel
$ sudo tasksel install lamp-server
Create a mysql database
$ mysql> CREATE DATABASE moodel;
Create a mysql user
$ mysql> GRANT ALL PRIVILEGES ON moodel.* TO 'yourusername'@'localhost' IDENTIFIED BY 'yourpassword' WITH GRANT OPTION;
After installing a copy of Moodle for development, the first thing you should do is:
- Go to Site administration -> Development -> Debugging
- Set Debug messages to DEVELOPER, and Turn on Display debug messages. (Consider turning on some of the other options too.)
- In the administration block, search for "Cache" then
- Turn off Cache all language strings.
- Set Text cache lifetime to No
- Turn on Theme designer mode
Immediately after the installation, set your name and contact e-mail. The name and e-mail will become part of your commits and they can't be changed later once your commits are accepted into the Moodle code. Therefore we ask contributors to use their real names written in capital letters, eg "John Smith" and not "john smith" or even "john5677".
git config --global user.name "Your Name" git config --global user.email firstname.lastname@example.org
Unless you are the repository maintainer, it is wise to set your Git to not push changes in file permissions:
git config --global core.filemode false
Then register the upstream remote:
cd moodle git remote add upstream git://git.moodle.org/moodle.git
Then use following commands to keep the standard Moodle branches at your Github repository synced with the upstream repository. You may wish to store them in a script so that you can run it every week after the upstream repository is updated.
#!/bin/sh git fetch upstream for BRANCH in MOODLE_19_STABLE MOODLE_20_STABLE MOODLE_21_STABLE MOODLE_22_STABLE MOODLE_23_STABLE MOODLE_24_STABLE master; do git push origin refs/remotes/upstream/$BRANCH:$BRANCH done
Wednesday, April 10, 2013
The paper could be found at
Thursday, January 24, 2013
For research purpose, we usually collect clean data and pure noise data and then generate noisy speech by combining them. Sometimes, we use some filters to create the channel distortion effects. Speech at different SNRs are created and used for evaluation. Like in the Aurora 2 dataset, 6 SNRs from 20dB to -5dB are used for evaluation.
To understand the difficulty in recognizing the noisy speech, the spectrograms of them at different SNRs are plotted. From these relatively high resolution spectrograms, at SNR0 the patterns are already quite confusing. At SNR-5 it is hard to extract speech patterns from the noise.
While in speech recognition, the FBank features used are rather low resolution to the spectrograms. Due to the value ranges, the patterns are relatively hard. That's also why usually we use CMVN to preprocess the features before sending to NNs.
The CMVN normalized FBank features are shown as follows. The dynamic parameters actually helps a lot to locate the patterns.
Although from my experience, the dynamic coefficients are really helpful. I always have the question of whether is that because the high dimension. If we use higher dimensional static features, we will have much more detailed information, will that outperform the dynamic features? However, when I try to extract the same number of FBanks, there are several dimensions always giving 0 values. This may be saying the current feature extraction methods are in some aspects limited.
First the per utterance normalized static parts (40D) of the above features are displayed below:
Following are illustrations of 80 FBanks of static features.
Although more FBanks are visually more favorable (at least to me), it is hard to say how the ASR can benefit form them. Maybe some automatically learnt features directly from the waveform signals would be helpful.
Tuesday, August 21, 2012
This article briefly summarized the Pronunciation Evaluation Web Portal Design and Implementation for the GSoC 2012 Pronunciation Evaluation Project.
The pronunciation evaluation system mainly consists following components:
1) Database management module: Store, retrieve and update all the necessary information including both user information and various data information such as phrases, words, correct pronunciations, assessment scores and etc.
2) User management module: New user registration, information update, change/reset password and so on.
3) Audio recording and playback module: Recording the user's pronunciation for further processing.
4) Exemplar verification module: Justify whether a given recording is an exemplar or not.
5) Pronunciation assessment module: Provide numerical evaluation at the phoneme level (which could be aggregated to form higher level evaluation scores) in both acoustic and duration aspects.
6) Phrase library module: Allow users to create new phrases into the database for evaluation.
7) Human evaluation module: Support human experts to evaluate the users' pronunciations which could be compared with the automatically generated evaluations.
The website could be tested at http://talknicer.net/~li-bo/datacollection/login.php. Do let me know (email@example.com) once you encounter any problem as the site needs quite a lot testing before it works robustly. The complete setup of the website could be found athttp://cmusphinx.svn.sourceforge.net/viewvc/cmusphinx/branches/speecheval/troy/. More detailed functionality and implementations could be found in a more manual like report:
Although it is the end of this GSoC, it is just the start of our project that leveraging on open source tools to improve people's lives around the world using speech technologies. We are currently preparing using Amazon Mechanical Turk to collect more exemplar data through our web portal to build a rich database for improved pronunciation evaluation performance and further making the learning much more fun through gamification.