Sunday, August 29, 2010

5 indispensable IT skills of the future


Computerworld -
 In the year 2020, technical expertise will no longer be the sole province of the IT department. Employees throughout the organization will understand how to use technology to do their jobs.

Yet futurists and IT experts say that the most sought-after IT-related skills will be those that involve the ability to mine overwhelming amounts of data, protect systems from security threats, manage the risks of growing complexity in new systems, and communicate how technology can increase productivity.

1. Analyzing Data

By 2020, the amount of data generated each year will reach 35 zettabytes, or about 35 million petabytes, according to market researcher IDC. That's enough data to fill a stack of DVDs reaching from the Earth to the moon and back, according to John Gantz, chief research officer at IDC.

Demand will be high for IT workers with the ability to not only analyze dizzying amounts of data, but also work with business units to define what data is needed and where to get it.

These hybrid business-technology employees will have IT expertise and an understanding of business processes and operations. "They are people who understand what information people need" and how that information translates into profitability, says David Foote, president and CEO of IT workforce research firm Foote Partners LLC. "You'll have many more people understanding the whole data 'supply chain,' from information to money," he says.

2. Understanding Risk

Risk management skills will remain in high demand through 2020, says futurist David Pearce Snyder, especially at a time when business wrestles with growing IT complexity. Think of IT problems on the scale of BP's efforts to stop the Gulf of Mexico oil spill, or Toyota's work to correct sudden acceleration in some of its cars, Snyder says.

"When you're in the time of rapid innovation," which is happening now and will continue into 2020, he contends, "you run into the law of unintended consequences -- when you try something brand-new in a complex world, you can be certain that it's going to produce unexpected consequences." Businesses will seek out IT workers with risk management skills to predict and react to these challenges

3. Mastering Robotics

Robots will have taken over more jobs by 2020, according to Joseph Coates, a consulting futurist in Washington. IT workers specializing in robotics will see job opportunities in all markets, he adds.

"You can think of [robots] as humanlike devices, but you have to widen that to talk about anything that is automated," Coates says. Robotics jobs will involve research, maintenance and repair. Specialists will explore uses for the technology in vertical markets. For example, some roboticists might specialize in health care, developing equipment for use in rehabilitation facilities, while others might create devices for the handicapped or learning tools for children.

4. Securing Information

Since we're spending more and more time online, verifying users' identities and protecting privacy will be big challenges by 2020, because fewer interactions will be face-to-face, more personal information may be available online, and new technologies could make it easier to impersonate people, according to a report by PricewaterhouseCoopers. Teleworkers will also represent a larger portion of the workforce, opening up a slew of corporate security risks.

"We're in a dangerous place," because many employees are tech-savvy, yet they "don't understand the first thing about data security," Foote explains. "That will change in 2020," when companies will cast an even wider net over data security -- including the data center, Internet connectivity and remote access, he predicts.

5. Running the Network

Network systems and data communications management will remain a top priority in 2020, but as companies steer away from adding to the payroll, they will turn to consultants to tell them how to be more productive and efficient, says Snyder, who follows predictions from the U.S. Bureau of Labor Statistics.

"You have already cut as many people as you can, so now you can only increase productivity," he says. "Someone has to come in here and tell me how to better use the technology that I have."

Posted via email from Troy's posterous

Friday, August 27, 2010

[Tools] matplotlib

Plotting package for python.


matplotlib is a python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. matplotlib can be used in python scripts, the python and ipython shell (ala MATLAB®* or Mathematica®), web application servers, and six graphical user interface toolkits.

Posted via email from Troy's posterous

Monday, August 16, 2010

[Linux] Graph format conversion commands

1. convert

This command is really powerful, it could deal many kinds of image formats, including different vector graphics. 
With "-trim" to only convert the drawing area without white margins.

2. epstopdf

This command coverts eps files to pdf file in vector form.

Posted via email from Troy's posterous

Wednesday, August 11, 2010

[Statistics] Point Estimation

Download now or preview on posterous
6103_chap01.pdf (399 KB)

From Wiki:

In statistics, point estimation involves the use of sample data to calculate a single value (known as a statistic) which is to serve as a "best guess" for an unknown (fixed or random) population parameter.
More formally, it is the application of a point estimator to the data.
In general, point estimation should be contrasted with interval estimation.
Point estimation should be contrasted with general Bayesian methods of estimation, where the goal is usually to compute (perhaps to an approximation) the posterior distributions of parameters and other quantities of interest. The contrast here is between estimating a single point (point estimation), versus estimating a weighted set of points (a probability density function). However, where appropriate, Bayesian methodology can include the calculation of point estimates, either as the expectation or median of the posterior distribution or as the mode of this distribution.
In a purely frequentist context (as opposed to Bayesian), point estimation should be contrasted with the specific interval estimation calculation of confidence intervals.

From other resources:

For a population whose distribution is known but depends on one or more unknown parameters, point estimation predicts the value of the unknown parameter and interval estimation determines the range of the unknown parameter.

In summarization, point estimation is used to estimate mean, variance, standard deviation, or any other statistical parameter for describing the data.

In time-series prediction, point estimation is used to predict one or more values appearing later in the sequence by calculating parameters for a sample.

Methods to obtain point estimates:
1) moments
2) maximum likelihood estimation
3) Bayes estimators
4) EM
5) robust estimation

Criteria to assess estimators:
1) bias
2) mean squared error
3) standard error
4) efficiency 
5) consistency


Posted via email from Troy's posterous

Thursday, August 5, 2010

[ASR] A Study on Lattice Rescoring with Knowledge Scores for Automatic Speech Recognition

Download now or preview on posterous
isca_rescoring.pdf (181 KB)

15 HMM based articulatory detectors are adopted to generate log-likelihood rate features for a later stage NN to predict phone posteriors.

Meanwhile, those LLRs are directly used to rescoring the lattice generated from the standard HMM ASR systems and has been shown to yield better performance.

The articulatory knowledge scores are generated by those HMM based detectors, which are better than NN based detectors.
The problem with the NN-based scores is that they are likely to fluctuate.

Automatic Speech Attribute Transcription (ASAT) paradigm.

Frame level LLRs are better than segmental level's.

The 15 articulators adopted in this paper are: fricative, vowel, stop, nasal, approximant, low, mid, high, labial, coronal, dental, velar, retroflex, glottal, and silence.

Posted via email from Troy's posterous

[LVCSR] A phonetic feature based lattice rescoring approach to LVCSR

In the previous work, the authors developed a detector based high performance phone recognizer. Articulatory informations are extracted using a bunch of speech feature detectors implemented by MLP. A final event merger, another MLP, combines those different detectors to predict phoneme posteriors, which are used as HMM's emission probabilities for decoding.

In this paper, the detector based phoneme recognizer is extended to LVCSR. With the state-of-the-art HMM based speech recognizer, word lattices are generated. The the high quality monophone posteriors generated by the detector based recognizer is utilized to rescore the lattices for second stage decoding.

Comparing with standard MLE and MMI trained HMM systems, the rescored lattices yield lower WER on WSJ0 corpus.

The system structure is illustrated in the figure below:

Download now or preview on posterous
4960471.pdf (286 KB)

Posted via email from Troy's posterous

[NN/HMM] Introducing Phonetically Motivated, Heterogeneous Information into Automatic Speech Recognition

Phonetically motivated experts are investigated for multi-stream automatic speech recognition.

The two experts adopted are:
1) {vowel, consonant, nasal, liquid, silence}
2) {voiced, unvoiced, silence}

The basic system is a MLP to predict phone posteriors and two settings are used: one is full-band and the other is multi-band.

The fusion of the original model and the expert system is done by simply multiplying them together.

The two experimental settings are displayed below:

Posted via email from Troy's posterous

Tuesday, August 3, 2010

Map of the brain's network

The scientists focused on the long-distance network of 383 brain regions and 6,602 long-distance brain connections that travel through the brain’s white matter, which are like the “interstate highways” between far-flung brain regions, he explained, while short-distance gray matter connections (based on neurons) constitute “local roads” within a brain region and its sub-structures

Posted via email from Troy's posterous

Monday, August 2, 2010

Split lossless audio (ape, flac, wv, wav) by cue file in Ubuntu


Lossless audio files can be split by cue file using “shnsplit” (part of the “shntool” package). You will also need the “cuebreakpoints” tool (part of the “cuetools” package). To install cuetools and shntool in Ubuntu/ Kubuntu, open a terminal window and enter the following:

sudo apt-get install cuetools shntool

You will also need software for your prefered lossless audio format. For Monkey’s Audio you need to install “mac” – see here for details. For FLAC and WavPack formats you need to install “flac” and “wavpack” respectively:

sudo apt-get install flac wavpack

Shnsplit requires a list of break-points with which to split an audio file. Conveniently, cuebreakpoints prints the break-points from a cue or toc file in a format that can be used by shnsplit. You can pipe the output of cuebreakpoints to shnsplit as follows:

cuebreakpoints sample.cue | shnsplit -o flac sample.flac

In this example, a flac file called “sample.flac” is split according to the break-points contained in “sample.cue” and the results are output in the flac format.

The output file format is specified via the “-o” option. If you don’t specify an output format your split files will be in shntool’s default format (i.e., wave files, “wav”).

To split a monkey’s audio file by cue file and output the results in the flac format:

cuebreakpoints sample.cue | shnsplit -o flac sample.ape

Note that a default prefix “split-track” is used to name the output files. (The default output format is split-track01, split-track02, split-track03, …). You can specify your own prefix via the “-a” option.

To see all the options for shntool split type “shntool split -h” or “shnsplit -h”.

Transferring tags

The audio files output by shnsplit do not contain tag data. However you can use the “cuetag” script (installed as part of the cuetools package) to transfer tag data directly from a cue file to your split audio files. You specify the individual audio files corresponding to the tracks contained in your cue file as follows:

cuetag sample.cue split-track01.flac split-track02.flac split-track03.flac split-track04.flac

This will transfer the tag data contained in “sample.cue” to the flac audio tracks “split-track01.flac” “split-track02.flac” “split-track03.flac” and “split-track04.flac”.

The above command could be streamlined as:

cuetag sample.cue split-track*.flac

Cuetag works with flac, ogg and mp3 files. The cuetag script is not currently able to handle file names containing spaces.

Note: If you are running flac version 1.1.4 or higher then you may need to make some small changes to the cuetag script before it will work correctly with flac files. Open the cuetag script (for Ubuntu installations it will be located at /usr/bin/cuetag) in a text editor and make these two changes: 1) search for the text “remove-vc-all” and replace it with “remove-all-tags”. 2) search for the “import-vc-from” and replace with “import-tags-from”.

Posted via email from Troy's posterous

Sunday, August 1, 2010

From movies

Life is complicated and we do our best!

Instead of finding the one, find the one you love and make him/her the perfect one of yours.

Posted via email from Troy's posterous