Dream & Passion

Sunday, October 24, 2010

[DBN - 0003 ] A theoretical framework for back-propagation

The theoretical formalism described in this paper seems to be well suited to the description of many different variations of back-propagation.

From a historical point of view, back-propagation had been used in the field of optimal control long before its application to connectionist systems has been porposed.

The central problem that back-propagation solves is the evaluation of the influence of a parameter on a function whose computation involves several elementary steps.

This paper presents a mathematical framework for studying back-propagation based on the Lagrangian formalism. In this framework, inspired by optimal control theory, back-propagation is formulated as an optimization problem with non-linear constraints.

The Lagrange function is the sum of an output objective function, which is usually a squared sum of the difference between the actual output and the desired output, and a constraint term which describes the network dynamics.

This approach suggests many natural extensions to the basic algorithm.

Other easily described variations involve either additional terms in the error function, additional constraints on the set of solutions, or transformations of the parameter space.

Posted via email from Troy's posterous

[DBN - 0002 ] Preface to the Special Issue on Connectionist Symbol Processing

Download now or preview on posterous

0002-connectionist.pdf (300 KB)

Connectionist networks are composed of relatively simple, neuron-like processing elements that store all their long-term knowledge in the strengths of the connections between processors.

The network generally learns to use distributed representations in which each input vector is represented by activity in many different hidden units, and each hidden unit is involved in representing many different input vectors.

The ability to represent complex hierarchical structures efficiently and to apply structure sensitive operations to these representations seems to be essential.

The outcomes of these two battles suggest that as the learning procedures become more sophisticated the advantage of automatic parameter tuning may more than outweigh the representational inadequacies of the restricted systems that admit such optimization techniques.

Clearly, the ultimate goal is efficient learning procedures for representationally powerful systems. The disagreement is about which of these two objectives should be sacrificed in the short term.

Posted via email from Troy's posterous

[DBN - 0001 ] Learning Representations by back-propagating errors

Download now or preview on posterous

0001-backprop_hinton.pdf (348 KB)

The original article on Error Back Propagation for Neural Networks.

Posted via email from Troy's posterous

Monday, October 18, 2010

[DBN] Visualizing DBN's higher level features

Download now or preview on posterous

visualization_techreport.pdf (710 KB)

Posted via email from Troy's posterous

Wednesday, October 13, 2010

[Matlab] How does the imagesc work

The Matlab function imagesc automatically scales the data matrices to be displayed as a colored figure. The actual color depends on the color map used.

How does the scaling work? Actually, it is just a linear scaling which maps the original data values to the color map indices.

A blog post from Steve explained it detailedly, which can be found http://blogs.mathworks.com/steve/2006/02/10/all-about-pixel-colors-part-3/.

The equation used to scale the data matrices is

Posted via email from Troy's posterous

Friday, October 8, 2010

[Linux] Install PLearn on Ubuntu and Suse SLES 10sp3

PLearn is a C++ Machine Learning package.

The installation on Ubuntu is much easy:

Follow the guidance on this page: http://plearn.berlios.de/installation_guide/node4.html

Install all the prerequisite and recommended packages through system's syntactic package manager.

Note that in your system, the version maybe different, the version number should not be smaller than the number on the guidance page.

In my installation, "edfblas" package is not found and not installed, instead the libblas is installed;

also for lapack package, no lapack3, lapack3-dev are found; liblapack3 and liblapack3-dev are installed instead;

After the package installation, edit the ~/.bashrc file, to include following lines:

export PLEARNDIR=${HOME}/PLearn

export PATH=$PLEARNDIR/scripts:$PLEARNDIR/commands:${PATH}

export PYTHONPATH=$PLEARNDIR/python_modules

Change them to the correct path on your own system.

Next just following what the instructions on that page tells you.

To compile the executable, pymake plearn.cc

The installation on Suse SLES 10sp3 is a little difficult:

The major problem is to install the required packages. On the Suse system, the package name are different from what are used in Ubuntu.

there is not libboost, but boost package

libnspr4 is called mozilla-nspr4

libncurses is called ncurse directly

python-numarray is not needed, but python-numpy-dev is necessary

Anyway try to search the key name of that package when not found.

One more comments on installing those packages is that do use Yast instead of zypper. I'm not sure why some packages are not found with zypper but with Yast, with the same installation sources.

After the package installation, everything becomes the same and you are near the success.

Posted via email from Troy's posterous

[Deep Learning] An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation

Deep Architectures have been proposed in recent years, and experiments on simple tasks have shown much better results compared with existing shallow structure models.

The idea of Deep architecture is that higher level abstract features could be captured, which is believed to be much more robust to the variations in the original feature space.

Through the deep architecture, the effects of different variation factors could be modeled by the layered structure.

The deep architecture is believed to work well on problems that the underlying data distribution can be thought as the product of factor distributions, which means that a sample corresponds to a combination of particular value for these factors.

In this paper, the authors experimented with plenty of factors of variations on the MNIST digit database to compare with different models including shallow SVM models and deep believe network, stacked auto-associators.

The shallow structures:

The DBN structure:

The stacked auto-associator structure:

The results of their experiments:

From the results, most of the time the deep architecture works well with variations, but there are also some cases they are worse than the shallow architectures.

The deep learning algorithms also need to be adapted in order to scale to harder, potentially "real life" problems.

In the talk presented by one of the author, Dumitru Erhan, another set of experiments using Multi-layered kernel machines for comparison, which was proposed by Schoelkopf et al. in 1998 by stacking up kernel-PCAs.

They have shown that the Multi-layer kernel machines work pretty well.

One last thing, their experiments are done using PLearn.

Download now or preview on posterous

icml-2007-camera-ready.pdf (360 KB)

Download now or preview on posterous

erhan_talk.pdf (389 KB)

Posted via email from Troy's posterous