Документ взят из кэша поисковой машины. Адрес оригинального документа : http://acat02.sinp.msu.ru/presentations/ososkov/ACATOsoskov.doc
Дата изменения: Mon Jul 8 22:46:44 2002
Дата индексирования: Mon Oct 1 20:26:26 2012
Кодировка:

ACAT'2002 24-28 June, Moscow, RUSSIA
[pic] Joint Institute for Nuclear Research
141980 Dubna, Moscow region, RUSSIA

Laboratory of Informative Technologies

Effective training algorithms
for RBF-networks

Gennadi A.OSOSKOV, Alexey STADNIK

e-mail: ososkov@jinr.ru http://www.jinr.ru/~ososkov

Outline
1. Introduction, image recognition problem
2. Problem formulation for a security system
3. Multilayer perceptron, arising hindrances
4. Proposed RBF-network design and image preprocessing
Training algorithm for RBF-nets with Mahalanobis
distance. Some examples.
5. The first security application
6. How to decrease neural net dimensionality for image handling
(Principal component method
(Wavelets for image scaling
7. Applications and results
8. Conclusion

[pic]

[pic]
2 Problem formulation for a security system

Neural network is considered as a mean for the fast and reliable
recognition of any of substantial group of human faces.
Reliability requirements:
. probability of identifification error = 0.01;
. probability of misidentifification = 0.005.
The only frontal views of face images are considered further, which are
digitized (by a video-camera, for instance) and stored as 2D raster. In the
majority of cases a raster with not less than 80x100 pixels of 8-bit grey
level is efficient to distinguish individual features of a person.
To obtain a reliable level of recognizability the MLP must be trained
before on a training sample of digitized face images of all persons to be
recognized.
After training NN must adequately recognize any of faces from the
sample and undoubtedly indicate any case of a "stranger" face. Network must
function in real circumstances when a person can slightly vary its pose,
have a minor changes of the face expression, hair-dressing, make-ups, be
unshaven etc.
Such the reliability and robustness requirements can be accomplished
by including into the training sample more than one face image (up to 10)
of the same person.

[pic]

[pic]
Arizing hindrances:
. "Damnation of dimension" leading to very long back-propagation training;
. Arbitrariness in choosing of the hidden layer neurons. The MLP structure
is fixed during the training;
. Difficulties with selecting the training sample to be long enough to
guarantee the correct classification.
[pic]
Such neural nets are known as RBF-nets - Radial Basis Function
neural networks. RBF-nets differ from MLP by two things: by their
metrics (it can be not only L2, but also Manhattan or Mahalanobis
metrics) and by activation function (gaussian instead of (2)).
Our RBF-net innovations are as follows:
. New sructure
. New training algorithm.
4. 1 New structure
[pic]

4.2 The strategy of training
The main features of the first training algorithm:

. use as activation function F(x)=1; if (x(() or =0 ; if (x(()
with additional parameter (, which is also will be optimized
during training;
. dynamically add neurons to the hidden layer;
. train layers of the network separately. First - clasterization, second -
mapping to the desired output;
. train each neuron in the layer also separately(!)

Separate training of each neurons in all layers, gives high speed and
finiteness of the training procedure. During training procedure all
training set separated into three subsets:
. samples which are already classified by the network (AC);
. samples which are not classified (NC);
. samples which are classified by current neuron (CC)

The idea of training procedure is to train a single neuron in NC (not
classified samples) and then add it to the RBF-network.
Algorithm stops when NC becomes empty.
The strategy of training a single neuron is:
. randomly choose one sample in NC.
. allow threshold parameter ( to grow.
. add samples which are closer in terms of selected metric then ( to CC and
remove them from NC;
. recalculate synaptic weights of every neuron as the center of gravity of
corresponding samples in CC set;
. keep increasing threshold parameter ( unless those samples belong to the
same class;
. add a new neuron to the network having all samples from CC added to the
AC.
Such training procedure guarantees the finite number of training cycles and
100% correct classification by the training set.
Therefore, as the result, we have the layer, which produces just one
activated neuron for each sample in training set, then the last layer to be
mapped to the desired output, can be "trained" by setting to 1 weights
connected to such activated neuron while others are to be set to 0.
Then we complete this first algorithm of training by the possibility of
having an extra sample set containing the wrong classified samples (WCS).
Further we name it as the second algorithm.
Three examples of 2D classification by MLP and RBF net. Different
classes marked by different colors.
Example 1. [pic]
From left to right there are presented: (1) training set; (2)
classification by RBF-network; (3) classification by MLP.
[pic]
Example 2. demonstrates the difference in efficiency of both RBF-net
algorithms. From left to right: (1)training set; (2) RBF-network trained by
the first algorithm; (3) RBF-net trained by the second algorithm.

[pic]

Example 3. shows result of classification of well-known benchmarking
problem of separation two imbedded spirals. From left to right: (1)training
set; (2) RBF-network trained by the first algorithm; (3) RBF-net trained by
the second algorithm.

5. The first security application
Now we were ready to work with frontal face images. We use, at first, as
the training sample, the following set (see fig. below):
[pic]

The RBF neural network with
L2 -metrics after training on this small set was enabled to recognize
without errors specially distorted faces from the same set (see the next
picture)

[pic]

However, as soon as we decided to apply our RBF net to 400 images set from
the famous Cambridge face database, our neural net began to mix up faces.

The reason was foreseeable.

Let us consider a digitized image of a face as a vector [pic], i.e. a point
in a space of an unthinkable dimensionality like [pic]. All these points
occupy only a very tiny part, a miserable subspace of this giant space.
Therefore our attempts to search in this whole space for an particular
image without taking into account any specifics of human faces and that
particular face are doomed to be unreliable.

6. Principal component method (PCM)

PCM is a way to project our data onto this subspace extracting most
adequate features by using the information about mutual correlations
[pic]. There is an orthogonal transform [pic] (named Karhunen-Loeve
transform), which converts [pic]to its diagonal form
[pic], where eigenvalues [pic]of [pic]are numbered in their descent order.
One can keep the only most essential
components [pic](m[pic]<< p).
[pic]
Main components as a function of their numbers
Thus we can now express the source data [pic]via these main components
[pic]
neglecting non-important ones.

PCM computational consequences

1. Being linear, principal component method can be easily realized as the
first layer of an RBF net;
2. It gives a considerable economy of computing resources what is important
as on the RBF-net training phase and also for improving the net
reliability.

However, PCM has some shortcomings:
. principal component algorithm is NP-complete, i.e RBF-net training time
grows exponentially with the client number increase;
. as soon as the number of new clients joining the face base exceeds
20-30% of its current value, the RBF-net must be trained afresh
completely, since the prognosis capability of the covariance matrix
is rather restricted.
. Applying PCM to the collection of frontal face images from the Cambridge
face database we found that obtained main components (see Fig. on the
right) are too dependent from variations of the source images in
lightening, background etc.
|[pic] | |
| | |
| | |
| | |
| | |
| | |
| |Main components of some faces|
| |from the Cambridge face |
| |database without previous |
| |wavelet transformation |

Therefore the wavelet preprocessing have been applied
|[pic] | |
| |It removes depending on the |
| |lightening and performs a |
| |scaling of images, although some|
| |of important face features have |
| |been lost. |
| | |
| |Main components of the same face|
| |images after their preprocessing|
| |by 2D gaussian wavelets |
| | |

[pic]
A fast algorithm was developed for 2D vanishing momenta wavelets. Applying
it to the image below we obtain the following wavelet expansion:

[pic] [pic]

A face image its 2D
wavelet expansion

Summarizing three wavelets - vertical, horizontal and diagonal we obtain
the wavelet transform independent on the image variability of lightening,
background and size.

[pic]

Lower row shows results of applying

2D gaussian 2-d order wavelets

to face images of the upper row

Nevertheless, after detailed studying of the efficiency and
misidentification probability of C++ program implementing our RBF-like
neural network with the second algorithm of training (RBFNN2), we had to
refuse from using wavelets for the present time. The reason was in above-
mentioned loss of face features for some type of faces. Besides it is easy
for our security system to provide checkpoints by uniform lightening and
keep the same distance to a photographed person.
After training RBFNN2 on 250 face images we test it on 190 faces with
very promising results: efficiency - 95% and not a single case of wrong
acceptings! 5% of inefficiency occurs due to the only case, when among 10
pictures of the same person used for training on one picture it was a photo
of this man made an unusual grimace, so the program did not accept namely
that photograph of this man.

However, we are still going to study more in details the idea of
applying wavelets for a considerable face image compression without loosing
important features, in order to apply then principal component method to
wavelet coefficients obtained on the preprocessing stage.

Conclusion

. New RBF-like neural network is proposed, which allows to process
raster information of digitized images ;
. A study of the reliability of direct RBFNN2 application to frontal
face data shows the need in data preprocessing by extraction of
principal components after scaling the data by 2D wavelet transfom;
. Wavelet preprocessing resulting in significant data compression is
still under study;
. Corresponding object-oriented C++ software is developed to work with
frontal face images recorded by a video-camera. The first results on
statistics provided by Cambridge face database are quite promising.