Документ взят из кэша поисковой машины. Адрес оригинального документа : http://ecology.genebee.msu.ru/3_SOTR/CV_Terekhin_publ/2001_RadialMaze.doc
Дата изменения: Mon Mar 16 10:59:42 2009
Дата индексирования: Mon Oct 1 21:16:58 2012
Кодировка:

A GENETIC ALGORHYTM FOR OPTIMIZING A NEURAL NETWORK
CAPABLE TO LEARN FOR FOOD SEARCHING IN A RADIAL MAZE

Budilova E.V., Chepurnova N.E., Chepurnov S.A, Teriokhin A.T.

M.V.Lomonosov Moscow State University, Dept. of Biology.
Moscow 119899, Russia

chepurnovsa@mail.ru
at@ateriokhin.home.bio.msu.ru

A hypothetical neural scheme which can realise an optimal search of food in
a maze is proposed. The general structure of the network was fixed but its
parameters were optimized using a natural selection like procedure. A real
prototype of the situation that we investigated were experiments on
learning of food search by a rat in Olton's two-level radial maze [Olton,
1979]. Such a maze consists of two round or polygonal platforms - larger
lower and smaller upper ones. The upper platform has a hole in the middle
and several radially directed tubular canals ("arms") each of which can
contain a peace of food and has doors opening only in the centrifugal
direction. The rat placed on the lower platform can go up through the
hole, pass through an arm, return down, go up again, pass through the same
or another arm, and so on. The task of the animal is to visit only those
arms which contain food. In the beginning a real rat makes many errors
i.e. often visits empty arms but after several series of attempts the
number of errors considerably decreases. Our aim was to construct a
control system imitating the learning process and the behaviour of the
real animal. The main part of the proposed system is a neural network with
asymmetrical links. The general structure of the network was fixed but
its parameters had to be determined by learning. The learning process was
split into two parts - phylogenetic and ontogenetic ones [Mangel, 1990].
The task of the phylogenetic learning was to optimize the main parameters
of the network by a process imitating natural selection in a widely
changing environment. The ontogenetic learning, in its turn, should adjust
the values of parameters of an individual to some specific environmental
conditions in which this individual found itself. This was made by a
process similar to the Hebbian learning rule [Hopfield, 1982] but the
finding of this learning rule was itself a result of the phylogenetic
learning. The network consists of working neurones Ai each of which sets in
action its own program (supposed already formed) of going into the arm i.
These neurones are connected by modifiable asymmetrical synaptic links
that ensure learning of an effective order of visiting of the arms.
Additionally, each neuron Ai is connected with a neurone of working memory
Bi. The state of Bi is set 1 if Ai become 1 and decreases in some
proportion if Ai is -1. There are also a system taking in account the
geometry of the maze and a system preventing simultaneous firing of more
than one neuron Ai. The final result of an attempt to visit an arm is a
change in the energy E stored by the animal which is composed of an
expense for visiting the arm, of a possible gain after finding of food,
and of other expenses. The experimental situation considered had a
particularity consisting in that the rat, after leaving an arm and
returning through the central hole to the upper platform, turns out to be
in front of the arm opposite to the visited one and so has a tendency to
enter namely into this opposite arm. Obviously, this tendency decreases
with increasing of the deviation from this opposite direction and becomes
minimum for the direction to the last visited arm. So it was possible to
define a natural measure of proximity between arms which takes its
maximum value when the angle between arms is 180 degrees. The influence
of inter-arms proximities on the behavior of a rat was incorporated into
our model by adding the matrix of proximities to the matrix of synaptic
weights. Initially the weights of links were set randomly. Then a number of
network copies differing by values of randomly generated parameters
evolved each in accordance with the description of the network and its
parameters. After performing a number of rounds of visits the actual
population is replaced by a new generation of networks in which the former
copies are presented proportionally to the energies stored by them. Random
mutations and crossings can be also included in this stage. The procedure
is repeated until the values of parameter become stable. The environment
could change at any moment but periods of fixed food distribution of
random duration were possible. Usually seven tours were sufficient for
learning of successful (or almost successful) visiting of the maze arms.
We have found that the computer simulation results demonstrate a
considerable degree of similarity between the real behaviour and simulated
ones. Nevertheless much work is still needed to obtain a more satisfactory
conformity of the model to the reality and not only from the point of view
of similarity to the external behaviour of the animal but also from the
point of view of adequate modelling the internal cognitive processes.

Olton D.S., Becker J.T., Hendelman G.E. Hippocampus, space and memory.
Behav. Brain Sci., 1979, v.2, 313-365.
Mangel M. Evolutionary optimization and neural network models of behavior.
J. of Math. Biol., 1990, v.28, 237-256.
Hopfield J.J. Neural networks and physical systems with emergent
collective computational abilities. Proc. Natl. Acad. Sci. USA, 1982,
v.79, 2554-2558.