Genetic Programming
Genetic programming (GP) automatically create programs.
It uses the principles of Darwinian natural selection,
such as survival of the fitess, to evolve populations of programs.
The genetic material of individuals in the population are the
programs.
Each generation the fitter programs
are selected to be parents and mated to give them children.
There are two common types of genetic operation,
sexual and asexual.
The Genetic Programming Cycle.
From
GP+DS.
Asexual Reproduction
A sexual reproduction create children from a single parant.
This is often known as mutation.
The genetic material in the child comes from just one parent in the
previous generation.
In GP this means the program in the next generation is a copy of a
program in the previous generation.
It is common to make one or more random changes,
known as mutations, that make the child program slightly different
from its parent.
Sexual Mating
With sex, (parts of) two programs are combined to yeild the child
program.
Commonly two points are chosen at random.
One in the mother program and one in the father.
The code below the cutpoint in the father is copied.
The code below the cutpoint in the mother programme is deleted and
then replaced by that cut out of the father.
The newly formed program is their child.
Commonly these operations are performed on copies.
The idea is that the mother and father will tend to be better than
average (since they got selected to have children)
and therefore they will contain good fragments of code.
Therefore the child will also tend to contain good code.
Sometimes a child will be especially fortunate and will be created by
putting together good but different code fragments (and they will work
together well in the child).
Mum, fitness .64286, 2x-x (red subtree will be removed)
Dad, fitness .70588,
x\(x\(x-x**3))-x
(blue subtree copied)
Correct Program, fitness 1.0,
x+x**2-x
(blue subtree copied)
The newly formed child may also be mutated.
There are seveal types of crossover operator available.
Some use more than one crossover point in each parent.
Fitness
Some means is needed to decide which programs are to be parents.
Conventionally this is done by a fitness function which gives each
individal in the population a number (its fitness).
Conventionally this is a scalar number but one of the advantages of
eveoltionary approaches is the
multiobjectives can be
readily supported.
In genetic programming fitness is often calculated by running the
program (individual) on one or more inputs (test cases).
The answer returned by the program is then compared with
the desired answer. This is equaivelent to supervised learning.
The goal is to find a function (program) which describes the test
data.
That is we are fitting a function to some data.
By analogy with linear regression (which fits a line to data),
this is known as "symbolic regression".
C
code for the above example.
Genetic Programming Contacts
See.
W.B.Langdon@cs.ucl.ac.uk,
3 April 2001