Results from Genetic Programming for Combining Neural Networks for Drug Discovery
Figure 3
shows the evolved combined classifier produced by genetic
programming is better than all of the 60 trained neural networks.
It is better than
ROCs of any of the 60 neural networks.
In fact
Figure 4 shows its ROC is better than the MRROC of them all together.
(The
movie
shows the evolution of the ROCs in the GP population
in a similar run).
Graph (purple) shows
size fair crossover and the 4 mutation operators have
succeeded in controlling bloat.
The evolved classifier is shown at the end.
Fig. 3.
Performance of 60 given neural networks
and genetic programming.
The area under the
Receiver Operating Characteristics (ROC)
is plotted on the training data (horizontal)
versus
holdout data (vertical).
Points below the diagonal indicate a degree of over training.
Receiver Operating Characteristics
on the training data
of evolved composite classifier.
For comparison the ROCs of 60 given neural networks
are also plotted.
The 17 that are included in the final
are in different colour.
Fig. 4.
Receiver Operating Characteristics
of evolved composite classifier.
For comparison the convex hull of the 60 given neural networks,
on the training and holdout data, is given.
Note the convex hull classifier is no longer convex when
used to classify the holdout data.
Movie
showing the evolution of individual classifiers within the GP
population plotted in the ROC square.
Evolved P450 Classifier
|
Tree 0 |
|
Tree 1 |
|
Tree 2 |
|
Tree 3 |
|
Tree 4 |
Five combinations are evolved simultaneously.
Each consists of the neural networks trained by Clementine
(in red),
arithmetic functions,
constants
and the tuning (or sensitivity) parameter T (in blue).
The complete classifier is obtained by summing the outputs of the five
trees.
If the sum is negative, this predicts the chemical will be inactive.
The Function set
- floating point +, -, times, and divide
Note however divide by zero
always yields 1.0.
This protects it and prevents the GP system failing with a
divide-by-zero fault.
- Max takes two arguments and returns the value of the
largest.
- Min returns the value of the smaller
- MaxA also takes two arguments but returns the signed value of the
largest in absolute terms.
E.g. MaxA (-2,1) returns -2.0
- MinA returns the signed value of the
smallest in absolute terms.
- E.g. INT(3.23) returns 3.
- FRAC returns the fractional part of its input.
E.g. FRAC(3.23) returns 0.23.
- ANNxx.y
is the output of the neural network trained on featureset group xx
and dataset y.
Its argument is its tuning parameter.
A value of 0.5 indicates no baise.
Using the evolved classifier
When presented with a chemical to be classified.
Each of the neural networks (in the evolved composite classifier)
makes its own estimate of whether the chemical is active or not.
To do this it needs about 50 inputs.
(which ones depend upon which of 15 groups the neural network belongs
to).
The inputs are properties of the chemical.
There are a total of 699 chemical features (inputs).
These have been split into 15 groups of about 50.
The evolved classifier uses 17 neural networks
but some come from the same group.
In fact only 12 groups are used.
So approximately 560 features are need for the chemical.
The outputs of the neural networks are
then treated as implicit inputs by the five evolved trees.
The value returned internally within the GP calculation
depend both on the threshold argument to the ANN function and
(implicitly) on the output of the neural network.
Additionally the tuning parameter T must be set
to a value in the range 0 to 1.
Note each GP tree makes a nonlinear combination of the results given
by several neural networks.
Finally the output of all five trees are summed.
If the value is non-negative,
this means the chemical is predicted to be active against the specific
P450 enzyme.
back
W.B.Langdon@cs.ucl.ac.uk
30 August 2001