GP thesis I wish I had written

Panel Discussion on GP thesis I wish I had written

Wednesday, 16 July, 1997, 11:30 - 12:30 pm

Genetic Programming (GP97),

Stanford, California, USA

Una-May O'Reilly, Artificial Intelligence Lab, MIT, unamay@ai.mit.edu

Suggestions for GP Students

Major potential topics
1. The interactions among genotype growth, problem solving efficacy and problem definition. Why does genotype growth impede GP on some problems and not others? Can growth be controlled in step with search and in step with increasing the difficulty of th e task? Are self-adaptive mechanisms the best way or is there some simple explicit performance dependent mechanism that can be added to GP?
2. A taxonomy of program discovery problems that accounts for primitive semantics and findings on how semantics interact with GP search capability.
3. An analysis of epistatic interaction and GP.
For one or all of the topics in 1., elaborate in terms of suggesting several specific investigations, a methodology, a problem class, related work, etc.
Regarding topic #3: The behaviour of a primitive depends on the other primitives in the GP program. Is it reasonable to consider data dependency and control dependency as a measure of epistasis? Can dependency flow graphs quantify epistasis i a progra m? How can the dependency of a primitive on its "neighbours" be quantified so that its value correlates with problem difficulty? GAs seem to have a niche in optimization problem space: they don't solve independent problems as efficiently as hill climbers , but as epistasis increases, a GA becomes a better choice than a hill climber. However, as epistasis further increases the problem becomes too hard for any algorithm. What happens in GP?
What are good GP problems for a dissertation? By ``good'' let's mean useful for illustration and evidence that the concept you have for extending our knowledge about GP is worthwhile.
I think a dissertation needs to make use of 1) problems that are simple but illustrative, and scalable 2) one (or more) valid "real" problem to demonstrate a technique. I would suggest that designing a problem to investigate an issue is often an enlight ening exercise that yields more than just a problem to work with.
What overused or unillustrative problems detract from convincing you about a GP research project?
The ant trail. Games and puzzles. I don't mind multiplexor because it's common in the literature and it's possible to understand it.
What is the most difficult aspect of GP of which to convince skeptical committee members? How did you convince them? What do committee members find intuitive about GP? Which of their intuitions mislead them?
It was difficult to justify that I hadn't thrown in every bell and whistle into my GP+ algorithm. Committee members can intuitively come up with ideas such as modified operators, niching, competitive selection, distributed island models because they are k nowledgeable about evolution to a sufficient degree that they can think up evolutionary mechanisms that might convert algorithmically.
I had to stress that the probabilistic nature of the algorithm and the population component make GP sufficiently complex that simple versions of it are worth studying and analysing.
My committee occasionally forgot that the process of natural evolution does not optimize - it adapts. And, they confused their engineering hats with their scientific hats.
Are there any areas in GP research you would steer a dissertation-student away from? Why?
I don't think GP dissertations that merely enhance the algorithm to solve a big real world problem qualify under academic standards (even though they are extremely important to the "real" world!). To meet dissertation standards, there must be analysis and new revelations about how something works.
Comment on what you consider a solid methodology for experimentation and investigations based on GP.
Let's suppose you're going to add something to vanilla GP. Start with a hypothesis about what the effect of your mechanism will be.
Collect illustrative problems and *beforehand* make sure you have measurable benchmarks which you can use to evaluate your hypothesis. Make sure you have a way of making fair comparisons.
Run a sufficient number of experiments to ensure you can say something with statistical certainty. Beware, GP runs have high variance in terms of success and level of success.
Regardless of whether your hypothesis is confirmed, explain what happened!! Failures are often more enlightening than successes.
Be prepared to do and try lots of things that don't make it into your dissertation.
How do your GP projects fit into the broad investigative perspective of your work or research?
I consider myself to be interesting in the topic of learning under the umbrella of artificial intelligence. I see evolution as an intriguing example of adaptation and learning. I want to find evolution's niche in learning.
What tools, advice and references did you find particularily useful in preparing your dissertation?
A good set of statistic routines and a good plotter! Matlab is very handy now though for my dissertation I used c and perl to write stats routines and gnuplot to plot stuff.
I started writing using a LaTex disseration template and still prefer any text prep program that produces (efficient) postscript. Figures are a pain so figure out how to get them onto a page with captions and sized properly on a slow day.
I think the most inspiring book I've read on evolutionary computation is John Holland's book entitled "Adaptation in Natural and Artificial Systems". It is essential to keep up to date with all related conferences, books and journals (and they are gowing in number!
As I grew more confident about the individuality of my thesis subject - that no one else had my perspective or investigative slant, I engaged in email discussions with people whose work I respected.
Do you have advice on computational resources, software platforms?
I wrote my original stuff in LISP and still regret it. I really think a C or OO-based platform is more efficient. Write your code threaded (eg distributed GP) and don't worry about the extra overhead you introduce if you run it on one processor, I bet tha t will be dwarfed by other computations.
Other comments?
Read other people's dissertations to get ideas. Contact them if you want to discuss something.