EVOSTAR2024: EVOSTAR
PROGRAM FOR THURSDAY, APRIL 4TH
Days:
previous day
next day
all days

View: session overviewtalk overview

09:00-11:00 Session 7A: EvoCOP 1 - Operators, learning, and applications
Location: Room A
09:00
Q-learning Based Framework for Solving the Stochastic E-waste Collection Problem

ABSTRACT. Electrical and Electronic Equipment (EEE) has evolved into a gateway for accessing technological innovations. The rapid turnover of EEE imposes substantial pressure on the environment due to the shortened life cycles. E-waste encompasses discarded EEE and its components which are no longer in use. This study focuses on the e-waste collection problem and models it as a Vehicle Routing Problem with a heterogeneous fleet and a multi-period planning problem with time windows as well as stochastic travel times. Two different Q-learning-based methods are designed to enhance the search procedure for finding solutions of the problem. The first method involves utilizing the state-action value (Q-value) to determine the order of multiple improvement operators within the GRASP framework. The second method involves a hyperheuristic that extracts a stochastic policy from the Q-table to select heuristic operators during the search. Computational experiments demonstrate that both methods perform competitively with state-of-the-art methods in newly-generated small-sized instances, while the performance gap widens as the size of the problem instances increases.

09:25
Studies on Multi-objective Role Mining in ERP Systems

ABSTRACT. A common concept to ensure the security of IT systems, in which multiple users share access to common resources, is Role Based Access Control (RBAC). Permissions, which correspond to the authorization to perform an operation on a data or business object are grouped into roles. These roles are then assigned to users. The corresponding optimization problem, the so-called Role Mining Problem (RMP), aims at finding a role concept comprising a minimal set of such roles and was shown to be NP-complete. However, in real-world role mining scenarios, it is typically the case that, besides the number of roles, further key figures must be consulted in order to adequately evaluate role concepts. Therefore, in this paper, the RMP is extended to a multi-objective (MO) optimization problem. Potential optimization objectives are discussed in the context of Enterprise Resource Planning (ERP) systems. Furthermore, it is shown, how evolutionary algorithms for the RMP can be adapted to meet the requirements of MO role mining. Based on this, the integration of different optimization objectives is examined and evaluated in a series of experiments.

09:50
Greedy heuristic guided by lexicographic excellence

ABSTRACT. This article deals with a basic greedy algorithm which, element by element, is able to myopically construct a feasible solution to a wide family of combinatorial optimization problems. The novelty is to guide the greedy algorithm by considering the elements of the problem by order of merit, following a social ranking method. Social rankings come from social choice theory. The method used in the present article, called lexicographic excellence, sorts individual elements on the basis of the performances of groups of elements. In order to validate our approach, we conduct a theoretical analysis on matroid optimization problems, followed by a thorough experimental study on the multi-dimensional knapsack and the maximum weight independent set problem, leading to promising results.

10:15
Emergence of new local search algorithms with neuro-evolution

ABSTRACT. This paper explores a novel approach aimed at overcoming existing challenges in the realm of local search algorithms. The main objective is to better manage information within these algorithms, while retaining simplicity and generality in their core components. Our goal is to equip a neural network with the same information as the basic local search and, after a training phase, use the neural network as the fundamental move component within a straightforward local search process. To assess the efficiency of this approach, we develop an experimental setup centered around NK landscape problems, offering the flexibility to adjust problem size and ruggedness. This approach offers a promising avenue for the emergence of new local search algorithms and the improvement of their problem-solving capabilities for black-box problems.

09:00-11:00 Session 7B: EvoApplications 4 - Computational Intelligence for Sustainability
Location: Room B
09:00
Optimizing Urban Infrastructure for E-Scooter Mobility

ABSTRACT. This paper addresses the optimization of urban infrastructure for e-scooter mobility through a multi-criteria approach. The proposed problem considers redesigning road infrastructure to integrate e-scooters into a city's multimodal transportation system. The objectives involve improving cycle lane coverage for e-scooters while minimizing installation costs. A parallel multi-objective evolutionary algorithm is introduced to solve this problem, applied to a real-world instance based on Malaga city data. The results showcase the algorithm's effectiveness in exploring the Pareto front, offering diverse trade-off solutions. Key solutions are analyzed, highlighting different zones with varying trade-offs between travel time improvement and installation costs. Visualization of proposed infrastructure changes illustrates significant reductions in travel time and enhanced multimodality. Computational efficiency analysis indicates successful parallelization, achieving substantial speedup and high efficiency with up to 32 processing elements.

09:25
Improving Image Filter Efficiency: A Multi-Objective Genetic Algorithm Approach to Optimize Computing Power

ABSTRACT. For real-time applications in embedded systems, an efficient image filter is not defined solely by its accuracy but by the delicate balance it strikes between precision and computational speed. While one approach to manage an algorithm’s computing demands involves evaluating its complexity, an alterna- tive strategy employs a multi-objective algorithm to optimize both precision and computation. In this paper, we introduce a multi-objective adaptation of Cartesian genetic pro- gramming aimed at enhancing image filter performance. We refine the existing Cartesian genetic programming framework for image processing by integrating the elite non-dominated sorting genetic algorithm into the evolutionary process, thus enabling the generation of a set of Pareto front solutions that cater to multiple objectives. To assess the effectiveness of our framework, we conduct a study using an Urban Traffic dataset and compare our results with those obtained using the standard framework employing a mono-objective evolutionary strategy. Our findings re- veal two key advantages of this adaptation. Firstly, it generates individuals with nearly identical precision in one objective while achieving a substantial enhance- ment in the other objective. Secondly, the use of the Pareto front during the evo- lution process expands the research space, yielding individuals with improved fitness.

09:50
Heuristics for Evolutionary Optimization for the Centered Bin Packing Problem

ABSTRACT. The Bin Packing Problem (BPP) is an optimization problem where a number of objects are placed within a finite space. This problem has a wide range of applications, from improving the efficiency of transportation to reducing waste in manufacturing. In this paper, we are considering a variant of the BPP where irregular shaped polygons are required to be placed as close to the center as possible. This variant is motivated by its application in 3D printing, where central placement of the objects improves the printing reliability. To find (near) optimum solutions to this problem, we employ Evolutionary Algorithms, and propose several heuristics. We show how these heuristics interact with each other, and their most effective configurations in providing the best solutions.

10:15
Evolutionary Algorithms for Optimizing Emergency Exit Placement in Indoor Environments

ABSTRACT. The problem of finding the optimal placement of emergency exits in an indoor environment to facilitate the rapid and orderly evacuation of crowds is addressed in this work. A cellular-automaton model is used to simulate the behavior of pedestrians in such scenarios, taking into account factors such as the environment, the pedestrians themselves, and the interactions among them. A metric is proposed to determine how successful or satisfactory an evacuation was. Subsequently, two metaheuristic algorithms, namely an iterated greedy heuristic and an evolutionary algorithm (EA) are proposed to solve the optimization problem. A comparative analysis shows that the proposed EA is able to find effective solutions for different scenarios, and that an island-based version of it outperforms the other two algorithms in terms of solution quality.

10:40
Nature-inspired Portfolio Diversification using Ant Brood Clustering
PRESENTER: Ashish Lakhmani

ABSTRACT. Portfolio diversification is a crucial strategy for mitigating risk and enhancing long-term returns. This paper introduces a unique approach to large-scale diversification using Ant Brood Sorting clustering, a nature-inspired algorithm, in conjunction with co-integration measure of time series. Traditional diversification strategies often struggle during uncertain market times. In contrast, the proposed method leverages Ant Brood Sorting to group similar stocks based on the cointegration of their closing prices. This approach allows for the creation of diversified portfolios from a wide range of stocks. The study presents promising results, with clusters of stocks showing both high correlation and cosine similarity, validating the effectiveness of the approach. Silhouette score, a measure of cluster quality, and inter-cluster analysis demonstrate support in validating the results of the study by displaying similarities between the stocks being clustered and distinctiveness with stocks in other clusters. The research contributes to the application of nature-inspired algorithms in large-scale portfolio diversification, offering potential benefits for investors seeking resilient and balanced portfolios.

09:00-11:00 Session 7C: EvoApplications 5 - Misc applications
Location: Room C
09:00
Finding Near-Optimal Portfolios With Quality-Diversity
PRESENTER: Bruno Gašperov

ABSTRACT. The majority of standard approaches to financial portfolio optimization (PO) are based on the mean-variance (MV) framework. Given a risk aversion coefficient, the MV procedure yields a single portfolio that represents the optimal trade-off between risk and return. However, the resulting optimal portfolio is known to be highly sensitive to the input parameters, i.e., the estimates of the return covariance matrix and the mean return vector. It has been shown that a more robust and flexible alternative lies in determining the entire region of near-optimal portfolios. In this paper, we present a novel approach for finding a diverse set of such portfolios based on quality-diversity (QD) optimization. More specifically, we employ the CVT-MAP-Elites algorithm, which is scalable to high-dimensional settings with potentially hundreds of behavioral descriptors and/or assets. The results highlight the promising features of QD as a novel tool in PO.

09:25
Low-Memory Matrix Adaptation Evolution Strategies exploiting gradient information and Lévy flight

ABSTRACT. The Low-Memory Matrix Adaptation Evolution Strategy is a recent variant of CMA-ES that is specifically meant for large-scale numerical optimization. In this paper, we investigate if and how gradient information can be included in this algorithm, in order to enhance its performance. Furthermore, we consider the incorporation of Lévy flight to alleviate stability issues due to possibly unreliably gradient estimation as well as promote better exploration. In total, we propose four new variants of LMMA-ES, making use of real and estimated gradient, with and without Lévy flight. We test the proposed variants on two neural network training tasks, one for image classification through the newly introduced Forward-Forward paradigm, and one for a Reinforcement Learning problem, as well as five benchmark functions for numerical optimization.

09:50
Memory Based Evolutionary Algorithm for Dynamic Aircraft Conflict Resolution
PRESENTER: Sarah Degaugue

ABSTRACT. In this article, we focus on a dynamic aircraft conflict resolution problem. The objective of an algorithm dedicated to dynamic problems shifts from finding the global optimum to detecting changes and monitoring the evolution of the optima over time. In the air traffic control domain, there is added value in dealing quickly with the dynamic nature of the environment and providing the controller with solutions that are stable over time. In this article, we compare two approaches of an evolutionary algorithm for the management of aircraft in a control sector at a given flight level: one is naive, i.e. the resolution of the current situation is reset to zero at each time step, and the other is memory-based, where the last population of the optimisation is stored to initiate the resolution at the next time step. Both approaches are evaluated with basic and optimised operators and settings. The results are in favour of the optimised version with explicit memory, where conflict-free solutions are found quicker and the solutions are more stable over time. Furthermore in the case of an external action, although the diversity of the population could be lower with the memory-based approach, the presence of memory does not appear to be a hindrance and, on average, improves the solver's responsiveness.

10:15
GM4OS: an Evolutionary Oversampling Approach for Imbalanced Binary Classification Tasks

ABSTRACT. Imbalanced datasets pose a significant and longstanding challenge to machine learning algorithms, particularly in binary classification tasks. Over the past few years, various solutions have emerged, with a substantial focus on the automated generation of synthetic observations for the minority class, a technique known as oversampling. Among the various oversampling approaches, the Synthetic Minority Oversampling Technique (SMOTE) has recently garnered considerable attention as a highly promising method. SMOTE achieves this by generating new observations through the creation of points along the line segment connecting two existing minority class observations. Nevertheless, the performance of SMOTE frequently hinges upon the specific selection of these observation pairs for resampling. This research introduces the Genetic Methods for OverSampling (GM4OS), a novel oversampling technique that addresses this challenge. In GM4OS, individuals are represented as pairs of objects. The first object assumes the form of a GP-like function, operating on vectors, while the second object adopts a GA-like genome structure containing pairs of minority class observations. By co-evolving these two elements, GM4OS conducts a simultaneous search for the most suitable resampling pair and the most effective oversampling function. Experimental results, obtained on ten imbalanced binary classification problems, demonstrate that GM4OS consistently outperforms or yields results that are at least comparable to those achieved through linear regression and linear regression when combined with SMOTE.

10:40
Applying Graph Partitioning-Based Seeding Strategies to Software modularisation

ABSTRACT. Software modularisation is a pivotal facet within software engineering, seeking to optimise the arrangement of software components based on their interrelationships. Despite extensive investigations in this domain, particularly concerning evolutionary computation, the research emphasis has transitioned towards solution design and convergence analysis rather than pioneering methodologies. The primary objective is to attain efficient solutions within a pragmatic timeframe. Recent research posits that initial positions in the search space wield minimal influence, given the prevalent trend of methods converging upon akin local optima. This paper delves into this phenomenon comprehensively, employing graph partitioning techniques on dependency graphs to generate initial clustering arrangement seeds. Our empirical discoveries challenge conventional insight, underscoring the pivotal role of seed selection in software modularisation to enhance overall outcomes.

14:20-16:00 Session 10A: EvoApplications Best Paper nominations
Location: Room A
14:20
Evolving Feature Extraction Models for Melanoma Detection: A Co-operative Co-evolution Approach

ABSTRACT. As global mortality rates rise alongside an increasing incidence of skin cancer, it becomes increasingly clear that the pursuit of an effective strategy to combat this challenge is gaining urgency. In traditional practices, the diagnosis of skin cancer predominantly depends on manual inspection of skin lesions. Despite its prevalent use, this approach is beset with several limitations, such as subjectivity, time constraints, and the invasive nature of biopsy procedures. Addressing these obstacles, the burgeoning field of Artificial Intelligence (AI) has been instrumental in advancing Computer Automated Diagnostic Systems (CADS) for skin cancer. A critical aspect of these systems is feature extraction, a process crucial for discerning and utilising key characteristics from raw image data, thereby bolstering the efficacy of CADS. This study introduces a feature extraction model that evolves automatically, leveraging the principles of genetic programming and cooperative coevolution. This method generates a ensemble of models that collaboratively work to extract discerning features from images of skin lesions. The model's effectiveness is evaluated using a publicly accessible dataset. The findings indicate that the proposed method either matches or surpasses the performance of established benchmarks and recent methodologies in this field, underscoring its potential in enhancing skin cancer diagnostic processes.

14:45
Interpretable Solutions for Breast Cancer Diagnosis with Grammatical Evolution and Data Augmentation
PRESENTER: Yumnah Hasan

ABSTRACT. Medical imaging diagnosis increasingly relies on Machine Learning (ML) models. This is a task that is often hampered by severely imbalanced datasets, where positive cases can be quite rare. Their use is further compromised by their limited interpretability, which is becoming increasingly important. While post-hoc interpretability techniques such as SHAP and LIME have been used with some success on so-called black box models, the use of inherently understandable models makes such endeavours more fruitful. This paper addresses these issues by demonstrating how a relatively new synthetic data generation technique, STEM, can be used to produce data to train models produced by Grammatical Evolution (GE) that are inherently understandable. STEM is a recently introduced combination of the Synthetic Minority Over-sampling Technique (SMOTE), Edited Nearest Neighbour (ENN), and Mixup; it has previously been successfully used to tackle both between-class and within-class imbalance issues. We test our technique on the Digital Database for Screening Mammography (DDSM) and the Wisconsin Breast Cancer (WBC) datasets and compare Area Under the Curve (AUC) results with an ensemble of the top three performing classifiers from a set of eight standard ML classifiers with varying degrees of interpretability. We demonstrate that the GE-derived models present the best AUC while still maintaining interpretable solutions

15:10
3D Motion Analysis in MRI using a Multi-objective Evolutionary k-means Clustering

ABSTRACT. Many studies focused on gastric motility require the use of synthetic tracers to map the motion of content. Our study instead takes advantage of an unusual MRI acquisition protocol, combined with multi-objective optimised clustering to map the motion of food (peas, a natural 'tracer') in a human stomach. We chose NSGA-II to optimise the starting positions for a modified k-means to create optimum clusters. We compared our optimisation approach with a purely random approach that took an equal amount of processing time. Since we have no ground truth available, we have created alternative measures to evaluate our solutions: if the resulting pea velocities are within an expected range, and if each pea's motion is correlated with neighbouring peas. We found that the optimised version has a significant improvement over the purely random search. Furthermore, we found many interesting food motion behaviours, such as correlated pea motion and more complex motion dynamics such as collision. Overall we found that the combined optimisation and clustering approach produced interesting findings relating to food dynamics in a human stomach.

14:20-16:00 Session 10B: EvoMUSART 4 - Generative Visual Design and Illustration
Location: Room B
14:20
Evolving Visually-Diverse Graphic Design Posters

ABSTRACT. Finding unconventional visual solutions that stand out and draw attention is frequently one of the goals of Graphic Design (GD). However, to save time, graphic designers often adhere to design trends and templates, resulting in creations that frequently lack distinctive qualities. To speed up the creative process, among other techniques, researchers have been using evolutionary algorithms to generate and present novel GD solutions to end-users, so they can save time by working over the generated ideas. Nevertheless, state-of-the-art approaches often converge to a final group of results that look too similar to each other. In this paper, we test the application of niching techniques to generate a set of visually varied GD solutions and present it to end-users. GD posters were evolved as a proof of concept. The results suggest that implementing the tested niching technique in evolutionary systems can be a viable approach to present users with a wider range of ideas, at least for poster design.

14:45
Evaluation Metrics for Automated Typographic Poster Generation

ABSTRACT. Computational Design approaches facilitate the generation of typographic design, but evaluating these designs remains a challenging task. In this paper, we propose a set of heuristic metrics for typographic design evaluation, focusing on their legibility, which assesses the text visibility, aesthetics, which evaluates the visual quality of the design, and semantic features, which estimate how effectively the design conveys the content semantics. We experiment with a constrained evolutionary approach for generating typographic posters, incorporating the proposed evaluation metrics with varied setups, and treating the legibility metrics as constraints. We also integrate emotion recognition to identify text semantics automatically and analyse the performance of the approach and the visual characteristics outputs.

15:10
Evoboard: Geoboard-inspired Evolved Typefonts

ABSTRACT. Typo design is a field that deals with the creation of visually appealing designs for the written language. The work of the designer is time-consuming and requires many iterations until the final solution is achieved. Although a human expert is required to validate the final results, this task can be aided with the use of automatic design software. We propose Evoboard, an automatic algorithm that evolves a typefont using a geoboard-inspired representation where each character is a self-intersecting polygon. Evoboard uses a genetic algorithm to optimize the number of vertices of the polygon and their positions in a grid. The evolution of the population is guided by an Optical Character Recognition~(OCR) model that aims to maximize the recognition of the polygon as the target character. Thanks to this simple pipeline, both the OCR model and the representation can be easily modified by the user to their needs. In this work, Evoboard evolved a set of 36 alphanumeric characters that are both highly legible and aesthetically appealing, two very important aspects in type design.

15:35
PatternPortrait: Draw Me Like One of Your Scribbles

ABSTRACT. This paper introduces a process for generating abstract portrait drawings from pictures. Their unique style is created by utilizing single freehand pattern sketches as references to generate unique patterns for shading. The method involves extracting facial and body features from images and transforming them into vector lines. A key aspect of the research is the development of a graph neural network architecture designed to learn sketch stroke representations in vector form, enabling the generation of diverse stroke variations. The combination of these two approaches creates joyful abstract drawings that are realized via a pen plotter. The presented process garnered positive feedback from an audience of approximately 280 participants.

14:20-16:00 Session 10C: EuroGP 2
Location: Room C
14:20
Reducing Generational Computation in Counterexample-Driven Genetic Programming without Formal Specifications

ABSTRACT. Counterexample-driven genetic programming (CDGP) uses specifications provided as formal constraints in order to generate the training cases used to evaluate the evolving programs. It has also been extended to combine formal constraints and user-provided training data to solve symbolic regression problems. Here we show how the ideas underlying CDGP can also be applied using only user-provided training data, without formal specifications. We demonstrate the application of this method, called ``informal CDGP,'' to software synthesis problems. Our results show that informal CDGP finds solutions faster (i.e. with fewer program executions) than standard GP. Additionally, we propose two new variants to informal CDGP, and find that one produces significantly more successful runs on about half of the tested problems. Finally, we study whether the addition of counterexample training cases to the training set is useful by comparing informal CDGP to using a static subsample of the training set, and find that the addition of counterexamples significantly improves performance.

14:45
Enhancing Large Language Models-based Code Generation by Leveraging Genetic Improvement

ABSTRACT. In recent years, the rapid advances in neural networks for Natural Language Processing (NLP) have led to the development of Large Language Models (LLMs), able to substantially improve the state-of-the-art in many NLP tasks, such as question answering and text summarization. Among them, one particularly interesting application is automatic code generation based only on the problem description. However, it has been shown that even the most effective LLMs available often fail to produce correct code. To address this issue, we propose an evolutionary-based approach using Genetic Improvement (GI) to improve the code generated by an LLM using a collection of user-provided test cases. Specifically, we employ Grammatical Evolution (GE) using a grammar that we automatically specialize---starting from a general one---for the output of the LLM. We test 25 different problems and 5 different LLMs, showing that the proposed method is able to improve in a statistically significant way the code generated by LLMs. This is a first step in showing that the combination of LLMs and evolutionary techniques can be a fruitful avenue of research.

15:10
DALex: Lexicase-like Selection via Diverse Aggregation

ABSTRACT. Lexicase selection has been shown to provide advantages over other selection algorithms in several areas of evolutionary computation and machine learning. In its standard form, lexicase selection filters a population or other collection on the basis of randomly ordered training cases that are considered one at a time. This iterated filtering process can be time-consuming, particularly in settings with large numbers of training cases. Such settings include many symbolic regression and deep learning applications. In this paper, we propose a new method that is nearly equivalent to lexicase selection in terms of the individuals that it selects, but which does so significantly more quickly. The new method, called DALex (for Diversely Aggregated Lexicase), selects the best individual with respect to a weighted sum of training case errors, where the weights are sampled from a gaussian distribution. This allows us to formulate the core computation required for selection as matrix multiplication instead of recursive loops of comparisons, which in turn allows us to take advantage of optimized and parallel algorithms designed for matrix multiplication for speedup. Furthermore, we show that we can interpolate between lexicase-like behavior and the behavior of "relaxed" variants of lexicase selection, such as epsilon lexicase selection and batch lexicase selection, by adjusting a single hyperparameter, named "particularity pressure," which represents the importance granted to each individual training case. Results on program synthesis, deep learning, symbolic regression, and learning classifier systems demonstrate that DALex achieves significant speedups over lexicase selection and its relaxed variants while maintaining almost identical problem-solving performance. Under a fixed computational budget, these savings free up resources that can be directed towards increasing population size or the number of generations, enabling the potential for solving more difficult problems.

16:20-18:00 Session 11A: EvoCOP Best Paper nominations
Location: Room A
16:20
Hardest Monotone Functions for Evolutionary Algorithms

ABSTRACT. The study of hardest and easiest fitness landscapes is an active area of research. Recently, Kaufmann, Larcher, Lengler and Zou conjectured that for the self-adjusting (1, λ)-EA, Adversarial Dynamic BinVal (ADBV) is the hardest function to optimize. We introduce the function Switching Dynamic BinVal (SDBV) which coincides with ADBV whenever the number of remaining zeros in the search point is strictly less than n/2, where n denotes the dimension of the search space. We show, using a combinatorial argument, that for the (1 + 1)-EA, SDBV is drift-minimizing among the class of dynamic monotone functions. Our construction provides the first explicit example of an instance of the partially-ordered evolutionary algorithm (PO-EA) model with param- eterized pessimism introduced by Colin, Doerr and Férey, building on work of Jansen. We further show that the (1 + 1)-EA optimizes SDBV in Θ(n^3/2) generations. Our simulations demonstrate matching runtimes for both static and self-adjusting (1, λ) and (1 + λ)-EA. We further show, using an example of fixed dimension, that drift-minimization does not equal maximal runtime.

16:45
A Neural Network Based Guidance for a BRKGA: An Application to the Longest Common Square Subsequence Problem
PRESENTER: Jaume Reixach

ABSTRACT. In this work we apply machine learning to better guide a biased random key genetic algorithm (BRKGA) for the longest common square subsequence (LCSqS) problem. The problem is a variant of the well-known longest common subsequence (LCS) problem in which valid solutions are square strings. A string is square if it can be expressed as the concatenation of a string with itself. The original BRKGA is based on a reduction of the LCSqS problem to the LCS problem by cutting each input string into two parts. Our work consists in enhancing the search process of BRKGA for good cut points by using a machine learning approach, which is trained to produce promising cut points for the input strings of a problem instance. In this study, we show the benefits of this approach by comparing the enhanced BRKGA with the original BRKGA, using two benchmark sets from the literature. We show that the results of the enhanced BRKGA significantly improve over the original results, especially when tackling instances with non-uniformly generated input strings.

17:10
Sparse Surrogate Model for Optimization: Example of the Bus Stops Spacing Problem

ABSTRACT. Combinatorial optimization problems can involve computationaly expensive fitness function, making their resolution challenging. Surrogate models are one of the effective techniques used to solve such black-box problems by guiding the search towards potentially good solutions. In this paper, we focus on the use of surrogate based on multinomial approaches, particularly based on Walsh functions, to tackle pseudo-Boolean problems. Although this approach can be effective, a potential drawback is the growth of the polynomial expansion with problem dimension. We introduce a method for analyzing real-world combinatorial black-box problems defined through numerical simulation. This method combines Walsh spectral analysis and polynomial regression. Consequently, we propose a sparse surrogate model that incorporates selected, relevant terms and is simpler to optimize. To demonstrate our approach, we apply it to the bus stop spacing problem, an exemplary combinatorial pseudo-Boolean challenge.

16:20-18:00 Session 11B: EML 2 - EC for EML enhancement
Location: Room B
16:20
Cultivating Diversity: A Comparison of Diversity Objectives in Neuroevolution

ABSTRACT. Inspired by biological evolution’s ability to produce complex and intelligent beings, neuroevolution utilizes evolutionary algorithms for optimizing the connection weights and structure of artificial neural networks. With evolutionary algorithms often failing to produce the same level of diversity as biological evolution, explicitly encouraging diversity with additional optimization objectives has emerged as a successful approach. However, there is a lack of knowledge regarding the performance of different types of diversity objectives on problems with different characteristics. In this paper, we perform a systematic comparison between objectives related to structural diversity, behavioral diversity, and our newly proposed representational diversity. We explore these objectives' effects on problems with different levels of modularity, regularity, deceptiveness and discreteness and find clear relationships between problem characteristics and the effect of different diversity objectives -- suggesting that there is much to be gained from adapting diversity objectives to the specific problem being solved.

16:45
A Hierarchical Dissimilarity Metric for Automated Machine Learning Pipelines, and Visualizing Search Behaviour

ABSTRACT. In this study, the challenge of developing a dissimilarity metric for machine learning pipeline optimization is addressed. Traditional approaches, limited by simplified operator sets and pipeline structures, fail to address the full complexity of this task. Two novel metrics are proposed for measuring structural, and hyperparameter, dissimilarity in the decision space. A hierarchical approach is employed to integrate these metrics, prioritizing structural over hyperparameter differences. The Tree-based Pipeline Optimization Tool (TPOT) is utilized as the primary automated machine learning framework, applied on the abalone dataset. Novel visual representations of TPOT's search dynamics are also proposed, providing some deeper insights into its behaviour and evolutionary trajectories, under different search conditions. The effects of altering the population selection mechanism and reducing population size are explored, highlighting the enhanced understanding these methods provide in automated machine learning pipeline optimization.

17:10
DeepEMO: A Multi-Indicator Convolutional Neural Network-based Evolutionary Multi-Objective Algorithm

ABSTRACT. Quality Indicators (QIs) have been used in numerous Evolutionary Multi-objective Optimization Algorithms (EMOAs) as selection mechanisms within the evolutionary process. Because each QI prefers specific point-distribution properties, an Indicator-based EMOA (IB-EMOA) that uses a single QI has an intrinsically limited scope of problems it can solve accurately. To overcome the issues that IB-EMOAs have, we present the first results of a new general multi-indicator-based multi-objective evolutionary algorithm, denoted as DeepEMO. It uses a Convolutional Neural Network (CNN) as a hyper-heuristic to choose, depending on the Pareto-front geometry, the appropriate indicator-based selection mechanism at each generation of the evolutionary process. We employ state-of-the-art benchmark problems with different Pareto front geometries to test our approach. Our experimental results show that DeepEMO obtains competitive performance across multiple QIs. This is because the CNN is employed to classify the geometry of the point cloud that approximates the Pareto front. Hence, DeepEMO compensates for the weaknesses of a single QI with the strengths of others, showing that its performance is invariant to the Pareto front geometry.

17:35
Improving Generalization of Evolutionary Feature Construction with Minimal Complexity Knee Points in Regression

ABSTRACT. Genetic programming-based evolutionary feature construction is a widely used technique for automatically enhancing the performance of a regression algorithm. While it has achieved great success, a challenging problem in feature construction is the issue of overfitting, which has led to the development of many multi-objective methods to control overfitting. However, for multi-objective methods, a key issue is how to select the final model from the front with different trade-offs. To address this challenge, in this paper, we propose a novel minimal complexity knee point selection strategy in evolutionary multi-objective feature construction for regression to select the final model for making predictions. Experimental results on 58 datasets demonstrate the effectiveness and competitiveness of this strategy when compared to eight existing methods. Furthermore, an ensemble of the proposed strategy and existing model selection strategies achieves the best performance and outperforms four popular machine learning algorithms.

16:20-18:00 Session 11C: EvoApplications 7 - Analysis of EC Methods: Theory, Empirics, and Real-World Applications
Location: Room C
16:20
On the Potential of Multi-Objective Automated Algorithm Configuration on Multi-Modal Multi-Objective Optimization Problems

ABSTRACT. The complexity of Multi-Objective (MO) continuous optimization problems arises from a combination of different characteristics, such as the level of multi-modality. Earlier studies revealed that there is a conflict between solver convergence in objective space and solution set diversity in the decision space, which is especially important in the multi-modal setting. We build on top of this observation and investigate this trade-off in a multi-objective manner by using multi-objective automated algorithm configuration (MO-AAC) on evolutionary multi-objective algorithms (EMOA). Our results show that MO-AAC is able to find configurations that outperform the default configuration as well as configurations found by single-objective AAC in regards to both criteria, leading to new recommendations for high-performing default settings.

16:45
A Simple Statistical Test Against Origin-Biased~Metaheuristics

ABSTRACT. One of the strong points of evolutionary algorithms and other similar metaheuristics is their robustness, which means that their performance is consistent across large varieties of problem settings. In particular, such algorithms avoid preferring one solution to another unless the optimized function gives enough reasons for doing that. This property is formally captured as invariance with regards to certain transformations of the search space and the problem definition, such as translation or rotation.

The lack of some basic invariance properties in some recently proposed "nature-inspired" algorithms, together with the deliberate misuse of commonly used benchmark functions, can present them as excellent optimizers, which they are not. One particular class of such algorithms, origin-based metaheuristics, are good at finding an optimum at the origin and are much worse for any other purpose.

This paper presents a statistical testing procedure which can help to reveal such algorithms and to illustrate the negative aspects of their behavior. A case study involving 15 different algorithms shows that this test successfully detects most origin-biased algorithms.

17:10
On the Utility of Probing Trajectories for Algorithm-Selection

ABSTRACT. Machine-learning approaches to algorithm-selection typically take data describing an instance as input. Input data can take the form of features derived from the instance description or fitness landscape, or can be a direct representation of the instance itself, i.e. an image or textual description. Regardless of the choice of input, there is an implicit assumption that instances that are similar will elicit similar performance from algorithm, and that a model is capable of learning this relationship. We argue that viewing algorithm-selection purely from an instance perspective can be misleading as it fails to account for how an algorithm `views' similarity between instances. We propose a novel `algorithm-centric' method for describing instances that can be used to train models for algorithm-selection: specifically, we use short probing trajectories calculated by applying a solver to an instance for a very short period of time. The approach is demonstrated to be promising, providing comparable or better results to computationally expensive landscape-based feature-based approaches. Furthermore, projecting the trajectories into a 2-dimensional space illustrates that functions that are similar from an algorithm-perspective do not necessarily correspond to the accepted categorisation of these functions from a human perspective.

17:35
Evolving Artificial Neural Networks for Simulating and Analyzing Fish Social Interactions

ABSTRACT. Can we use computational modeling to infer whether fish can remember and/or anticipate each other's movements? What minimum of temporal input and internal complexity is sufficient to model a specific fish, or to produce generally "fish-like" behavior? Agent-based modeling to emulate biological behavior has been used to great effect, both in real-world and simulated experiments. We present feedforward neural network architectures for simulating fish social interactions, evolved using evolution strategies in two different experiments. Evolution of the temporal input of the partner fish's position when testing models on labeled data uncovers anticipation and/or memory capacities used by a focal fish. When testing via a general discriminator for fish-like trajectories, the right neural network architecture and temporal input are shown to be a necessary, but insufficient condition for highly lifelike simulations. Lifelike simulations for some datasets are possible as simple functions of the input, showing variability in the complexity of individual fish's social behaviors.