BENCH@GECCO26 - Good Benchmarking Practices for Evolutionary Computation
Webpage: https://sites.google.com/view/benchmarking-network/home
Description
Benchmarking plays a vital role in understanding the performance and search behaviour of sampling-based optimization techniques such as evolutionary algorithms. This workshop will continue our workshop series on good benchmarking practices at different conferences in the context of EC that we started in 2020. The core theme is on benchmarking evolutionary computation methods and related sampling-based optimization heuristics, but each year, the focus is changed.
For GECCO 2026, our focus will be on “Challenges in benchmarking dynamic optimisation problems“
Many problems in the real-world are dynamic in some way. The decision maker could change their opinion, thus modifying the objective function(s). The availability of materials might change, thus varying the constraints/requirements. The environment might heat up, thus influencing the dynamics of a fluid simulation. Such changing environments can be observed between optimisation runs or even during one run.
While some dynamic problems have been addressed in the past, and some benchmarks exist, only very few types of dynamics are covered at relatively low complexity. Therefore, we propose to organise this workshop to create an overview of the current state of existing benchmarking problems and evaluation measures, and agree on future steps to take for a consolidated effort towards a common benchmark to better streamline this area of research.
We will be addressing the following questions:
- What are conceptual differences between benchmarking for static and dynamic environments?
- What are the recent trends and open issues emerging from the recent TEVC special issue on Evolutionary Dynamic Optimization? https://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=4235&isnumber=11199991
- Which different types of dynamic problems are common in real-world applications and in research, and how are they benchmarked?
- Are performance measures from transfer learning applicable (jumpstart, transfer ratio, time to threshold etc)?
Submission format
No paper submissions for this workshop
-
Organizers
Mike Preuss is associate professor at the Leiden Institute of Advanced Computer Science and most interested in using modern AI algorithms to solve practical problems, most notably in ChemAI (as for retrosynthesis), but generally in contexts where human expertise and new AI methods meet. This encompasses LLM and image/video generation tools and how they can be integrated into human workflows meaningfully. Partly automated Procedural Content generation (PCG) is actually a well-known concept in game AI for a long time already and profits greatly from these new developments. Recently, Mike is also involved with quantum games (quantum versions of board games as Checkers) and drone research.
Mike received his PhD from TU Dortmund university, Germany, in 2013, under the supervision of Hans-Paul Schwefel in Evolutionary Computation, namely in deriving methods for complex multimodal optimization tasks, with a view to real-world applications as the design of ship propulsion engines. In the following years, he stayed with the information systems department of WWU Münster, Germany, before starting his current position at Leiden University where he established the Game Research Lab, bringing together topics from education games for teaching AI to (very recently) training Deep Reinforcement Learning algorithms to learn to play Pokemon Red in a fully automated fashion. He is always looking out for new interesting problems that can be solved by means of modern AI algorithms in and outside of computer games.
Olaf Mersmann is a Professor of Computer Science at the Federal University of Applied
Administrative Sciences in Germany and before that he was a Professor of Data Science at TH Köln - University of Applied Sciences. He received his BSc, MSc and PhD in Statistics from TU Dortmund. His research interests include applying statistical and machine learning methods to large benchmark databases to gain insight into the structure of the algorithm choice problem, the automated design of benchmark functions and benchmark function sets and using these methods on real world engineering problems.