LLM-Assisted Crossover in Genetic Improvement of Software
This study explores the use of Large Language Models to improve the crossover process in genetic programming, as applied in the genetic improvement domain. Traditional crossover techniques typically combine parent variants by selecting modifications uniformly or even randomly, without consideration of contextual relevance, often resulting in inefficient searches and suboptimal solutions due to incompatible or redundant modifications. In contrast, our LLM-assisted crossover leverages context to select and combine edits from parent solutions that are more likely to work well together, with the goal of producing higher quality variants, accelerating optimization.
We implemented this approach within MAGPIE, a unified genetic improvement framework. We evaluated against five traditional crossover methods across seven benchmarks, measuring performance on four key metrics: average ranking, best variant execution time, efficiency in reaching performance milestones, and viable variant count. Results show that LLM-assisted crossover achieved an average ranking of 2.27 (on a scale where 1 is best and 6 is worst), making it the top-performing method across benchmarks based on the quality of the optimal variants produced. The LLM-based approach also improved the fitness (execution time) by an average of 8.5% over the best variant produced by the traditional methods. In terms of efficiency, the LLM-assisted crossover required on average 25.6% fewer variants to reach 25%, 50%, 75%, and 100% of the final performance improvement compared to the traditional methods. Additionally, the LLM-assisted crossover produced 4.8% more viable variants across scenarios, including both source code modification and parameter tuning cases.
These findings suggest that LLMs can significantly enhance genetic improvement by guiding the crossover process toward more effective and viable solutions, providing motivation for further research in LLM-assisted search algorithms.
Sun 27 AprDisplayed time zone: Eastern Time (US & Canada) change
16:00 - 17:30 | Afternoon Session 2GI at 202 Chair(s): Thanatad Songpetchmongkol University College London, Oliver Krauss University of Applied Sciences Upper Austria | ||
16:00 30mTalk | The gem5 C++ glibc Heap Fitness Landscape GI | ||
16:30 30mTalk | LLM-Assisted Crossover in Genetic Improvement of Software GI Dimitrios Stamatios Bouras Peking University, Justyna Petke University College London, Sergey Mechtaev Peking University | ||
17:00 25mMeeting | Discussion GI Aymeric Blot University of Rennes, IRISA / INRIA, Oliver Krauss University of Applied Sciences Upper Austria | ||
17:25 5mAwards | Awards and Closing GI |