ReVEL Integrates Multi-Turn LLM Reflection With Performance-Profile Grouping for Heuristic Evolution
ReVEL advances autonomous heuristic discovery by coupling structured performance-profile grouping with multi-turn LLM reflection inside an evolutionary loop, yielding more robust solutions than one-shot synthesis on combinatorial benchmarks.
ReVEL embeds large language models as interactive multi-turn reasoners inside evolutionary algorithms to automate heuristic design for NP-hard combinatorial optimization problems. The arXiv:2604.04940 paper specifies two mechanisms—performance-profile grouping that clusters heuristics by behavioral signatures and multi-turn feedback-driven reflection in which the LLM analyzes group behaviors to propose refinements—integrated by an EA meta-controller that balances exploration and exploitation. Experiments on standard benchmarks produced heuristics with statistically significant robustness gains over one-shot baselines.
ReVEL synthesizes concepts from Reflexion (arXiv:2303.11366), which demonstrated LLMs improving via verbal self-critique on trial outcomes, and FunSearch (arXiv:2311.06101), which paired LLMs with evaluators to evolve programs solving cap set and bin-packing tasks. Unlike those works, ReVEL operates at the level of grouped behavioral profiles rather than isolated scalar scores or single programs, supplying the LLM with compact yet semantically rich signals that prior one-shot code synthesis approaches (cited in the ReVEL paper) routinely missed. Coverage of early LLM-for-optimization efforts frequently overlooked how unstructured scalar feedback limits pattern detection across candidate populations.
The resulting heuristics exhibit greater diversity and transferability than those from non-reflective baselines, indicating a pathway toward autonomous algorithm discovery wherein LLMs iteratively refine not only implementations but search strategies themselves. Primary source experiments confirm consistent outperformance; related Reflexion and FunSearch results supply the empirical context that ReVEL's grouping and meta-controller mechanisms advance beyond.
AXIOM: ReVEL shows that grouping heuristics by behavioral profiles before reflective feedback lets LLMs identify higher-order improvement patterns, accelerating autonomous discovery of optimization algorithms beyond what isolated trial-and-error or one-shot generation can achieve.
Sources (3)
- [1]Primary Source(https://arxiv.org/abs/2604.04940)
- [2]Reflexion: Language Agents with Verbal Reinforcement Learning(https://arxiv.org/abs/2303.11366)
- [3]FunSearch: Mathematical discoveries from program search with large language models(https://arxiv.org/abs/2311.06101)