Adaptive Algorithmic Behavior for solving Mixed Integer Programs using Bandit Algorithms


State-of-the-art solvers for mixed integer programs (MIP) govern a variety of algorithmic components. Ideally, the solver adaptively learns to concentrate its computational budget on those components that perform well on a particular problem, especially if they are time consuming. In this work, we focus on three examples of such expensive algorithms, namely the classes of Large Neighborhood Search and diving heuristics as well as the available pricing strategies to the Simplex algorithm for solving LP relaxations. Typically, only one algorithm is run and evaluated at a time to update the overall selection strategy. We review several common strategies for such a selection scenario under uncertainty, also known as Multi Armed Bandit Problem. In order to apply those bandit strategies, we carefully design deterministic reward functions to rank and compare each individual heuristic or pricing algorithm within its respective group. We use simulations on publicly available MIP problems to calibrate the bandit strategies and show their individual learning curves. Finally, we discuss the computational benefits of using the proposed adaptive selection within the MIP solver SCIP.

Tokyo, Japan
Gregor Hendel
Senior Engineer