A graph‑learning method that repairs production plans fast after machine breakdowns
Real-world plans for production or transport can break when something unexpected happens. Recomputing a fresh optimal plan from scratch is often too slow. Simple quick repairs are fast but tend to give poor results. This paper proposes a middle way: a learning-driven repair that finds a good, feasible plan quickly after a disruption such as a machine breakdown.
The authors build a “learning-to-reoptimize” method that combines a classic fix-and-optimize idea with a Graph Neural Network (GNN). They represent the problem instance, the original plan, and the disruption as a graph. The GNN is trained to predict which decision variables (in particular binary setup variables) are likely to need changing. The method then lets a solver reoptimize only a small subset of variables while the others are fixed. The authors use a single-iteration fix-and-optimize step to keep the computation short and to control how much the new plan can differ from the original.
To test the idea they focus on a Lot Sizing Problem. This is a production‑planning model with multiple items, multiple machines, limited machine time, setup costs and times, inventory, and lost‑sales penalties. The problem is written as a Mixed‑Integer Linear Program (MILP) with production quantities, inventories, lost sales and binary setup and carry‑over variables. The paper models machine breakdowns that can make a previously valid plan infeasible. Both the original nominal solution and the reoptimized solution are produced by a general‑purpose MILP solver; the GNN decides which variables the solver should be allowed to change.
Why this matters: the approach sits between slow full reoptimization and low‑quality quick fixes. By guiding the solver to focus its effort where it matters, the method can produce better solutions in the very short time windows available in operations. The authors report numerical experiments on a large dataset that show the GNN‑aided method handles different instance sizes and yields larger cost reductions than a baseline reoptimization strategy when both are given the same limited time.