MathematicsEnglishPublished

How to do online feedback optimization when your control actions change the randomness

June 18, 2026arXiv: 2606.19284v1

What is this paper about? The authors study online feedback optimization (OFO) for systems where the distribution of random parameters depends on the control actions. In other words, choosing a control input can change the way uncertainty behaves. This situation appears in engineering systems such as power grids with price-responsive devices. The paper develops and analyzes a measurement-based algorithm that can operate without knowing the full system model or the exact distributional map.

What the researchers did. They extend a projected primal–dual algorithm to the stochastic, decision-dependent setting. The algorithm uses real-time output measurements to build a gradient-like update and then projects the result onto the allowed input set and onto a surrogate set for the dual variables. Because the true dual constraint sets are not assumed known, the method works with surrogate dual sets H(n). The authors study the iterates relative to the sequence of "performatively stable" saddle points — fixed points that are optimal for the distribution they induce — rather than the saddle points of a fixed, known distribution.

How the method works at a high level. At each time step the controller measures the system output, forms a measurement-based approximation of the gradient of a regularized Lagrangian, and takes a step in the negative gradient direction. The step is followed by a projection onto the allowed input set and onto the surrogate dual set. The analysis accounts for four sources of error that affect tracking of the performatively stable points: (i) the intrinsic stochasticity of the problem, (ii) errors in the output measurements used to form gradients, (iii) time variation in the underlying problem, and (iv) any mismatch between the surrogate dual sets and the true dual sets.

Main theoretical result and why it matters. The paper provides an upper bound on the long-run mean-square tracking error (a steady-state bound on E[||z_n − z^P_n||^2]). That bound decomposes into interpretable terms tied to the four error sources above. The bound depends on the algorithm step size (often denoted α) and shows how each error source contributes to steady-state error. Special cases are spelled out: if the randomness does not change over time then the stochasticity term drops out; if the surrogate dual sets contain the true dual variables then the surrogate-mismatch term drops out; and if measurement error, time variation and surrogate mismatch are all zero the bound reduces to a rate proportional to the step size, consistent with standard stochastic gradient descent theory.