Learning unknown nonlinear dynamics safely by combining a simple linear model with online Gaussian process learning
This paper presents a way for a controller to learn unknown nonlinear parts of a system while never leaving a pre‑specified safe region with high probability. The authors assume you start with only a stabilizable linear approximation of the process. The unknown nonlinear remainder is modeled online with a Gaussian process (GP), a flexible statistical model that provides both a best estimate and an uncertainty for the unknown function. Safety is enforced by a Lyapunov-based, probabilistic invariant set so the closed‑loop system stays stable with high probability during learning.
Concretely, the residual nonlinear dynamics g(x) are learned with a GP whose posterior gives a mean µ(x) and a standard deviation σ(x). The method uses a high‑confidence envelope µ ± βσ (β chosen from GP‑UCB style schedules and calibrated in practice) to bound the possible true residual. The authors plug that bound into a Lyapunov condition built from a quadratic Lyapunov function V(x)=x^T P x for the known linear part. Points that admit a control input u making the Lyapunov condition hold despite the worst‑case GP error form a probabilistic control‑invariant set (PCIS). By construction, trajectories starting inside this set remain inside it with probability at least 1−δ, where δ is a user‑chosen risk level.
At each time step the controller solves a convex quadratic program (QP) to choose a control input. The QP trades staying close to the nominal stabilizing controller against exciting the system to gather informative data for the GP. To keep the QP convex and tractable, the authors linearize the variance‑seeking objective and use the current GP variance as a constant weight. The framework gives finite‑sample, high‑probability safety guarantees and lets the safe set expand adaptively when the GP uncertainty shrinks, enabling progressively more ambitious exploration.