Optimization and Control (math.OC)
Wed, 23 Aug 2023
1.Solving Elliptic Optimal Control Problems using Physics Informed Neural Networks
Authors:Bangti Jin, Ramesh Sau, Luowei Yin, Zhi Zhou
Abstract: In this work, we present and analyze a numerical solver for optimal control problems (without / with box constraint) for linear and semilinear second-order elliptic problems. The approach is based on a coupled system derived from the first-order optimality system of the optimal control problem, and applies physics informed neural networks (PINNs) to solve the coupled system. We present an error analysis of the numerical scheme, and provide $L^2(\Omega)$ error bounds on the state, control and adjoint state in terms of deep neural network parameters (e.g., depth, width, and parameter bounds) and the number of sampling points in the domain and on the boundary. The main tools in the analysis include offset Rademacher complexity and boundedness and Lipschitz continuity of neural network functions. We present several numerical examples to illustrate the approach and compare it with three existing approaches.
2.Non-ergodic linear convergence property of the delayed gradient descent under the strongly convexity and the Polyak-Łojasiewicz condition
Authors:Hyung Jun Choi, Woocheol Choi, Jinmyoung Seok
Abstract: In this work, we establish the linear convergence estimate for the gradient descent involving the delay $\tau\in\mathbb{N}$ when the cost function is $\mu$-strongly convex and $L$-smooth. This result improves upon the well-known estimates in Arjevani et al. \cite{ASS} and Stich-Karmireddy \cite{SK} in the sense that it is non-ergodic and is still established in spite of weaker constraint of cost function. Also, the range of learning rate $\eta$ can be extended from $\eta\leq 1/(10L\tau)$ to $\eta\leq 1/(4L\tau)$ for $\tau =1$ and $\eta\leq 3/(10L\tau)$ for $\tau \geq 2$, where $L >0$ is the Lipschitz continuity constant of the gradient of cost function. In a further research, we show the linear convergence of cost function under the Polyak-{\L}ojasiewicz\,(PL) condition, for which the available choice of learning rate is further improved as $\eta\leq 9/(10L\tau)$ for the large delay $\tau$. Finally, some numerical experiments are provided in order to confirm the reliability of the analyzed results.
3.An Accelerated Block Proximal Framework with Adaptive Momentum for Nonconvex and Nonsmooth Optimization
Authors:Weifeng Yang, Wenwen Min
Abstract: We propose an accelerated block proximal linear framework with adaptive momentum (ABPL$^+$) for nonconvex and nonsmooth optimization. We analyze the potential causes of the extrapolation step failing in some algorithms, and resolve this issue by enhancing the comparison process that evaluates the trade-off between the proximal gradient step and the linear extrapolation step in our algorithm. Furthermore, we extends our algorithm to any scenario involving updating block variables with positive integers, allowing each cycle to randomly shuffle the update order of the variable blocks. Additionally, under mild assumptions, we prove that ABPL$^+$ can monotonically decrease the function value without strictly restricting the extrapolation parameters and step size, demonstrates the viability and effectiveness of updating these blocks in a random order, and we also more obviously and intuitively demonstrate that the derivative set of the sequence generated by our algorithm is a critical point set. Moreover, we demonstrate the global convergence as well as the linear and sublinear convergence rates of our algorithm by utilizing the Kurdyka-Lojasiewicz (K{\L}) condition. To enhance the effectiveness and flexibility of our algorithm, we also expand the study to the imprecise version of our algorithm and construct an adaptive extrapolation parameter strategy, which improving its overall performance. We apply our algorithm to multiple non-negative matrix factorization with the $\ell_0$ norm, nonnegative tensor decomposition with the $\ell_0$ norm, and perform extensive numerical experiments to validate its effectiveness and efficiency.
4.Data-driven decision-focused surrogate modeling
Authors:Rishabh Gupta, Qi Zhang
Abstract: We introduce the concept of decision-focused surrogate modeling for solving computationally challenging nonlinear optimization problems in real-time settings. The proposed data-driven framework seeks to learn a simpler, e.g. convex, surrogate optimization model that is trained to minimize the decision prediction error, which is defined as the difference between the optimal solutions of the original and the surrogate optimization models. The learning problem, formulated as a bilevel program, can be viewed as a data-driven inverse optimization problem to which we apply a decomposition-based solution algorithm from previous work. We validate our framework through numerical experiments involving the optimization of common nonlinear chemical processes such as chemical reactors, heat exchanger networks, and material blending systems. We also present a detailed comparison of decision-focused surrogate modeling with standard data-driven surrogate modeling methods and demonstrate that our approach is significantly more data-efficient while producing simple surrogate models with high decision prediction accuracy.
5.Funnel MPC for nonlinear systems with arbitrary relative degree
Authors:Thomas Berger, Dario Dennstädt
Abstract: The Model Predictive Control (MPC) scheme Funnel MPC enables output tracking of smooth reference signals with prescribed error bounds for nonlinear multi-input multi-output systems with stable internal dynamics. Earlier works achieved the control objective for system with relative degree restricted to one or incorporated additional feasibility constraints in the optimal control problem. Here we resolve these limitations by introducing a modified stage cost function relying on a weighted sum of the tracking error derivatives. The weights need to be sufficiently large and we state explicit lower bounds. Under these assumptions we are able to prove initial and recursive feasibility of the novel Funnel MPC scheme for systems with arbitrary relative degree - without requiring any terminal conditions, a sufficiently long prediction horizon or additional output constraints.