Optimization and Control (math.OC)
Mon, 26 Jun 2023
1.Open-loop and closed-loop solvabilities for discrete-time mean-field stochastic linear quadratic optimal control problems
Authors:Teng Song, Bin Liu
Abstract: This paper discusses the discrete-time mean-field stochastic linear quadratic optimal control problems, whose weighting matrices in the cost functional are not assumed to be definite. The open-loop solvability is characterized by the existence of the solution to a mean-field forward-backward stochastic difference equations with a convexity condition and a stationary condition. The closed-loop solvability is presented by virtue of the existences of the regular solution to the generalized Riccati equations and the solution to the linear recursive equation, which is also shown by the uniform convexity of the cost functional. Moreover, based on a family of uniformly convex cost functionals, the finiteness of the problem is characterized. Also, it turns out that a minimizing sequence, whose convergence is equivalent to the open-loop solvability of the problem. Finally, some examples are given to illustrate the theory developed.
2.Nonconvex Stochastic Bregman Proximal Gradient Method with Application to Deep Learning
Authors:Kuangyu Ding, Jingyang Li, Kim-Chuan Toh
Abstract: The widely used stochastic gradient methods for minimizing nonconvex composite objective functions require the Lipschitz smoothness of the differentiable part. But the requirement does not hold true for problem classes including quadratic inverse problems and training neural networks. To address this issue, we investigate a family of stochastic Bregman proximal gradient (SBPG) methods, which only require smooth adaptivity of the differentiable part. SBPG replaces the upper quadratic approximation used in SGD with the Bregman proximity measure, resulting in a better approximation model that captures the non-Lipschitz gradients of the nonconvex objective. We formulate the vanilla SBPG and establish its convergence properties under nonconvex setting without finite-sum structure. Experimental results on quadratic inverse problems testify the robustness of SBPG. Moreover, we propose a momentum-based version of SBPG (MSBPG) and prove it has improved convergence properties. We apply MSBPG to the training of deep neural networks with a polynomial kernel function, which ensures the smooth adaptivity of the loss function. Experimental results on representative benchmarks demonstrate the effectiveness and robustness of MSBPG in training neural networks. Since the additional computation cost of MSBPG compared with SGD is negligible in large-scale optimization, MSBPG can potentially be employed an universal open-source optimizer in the future.
3.The Implicit Rigid Tube Model Predictive Control
Authors:Saša V. Raković
Abstract: A computationally efficient reformulation of the rigid tube model predictive control is developed. A unique feature of the derived formulation is the utilization of the implicit set representations. This novel formulation does not require any set algebraic operations to be performed explicitly, and its implementation requires merely the use of the standard optimization solvers.
4.Optimal control of a parabolic equation with a nonlocal nonlinearity
Authors:Cyrille Kenne, Landry Djomegne, Gisèle Mophou
Abstract: This paper proposes an optimal control problem for a parabolic equation with a nonlocal nonlinearity. The system is described by a parabolic equation involving a nonlinear term that depends on the solution and its integral over the domain. We prove the existence and uniqueness of the solution to the system and the boundedness of the solution. Regularity results for the control-to-state operator, the cost functional and the adjoint state are also proved. We show the existence of optimal solutions and derive first-order necessary optimality conditions. In addition, second-order necessary and sufficient conditions for optimality are established.
5.Stability of optimal shapes and convergence of thresholding algorithms in linear and spectral optimal control problems
Authors:Antonin Chambolle, Idriss Mazari-Fouquer, Yannick Privat
Abstract: We prove the convergence of the fixed-point (also called thresholding) algorithm in three optimal control problems under large volume constraints. This algorithm was introduced by C\'ea, Gioan and Michel, and is of constant use in the simulation of $L^\infty-L^1$ optimal control problems. In this paper we consider the optimisation of the Dirichlet energy, of Dirichlet eigenvalues and of certain non-energetic problems. Our proofs rely on new diagonalisation procedure for shape hessians in optimal control problems, which leads to local stability estimates.
6.Sum-of-squares relaxations for polynomial min-max problems over simple sets
Authors:Francis Bach SIERRA
Abstract: We consider min-max optimization problems for polynomial functions, where a multivariate polynomial is maximized with respect to a subset of variables, and the resulting maximal value is minimized with respect to the remaining variables. When the variables belong to simple sets (e.g., a hypercube, the Euclidean hypersphere, or a ball), we derive a sum-of-squares formulation based on a primal-dual approach. In the simplest setting, we provide a convergence proof when the degree of the relaxation tends to infinity and observe empirically that it can be finitely convergent in several situations. Moreover, our formulation leads to an interesting link with feasibility certificates for polynomial inequalities based on Putinar's Positivstellensatz.
7.Generalized Scaling for the Constrained Maximum-Entropy Sampling Problem
Authors:Zhongzhu Chen, Marcia Fampa, Jon Lee
Abstract: The best practical techniques for exact solution of instances of the constrained maximum-entropy sampling problem, a discrete-optimization problem arising in the design of experiments, are via a branch-and-bound framework, working with a variety of concave continuous relaxations of the objective function. A standard and computationally-important bound-enhancement technique in this context is (ordinary) scaling, via a single positive parameter. Scaling adjusts the shape of continuous relaxations to reduce the gaps between the upper bounds and the optimal value. We extend this technique to generalized scaling, employing a positive vector of parameters, which allows much more flexibility and thus significantly reduces the gaps further. We give mathematical results aimed at supporting algorithmic methods for computing optimal generalized scalings, and we give computational results demonstrating the performance of generalized scaling on benchmark problem instances.
8.Gain Confidence, Reduce Disappointment: A New Approach to Cross-Validation for Sparse Regression
Authors:Ryan Cory-Wright, Andrés Gómez
Abstract: Ridge regularized sparse regression involves selecting a subset of features that explains the relationship between a design matrix and an output vector in an interpretable manner. To select the sparsity and robustness of linear regressors, techniques like leave-one-out cross-validation are commonly used for hyperparameter tuning. However, cross-validation typically increases the cost of sparse regression by several orders of magnitude. Additionally, validation metrics are noisy estimators of the test-set error, with different hyperparameter combinations giving models with different amounts of noise. Therefore, optimizing over these metrics is vulnerable to out-of-sample disappointment, especially in underdetermined settings. To address this, we make two contributions. First, we leverage the generalization theory literature to propose confidence-adjusted variants of leave-one-out that display less propensity to out-of-sample disappointment. Second, we leverage ideas from the mixed-integer literature to obtain computationally tractable relaxations of confidence-adjusted leave-one-out, thereby minimizing it without solving as many MIOs. Our relaxations give rise to an efficient coordinate descent scheme which allows us to obtain significantly lower leave-one-out errors than via other methods in the literature. We validate our theory by demonstrating we obtain significantly sparser and comparably accurate solutions than via popular methods like GLMNet and suffer from less out-of-sample disappointment. On synthetic datasets, our confidence adjustment procedure generates significantly fewer false discoveries, and improves out-of-sample performance by 2-5% compared to cross-validating without confidence adjustment. Across a suite of 13 real datasets, a calibrated version of our procedure improves the test set error by an average of 4% compared to cross-validating without confidence adjustment.
9.Near-Optimal Fully First-Order Algorithms for Finding Stationary Points in Bilevel Optimization
Authors:Lesi Chen, Yaohua Ma, Jingzhao Zhang
Abstract: Bilevel optimization has various applications such as hyper-parameter optimization and meta-learning. Designing theoretically efficient algorithms for bilevel optimization is more challenging than standard optimization because the lower-level problem defines the feasibility set implicitly via another optimization problem. One tractable case is when the lower-level problem permits strong convexity. Recent works show that second-order methods can provably converge to an $\epsilon$-first-order stationary point of the problem at a rate of $\tilde{\mathcal{O}}(\epsilon^{-2})$, yet these algorithms require a Hessian-vector product oracle. Kwon et al. (2023) resolved the problem by proposing a first-order method that can achieve the same goal at a slower rate of $\tilde{\mathcal{O}}(\epsilon^{-3})$. In this work, we provide an improved analysis demonstrating that the first-order method can also find an $\epsilon$-first-order stationary point within $\tilde {\mathcal{O}}(\epsilon^{-2})$ oracle complexity, which matches the upper bounds for second-order methods in the dependency on $\epsilon$. Our analysis further leads to simple first-order algorithms that can achieve similar near-optimal rates in finding second-order stationary points and in distributed bilevel problems.