Optimization and Control (math.OC)
Wed, 19 Jul 2023
1.On the Bredies-Chenchene-Lorenz-Naldi algorithm
Authors:Heinz H. Bauschke, Walaa M. Moursi, Shambhavi Singh, Xianfu Wang
Abstract: Monotone inclusion problems occur in many areas of optimization and variational analysis. Splitting methods, which utilize resolvents or proximal mappings of the underlying operators, are often applied to solve these problems. In 2022, Bredies, Chenchene, Lorenz, and Naldi introduced a new elegant algorithmic framework that encompasses various well known algorithms including Douglas-Rachford and Chambolle-Pock. They obtained powerful weak and strong convergence results, where the latter type relies on additional strong monotonicity assumptions. In this paper, we complement the analysis by Bredies et al. by relating the projections of the fixed point sets of the underlying operators that generate the (reduced and original) preconditioned proximal point sequences. We also obtain strong convergence results in the case of linear relations. Various examples are provided to illustrate the applicability of our results.
2.Stopping Rules for Gradient Method for Saddle Point Problems with Twoside Polyak-Lojasievich Condition
Authors:Muratidi A. Ya., Stonyakin F. S
Abstract: The paper considers approaches to saddle point problems with a two-sided variant of the Polyak-Lojasievich condition based on the gradient method with inexact information and proposes a stopping rule based on the smallness of the norm of the inexact gradient of the external subproblem. Achieving this rule in combination with a suitable accuracy of solving the auxiliary subproblem ensures that the quality of the original saddle point problem is acceptable. The results of numerical experiments for various saddle point problems are discussed to illustrate the effectiveness of the proposed method, including the comparison with proven convergence rate estimates.
3.Information Structures in AC/DC Grids
Authors:Josh A. Taylor
Abstract: The converters in an AC/DC grid form actuated boundaries between the AC and DC subgrids. We show how in both simple linear and balanced dq-frame models, the states on either side of these boundaries are coupled only by control inputs. This topological property imparts all AC/DC grids with poset-causal information structures. A practical benefit is that certain decentralized control problems that are hard in general are tractable for poset-causal systems. We also show that special cases like multi-terminal DC grids can have coordinated and leader-follower information structures.
4.Inexact Direct-Search Methods for Bilevel Optimization Problems
Authors:Youssef Diouane, Vyacheslav Kungurtsev, Francesco Rinaldi, Damiano Zeffiro
Abstract: In this work, we introduce new direct search schemes for the solution of bilevel optimization (BO) problems. Our methods rely on a fixed accuracy black box oracle for the lower-level problem, and deal both with smooth and potentially nonsmooth true objectives. We thus analyze for the first time in the literature direct search schemes in these settings, giving convergence guarantees to approximate stationary points, as well as complexity bounds in the smooth case. We also propose the first adaptation of mesh adaptive direct search schemes for BO. Some preliminary numerical results on a standard set of bilevel optimization problems show the effectiveness of our new approaches.
5.A non-monotone extra-gradient trust-region method with noisy oracles
Authors:Natasa Krejic, Natasa Krklec Jerinkic, Angeles Martinez, Mahsa Yousefi
Abstract: In this work, we introduce a novel stochastic second-order method, within the framework of a non-monotone trust-region approach, for solving the unconstrained, nonlinear, and non-convex optimization problems arising in the training of deep neural networks. The proposed algorithm makes use of subsampling strategies which yield noisy approximations of the finite sum objective function and its gradient. To effectively control the resulting approximation error, we introduce an adaptive sample size strategy based on inexpensive additional sampling. Depending on the estimated progress of the algorithm, this can yield sample size scenarios ranging from mini-batch to full sample functions. We provide convergence analysis for all possible scenarios and show that the proposed method achieves almost sure convergence under standard assumptions for the trust-region framework. We report numerical experiments showing that the proposed algorithm outperforms its state-of-the-art counterpart in deep neural network training for image classification and regression tasks while requiring a significantly smaller number of gradient evaluations.
6.Convergence Guarantees for Stochastic Subgradient Methods in Nonsmooth Nonconvex Optimization
Authors:Nachuan Xiao, Xiaoyin Hu, Kim-Chuan Toh
Abstract: In this paper, we investigate the convergence properties of the stochastic gradient descent (SGD) method and its variants, especially in training neural networks built from nonsmooth activation functions. We develop a novel framework that assigns different timescales to stepsizes for updating the momentum terms and variables, respectively. Under mild conditions, we prove the global convergence of our proposed framework in both single-timescale and two-timescale cases. We show that our proposed framework encompasses a wide range of well-known SGD-type methods, including heavy-ball SGD, SignSGD, Lion, normalized SGD and clipped SGD. Furthermore, when the objective function adopts a finite-sum formulation, we prove the convergence properties for these SGD-type methods based on our proposed framework. In particular, we prove that these SGD-type methods find the Clarke stationary points of the objective function with randomly chosen stepsizes and initial points under mild assumptions. Preliminary numerical experiments demonstrate the high efficiency of our analyzed SGD-type methods.
7.An Operator-Splitting Approach for Variational Optimal Control Formulations for Diffeomorphic Shape Matching
Authors:Andreas Mang, Jiwen He, Robert Azencott
Abstract: We present formulations and numerical algorithms for solving diffeomorphic shape matching problems. We formulate shape matching as a variational problem governed by a dynamical system that models the flow of diffeomorphism $f_t \in \operatorname{diff}(\mathbb{R}^3)$. We overview our contributions in this area, and present an improved, matrix-free implementation of an operator splitting strategy for diffeomorphic shape matching. We showcase results for diffeomorphic shape matching of real clinical cardiac data in $\mathbb{R}^3$ to assess the performance of our methodology.