arXiv daily

Optimization and Control (math.OC)

Tue, 20 Jun 2023

Other arXiv digests in this category:Thu, 14 Sep 2023; Wed, 13 Sep 2023; Tue, 12 Sep 2023; Mon, 11 Sep 2023; Fri, 08 Sep 2023; Tue, 05 Sep 2023; Fri, 01 Sep 2023; Thu, 31 Aug 2023; Wed, 30 Aug 2023; Tue, 29 Aug 2023; Mon, 28 Aug 2023; Fri, 25 Aug 2023; Thu, 24 Aug 2023; Wed, 23 Aug 2023; Tue, 22 Aug 2023; Mon, 21 Aug 2023; Fri, 18 Aug 2023; Thu, 17 Aug 2023; Wed, 16 Aug 2023; Tue, 15 Aug 2023; Mon, 14 Aug 2023; Fri, 11 Aug 2023; Thu, 10 Aug 2023; Wed, 09 Aug 2023; Tue, 08 Aug 2023; Mon, 07 Aug 2023; Fri, 04 Aug 2023; Thu, 03 Aug 2023; Wed, 02 Aug 2023; Tue, 01 Aug 2023; Mon, 31 Jul 2023; Fri, 28 Jul 2023; Thu, 27 Jul 2023; Wed, 26 Jul 2023; Tue, 25 Jul 2023; Mon, 24 Jul 2023; Fri, 21 Jul 2023; Thu, 20 Jul 2023; Wed, 19 Jul 2023; Tue, 18 Jul 2023; Mon, 17 Jul 2023; Fri, 14 Jul 2023; Thu, 13 Jul 2023; Wed, 12 Jul 2023; Tue, 11 Jul 2023; Mon, 10 Jul 2023; Fri, 07 Jul 2023; Thu, 06 Jul 2023; Wed, 05 Jul 2023; Tue, 04 Jul 2023; Mon, 03 Jul 2023; Fri, 30 Jun 2023; Thu, 29 Jun 2023; Wed, 28 Jun 2023; Tue, 27 Jun 2023; Mon, 26 Jun 2023; Fri, 23 Jun 2023; Thu, 22 Jun 2023; Wed, 21 Jun 2023; Fri, 16 Jun 2023; Thu, 15 Jun 2023; Tue, 13 Jun 2023; Mon, 12 Jun 2023; Fri, 09 Jun 2023; Thu, 08 Jun 2023; Wed, 07 Jun 2023; Tue, 06 Jun 2023; Mon, 05 Jun 2023; Fri, 02 Jun 2023; Thu, 01 Jun 2023; Wed, 31 May 2023; Tue, 30 May 2023; Mon, 29 May 2023; Fri, 26 May 2023; Thu, 25 May 2023; Wed, 24 May 2023; Tue, 23 May 2023; Mon, 22 May 2023; Fri, 19 May 2023; Thu, 18 May 2023; Wed, 17 May 2023; Tue, 16 May 2023; Mon, 15 May 2023; Fri, 12 May 2023; Thu, 11 May 2023; Wed, 10 May 2023; Tue, 09 May 2023; Mon, 08 May 2023; Fri, 05 May 2023; Thu, 04 May 2023; Wed, 03 May 2023; Tue, 02 May 2023; Mon, 01 May 2023; Fri, 28 Apr 2023; Thu, 27 Apr 2023; Wed, 26 Apr 2023; Tue, 25 Apr 2023; Mon, 24 Apr 2023; Fri, 21 Apr 2023; Thu, 20 Apr 2023; Wed, 19 Apr 2023; Tue, 18 Apr 2023; Mon, 17 Apr 2023; Fri, 14 Apr 2023; Thu, 13 Apr 2023; Wed, 12 Apr 2023; Tue, 11 Apr 2023; Mon, 10 Apr 2023
1.A Lagrangian-Based Method with "False Penalty'' for Linearly Constrained Nonconvex Composite Optimization

Authors:Jong Gwang Kim

Abstract: We introduce a primal-dual framework for solving linearly constrained nonconvex composite optimization problems. Our approach is based on a newly developed Lagrangian, which incorporates \emph{false penalty} and dual smoothing terms. This new Lagrangian enables us to develop a simple first-order algorithm that converges to a stationary solution under standard assumptions. We further establish global convergence, provided that the objective function satisfies the Kurdyka-{\L}ojasiewicz property. Our method provides several advantages: it simplifies the treatment of constraints by effectively bounding the multipliers without boundedness assumptions on the dual iterates; it guarantees global convergence without requiring the surjectivity assumption on the linear operator; and it is a single-loop algorithm that does not involve solving penalty subproblems, achieving an iteration complexity of $\mathcal{O}(1/\epsilon^2)$ to find an $\epsilon$-stationary solution. Preliminary experiments on test problems demonstrate the practical efficiency and robustness of our method.

2.A gradient projection method for semi-supervised hypergraph clustering problems

Authors:Jingya Chang, Dongdong Liu, Min Xi

Abstract: Semi-supervised clustering problems focus on clustering data with labels. In this paper,we consider the semi-supervised hypergraph problems. We use the hypergraph related tensor to construct an orthogonal constrained optimization model. The optimization problem is solved by a retraction method, which employs the polar decomposition to map the gradient direction in the tangent space to the Stefiel manifold. A nonmonotone curvilinear search is implemented to guarantee reduction in the objective function value. Convergence analysis demonstrates that the first order optimality condition is satisfied at the accumulation point. Experiments on synthetic hypergraph and hypergraph given by real data demonstrate the effectivity of our method.

3.A Passivity-Based Method for Accelerated Convex Optimisation

Authors:Namhoon Cho, Hyo-Sang Shin

Abstract: This study presents a constructive methodology for designing accelerated convex optimisation algorithms in continuous-time domain. The two key enablers are the classical concept of passivity in control theory and the time-dependent change of variables that maps the output of the internal dynamic system to the optimisation variables. The Lyapunov function associated with the optimisation dynamics is obtained as a natural consequence of specifying the internal dynamics that drives the state evolution as a passive linear time-invariant system. The passivity-based methodology provides a general framework that has the flexibility to generate convex optimisation algorithms with the guarantee of different convergence rate bounds on the objective function value. The same principle applies to the design of online parameter update algorithms for adaptive control by re-defining the output of internal dynamics to allow for the feedback interconnection with tracking error dynamics.

4.Stabilization and Spill-Free Transfer of Viscous Liquid in a Tank

Authors:Iasson Karafyllis, Miroslav Krstic

Abstract: Flow control occupies a special place in the fields of partial differential equations (PDEs) and control theory, where the complex behavior of solutions of nonlinear dynamics in very high dimension is not just to be understood but also to be assigned specific desired properties, by feedback control. Among several benchmark problems in flow control, the liquid-tank problem is particularly attractive as a research topic. In the liquid-tank problem the objective is to move a tank filled with liquid, suppress the nonlinear oscillations of the liquid in the process, bring the tank and liquid to rest, and avoid liquid spillage in the process. In other words, this is a problem of nonlinear PDE stabilization subject to state constraints. This review article focuses only on recent results on liquid-tank stabilization for viscous liquids. All possible cases are studied: with and without friction from the tank walls, with and without surface tension. Moreover, novel results are provided for the linearization of the tank-liquid system. The linearization of the tank-liquid system gives a high-order PDE which is a combination of a wave equation with Kelvin-Voigt damping and an Euler-Bernoulli beam equation. The feedback design methodology presented in the article is based on Control Lyapunov Functionals (CLFs), suitably extended from the CLF methodology for ODEs to the infinite-dimensional case. The CLFs proposed are modifications and augmentations of the total energy functionals for the tank-liquid system, so that the dissipative effects of viscosity, friction, and surface tension are captured and additional dissipation by feedback is made relatively easy. The article closes with an extensive list of open problems.

5.Graph-Based Conditions for Feedback Stabilization of Switched and LPV Systems

Authors:Matteo Della Rossa, Thiago Alves Lima, Marc Jungers, Raphaël M. Jungers

Abstract: This paper presents novel stabilizability conditions for switched linear systems with arbitrary and uncontrollable underlying switching signals. We distinguish and study two particular settings: i) the \emph{robust} case, in which the active mode is completely unknown and unobservable, and ii) the \emph{mode-dependent} case, in which the controller depends on the current active switching mode. The technical developments are based on graph-theory tools, relying in particular on the path-complete Lyapunov functions framework. The main idea is to use directed and labeled graphs to encode Lyapunov inequalities to design robust and mode-dependent piecewise linear state-feedback controllers. This results in novel and flexible conditions, with the particular feature of being in the form of linear matrix inequalities (LMIs). Our technique thus provides a first controller-design strategy allowing piecewise linear feedback maps and piecewise quadratic (control) Lyapunov functions by means of semidefinite programming. Numerical examples illustrate the application of the proposed techniques, the relations between the graph order, the robustness, and the performance of the closed loop.

6.Regularized Robust MDPs and Risk-Sensitive MDPs: Equivalence, Policy Gradient, and Sample Complexity

Authors:Runyu Zhang, Yang Hu, Na Li

Abstract: This paper focuses on reinforcement learning for the regularized robust Markov decision process (MDP) problem, an extension of the robust MDP framework. We first introduce the risk-sensitive MDP and establish the equivalence between risk-sensitive MDP and regularized robust MDP. This equivalence offers an alternative perspective for addressing the regularized RMDP and enables the design of efficient learning algorithms. Given this equivalence, we further derive the policy gradient theorem for the regularized robust MDP problem and prove the global convergence of the exact policy gradient method under the tabular setting with direct parameterization. We also propose a sample-based offline learning algorithm, namely the robust fitted-Z iteration (RFZI), for a specific regularized robust MDP problem with a KL-divergence regularization term and analyze the sample complexity of the algorithm. Our results are also supported by numerical simulations.

7.Last-Iterate Convergent Policy Gradient Primal-Dual Methods for Constrained MDPs

Authors:Dongsheng Ding, Chen-Yu Wei, Kaiqing Zhang, Alejandro Ribeiro

Abstract: We study the problem of computing an optimal policy of an infinite-horizon discounted constrained Markov decision process (constrained MDP). Despite the popularity of Lagrangian-based policy search methods used in practice, the oscillation of policy iterates in these methods has not been fully understood, bringing out issues such as violation of constraints and sensitivity to hyper-parameters. To fill this gap, we employ the Lagrangian method to cast a constrained MDP into a constrained saddle-point problem in which max/min players correspond to primal/dual variables, respectively, and develop two single-time-scale policy-based primal-dual algorithms with non-asymptotic convergence of their policy iterates to an optimal constrained policy. Specifically, we first propose a regularized policy gradient primal-dual (RPG-PD) method that updates the policy using an entropy-regularized policy gradient, and the dual via a quadratic-regularized gradient ascent, simultaneously. We prove that the policy primal-dual iterates of RPG-PD converge to a regularized saddle point with a sublinear rate, while the policy iterates converge sublinearly to an optimal constrained policy. We further instantiate RPG-PD in large state or action spaces by including function approximation in policy parametrization, and establish similar sublinear last-iterate policy convergence. Second, we propose an optimistic policy gradient primal-dual (OPG-PD) method that employs the optimistic gradient method to update primal/dual variables, simultaneously. We prove that the policy primal-dual iterates of OPG-PD converge to a saddle point that contains an optimal constrained policy, with a linear rate. To the best of our knowledge, this work appears to be the first non-asymptotic policy last-iterate convergence result for single-time-scale algorithms in constrained MDPs.

8.Closed-form expressions for the pure time delay in terms of the input and output Laguerre spectra

Authors:Alexander Medvedev

Abstract: The pure time delay operator is considered in continuous and discrete time under the assumption of the input signal being integrable (summable) with square. By making use of a discrete convolution operator with polynomial Markov parameters, a common framework for handling the continuous and discrete case is set. Closed-form expressions for the delay value are derived in terms of the Laguerre spectra of the output and input signals. The expressions hold for any feasible value of the Laguerre parameter and can be utilized for e.g. building time-delay estimators that allow for non-persistent input. A simulation example is provided to illustrate the principle of Laguerre-domain time delay modeling and analysis.

9.Projection-Free Methods for Solving Nonconvex-Concave Saddle Point Problems

Authors:Morteza Boroun, Erfan Yazdandoost Hamedani, Afrooz Jalilzadeh

Abstract: In this paper, we investigate a class of constrained saddle point (SP) problems where the objective function is nonconvex-concave and smooth. This class of problems has wide applicability in machine learning, including robust multi-class classification and dictionary learning. Several projection-based primal-dual methods have been developed for tackling this problem, however, the availability of methods with projection-free oracles remains limited. To address this gap, we propose efficient single-loop projection-free methods reliant on first-order information. In particular, using regularization and nested approximation techniques we propose a primal-dual conditional gradient method that solely employs linear minimization oracles to handle constraints. Assuming that the constraint set in the maximization is strongly convex our method achieves an $\epsilon$-stationary solution within $\mathcal{O}(\epsilon^{-6})$ iterations. When the projection onto the constraint set of maximization is easy to compute, we propose a one-sided projection-free method that achieves an $\epsilon$-stationary solution within $\mathcal{O}(\epsilon^{-4})$ iterations. Moreover, we present improved iteration complexities of our methods under a strong concavity assumption. To the best of our knowledge, our proposed algorithms are among the first projection-free methods with convergence guarantees for solving nonconvex-concave SP problems.