arXiv daily: Optimization and Control (math.OC)

Thu, 14 Sep 2023digest

1.Chemotherapy planning and multi-appointment scheduling: formulations, heuristics and bounds

Authors:Giuliana Carello, Mauro Passacantando, Elena Tanfani

Abstract: The number of new cancer cases is expected to increase by about 50% in the next 20 years, and the need for chemotherapy treatments will increase accordingly. Chemotherapy treatments are usually performed in outpatient cancer centers where patients affected by different types of tumors are treated. The treatment delivery must be carefully planned to optimize the use of limited resources, such as drugs, medical and nursing staff, consultation and exam rooms, and chairs and beds for the drug infusion. Planning and scheduling chemotherapy treatments involve different problems at different decision levels. In this work, we focus on the patient chemotherapy multi-appointment planning and scheduling problem at an operational level, namely the problem of determining the day and starting time of the oncologist visit and drug infusion for a set of patients to be scheduled along a short-term planning horizon. We use a per-pathology paradigm, where the days of the week in which patients can be treated, depending on their pathology, are known. We consider different metrics and formulate the problem as a multi-objective optimization problem tackled by sequentially solving three problems in a lexicographic multi-objective fashion. The ultimate aim is to minimize the patient's discomfort. The problems turn out to be computationally challenging, thus we propose bounds and ad-hoc approaches, exploiting alternative problem formulations, decomposition, and $k$-opt search. The approaches are tested on real data from an Italian outpatient cancer center and outperform state-of-the-art solvers.

2.Bilinear control of semilinear elliptic PDEs: Convergence of a semismooth Newton method

2309.07554

Authors:Eduardo Casas, Konstantinos Chrysafinos, Mariano Mateos

Abstract: In this paper, we carry out the analysis of the semismooth Newton method for bilinear control problems related to semilinear elliptic PDEs. We prove existence, uniqueness and regularity for the solution of the state equation, as well as differentiability properties of the control to state mapping. Then, first and second order optimality conditions are obtained. Finally, we prove the superlinear convergence of the semismooth Newton method to local solutions satisfying no-gap second order sufficient optimality conditions as well as a strict complementarity condition.

3.Online Mixed Discrete and Continuous Optimization: Algorithms, Regret Analysis and Applications

2309.07630

Authors:Lintao Ye, Ming Chi, Zhi-Wei Liu, Xiaoling Wang, Vijay Gupta

Abstract: We study an online mixed discrete and continuous optimization problem where a decision maker interacts with an unknown environment for a number of $T$ rounds. At each round, the decision maker needs to first jointly choose a discrete and a continuous actions and then receives a reward associated with the chosen actions. The goal for the decision maker is to maximize the accumulative reward after $T$ rounds. We propose algorithms to solve the online mixed discrete and continuous optimization problem and prove that the algorithms yield sublinear regret in $T$. We show that a wide range of applications in practice fit into the framework of the online mixed discrete and continuous optimization problem, and apply the proposed algorithms to solve these applications with regret guarantees. We validate our theoretical results with numerical experiments.

4.Tulipa Energy Model: Mathematical Formulation

2309.07711

Authors:Diego A. Tejada-Arango, Germán Morales-España, Lauren Clisby, Ni Wang, Abel S. Siqueira, Ali Subayu, Laurent Soucasse, Zhi Gao

Abstract: Tulipa Energy Model aims to optimise the investment and operation of the electricity market, considering its coupling with other sectors, such as hydrogen and heat, that can also be electrified. The problem is analysed from the perspective of a central planner who determines the expansion plan that is most beneficial for the system as a whole, either by maximising social welfare or by minimising total costs. The formulation provides a general description of the objective function and constraints in the optimisation model based on the concept of energy assets representing any element in the model. The model uses subsets and specific methods to determine the constraints that apply to a particular technology or network, allowing more flexibility in the code to consider new technologies and constraints with different levels of detail in the future.

5.Optimal inexactness schedules for Tunable Oracle based Methods

2309.07787

Authors:Guillaume Van Dessel, François Glineur

Abstract: Several recent works address the impact of inexact oracles in the convergence analysis of modern first-order optimization techniques, e.g. Bregman Proximal Gradient and Prox-Linear methods as well as their accelerated variants, extending their field of applicability. In this paper, we consider situations where the oracle's inexactness can be chosen upon demand, more precision coming at a computational price counterpart. Our main motivations arise from oracles requiring the solving of auxiliary subproblems or the inexact computation of involved quantities, e.g. a mini-batch stochastic gradient as a full-gradient estimate. We propose optimal inexactness schedules according to presumed oracle cost models and patterns of worst-case guarantees, covering among others convergence results of the aforementioned methods under the presence of inexactness. Specifically, we detail how to choose the level of inexactness at each iteration to obtain the best trade-off between convergence and computational investments. Furthermore, we highlight the benefits one can expect by tuning those oracles' quality instead of keeping it constant throughout. Finally, we provide extensive numerical experiments that support the practical interest of our approach, both in offline and online settings, applied to the Fast Gradient algorithm.

6.Learning to Warm-Start Fixed-Point Optimization Algorithms

2309.07835

Authors:Rajiv Sambharya, Georgina Hall, Brandon Amos, Bartolomeo Stellato

Abstract: We introduce a machine-learning framework to warm-start fixed-point optimization algorithms. Our architecture consists of a neural network mapping problem parameters to warm starts, followed by a predefined number of fixed-point iterations. We propose two loss functions designed to either minimize the fixed-point residual or the distance to a ground truth solution. In this way, the neural network predicts warm starts with the end-to-end goal of minimizing the downstream loss. An important feature of our architecture is its flexibility, in that it can predict a warm start for fixed-point algorithms run for any number of steps, without being limited to the number of steps it has been trained on. We provide PAC-Bayes generalization bounds on unseen data for common classes of fixed-point operators: contractive, linearly convergent, and averaged. Applying this framework to well-known applications in control, statistics, and signal processing, we observe a significant reduction in the number of iterations and solution time required to solve these problems, through learned warm starts.

7.Mean-field games of speedy information access with observation costs

2309.07877

Authors:Dirk Becherer, Christoph Reisinger, Jonathan Tam

Abstract: We investigate a mean-field game (MFG) in which agents can exercise control actions that affect their speed of access to information. The agents can dynamically decide to receive observations with less delay by paying higher observation costs. Agents seek to exploit their active information gathering by making further decisions to influence their state dynamics to maximize rewards. In the mean field equilibrium, each generic agent solves individually a partially observed Markov decision problem in which the way partial observations are obtained is itself also subject of dynamic control actions by the agent. Based on a finite characterisation of the agents' belief states, we show how the mean field game with controlled costly information access can be formulated as an equivalent standard mean field game on a suitably augmented but finite state space.We prove that with sufficient entropy regularisation, a fixed point iteration converges to the unique MFG equilibrium and yields an approximate $\epsilon$-Nash equilibrium for a large but finite population size. We illustrate our MFG by an example from epidemiology, where medical testing results at different speeds and costs can be chosen by the agents.

8.Acceleration by Stepsize Hedging I: Multi-Step Descent and the Silver Stepsize Schedule

2309.07879

Authors:Jason M. Altschuler, Pablo A. Parrilo

Abstract: Can we accelerate convergence of gradient descent without changing the algorithm -- just by carefully choosing stepsizes? Surprisingly, we show that the answer is yes. Our proposed Silver Stepsize Schedule optimizes strongly convex functions in $k^{\log_{\rho} 2} \approx k^{0.7864}$ iterations, where $\rho=1+\sqrt{2}$ is the silver ratio and $k$ is the condition number. This is intermediate between the textbook unaccelerated rate $k$ and the accelerated rate $\sqrt{k}$ due to Nesterov in 1983. The non-strongly convex setting is conceptually identical, and standard black-box reductions imply an analogous accelerated rate $\varepsilon^{-\log_{\rho} 2} \approx \varepsilon^{-0.7864}$. We conjecture and provide partial evidence that these rates are optimal among all possible stepsize schedules. The Silver Stepsize Schedule is constructed recursively in a fully explicit way. It is non-monotonic, fractal-like, and approximately periodic of period $k^{\log_{\rho} 2}$. This leads to a phase transition in the convergence rate: initially super-exponential (acceleration regime), then exponential (saturation regime).

Wed, 13 Sep 2023digest

1.Maximum Principle for Mean Field Type Control Problems with General Volatility Functions

2309.06736

Authors:Alain Bensoussan, Ziyu Huang, Sheung Chi Phillip Yam

Abstract: In this paper, we study the maximum principle of mean field type control problems when the volatility function depends on the state and its measure and also the control, by using our recently developed method. Our method is to embed the mean field type control problem into a Hilbert space to bypass the evolution in the Wasserstein space. We here give a necessary condition and a sufficient condition for these control problems in Hilbert spaces, and we also derive a system of forward-backward stochastic differential equations.

2.Nonlinear network identifiability: The static case

2309.06854

Authors:Renato Vizuete, Julien M. Hendrickx

Abstract: We analyze the problem of network identifiability with nonlinear functions associated with the edges. We consider a static model for the output of each node and by assuming a perfect identification of the function associated with the measurement of a node, we provide conditions for the identifiability of the edges in a specific class of functions. First, we analyze the identifiability conditions in the class of all nonlinear functions and show that even for a path graph, it is necessary to measure all the nodes except by the source. Then, we consider analytic functions satisfying $f(0)=0$ and we provide conditions for the identifiability of paths and trees. Finally, by restricting the problem to a smaller class of functions where none of the functions is linear, we derive conditions for the identifiability of directed acyclic graphs. Some examples are presented to illustrate the results.

3.Barzilai-Borwein Descent Methods for Multiobjective Optimization Problems with Variable Trade-off Metrics

2309.06929

Authors:Jian Chen, Liping Tang, Xinmin Yang

Abstract: The imbalances and conditioning of the objective functions influence the performance of first-order methods for multiobjective optimization problems (MOPs). The latter is related to the metric selected in the direction-finding subproblems. Unlike single-objective optimization problems, capturing the curvature of all objective functions with a single Hessian matrix is impossible. On the other hand, second-order methods for MOPs use different metrics for objectives in direction-finding subproblems, leading to a high per-iteration cost. To balance per-iteration cost and better curvature exploration, we propose a Barzilai-Borwein descent method with variable metrics (BBDMO\_VM). In the direction-finding subproblems, we employ a variable metric to explore the curvature of all objectives. Subsequently, Barzilai-Borwein's method relative to the variable metric is applied to tune objectives, which mitigates the effect of imbalances. We investigate the convergence behaviour of the BBDMO\_VM, confirming fast linear convergence for well-conditioned problems relative to the variable metric. In particular, we establish linear convergence for problems that involve some linear objectives. These convergence results emphasize the importance of metric selection, motivating us to approximate the trade-off of Hessian matrices to better capture the geometry of the problem. Comparative numerical results confirm the efficiency of the proposed method, even when applied to large-scale and ill-conditioned problems.

4.On the Intelligent Proportional Controller Applied to Linear Systems

2309.06992

Authors:Mohamed Camil Belhadjoudja, Mohamed Maghenem, Emmanuel Witrant

Abstract: We analyze in this paper the effect of the well known intelligent proportional controller on the stability of linear control systems. Inspired by the literature on neutral time delay systems and advanced type systems, we derive sufficient conditions on the order of the control system, under which, the used controller fails to achieve exponential stability. Furthermore, we obtain conditions, relating the system s and the control parameters, such that the closed-loop system is either unstable or not exponentially stable. After that, we provide cases where the intelligent proportional controller achieves exponential stability. The obtained results are illustrated via numerical simulations, and on an experimental benchmark that consists of an electronic throttle valve.

5.Dynamical convergence analysis for nonconvex linearized proximal ADMM algorithms

2309.07008

Authors:Jiahong Guo, Xiao Wang, Xiantao Xiao

Abstract: The convergence analysis of optimization algorithms using continuous-time dynamical systems has received much attention in recent years. In this paper, we investigate applications of these systems to analyze the convergence of linearized proximal ADMM algorithms for nonconvex composite optimization, whose objective function is the sum of a continuously differentiable function and a composition of a possibly nonconvex function with a linear operator. We first derive a first-order differential inclusion for the linearized proximal ADMM algorithm, LP-ADMM. Both the global convergence and the convergence rates of the generated trajectory are established with the use of Kurdyka-\L{}ojasiewicz (KL) property. Then, a stochastic variant, LP-SADMM, is delved into an investigation for finite-sum nonconvex composite problems. Under mild conditions, we obtain the stochastic differential equation corresponding to LP-SADMM, and demonstrate the almost sure global convergence of the generated trajectory by leveraging the KL property. Based on the almost sure convergence of trajectory, we construct a stochastic process that converges almost surely to an approximate critical point of objective function, and derive the expected convergence rates associated with this stochastic process. Moreover, we propose an accelerated LP-SADMM that incorporates Nesterov's acceleration technique. The continuous-time dynamical system of this algorithm is modeled as a second-order stochastic differential equation. Within the context of KL property, we explore the related almost sure convergence and expected convergence rates.

6.Absorbing Markov Decision Processes

2309.07059

Authors:François Dufour, Tomás Prieto-Rumeau

Abstract: In this paper, we study discrete-time absorbing Markov Decision Processes (MDP) with measurable state space and Borel action space with a given initial distribution. For such models, solutions to the characteristic equation that are not occupation measures may exist. Several necessary and sufficient conditions are provided to guarantee that any solution to the characteristic equation is an occupation measure. Under the so-called continuity-compactness conditions, it is shown that the set of occupation measures is compact in the weak-strong topology if and only if the model is uniformly absorbing. Finally, it is shown that the occupation measures are characterized by the characteristic equation and an additional condition. Several examples are provided to illustrate our results.

7.Complexity analysis of regularization methods for implicitly constrained least squares

2309.07086

Authors:Akwum Onwunta, Clément W. Royer

Abstract: Optimization problems constrained by partial differential equations (PDEs) naturally arise in scientific computing, as those constraints often model physical systems or the simulation thereof. In an implicitly constrained approach, the constraints are incorporated into the objective through a reduced formulation. To this end, a numerical procedure is typically applied to solve the constraint system, and efficient numerical routines with quantifiable cost have long been developed. Meanwhile, the field of complexity in optimization, that estimates the cost of an optimization algorithm, has received significant attention in the literature, with most of the focus being on unconstrained or explicitly constrained problems. In this paper, we analyze an algorithmic framework based on quadratic regularization for implicitly constrained nonlinear least squares. By leveraging adjoint formulations, we can quantify the worst-case cost of our method to reach an approximate stationary point of the optimization problem. Our definition of such points exploits the least-squares structure of the objective, leading to an efficient implementation. Numerical experiments conducted on PDE-constrained optimization problems demonstrate the efficiency of the proposed framework.

8.Optimal adaptive control with separable drift uncertainty

2309.07091

Authors:Samuel N. Cohen, Christoph Knochenhauer, Alexander Merkel

Abstract: We consider a problem of stochastic optimal control with separable drift uncertainty in strong formulation on a finite horizon. The drift coefficient of the state $Y^{u}$ is multiplicatively influenced by an unknown random variable $\lambda$, while admissible controls $u$ are required to be adapted to the observation filtration. Choosing a control actively influences the state and information acquisition simultaneously and comes with a learning effect. The problem, initially non-Markovian, is embedded into a higher-dimensional Markovian, full information control problem with control-dependent filtration and noise. To that problem, we apply the stochastic Perron method to characterize the value function as the unique viscosity solution to the HJB equation, explicitly construct $\varepsilon$-optimal controls and show that the values of strong and weak formulations agree. Numerical illustrations show a significant difference between the adaptive control and the certainty equivalence control.

Tue, 12 Sep 2023digest

1.Relating Electric Vehicle Charging to Speed Scaling with Job-Specific Speed Limits

2309.06174

Authors:Leoni Winschermann, Marco E. T. Gerards, Antonios Antoniadis, Gerwin Hoogsteen, Johann Hurink

Abstract: Due to the ongoing electrification of transport in combination with limited power grid capacities, efficient ways to schedule electric vehicles (EVs) are needed for intraday operation of, for example, large parking lots. Common approaches like model predictive control repeatedly solve a corresponding offline problem. In this work, we present and analyze the Flow-based Offline Charging Scheduler (FOCS), an offline algorithm to derive an optimal EV charging schedule for a fleet of EVs that minimizes an increasing, convex and differentiable function of the corresponding aggregated power profile. To this end, we relate EV charging to mathematical speed scaling models with job-specific speed limits. We prove our algorithm to be optimal. Furthermore, we derive necessary and sufficient conditions for any EV charging profile to be optimal.

2.Inexact Decentralized Dual Gradient Tracking for Constraint-Coupled Optimization

2309.06330

Authors:Jingwang Li, Housheng Su

Abstract: We propose an inexact decentralized dual gradient tracking method (iDDGT) for distributed optimization problems with a globally coupled equality constraint. Unlike existing algorithms that rely on either the exact dual gradient or an inexact one obtained through single-step gradient descent, iDDGT introduces a new approach: utilizing an inexact dual gradient with controllable levels of inexactness. Numerical experiments demonstrate that iDDGT achieves significantly higher computational efficiency compared to state-of-the-art methods. Furthermore, it is proved that iDDGT can achieve linear convergence over directed graphs without imposing any conditions on the constraint matrix. This expands its applicability beyond existing algorithms that require the constraint matrix to have full row rank and undirected graphs for achieving linear convergence.

3.Stochastic Bridges over Ensemble of Linear Systems

2309.06350

Authors:Daniel Owusu Adu, Yongxin Chen

Abstract: We consider particles that are conditioned to initial and final states. The trajectory of these particles is uniquely shaped by the intricate interplay of internal and external sources of randomness. The internal randomness is aptly modelled through a parameter varying over a deterministic set, thereby giving rise to an ensemble of systems. Concurrently, the external randomness is introduced through the inclusion of white noise. Within this context, our primary objective is to effectively generate the stochastic bridge through the optimization of a random differential equation. As a deviation from the literature, we show that the optimal control mechanism, pivotal in the generation of the bridge, does not conform to the typical Markov strategy. Instead, it adopts a non-Markovian strategy, which can be more precisely classified as a stochastic feedforward control input. This unexpected divergence from the established strategies underscores the complex interrelationships present in the dynamics of the system under consideration.

4.Symmetric Stair Preconditioning of Linear Systems for Parallel Trajectory Optimization

2309.06427

Authors:Xueyi Bu, Brian Plancher

Abstract: There has been a growing interest in parallel strategies for solving trajectory optimization problems. One key step in many algorithmic approaches to trajectory optimization is the solution of moderately-large and sparse linear systems. Iterative methods are particularly well-suited for parallel solves of such systems. However, fast and stable convergence of iterative methods is reliant on the application of a high-quality preconditioner that reduces the spread and increase the clustering of the eigenvalues of the target matrix. To improve the performance of these approaches, we present a new parallel-friendly symmetric stair preconditioner. We prove that our preconditioner has advantageous theoretical properties when used in conjunction with iterative methods for trajectory optimization such as a more clustered eigenvalue spectrum. Numerical experiments with typical trajectory optimization problems reveal that as compared to the best alternative parallel preconditioner from the literature, our symmetric stair preconditioner provides up to a 34% reduction in condition number and up to a 25% reduction in the number of resulting linear system solver iterations.

Mon, 11 Sep 2023digest

1.Optimization Method Based On Optimal Control

2309.05280

Authors:Yeming Xu, Ziyuan Guo, Hongxia Wang, Huanshui Zhang

Abstract: In this paper, we focus on a method based on optimal control to address the optimization problem. The objective is to find the optimal solution that minimizes the objective function. We transform the optimization problem into optimal control by designing an appropriate cost function. Using Pontryagin's Maximum Principle and the associated forward-backward difference equations (FBDEs), we derive the iterative update gain for the optimization. The steady system state can be considered as the solution to the optimization problem. Finally, we discuss the compelling characteristics of our method and further demonstrate its high precision, low oscillation, and applicability for finding different local minima of non-convex functions through several simulation examples.

2.Simba: A Scalable Bilevel Preconditioned Gradient Method for Fast Evasion of Flat Areas and Saddle Points

2309.05309

Authors:Nick Tsipinakis, Panos Parpas

Abstract: The convergence behaviour of first-order methods can be severely slowed down when applied to high-dimensional non-convex functions due to the presence of saddle points. If, additionally, the saddles are surrounded by large plateaus, it is highly likely that the first-order methods will converge to sub-optimal solutions. In machine learning applications, sub-optimal solutions mean poor generalization performance. They are also related to the issue of hyper-parameter tuning, since, in the pursuit of solutions that yield lower errors, a tremendous amount of time is required on selecting the hyper-parameters appropriately. A natural way to tackle the limitations of first-order methods is to employ the Hessian information. However, methods that incorporate the Hessian do not scale or, if they do, they are very slow for modern applications. Here, we propose Simba, a scalable preconditioned gradient method, to address the main limitations of the first-order methods. The method is very simple to implement. It maintains a single precondition matrix that it is constructed as the outer product of the moving average of the gradients. To significantly reduce the computational cost of forming and inverting the preconditioner, we draw links with the multilevel optimization methods. These links enables us to construct preconditioners in a randomized manner. Our numerical experiments verify the scalability of Simba as well as its efficacy near saddles and flat areas. Further, we demonstrate that Simba offers a satisfactory generalization performance on standard benchmark residual networks. We also analyze Simba and show its linear convergence rate for strongly convex functions.

3.Computing Wasserstein Barycenter via operator splitting: the method of averaged marginals

2309.05315

Authors:D. Mimouni IFPEN, P Malisani IFPEN, J. Zhu IFPEN, W. de Oliveira CMA

Abstract: The Wasserstein barycenter (WB) is an important tool for summarizing sets of probabilities. It finds applications in applied probability, clustering, image processing, etc. When the probability supports are finite and fixed, the problem of computing a WB is formulated as a linear optimization problem whose dimensions generally exceed standard solvers' capabilities. For this reason, the WB problem is often replaced with a simpler nonlinear optimization model constructed via an entropic regularization function so that specialized algorithms can be employed to compute an approximate WB efficiently. Contrary to such a widespread inexact scheme, we propose an exact approach based on the Douglas-Rachford splitting method applied directly to the WB linear optimization problem for applications requiring accurate WB. Our algorithm, which has the interesting interpretation of being built upon averaging marginals, operates series of simple (and exact) projections that can be parallelized and even randomized, making it suitable for large-scale datasets. As a result, our method achieves good performance in terms of speed while still attaining accuracy. Furthermore, the same algorithm can be applied to compute generalized barycenters of sets of measures with different total masses by allowing for mass creation and destruction upon setting an additional parameter. Our contribution to the field lies in the development of an exact and efficient algorithm for computing barycenters, enabling its wider use in practical applications. The approach's mathematical properties are examined, and the method is benchmarked against the state-of-the-art methods on several data sets from the literature.

4.Dynamic Pricing in an Energy Community Providing Capacity Limitation Services

2309.05363

Authors:Bennevis Crowley, Jalal Kazempour, Lesia Mitridati

Abstract: This paper proposes a mathematical framework for dynamic pricing in an energy community to enable the provision of capacity limitation services to the distribution grid. In this framework, the energy community complies with a time-variant limit on its maximum power import from the distribution grid in exchange for grid tariff discounts. A bi-level optimization model is developed to implicitly coordinate the energy usage of prosumers within the community. In the upper-level problem, the community manager minimizes the total operational cost of the community based on reduced grid tariffs and power capacity limits by setting time-variant and prosumer-specific prices. In the lower-level problem, each prosumer subsequently adjusts their energy usage over a day to minimize their individual operational cost. This framework allows the community manager to maintain central economic market properties such as budget balance and individual rationality for prosumers. We show how the community benefits can be allocated to prosumers either in an equal or a proportional manner. The proposed model is eventually reformulated into a mixed integer second-order cone program and thereafter applied to a distribution grid case study.

5.Convergence analysis of the semismooth Newton method for sparse control problems governed by semilinear elliptic equations

2309.05393

Authors:Casas Eduardo, Mateos Mariano

Abstract: We show that a second order sufficient condition for local optimality, along with a strict complementarity condition, is enough to get the super-linear convergence of the semismooth Newton method for an optimal control problem governed by a semilinear elliptic equation. The objective functional may include a sparsity promoting term and we allow for box control constraints. We also obtain quadratic convergence under quite natural assumptions on the data of the control problem.

6.Turnpike and dissipativity in generalized discrete-time stochastic linear-quadratic optimal control

2309.05422

Authors:Jonas Schießl, Ruchuan Ou, Timm Faulwasser, Michael Heinrich Baumann, Lars Grüne

Abstract: We investigate different turnpike phenomena of generalized discrete-time stochastic linear-quadratic optimal control problems. Our analysis is based on a novel strict dissipativity notion for such problems, in which a stationary stochastic process replaces the optimal steady state of the deterministic setting. We show that from this time-varying dissipativity notion, we can conclude turnpike behaviors concerning different objects like distributions, moments, or sample paths of the stochastic system and that the distributions of the stationary pair can be characterized by a stationary optimization problem. The analytical findings are illustrated by numerical simulations.

7.Algorithms for DC Programming via Polyhedral Approximations of Convex Functions

2309.05487

Authors:Fahaar Mansoor Pirani, Firdevs Ulus

Abstract: There is an existing exact algorithm that solves DC programming problems if one component of the DC function is polyhedral convex (Loehne, Wagner, 2017). Motivated by this, first, we consider two cutting-plane algorithms for generating an $\epsilon$-polyhedral underestimator of a convex function g. The algorithms start with a polyhedral underestimator of g and the epigraph of the current underestimator is intersected with either a single halfspace (Algorithm 1) or with possibly multiple halfspaces (Algorithm 2) in each iteration to obtain a better approximation. We prove the correctness and finiteness of both algorithms, establish the convergence rate of Algorithm 1, and show that after obtaining an $\epsilon$-polyhedral underestimator of the first component of a DC function, the algorithm from (Loehne, Wagner, 2017) can be applied to compute an $\epsilon$ solution of the DC programming problem without further computational effort. We then propose an algorithm (Algorithm 3) for solving DC programming problems by iteratively generating a (not necessarily $\epsilon$-) polyhedral underestimator of g. We prove that Algorithm 3 stops after finitely many iterations and it returns an $\epsilon$-solution to the DC programming problem. Moreover, the sequence $\{x_k\}_{k\geq 0} outputted by Algorithm 3 converges to a global minimizer of the DC problem when $\epsilon$ is set to zero. Computational results based on some test instances from the literature are provided.

8.Energy-optimal Timetable Design for Sustainable Metro Railway Networks

2309.05489

Authors:Shuvomoy Das Gupta, Bart P. G. Van Parys, J. Kevin Tobin

Abstract: We present our collaboration with Thales Canada Inc, the largest provider of communication-based train control (CBTC) systems worldwide. We study the problem of designing energy-optimal timetables in metro railway networks to minimize the effective energy consumption of the network, which corresponds to simultaneously minimizing total energy consumed by all the trains and maximizing the transfer of regenerative braking energy from suitable braking trains to accelerating trains. We propose a novel data-driven linear programming model that minimizes the total effective energy consumption in a metro railway network, capable of computing the optimal timetable in real-time, even for some of the largest CBTC systems in the world. In contrast with existing works, which are either NP-hard or involve multiple stages requiring extensive simulation, our model is a single linear programming model capable of computing the energy-optimal timetable subject to the constraints present in the railway network. Furthermore, our model can predict the total energy consumption of the network without requiring time-consuming simulations, making it suitable for widespread use in managerial settings. We apply our model to Shanghai Railway Network's Metro Line 8 -- one of the largest and busiest railway services in the world -- and empirically demonstrate that our model computes energy-optimal timetables for thousands of active trains spanning an entire service period of one day in real-time (solution time less than one second on a standard desktop), achieving energy savings between approximately 20.93% and 28.68%. Given the compelling advantages, our model is in the process of being integrated into Thales Canada Inc's industrial timetable compiler.

9.Safe Adaptive Control of Hyperbolic PDE-ODE Cascades

2309.05596

Authors:Ji Wang, Miroslav Krstic

Abstract: Adaptive safe control employing conventional continuous infinite-time adaptation requires that the initial conditions be restricted to a subset of the safe set due to parametric uncertainty, where the safe set is shrunk in inverse proportion to the adaptation gain. The recent regulation-triggered adaptive control approach with batch least-squares identification (BaLSI, pronounced ``ballsy'') completes perfect parameter identification in finite time and offers a previously unforeseen advantage in adaptive safe control, which we elucidate in this paper. Since the true challenge of safe control is exhibited for CBF of a high relative degree, we undertake a safe BaLSI design in this paper for a class of systems that possess a particularly extreme relative degree: ODE-PDE-ODE sandwich systems. Such sandwich systems arise in various applications, including delivery UAV with a cable-suspended load. Collision avoidance of the payload with the surrounding environment is required. The considered class of plants is $2\times2$ hyperbolic PDEs sandwiched by a strict-feedback nonlinear ODE and a linear ODE, where the unknown coefficients, whose bounds are known and arbitrary, are associated with the PDE in-domain coupling terms that can cause instability and with the input signal of the distal ODE. This is the first safe adaptive control design for PDEs, where we introduce the concept of PDE CBF whose non-negativity as well as the ODE CBF's non-negativity are ensured with a backstepping-based safety filter. Our safe adaptive controller is explicit and operates in the entire original safe set.

10.A distributionally robust index tracking model with the CVaR penalty: tractable reformulation

2309.05597

Authors:Ruyu Wang, Yaozhong Hu, Chao Zhang

Abstract: We propose a distributionally robust index tracking model with the conditional value-at-risk (CVaR) penalty. The model combines the idea of distributionally robust optimization for data uncertainty and the CVaR penalty to avoid large tracking errors. The probability ambiguity is described through a confidence region based on the first-order and second-order moments of the random vector involved. We reformulate the model in the form of a min-max-min optimization into an equivalent nonsmooth minimization problem. We further give an approximate discretization scheme of the possible continuous random vector of the nonsmooth minimization problem, whose objective function involves the maximum of numerous but finite nonsmooth functions. The convergence of the discretization scheme to the equivalent nonsmooth reformulation is shown under mild conditions. A smoothing projected gradient (SPG) method is employed to solve the discretization scheme. Any accumulation point is shown to be a global minimizer of the discretization scheme. Numerical results on the NASDAQ index dataset from January 2008 to July 2023 demonstrate the effectiveness of our proposed model and the efficiency of the SPG method, compared with several state-of-the-art models and corresponding methods for solving them.

11.An exact algorithm for linear optimization problem subject to max-product fuzzy relational inequalities with fuzzy constraints

2309.05624

Authors:Amin Ghodousian, Romina Omidi

Abstract: Fuzzy relational inequalities with fuzzy constraints (FRI-FC) are the generalized form of fuzzy relational inequalities (FRI) in which fuzzy inequality replaces ordinary inequality in the constraints. Fuzzy constraints enable us to attain optimal points (called super-optima) that are better solutions than those resulted from the resolution of the similar problems with ordinary inequality constraints. This paper considers the linear objective function optimization with respect to max-product FRI-FC problems. It is proved that there is a set of optimization problems equivalent to the primal problem. Based on the algebraic structure of the primal problem and its equivalent forms, some simplification operations are presented to convert the main problem into a more simplified one. Finally, by some appropriate mathematical manipulations, the main problem is transformed into an optimization model whose constraints are linear. The proposed linearization method not only provides a super-optimum (that is better solution than ordinary feasible optimal solutions) but also finds the best super-optimum for the main problem. The current approach is compared with our previous work and some well-known heuristic algorithms by applying them to random test problems in different sizes.

Fri, 08 Sep 2023digest

1.Optimal strategies for mosquitoes replacement techniques: influence of the carrying capacity on spatial releases

2309.04192

Authors:Luis Almeida, Jesús Bellver Arnau, Gwenaël Peltier, Nicolas Vauchelet

Abstract: This work is devoted to the mathematical study of an optimization problem regarding control strategies of mosquito population in a heterogeneous environment. Mosquitoes are well known to be vectors of diseases, but, in some cases, they have a reduced vector capacity when carrying the endosymbiotic bacterium Wolbachia. We consider a mathematical model of a replacement strategy, consisting in rearing and releasing Wolbachia-infected mosquitoes to replace the wild population. We investigate the question of optimizing the release protocol to have the most effective replacement when the environment is heterogeneous. In other words we focus on the question: where to release, given an inhomogeneous environment, in order to maximize the replacement across the domain. To do so, we consider a simple scalar model in which we assume that the carrying capacity is space dependent. Then, we investigate the existence of an optimal release profile and prove some interesting properties. In particular, neglecting the mobility of mosquitoes and under some assumptions on the biological parameters, we characterize the optimal releasing strategy for a short time horizon, and provide a way to reduce to a one-dimensional optimization problem the case of a long time horizon. Our theoretical results are illustrated with several numerical simulations.

2.A hybrid physics-informed neural network based multiscale solver as a partial differential equation constrained optimization problem

2309.04439

Authors:Michael Hintermüller, Denis Korolev

Abstract: In this work, we study physics-informed neural networks (PINNs) constrained by partial differential equations (PDEs) and their application in approximating multiscale PDEs. From a continuous perspective, our formulation corresponds to a non-standard PDE-constrained optimization problem with a PINN-type objective. From a discrete standpoint, the formulation represents a hybrid numerical solver that utilizes both neural networks and finite elements. We propose a function space framework for the problem and develop an algorithm for its numerical solution, combining an adjoint-based technique from optimal control with automatic differentiation. The multiscale solver is applied to a heat transfer problem with oscillating coefficients, where the neural network approximates a fine-scale problem, and a coarse-scale problem constrains the learning process. We show that incorporating coarse-scale information into the neural network training process through our modelling framework acts as a preconditioner for the low-frequency component of the fine-scale PDE, resulting in improved convergence properties and accuracy of the PINN method. The relevance and potential applications of the hybrid solver to computational homogenization and material science are discussed.

Tue, 05 Sep 2023digest

1.Local properties and augmented Lagrangians in fully nonconvex composite optimization

2309.01980

Authors:Alberto De Marchi, Patrick Mehlitz

Abstract: A broad class of optimization problems can be cast in composite form, that is, considering the minimization of the composition of a lower semicontinuous function with a differentiable mapping. This paper discusses the versatile template of composite optimization without any convexity assumptions. First- and second-order optimality conditions are discussed, advancing the variational analysis of compositions. We highlight the difficulties that stem from the lack of convexity when dealing with necessary conditions in a Lagrangian framework and when considering error bounds. Building upon these characterizations, a local convergence analysis is delineated for a recently developed augmented Lagrangian method, deriving rates of convergence in the fully nonconvex setting.

2.An optimal control approach for the treatment of hepatitis C patients

2309.01993

Authors:Anh-Tuan Nguyen, Hien Tran

Abstract: In this article, the feasibility of using optimal control theory will be studied to develop control theoretic methods for personalized treatment of HCV patients. The mathematical model for HCV progression includes compartments for healthy hepatocytes, infected hepatocytes, infectious virions and noninfectious virions. Methodologies have been used from optimal control theory to design and synthesize an open-loop control based treatment regimen for HCV dynamics.

3.PROMISE: Preconditioned Stochastic Optimization Methods by Incorporating Scalable Curvature Estimates

2309.02014

Authors:Zachary Frangella, Pratik Rathore, Shipu Zhao, Madeleine Udell

Abstract: This paper introduces PROMISE ($\textbf{Pr}$econditioned Stochastic $\textbf{O}$ptimization $\textbf{M}$ethods by $\textbf{I}$ncorporating $\textbf{S}$calable Curvature $\textbf{E}$stimates), a suite of sketching-based preconditioned stochastic gradient algorithms for solving large-scale convex optimization problems arising in machine learning. PROMISE includes preconditioned versions of SVRG, SAGA, and Katyusha; each algorithm comes with a strong theoretical analysis and effective default hyperparameter values. In contrast, traditional stochastic gradient methods require careful hyperparameter tuning to succeed, and degrade in the presence of ill-conditioning, a ubiquitous phenomenon in machine learning. Empirically, we verify the superiority of the proposed algorithms by showing that, using default hyperparameter values, they outperform or match popular tuned stochastic gradient optimizers on a test bed of $51$ ridge and logistic regression problems assembled from benchmark machine learning repositories. On the theoretical side, this paper introduces the notion of quadratic regularity in order to establish linear convergence of all proposed methods even when the preconditioner is updated infrequently. The speed of linear convergence is determined by the quadratic regularity ratio, which often provides a tighter bound on the convergence rate compared to the condition number, both in theory and in practice, and explains the fast global linear convergence of the proposed methods.

4.A Prescriptive Trilevel Equilibrium Model for Optimal Emissions Pricing and Sustainable Energy Systems Development

2309.02032

Authors:Olli Herrala, Steven A. Gabriel, Fabricio Oliveira, Tommi Ekholm

Abstract: We explore the class of trilevel equilibrium problems with a focus on energy-environmental applications. In particular, we apply this trilevel framework to a power market model, exploring the possibilities of an international policymaker in reducing emissions of the system. We present two alternative solution methods for such problems and a comparison of the resulting model sizes. The first method is based on a reformulation of the bottom-level solution set, and the second one uses strong duality. The first approach results in optimality conditions that are both necessary and sufficient, while the second one results in a model with fewer constraints but only sufficient optimality conditions. Using the proposed methods, we are able to obtain globally optimal solutions for a realistic five-node case study representing the Nordic countries and assess the impact of a carbon tax on the electricity production portfolio.

5.Backward error analysis and the qualitative behaviour of stochastic optimization algorithms: Application to stochastic coordinate descent

2309.02082

Authors:Stefano Di Giovacchino, Desmond J. Higham, Konstantinos Zygalakis

Abstract: Stochastic optimization methods have been hugely successful in making large-scale optimization problems feasible when computing the full gradient is computationally prohibitive. Using the theory of modified equations for numerical integrators, we propose a class of stochastic differential equations that approximate the dynamics of general stochastic optimization methods more closely than the original gradient flow. Analyzing a modified stochastic differential equation can reveal qualitative insights about the associated optimization method. Here, we study mean-square stability of the modified equation in the case of stochastic coordinate descent.

6.Finite dimensional backstepping controller design

2309.02196

Authors:Varga Kalantarov, Türker Özsarı, Kemal Cem Yılmaz

Abstract: We introduce a finite dimensional version of backstepping controller design for stabilizing solutions of PDEs from boundary. Our controller uses only a finite number of Fourier modes of the state of solution, as opposed to the classical backstepping controller which uses all (infinitely many) modes. We apply our method to the reaction-diffusion equation, which serves only as a canonical example but the method is applicable also to other PDEs whose solutions can be decomposed into a slow finite-dimensional part and a fast tail, where the former dominates the evolution in large time. One of the main goals is to estimate the sufficient number of modes needed to stabilize the plant at a prescribed rate. In addition, we find the minimal number of modes that guarantee the stabilization at a certain (unprescribed) decay rate. Theoretical findings are supported with numerical solutions.

7.Lifting functionals defined on maps to measure-valued maps via optimal transport

2309.02260

Authors:Hugo Lavenant

Abstract: How can one lift a functional defined on maps from a space X to a space Y into a functional defined on maps from X into P(Y) the space of probability distributions over Y? Looking at measure-valued maps can be interpreted as knowing a classical map with uncertainty, and from an optimization point of view the main gain is the convexification of Y into P(Y). We will explain why trying to single out the largest convex lifting amounts to solve an optimal transport problem with an infinity of marginals which can be interesting by itself. Moreover we will show that, to recover previously proposed liftings for functionals depending on the Jacobian of the map, one needs to add a restriction of additivity to the lifted functional.

8.An Efficient Semi-Real-Time Algorithm for Path Planning in the Hamilton-Jacobi Formulation

2309.02357

Authors:Christian Parkinson, Kyle Polage

Abstract: We present a semi-real-time algorithm for minimal-time optimal path planning based on optimal control theory, dynamic programming, and Hamilton-Jacobi (HJ) equations. Partial differential equation (PDE) based optimal path planning methods are well-established in the literature, and provide an interpretable alternative to black-box machine learning algorithms. However, due to the computational burden of grid-based PDE solvers, many previous methods do not scale well to high dimensional problems and are not applicable in real-time scenarios even for low dimensional problems. We present a semi-real-time algorithm for optimal path planning in the HJ formulation, using grid-free numerical methods based on Hopf-Lax formulas. In doing so, we retain the intepretablity of PDE based path planning, but because the numerical method is grid-free, it is efficient and does not suffer from the curse of dimensionality, and thus can be applied in semi-real-time and account for realistic concerns like obstacle discovery. This represents a significant step in averting the tradeoff between interpretability and efficiency. We present the algorithm with application to synthetic examples of isotropic motion planning in two-dimensions, though with slight adjustments, it could be applied to many other problems.

9.First and zeroth-order implementations of the regularized Newton method with lazy approximated Hessians

2309.02412

Authors:Nikita Doikov, Geovani Nunes Grapiglia

Abstract: In this work, we develop first-order (Hessian-free) and zero-order (derivative-free) implementations of the Cubically regularized Newton method for solving general non-convex optimization problems. For that, we employ finite difference approximations of the derivatives. We use a special adaptive search procedure in our algorithms, which simultaneously fits both the regularization constant and the parameters of the finite difference approximations. It makes our schemes free from the need to know the actual Lipschitz constants. Additionally, we equip our algorithms with the lazy Hessian update that reuse a previously computed Hessian approximation matrix for several iterations. Specifically, we prove the global complexity bound of $\mathcal{O}( n^{1/2} \epsilon^{-3/2})$ function and gradient evaluations for our new Hessian-free method, and a bound of $\mathcal{O}( n^{3/2} \epsilon^{-3/2} )$ function evaluations for the derivative-free method, where $n$ is the dimension of the problem and $\epsilon$ is the desired accuracy for the gradient norm. These complexity bounds significantly improve the previously known ones in terms of the joint dependence on $n$ and $\epsilon$, for the first-order and zeroth-order non-convex optimization.

10.Crack propagation in anisotropic brittle materials: from a phase-field model to a shape optimization approach

2309.02431

Authors:Tim Suchan, Chaitanya Kandekar, Wolfgang E. Weber, Kathrin Welker

Abstract: The phase-field method is based on the energy minimization principle which is a geometric method for modeling diffusive cracks that are popularly implemented with irreversibility based on Griffith's criterion. This method requires a length-scale parameter that smooths the sharp discontinuity, which influences the diffuse band and results in mesh-sensitive fracture propagation results. Recently, a novel approach based on the optimization on Riemannian shape spaces has been proposed, where the crack path is realized by techniques from shape optimization. This approach requires the shape derivative, which is derived in a continuous sense and used for a gradient-based algorithm to minimize the energy of the system. Due to the continuous derivation of the shape derivative, this approach yields mesh-independent results. In this paper, the novel approach based on shape optimization is presented, followed by an assessment of the predicted crack path in anisotropic brittle material using numerical calculations from a phase-field model.

Fri, 01 Sep 2023digest

1.Model Predictive Control using MATLAB

2309.00293

Authors:Midhun T. Augustine

Abstract: This tutorial consists of a brief introduction to the modern control approach called model predictive control (MPC) and its numerical implementation using MATLAB. We discuss the basic concepts and numerical implementation of the two major classes of MPC: Linear MPC (LMPC) and Nonlinear MPC (NMPC). This includes the various aspects of MPC such as formulating the optimization problem, constraints handling, feasibility, stability, and optimality.

2.Convergence Analysis of the Best Response Algorithm for Time-Varying Games

2309.00307

Authors:Zifan Wang, Yi Shen, Michael M. Zavlanos, Karl H. Johansson

Abstract: This paper studies a class of strongly monotone games involving non-cooperative agents that optimize their own time-varying cost functions. We assume that the agents can observe other agents' historical actions and choose actions that best respond to other agents' previous actions; we call this a best response scheme. We start by analyzing the convergence rate of this best response scheme for standard time-invariant games. Specifically, we provide a sufficient condition on the strong monotonicity parameter of the time-invariant games under which the proposed best response algorithm achieves exponential convergence to the static Nash equilibrium. We further illustrate that this best response algorithm may oscillate when the proposed sufficient condition fails to hold, which indicates that this condition is tight. Next, we analyze this best response algorithm for time-varying games where the cost functions of each agent change over time. Under similar conditions as for time-invariant games, we show that the proposed best response algorithm stays asymptotically close to the evolving equilibrium. We do so by analyzing both the equilibrium tracking error and the dynamic regret. Numerical experiments on economic market problems are presented to validate our analysis.

3.Urban Logistics in Amsterdam: A Modal Shift from Roadways to Waterway

2309.00345

Authors:Nadia Pourmohammad-Zia, Mark van Koningsveld

Abstract: The efficiency of urban logistics is vital for economic prosperity and quality of life in cities. However, rapid urbanization poses significant challenges, such as congestion, emissions, and strained infrastructure. This paper addresses these challenges by proposing an optimal urban logistic network that integrates urban waterways and last-mile delivery in Amsterdam. The study highlights the untapped potential of inland waterways in addressing logistical challenges in the city center. The problem is formulated as a two-echelon location routing problem with time windows, and a hybrid solution approach is developed to solve it effectively. The proposed algorithm consistently outperforms existing approaches, demonstrating its effectiveness in solving existing benchmarks and newly developed instances. Through a comprehensive case study, the advantages of implementing a waterway-based distribution chain are assessed, revealing substantial cost savings (approximately 28%) and reductions in vehicle weight (about 43%) and travel distances (roughly 80%) within the city center. The incorporation of electric vehicles further contributes to environmental sustainability. Sensitivity analysis underscores the importance of managing transshipment location establishment costs as a key strategy for cost efficiencies and reducing reliance on delivery vehicles and road traffic congestion. This study provides valuable insights and practical guidance for managers seeking to enhance operational efficiency, reduce costs, and promote sustainable transportation practices. Further analysis is warranted to fully evaluate the feasibility and potential benefits, considering infrastructural limitations and canal characteristics.

4.Enhancing PGA Tour Performance: Leveraging ShotlinkTM Data for Optimization and Prediction

2309.00485

Authors:Matthieu Guillot, Gautier Stauffer

Abstract: In this study, we demonstrate how data from the PGA Tour, combined with stochastic shortest path models (MDPs), can be employed to refine the strategies of professional golfers and predict future performances. We present a comprehensive methodology for this objective, proving its computational feasibility. This sets the stage for more in-depth exploration into leveraging data available to professional and amateurs for strategic optimization and forecasting performance in golf. For the replicability of our results, and to adapt and extend the methodology and prototype solution, we provide access to all our codes and analyses (R and C++).

5.Directional Tykhonov well-posedness for optimization problems and variational inequalities

2309.00515

Authors:Vo Ke Hoang, Vo Si Trong Long

Abstract: By using the so-called minimal time function, we propose and study a novel notion of directional Tykhonov well-posedness for optimization problems, which is an extension of the widely acknowledged notion of Tykhonov. In this way, we first provide some characterizations of this notion in terms of the diameter of level sets and admissible functions. Then, we investigate relationships between the level sets and admissible functions mentioned above. Finally, we apply the technology developed before to study directional Tykhonov well-posedness for variational inequalities. Several examples are presented as well to illustrate the applicability of our results.

6.Integral Quadratic Constraints with Infinite-Dimensional Channels

2309.00516

Authors:Aleksandr Talitckii, Peter Seiler, Matthew M. Peet

Abstract: Modern control theory provides us with a spectrum of methods for studying the interconnection of dynamic systems using input-output properties of the interconnected subsystems. Perhaps the most advanced framework for such input-output analysis is the use of Integral Quadratic Constraints (IQCs), which considers the interconnection of a nominal linear system with an unmodelled nonlinear or uncertain subsystem with known input-output properties. Although these methods are widely used for Ordinary Differential Equations (ODEs), there have been fewer attempts to extend IQCs to infinite-dimensional systems. In this paper, we present an IQC-based framework for Partial Differential Equations (PDEs) and Delay Differential Equations (DDEs). First, we introduce infinite-dimensional signal spaces, operators, and feedback interconnections. Next, in the main result, we propose a formulation of hard IQC-based input-output stability conditions, allowing for infinite-dimensional multipliers. We then show how to test hard IQC conditions with infinite-dimensional multipliers on a nominal linear PDE or DDE system via the Partial Integral Equation (PIE) state-space representation using a sufficient version of the Kalman-Yakubovich-Popov lemma (KYP). The results are then illustrated using four example problems with uncertainty and nonlinearity.

7.Online Distributed Learning over Random Networks

2309.00520

Authors:Nicola Bastianello, Diego Deplano, Mauro Franceschelli, Karl H. Johansson

Abstract: The recent deployment of multi-agent systems in a wide range of scenarios has enabled the solution of learning problems in a distributed fashion. In this context, agents are tasked with collecting local data and then cooperatively train a model, without directly sharing the data. While distributed learning offers the advantage of preserving agents' privacy, it also poses several challenges in terms of designing and analyzing suitable algorithms. This work focuses specifically on the following challenges motivated by practical implementation: (i) online learning, where the local data change over time; (ii) asynchronous agent computations; (iii) unreliable and limited communications; and (iv) inexact local computations. To tackle these challenges, we introduce the Distributed Operator Theoretical (DOT) version of the Alternating Direction Method of Multipliers (ADMM), which we call the DOT-ADMM Algorithm. We prove that it converges with a linear rate for a large class of convex learning problems (e.g., linear and logistic regression problems) toward a bounded neighborhood of the optimal time-varying solution, and characterize how the neighborhood depends on~$\text{(i)--(iv)}$. We corroborate the theoretical analysis with numerical simulations comparing the DOT-ADMM Algorithm with other state-of-the-art algorithms, showing that only the proposed algorithm exhibits robustness to (i)--(iv).

Thu, 31 Aug 2023digest

1.Optimal Stopping of BSDEs with Constrained Jumps and Related Zero-Sum Games

2308.16504

Authors:Magnus Perninge

Abstract: In this paper, we introduce a non-linear Snell envelope which at each time represents the maximal value that can be achieved by stopping a BSDE with constrained jumps. We establish the existence of the Snell envelope by employing a penalization technique and the primary challenge we encounter is demonstrating the regularity of the limit for the scheme. Additionally, we relate the Snell envelope to a finite horizon, zero-sum stochastic differential game, where one player controls a path-dependent stochastic system by invoking impulses, while the opponent is given the opportunity to stop the game prematurely. Importantly, by developing new techniques within the realm of control randomization, we demonstrate that the value of the game exists and is precisely characterized by our non-linear Snell envelope.

2.Interior point methods in optimal control problems of affine systems: Convergence results and solving algorithms

2308.16554

Authors:Paul Malisani IFPEN

Abstract: This paper presents an interior point method for pure-state and mixed-constrained optimal control problems for dynamics, mixed constraints, and cost function all affine in the control variable. This method relies on resolving a sequence of two-point boundary value problems of differential and algebraic equations. This paper establishes a convergence result for primal and dual variables of the optimal control problem. A primal and a primal-dual solving algorithm are presented, and a challenging numerical example is treated for illustration. Accepted for publication at SIAM SICON 2023

3.Investigating Sparse Reconfigurable Intelligent Surfaces (SRIS) via Maximum Power Transfer Efficiency Method Based on Convex Relaxation

2308.16658

Authors:Hans-Dieter Lang, Michel A. Nyffenegger, Heinz Mathis, Xingqi Zhang

Abstract: Reconfigurable intelligent surfaces (RISs) are widely considered to become an integral part of future wireless communication systems. Various methodologies exist to design such surfaces; however, most consider or require a very large number of tunable components. This not only raises system complexity, but also significantly increases power consumption. Sparse RISs (SRISs) consider using a smaller or even minimal number of tunable components to improve overall efficiency while maintaining sufficient RIS capability. The versatile semidefinite relaxation-based optimization method previously applied to transmit array antennas is adapted and applied accordingly, to evaluate the potential of different SRIS configurations. Because the relaxation is tight in all cases, the maximum possible performance is found reliably. Hence, with this approach, the trade-off between performance and sparseness of SRIS can be analyzed. Preliminary results show that even a much smaller number of reconfigurable elements, e.g. only 50%, can still have a significant impact.

4.On solving a rank regularized minimization problem via equivalent factorized column-sparse regularized models

2308.16690

Authors:Wenjing Li, Wei Bian, Kim-Chuan Toh

Abstract: Rank regularized minimization problem is an ideal model for the low-rank matrix completion/recovery problem. The matrix factorization approach can transform the high-dimensional rank regularized problem to a low-dimensional factorized column-sparse regularized problem. The latter can greatly facilitate fast computations in applicable algorithms, but needs to overcome the simultaneous non-convexity of the loss and regularization functions. In this paper, we consider the factorized column-sparse regularized model. Firstly, we optimize this model with bound constraints, and establish a certain equivalence between the optimized factorization problem and rank regularized problem. Further, we strengthen the optimality condition for stationary points of the factorization problem and define the notion of strong stationary point. Moreover, we establish the equivalence between the factorization problem and its a nonconvex relaxation in the sense of global minimizers and strong stationary points. To solve the factorization problem, we design two types of algorithms and give an adaptive method to reduce their computation. The first algorithm is from the relaxation point of view and its iterates own some properties from global minimizers of the factorization problem after finite iterations. We give some analysis on the convergence of its iterates to the strong stationary point. The second algorithm is designed for directly solving the factorization problem. We improve the PALM algorithm introduced by Bolte et al. (Math Program Ser A 146:459-494, 2014) for the factorization problem and give its improved convergence results. Finally, we conduct numerical experiments to show the promising performance of the proposed model and algorithms for low-rank matrix completion.

5.An Efficient Framework for Global Non-Convex Polynomial Optimization over the Hypercube

2308.16731

Authors:Pierre-David Letourneau, Dalton Jones, Matthew Morse, M. Harper Langston

Abstract: We present a novel efficient theoretical and numerical framework for solving global non-convex polynomial optimization problems. We analytically demonstrate that such problems can be efficiently reformulated using a non-linear objective over a convex set; further, these reformulated problems possess no spurious local minima (i.e., every local minimum is a global minimum). We introduce an algorithm for solving these resulting problems using the augmented Lagrangian and the method of Burer and Monteiro. We show through numerical experiments that polynomial scaling in dimension and degree is achievable for computing the optimal value and location of previously intractable global polynomial optimization problems in high dimension.

6.Moreau Envelope ADMM for Decentralized Weakly Convex Optimization

2308.16752

Authors:Reza Mirzaeifard, Naveen K. D. Venkategowda, Alexander Jung, Stefan Werner

Abstract: This paper proposes a proximal variant of the alternating direction method of multipliers (ADMM) for distributed optimization. Although the current versions of ADMM algorithm provide promising numerical results in producing solutions that are close to optimal for many convex and non-convex optimization problems, it remains unclear if they can converge to a stationary point for weakly convex and locally non-smooth functions. Through our analysis using the Moreau envelope function, we demonstrate that MADM can indeed converge to a stationary point under mild conditions. Our analysis also includes computing the bounds on the amount of change in the dual variable update step by relating the gradient of the Moreau envelope function to the proximal function. Furthermore, the results of our numerical experiments indicate that our method is faster and more robust than widely-used approaches.

7.A Divide and Conquer Approximation Algorithm for Partitioning Rectangles

2308.16899

Authors:Reyhaneh Mohammadi, Mehdi Behroozi

Abstract: Given a rectangle $R$ with area $A$ and a set of areas $L=\{A_1,...,A_n\}$ with $\sum_{i=1}^n A_i = A$, we consider the problem of partitioning $R$ into $n$ sub-regions $R_1,...,R_n$ with areas $A_1,...,A_n$ in a way that the total perimeter of all sub-regions is minimized. The goal is to create square-like sub-regions, which are often more desired. We propose a divide and conquer algorithm for this problem that finds factor $1.2$--approximate solutions in $\mathcal{O}(n\log n)$ time.

Wed, 30 Aug 2023digest

1.Variational Analysis of Kurdyka-Lojasiewicz Property by Way of Outer Limiting Subgradients

2308.15760

Authors:Minghua Li, Kaiwen Meng, Xiaoqi Yang

Abstract: In this paper, for a function $f$ locally lower semicontinuous at a stationary point $\bar{x}$, we obtain complete characterizations of the Kurdyka-{\L}ojasiewicz (for short, K{\L}) property and the exact estimate of the K{\L} modulus via the outer limiting subdifferential of an auxilliary function, and obtain a sufficient condition for verifying sharpness of the K{\L} exponent. By introducing a $\frac{1}{1-\theta}$-th subderivative $h$ for $f$ at $\bar{x}$, we show that the K{\L} property of $f$ at $\bar{x}$ with exponent $\theta\in [0, 1)$ can be inherited by $h$ at $0$ with the same exponent $\theta$, and that the K{\L} modulus of $f$ at $\bar{x}$ is bounded above by that of $(1-\theta)h$ at $0$. When $\theta=\frac12$, we obtain the reverse results under the strong metrically subregularity of the subgradient mapping for the class of prox-regular, twice epi-differentiable and subdifferentially continuous functions by virtue of Moreau envelopes. We apply the obtained results to establish the K{\L} property with exponent $\frac12$ and to provide calculations of the K{\L} modulus for smooth functions, the pointwise max of finitely many smooth functions and the $\ell_p$ ($0<p\leq 1$) regularized functions respectively. It is worth noting that these functions often appear in structured optimization problems.

2.A Note on Linear Quadratic Regulator and Kalman Filter

2308.15798

Authors:Midhun T. Augustine

Abstract: Two central problems in modern control theory are the controller design problem: which deals with designing a control law for the dynamical system, and the state estimation problem (observer design problem): which deals with computing an estimate of the states of the dynamical system. The Linear Quadratic Regulator (LQR) and Kalman Filter (KF) solves these problems respectively for linear dynamical systems in an optimal manner, i.e., LQR is an optimal state feedback controller and KF is an optimal state estimator. In this note, we will be discussing the basic concepts, derivation, steady-state analysis, and numerical implementation of the LQR and KF.

3.Design of Coherent Passive Quantum Equalizers Using Robust Control Theory

2308.15805

Authors:V. Ugrinovskii, M. R. James

Abstract: The paper develops a methodology for the design of coherent equalizing filters for quantum communication channels. Given a linear quantum system model of a quantum communication channel, the aim is to obtain another quantum system which, when coupled with the original system, mitigates degrading effects of the environment. The main result of the paper is a systematic equalizer synthesis algorithm which relies on methods of state-space robust control design via semidefinite programming.

4.Riemannian Optimistic Algorithms

2308.16004

Authors:Xi Wang, Deming Yuan, Yiguang Hong, Zihao Hu, Lei Wang, Guodong Shi

Abstract: In this paper, we consider Riemannian online convex optimization with dynamic regret. First, we propose two novel algorithms, namely the Riemannian Online Optimistic Gradient Descent (R-OOGD) and the Riemannian Adaptive Online Optimistic Gradient Descent (R-AOOGD), which combine the advantages of classical optimistic algorithms with the rich geometric properties of Riemannian manifolds. We analyze the dynamic regrets of the R-OOGD and R-AOOGD in terms of regularity of the sequence of cost functions and comparators. Next, we apply the R-OOGD to Riemannian zero-sum games, leading to the Riemannian Optimistic Gradient Descent Ascent algorithm (R-OGDA). We analyze the average iterate and best-iterate of the R-OGDA in seeking Nash equilibrium for a two-player, zero-sum, g-convex-concave games. We also prove the last-iterate convergence of the R-OGDA for g-strongly convex-strongly concave problems. Our theoretical analysis shows that all proposed algorithms achieve results in regret and convergence that match their counterparts in Euclidean spaces. Finally, we conduct several experiments to verify our theoretical findings.

5.Quasioptimal alternating projections and their use in low-rank approximation of matrices and tensors

2308.16097

Authors:Stanislav Budzinskiy

Abstract: We study the convergence of specific inexact alternating projections for two non-convex sets in a Euclidean space. The $\sigma$-quasioptimal metric projection ($\sigma \geq 1$) of a point $x$ onto a set $A$ consists of points in $A$ the distance to which is at most $\sigma$ times larger than the minimal distance $\mathrm{dist}(x,A)$. We prove that quasioptimal alternating projections, when one or both projections are quasioptimal, converge locally and linearly under the usual regularity assumptions on the two sets and their intersection. The theory is motivated by the successful application of alternating projections to low-rank matrix and tensor approximation. We focus on two problems -- nonnegative low-rank approximation and low-rank approximation in the maximum norm -- and develop fast alternating-projection algorithms for matrices and tensor trains based on cross approximation and acceleration techniques. The numerical experiments confirm that the proposed methods are efficient and suggest that they can be used to regularise various low-rank computational routines.

6.The Bus Rapid Transit Investment Problem

2308.16104

Authors:Rowan Hoogervorst, Evelien van der Hurk, Philine Schiewe, Anita Schöbel, Reena Urban

Abstract: Bus Rapid Transit (BRT) systems can provide a fast and reliable service to passengers at low investment costs compared to tram, metro and train systems. Therefore, they can be of great value to attract more passengers to use public transport. This paper thus focuses on the BRT investment problem: Which segments of a single bus line should be upgraded such that the number of newly attracted passengers is maximized? Motivated by the construction of a new BRT line around Copenhagen, we consider a setting in which multiple parties are responsible for different segments of the line. As each party has a limited willingness to invest, we solve a bi-objective problem to quantify the trade-off between the number of attracted passengers and the investment budget. We model different problem variations: First, we consider two potential passenger responses to upgrades on the line. Second, to prevent scattered upgrades along the line, we consider different restrictions on the number of upgraded connected components on the line. We propose an epsilon-constraint-based algorithm to enumerate the complete set of non-dominated points and investigate the complexity of this problem. Moreover, we perform extensive numerical experiments on artificial instances and a case study based on the BRT line around Copenhagen. Our results show that we can generate the full Pareto front for real-life instances and that the resulting trade-off between investment budget and attracted passengers depends both on the origin-destination demand and on the passenger response to upgrades. Moreover, we illustrate how the generated Pareto plots can assist decision makers in selecting from a set of geographical route alternatives in our case study.

Tue, 29 Aug 2023digest

1.A Geometric Algorithm for Maximizing the Distance over an Intersection of Balls to a Given Point

2308.15054

Authors:Marius Costandin, Beniamin Costandin

Abstract: In this paper the authors propose a polynomial algorithm which allows the computation of the farthest in an intersection of balls to a given point under three additional hypothesis: the farthest is unique, the distance to it is known and its magnitude is known. As a use case the authors analyze the subset sum problem SSP(S,T) for a given $S\in \mathbb{R}^n$ and $T \in \mathbb{R}$. The proposed approach is to write the SSP as a distance maximization over an intersection of balls. It was shown that the SSP has a solution if and only if the maximum value of the distance has a predefined value. This together with the fact that a solution is a corner of the unit hypercube, allows the authors to apply the proposed geometry results to find a solution to the SSP under the hypothesis that is unique.

2.Frequency-domain criterion on the stabilizability for infinite-dimensional linear control systems

2308.15082

Authors:Karl Kunisch, Gengsheng Wang, Huaiqiang Yu

Abstract: A quantitative frequency-domain condition related to the exponential stabilizability for infinite-dimensional linear control systems is presented. It is proven that this condition is necessary and sufficient for the stabilizability of special systems, while it is a necessary condition for the stabilizability in general. Applications are provided.

3.The Agricultural Spraying Vehicle Routing Problem With Splittable Edge Demands

2308.15108

Authors:Qian Wan, Rodolfo García-Flores, Simon A. Bowly, Philip Kilby, Andreas T. Ernst

Abstract: In horticulture, spraying applications occur multiple times throughout any crop year. This paper presents a splittable agricultural chemical sprayed vehicle routing problem and formulates it as a mixed integer linear program. The main difference from the classical capacitated arc routing problem (CARP) is that our problem allows us to split the demand on a single demand edge amongst robotics sprayers. We are using theoretical insights about the optimal solution structure to improve the formulation and provide two different formulations of the splittable capacitated arc routing problem (SCARP), a basic spray formulation and a large edge demands formulation for large edge demands problems. This study presents solution methods consisting of lazy constraints, symmetry elimination constraints, and a heuristic repair method. Computational experiments on a set of valuable data based on the properties of real-world agricultural orchard fields reveal that the proposed methods can solve the SCARP with different properties. We also report computational results on classical benchmark sets from previous CARP literature. The tested results indicated that the SCARP model can provide cheaper solutions in some instances when compared with the classical CARP literature. Besides, the heuristic repair method significantly improves the quality of the solution by decreasing the upper bound when solving large-scale problems.

4.Limited memory gradient methods for unconstrained optimization

2308.15145

Authors:Giulia Ferrandi, Michiel E. Hochstenbach

Abstract: The limited memory steepest descent method (Fletcher, 2012) for unconstrained optimization problems stores a few past gradients to compute multiple stepsizes at once. We review this method and propose new variants. For strictly convex quadratic objective functions, we study the numerical behavior of different techniques to compute new stepsizes. In particular, we introduce a method to improve the use of harmonic Ritz values. We also show the existence of a secant condition associated with LMSD, where the approximating Hessian is projected onto a low-dimensional space. In the general nonlinear case, we propose two new alternatives to Fletcher's method: first, the addition of symmetry constraints to the secant condition valid for the quadratic case; second, a perturbation of the last differences between consecutive gradients, to satisfy multiple secant equations simultaneously. We show that Fletcher's method can also be interpreted from this viewpoint.

5.Uniform Turnpike Property and Singular Limits

2308.15257

Authors:Martin Hernandez, Enrique Zuazua

Abstract: Motivated by singular limits for long-time optimal control problems, we investigate a class of parameter-dependent parabolic equations. First, we prove a turnpike result, uniform with respect to the parameters within a suitable regularity class and under appropriate bounds. The main ingredient of our proof is the justification of the uniform exponential stabilization of the corresponding Riccati equations, which is derived from the uniform null control properties of the model. Then, we focus on a heat equation with rapidly oscillating coefficients. In the one-dimensional setting, we obtain a uniform turnpike property with respect to the highly oscillatory heterogeneous medium. Afterward, we establish the homogenization of the turnpike property. Finally, our results are validated by numerical experiments.

6.Energy Space Newton Differentiability for Solution Maps of Unilateral and Bilateral Obstacle Problems

2308.15289

Authors:Constantin Christof, Gerd Wachsmuth

Abstract: We prove that the solution operator of the classical unilateral obstacle problem on a nonempty open bounded set $\Omega \subset \mathbb{R}^d$, $d \in \mathbb{N}$, is Newton differentiable as a function from $L^p(\Omega)$ to $H_0^1(\Omega)$ whenever $\max(1, 2d/(d+2)) < p \leq \infty$. By exploiting this Newton differentiability property, results on angled subspaces in $H^{-1}(\Omega)$, and a formula for orthogonal projections onto direct sums, we further show that the solution map of the classical bilateral obstacle problem is Newton differentiable as a function from $L^p(\Omega)$ to $H_0^1(\Omega)\cap L^q(\Omega)$ whenever $\max(1, d/2) < p \leq \infty$ and $1 \leq q <\infty$. For both the unilateral and the bilateral case, we provide explicit formulas for the Newton derivative. As a concrete application example for our results, we consider the numerical solution of an optimal control problem with $H_0^1(\Omega)$-controls and box-constraints by means of a semismooth Newton method.

7.Second-order methods for quartically-regularised cubic polynomials, with applications to high-order tensor methods

2308.15336

Authors:Coralia Cartis, Wenqi Zhu

Abstract: There has been growing interest in high-order tensor methods for nonconvex optimization, with adaptive regularization, as they possess better/optimal worst-case evaluation complexity globally and faster convergence asymptotically. These algorithms crucially rely on repeatedly minimizing nonconvex multivariate Taylor-based polynomial sub-problems, at least locally. Finding efficient techniques for the solution of these sub-problems, beyond the second-order case, has been an open question. This paper proposes a second-order method, Quadratic Quartic Regularisation (QQR), for efficiently minimizing nonconvex quartically-regularized cubic polynomials, such as the AR$p$ sub-problem [3] with $p=3$. Inspired by [35], QQR approximates the third-order tensor term by a linear combination of quadratic and quartic terms, yielding (possibly nonconvex) local models that are solvable to global optimality. In order to achieve accuracy $\epsilon$ in the first-order criticality of the sub-problem, we show that the error in the QQR method decreases either linearly or by at least $\mathcal{O}(\epsilon^{4/3})$ for locally convex iterations, while in the sufficiently nonconvex case, by at least $\mathcal{O}(\epsilon)$; thus improving, on these types of iterations, the general cubic-regularization bound. Preliminary numerical experiments indicate that two QQR variants perform competitively with state-of-the-art approaches such as ARC (also known as AR$p$ with $p=2$), achieving either a lower objective value or iteration counts.

8.Gauss-Newton oriented greedy algorithms for the reconstruction of operators in nonlinear dynamics

2308.15450

Authors:S. Buchwald, G. Ciaramella, J. Salomon

Abstract: This paper is devoted to the development and convergence analysis of greedy reconstruction algorithms based on the strategy presented in [Y. Maday and J. Salomon, Joint Proceedings of the 48th IEEE Conference on Decision and Control and the 28th Chinese Control Conference, 2009, pp. 375--379]. These procedures allow the design of a sequence of control functions that ease the identification of unknown operators in nonlinear dynamical systems. The original strategy of greedy reconstruction algorithms is based on an offline/online decomposition of the reconstruction process and an ansatz for the unknown operator obtained by an a priori chosen set of linearly independent matrices. In the previous work [S. Buchwald, G. Ciaramella and J. Salomon, SIAM J. Control Optim., 59(6), pp. 4511-4537], convergence results were obtained in the case of linear identification problems. We tackle here the more general case of nonlinear systems. More precisely, we introduce a new greedy algorithm based on the linearized system. Then, we show that the controls obtained with this new algorithm lead to the local convergence of the classical Gauss-Newton method applied to the online nonlinear identification problem. We then extend this result to the controls obtained on nonlinear systems where a local convergence result is also proved. The main convergence results are obtained for the reconstruction of drift operators in dynamical systems with linear and bilinear control structures.

Mon, 28 Aug 2023digest

1.The Nesterov-Spokoiny Acceleration: $o(1/k^2)$ Convergence without Proximal Operations

2308.14314

Authors:Weibin Peng, Tianyu Wang

Abstract: This paper studies a variant of an accelerated gradient algorithm of Nesterov and Spokoiny. We call this algorithm the Nesterov-Spokoiny Acceleration (NSA). The NSA algorithm satisfies the following properties for smooth convex programs, 1. The sequence $\{ \mathbf{x}_k \}_{k \in \mathbb{N}} $ governed by the NSA satisfies $ \limsup\limits_{k \to \infty } k^2 ( f (\mathbf{x}_k ) - f^* ) = 0 $, where $f^* > -\infty$ is the minimum of the smooth convex function $f$. 2. The sequence $\{ \mathbf{x}_k \}_{k \in \mathbb{N}} $ governed by the NSA satisfies $ \liminf\limits_{k \to \infty } k^2 \log k \log\log k ( f (\mathbf{x}_k ) - f^* ) = 0 $. 3. The sequence $\{ \mathbf{y}_k \}_{k \in \mathbb{N}} $ governed by NSA satisfies $ \liminf\limits_{k \to \infty } k^3 \log k \log\log k \| \nabla f ( \mathbf{y}_k ) \|^2 = 0 $. Item 1 above is perhaps more important than items 2 and 3: For general smooth convex programs, NSA is the first gradient algorithm that achieves $o(k^{-2})$ convergence rate without proximal operations. Some extensions of the NSA algorithm are also studied. Also, our study on a zeroth-order variant of NSA shows that $o(1/k^2)$ convergence can be achieved via estimated gradient.

2.General Discrete-Time Fokker-Planck Control by Power Moments

2308.14315

Authors:Guangyu Wu, Anders Lindquist

Abstract: In this paper, we address the so-called general Fokker-Planck control problem for discrete-time first-order linear systems. Unlike conventional treatments, we don't assume the distributions of the system states to be Gaussian. Instead, we only assume the existence and finiteness of the first several order power moments of the distributions. It is proved in the literature that there doesn't exist a solution, which has a form of conventional feedback control, to this problem. We propose a moment representation of the system to turn the original problem into a finite-dimensional one. Then a novel feedback control term, which is a mixture of a feedback term and a Markovian transition kernel term is proposed to serve as the control input of the moment system. The states of the moment system are obtained by maximizing the smoothness of the state transition. The power moments of the transition kernels are obtained by a convex optimization problem, of which the solution is proved to exist and be unique. Then they are mapped back to the probability distributions. The control inputs to the original system are then obtained by sampling from the realized distributions. Simulation results are provided to validate our algorithm in treating the general discrete-time Fokker-Planck control problem.

3.Calculation of Dispatchable Region for Renewables with Advanced Computational Techniques

2308.14330

Authors:Bin Liu, Thomas Brinsmead, Stefan Westerlund, Robert Davy

Abstract: Dispatchable region for renewables (DRR) depicts a space for renewables that a power system operator can manage by dispatching controllable resources. The DRR can be used to evaluate the distance from an operating point to a secure boundary and identify ramping events with the highest risk. However, existing approaches based on MILP reformulation or iteration-based LP algorithms may be computationally challenging. This paper investigates if advanced computation techniques, including high-performance computing and parallel computing techniques, can improve the computational performance.

4.On the identification of ARMA graphical models

2308.14384

Authors:Mattia Zorzi

Abstract: The paper considers the problem to estimate a graphical model corresponding to an autoregressive moving-average (ARMA) Gaussian stochastic process. We propose a new maximum entropy covariance and cepstral extension problem and we show that the problem admits an approximate solution which represents an ARMA graphical model whose topology is determined by the selected entries of the covariance lags considered in the extension problem. Then, we show how the corresponding dual problem is connected with the maximum likelihood principle. Such connection allows to design a Bayesian model and characterize an approximate maximum a posteriori estimator of the ARMA graphical model in the case the graph topology is unknown. We test the performance of the proposed method through some numerical experiments.

5.An iterative conditional dispatch algorithm for the dynamic dispatch waves problem

2308.14476

Authors:Leon Lan, Jasper van Doorn, Niels A. Wouda, Arpan Rijal, Sandjai Bhulai

Abstract: A challenge in same-day delivery operations is that delivery requests are typically not known beforehand, but are instead revealed dynamically during the day. This uncertainty introduces a trade-off between dispatching vehicles to serve requests as soon as they are revealed to ensure timely delivery, and delaying the dispatching decision to consolidate routing decisions with future, currently unknown requests. In this paper we study the dynamic dispatch waves problem, a same-day delivery problem in which vehicles are dispatched at fixed decision moments. At each decision moment, the system operator must decide which of the known requests to dispatch, and how to route these dispatched requests. The operator's goal is to minimize the total routing cost while ensuring all requests are served on time. We propose iterative conditional dispatch (ICD), an iterative solution construction procedure based on a sample scenario approach. ICD iteratively solves sample scenarios to classify requests to be dispatched, postponed, or undecided. The set of undecided requests shrinks in each iteration until a final dispatching decision is made in the last iteration We develop two variants of ICD: one variant based on thresholds, and another variant based on similarity. A significant strength of ICD is that it is conceptually simple and easy to implement. This simplicity does not harm performance: through rigorous numerical experiments, we show that both variants efficiently navigate the large state and action spaces of the dynamic dispatch waves problem and quickly converge to a high-quality solution. In particular, the threshold-based ICD variant improves over a greedy myopic strategy by 27.2% on average, and outperforms methods from the literature by 0.8% on average, and up to 1.5% in several cases.

6.On the interplay between pricing, competition and QoS in ride-hailing

2308.14496

Authors:Tushar Shankar Walunj, Shiksha Singhal, Jayakrishnan Nair, Veeraruna Kavitha

Abstract: We analyse a non-cooperative game between two competing ride-hailing platforms, each of which is modeled as a two-sided queueing system, where drivers (with a limited level of patience) are assumed to arrive according to a Poisson process at a fixed rate, while the arrival process of (price-sensitive) passengers is split across the two platforms based on Quality of Service (QoS) considerations. As a benchmark, we also consider a monopolistic scenario, where each platform gets half the market share irrespective of its pricing strategy. The key novelty of our formulation is that the total market share is fixed across the platforms. The game thus captures the competition between the platforms over market share, with pricing being the lever used by each platform to influence its share of the market. The market share split is modeled via two different QoS metrics: (i) probability that an arriving passenger gets a ride (driver availability), and (ii) probability that an arriving passenger gets an acceptable ride (driver availability and acceptable price). The platform aims to maximize the rate of revenue generated from matching drivers and passengers. In each of the above settings, we analyse the equilibria associated with the game in a certain limiting regime, where driver patience is scaled to infinity. We also show that these equilibria remain relevant in the more practically meaningful `pre-limit,' where drivers are highly (but not infinitely) patient. Interestingly, under the second QoS metric, we show that for a certain range of system parameters, no pure Nash equilibrium exists. Instead, we demonstrate a novel solution concept called an \textit{equilibrium cycle}, which has interesting dynamic connotations. Our results highlight the interplay between competition, passenger-side price sensitivity, and passenger/driver arrival rates.

7.Stochastic optimal control problems with delays in the state and in the control via viscosity solutions and an economical application

2308.14506

Authors:Filippo de Feo

Abstract: In this manuscript we consider optimal control problems of deterministic and stochastic differential equations with delays in the state and in the control. First we prove an equivalent Markovian reformulation on Hilbert spaces of the state equation. Then, using the dynamic programming approach for infinite-dimensional systems, we prove that the value function is the unique viscosity solution of the infinite-dimensional Hamilton Jacobi Bellman equation. Finally we apply this result to a stochastic optimal advertising problem with delays in the state and in the control.

8.Strict Dissipativity and turnpike for LQ Optimal Control Problems with Possibly Boundary Reference

2308.14609

Authors:Zhuqing Li, Roberto Guglielmi

Abstract: In this paper we investigate the turnpike property for constrained LQ optimal control problem in connection with dissipativity of the control system. We determine sufficient conditions to ensure the turnpike property in the case of a turnpike reference possibly occurring on the boundary of the state constraint set.

9.A real moment-HSOS hierarchy for complex polynomial optimization with real coefficients

2308.14631

Authors:Jie Wang, Victor Magron

Abstract: This paper proposes a real moment-HSOS hierarchy for complex polynomial optimization problems with real coefficients. We show that this hierarchy provides the same sequence of lower bounds as the complex analogue, yet is much cheaper to solve. In addition, we prove that global optimality is achieved when the ranks of the moment matrix and certain submatrix equal two in case that a sphere constraint is present, and as a consequence, the complex polynomial optimization problem has either two real optimal solutions or a pair of conjugate optimal solutions. A simple procedure for extracting a pair of conjugate optimal solutions is given in the latter case. Various numerical examples are presented to demonstrate the efficiency of this new hierarchy, and an application to polyphase code design is also provided.

10.Minimizing Quasi-Self-Concordant Functions by Gradient Regularization of Newton Method

2308.14742

Authors:Nikita Doikov

Abstract: We study the composite convex optimization problems with a Quasi-Self-Concordant smooth component. This problem class naturally interpolates between classic Self-Concordant functions and functions with Lipschitz continuous Hessian. Previously, the best complexity bounds for this problem class were associated with trust-region schemes and implementations of a ball-minimization oracle. In this paper, we show that for minimizing Quasi-Self-Concordant functions we can use instead the basic Newton Method with Gradient Regularization. For unconstrained minimization, it only involves a simple matrix inversion operation (solving a linear system) at each step. We prove a fast global linear rate for this algorithm, matching the complexity bound of the trust-region scheme, while our method remains especially simple to implement. Then, we introduce the Dual Newton Method, and based on it, develop the corresponding Accelerated Newton Scheme for this problem class, which further improves the complexity factor of the basic method. As a direct consequence of our results, we establish fast global linear rates of simple variants of the Newton Method applied to several practical problems, including Logistic Regression, Soft Maximum, and Matrix Scaling, without requiring additional assumptions on strong or uniform convexity for the target objective.

11.Matheuristic for Vehicle Routing Problem with Multiple Synchronization Constraints and Variable Service Time

2308.14744

Authors:Faisal Alkaabneh, Rabiatu Bonku

Abstract: This paper considers an extension of the vehicle routing problem with synchronization constraints and introduces the vehicle routing problem with multiple synchronization constraints and variable service time. This important problem is motivated by a real-world problem faced by one of the largest agricultural companies in the world providing precision agriculture services to their clients who are farmers and growers. The solution to this problem impacts the performance of farm spraying operations and can help design policies to improve spraying operations in large-scale farming. We propose a Mixed Integer Programming (MIP) model for this challenging problem, along with problem-specific valid inequalities. A three-phase powerful matheuristic is proposed to solve large instances enhanced with a novel local search method. We conduct extensive numerical analysis using realistic data. Results show that our matheuristic is fast and efficient in terms of solution quality and computational time compared to the state-of-the-art MIP solver. Using real-world data, we demonstrate the importance of considering an optimization approach to solve the problem, showing that the policy implemented in practice overestimates the costs by 15-20%. Finally, we compare and contrast the impact of various decision-maker preferences on several key performance metrics by comparing different mathematical models.

Fri, 25 Aug 2023digest

1.Optimal Planning in Habit Formation Models with Multiple Goods

2308.13470

Authors:Mauro Bambi, Daria Ghilli, Fausto Gozzi, Marta Leocata

Abstract: In this paper, on the line e.g. of [COW00]) we investigate a model with habit formation and two types of substitute goods. Such family of models, even in the case of 1 good, are difficult to study since their utility function is not concave in the interesting cases (see e.g. [BG20]), hence the first order conditions are not sufficient. We introduce and explain the model and provide some first results using the dynamic programming approach. Such results will form a solid ground over which a deep study of the features of the solutions can be performed.

2.A Fast Minimization Algorithm for the Euler Elastica Model Based on a Bilinear Decomposition

2308.13471

Authors:Zhifang Liu, Baochen Sun, Xue-Cheng Tai, Qi Wang, Huibin Chang

Abstract: The Euler Elastica (EE) model with surface curvature can generate artifact-free results compared with the traditional total variation regularization model in image processing. However, strong nonlinearity and singularity due to the curvature term in the EE model pose a great challenge for one to design fast and stable algorithms for the EE model. In this paper, we propose a new, fast, hybrid alternating minimization (HALM) algorithm for the EE model based on a bilinear decomposition of the gradient of the underlying image and prove the global convergence of the minimizing sequence generated by the algorithm under mild conditions. The HALM algorithm comprises three sub-minimization problems and each is either solved in the closed form or approximated by fast solvers making the new algorithm highly accurate and efficient. We also discuss the extension of the HALM strategy to deal with general curvature-based variational models, especially with a Lipschitz smooth functional of the curvature. A host of numerical experiments are conducted to show that the new algorithm produces good results with much-improved efficiency compared to other state-of-the-art algorithms for the EE model. As one of the benchmarks, we show that the average running time of the HALM algorithm is at most one-quarter of that of the fast operator-splitting-based Deng-Glowinski-Tai algorithm.

Thu, 24 Aug 2023digest

1.Convex envelopes of bounded monomials on two-variable cones

2308.12650

Authors:Pietro Belotti

Abstract: We consider an $n$-variate monomial function that is restricted both in value by lower and upper bounds and in domain by two homogeneous linear inequalities. Such functions are building blocks of several problems found in practical applications, and that fall under the class of Mixed Integer Nonlinear Optimization. We show that the upper envelope of the function in the given domain, for $n\ge 2$ is given by a conic inequality. We also present the lower envelope for $n=2$. To assess the applicability of branching rules based on homogeneous linear inequalities, we also derive the volume of the convex hull for $n=2$.

2.A Distributed Linear Quadratic Discrete-Time Game Approach to Formation Control with Collision Avoidance

2308.12775

Authors:Prima Aditya, Herbert Werner

Abstract: Formation control problems can be expressed as linear quadratic discrete-time games (LQDTG) for which Nash equilibrium solutions are sought. However, solving such problems requires solving coupled Riccati equations, which cannot be done in a distributed manner. A recent study showed that a distributed implementation is possible for a consensus problem when fictitious agents are associated with edges in the network graph rather than nodes. This paper proposes an extension of this approach to formation control with collision avoidance, where collision is precluded by including appropriate penalty terms on the edges. To address the problem, a state-dependent Riccati equation needs to be solved since the collision avoidance term in the cost function leads to a state-dependent weight matrix. This solution provides relative control inputs associated with the edges of the network graph. These relative inputs then need to be mapped to the physical control inputs applied at the nodes; this can be done in a distributed manner by iterating over a gradient descent search between neighbors in each sampling interval. Unlike inter-sample iteration frequently used in distributed MPC, only a matrix-vector multiplication is needed for each iteration step here, instead of an optimization problem to be solved. This approach can be implemented in a receding horizon manner, this is demonstrated through a numerical example.

Wed, 23 Aug 2023digest

1.Solving Elliptic Optimal Control Problems using Physics Informed Neural Networks

2308.11925

Authors:Bangti Jin, Ramesh Sau, Luowei Yin, Zhi Zhou

Abstract: In this work, we present and analyze a numerical solver for optimal control problems (without / with box constraint) for linear and semilinear second-order elliptic problems. The approach is based on a coupled system derived from the first-order optimality system of the optimal control problem, and applies physics informed neural networks (PINNs) to solve the coupled system. We present an error analysis of the numerical scheme, and provide $L^2(\Omega)$ error bounds on the state, control and adjoint state in terms of deep neural network parameters (e.g., depth, width, and parameter bounds) and the number of sampling points in the domain and on the boundary. The main tools in the analysis include offset Rademacher complexity and boundedness and Lipschitz continuity of neural network functions. We present several numerical examples to illustrate the approach and compare it with three existing approaches.

2.Non-ergodic linear convergence property of the delayed gradient descent under the strongly convexity and the Polyak-Łojasiewicz condition

2308.11984

Authors:Hyung Jun Choi, Woocheol Choi, Jinmyoung Seok

Abstract: In this work, we establish the linear convergence estimate for the gradient descent involving the delay $\tau\in\mathbb{N}$ when the cost function is $\mu$-strongly convex and $L$-smooth. This result improves upon the well-known estimates in Arjevani et al. \cite{ASS} and Stich-Karmireddy \cite{SK} in the sense that it is non-ergodic and is still established in spite of weaker constraint of cost function. Also, the range of learning rate $\eta$ can be extended from $\eta\leq 1/(10L\tau)$ to $\eta\leq 1/(4L\tau)$ for $\tau =1$ and $\eta\leq 3/(10L\tau)$ for $\tau \geq 2$, where $L >0$ is the Lipschitz continuity constant of the gradient of cost function. In a further research, we show the linear convergence of cost function under the Polyak-{\L}ojasiewicz\,(PL) condition, for which the available choice of learning rate is further improved as $\eta\leq 9/(10L\tau)$ for the large delay $\tau$. Finally, some numerical experiments are provided in order to confirm the reliability of the analyzed results.

3.An Accelerated Block Proximal Framework with Adaptive Momentum for Nonconvex and Nonsmooth Optimization

2308.12126

Authors:Weifeng Yang, Wenwen Min

Abstract: We propose an accelerated block proximal linear framework with adaptive momentum (ABPL$^+$) for nonconvex and nonsmooth optimization. We analyze the potential causes of the extrapolation step failing in some algorithms, and resolve this issue by enhancing the comparison process that evaluates the trade-off between the proximal gradient step and the linear extrapolation step in our algorithm. Furthermore, we extends our algorithm to any scenario involving updating block variables with positive integers, allowing each cycle to randomly shuffle the update order of the variable blocks. Additionally, under mild assumptions, we prove that ABPL$^+$ can monotonically decrease the function value without strictly restricting the extrapolation parameters and step size, demonstrates the viability and effectiveness of updating these blocks in a random order, and we also more obviously and intuitively demonstrate that the derivative set of the sequence generated by our algorithm is a critical point set. Moreover, we demonstrate the global convergence as well as the linear and sublinear convergence rates of our algorithm by utilizing the Kurdyka-Lojasiewicz (K{\L}) condition. To enhance the effectiveness and flexibility of our algorithm, we also expand the study to the imprecise version of our algorithm and construct an adaptive extrapolation parameter strategy, which improving its overall performance. We apply our algorithm to multiple non-negative matrix factorization with the $\ell_0$ norm, nonnegative tensor decomposition with the $\ell_0$ norm, and perform extensive numerical experiments to validate its effectiveness and efficiency.

4.Data-driven decision-focused surrogate modeling

2308.12161

Authors:Rishabh Gupta, Qi Zhang

Abstract: We introduce the concept of decision-focused surrogate modeling for solving computationally challenging nonlinear optimization problems in real-time settings. The proposed data-driven framework seeks to learn a simpler, e.g. convex, surrogate optimization model that is trained to minimize the decision prediction error, which is defined as the difference between the optimal solutions of the original and the surrogate optimization models. The learning problem, formulated as a bilevel program, can be viewed as a data-driven inverse optimization problem to which we apply a decomposition-based solution algorithm from previous work. We validate our framework through numerical experiments involving the optimization of common nonlinear chemical processes such as chemical reactors, heat exchanger networks, and material blending systems. We also present a detailed comparison of decision-focused surrogate modeling with standard data-driven surrogate modeling methods and demonstrate that our approach is significantly more data-efficient while producing simple surrogate models with high decision prediction accuracy.

5.Funnel MPC for nonlinear systems with arbitrary relative degree

2308.12217

Authors:Thomas Berger, Dario Dennstädt

Abstract: The Model Predictive Control (MPC) scheme Funnel MPC enables output tracking of smooth reference signals with prescribed error bounds for nonlinear multi-input multi-output systems with stable internal dynamics. Earlier works achieved the control objective for system with relative degree restricted to one or incorporated additional feasibility constraints in the optimal control problem. Here we resolve these limitations by introducing a modified stage cost function relying on a weighted sum of the tracking error derivatives. The weights need to be sufficiently large and we state explicit lower bounds. Under these assumptions we are able to prove initial and recursive feasibility of the novel Funnel MPC scheme for systems with arbitrary relative degree - without requiring any terminal conditions, a sufficiently long prediction horizon or additional output constraints.

Tue, 22 Aug 2023digest

1.Distorted optimal transport

2308.11238

Authors:Haiyan Liu, Bin Wang, Ruodu Wang, Sheng Chao Zhuang

Abstract: Classic optimal transport theory is built on minimizing the expected cost between two given distributions. We propose the framework of distorted optimal transport by minimizing a distorted expected cost. This new formulation is motivated by concrete problems in decision theory, robust optimization, and risk management, and it has many distinct features compared to the classic theory. We choose simple cost functions and study different distortion functions and their implications on the optimal transport plan. We show that on the real line, the comonotonic coupling is optimal for the distorted optimal transport problem when the distortion function is convex and the cost function is submodular and monotone. Some forms of duality and uniqueness results are provided. For inverse-S-shaped distortion functions and linear cost, we obtain the unique form of optimal coupling for all marginal distributions, which turns out to have an interesting ``first comonotonic, then counter-monotonic" dependence structure; for S-shaped distortion functions a similar structure is obtained. Our results highlight several challenges and features in distorted optimal transport, offering a new mathematical bridge between the fields of probability, decision theory, and risk management.

2.A Tight Formulation for the Dial-a-Ride Problem

2308.11285

Authors:Daniela Gaul, Kathrin Klamroth, Christian Pfeiffer, Arne Schulz, Michael Stiglmayr

Abstract: Ridepooling services play an increasingly important role in modern transportation systems. With soaring demand and growing fleet sizes, the underlying route planning problems become increasingly challenging. In this context, we consider the dial-a-ride problem (DARP): Given a set of transportation requests with pick-up and delivery locations, passenger numbers, time windows, and maximum ride times, an optimal routing for a fleet of vehicles, including an optimized passenger assignment, needs to be determined. We present tight mixed-integer linear programming (MILP) formulations for the DARP by combining two state-of-the-art models into novel location-augmented-event-based formulations. Strong valid inequalities and lower and upper bounding techniques are derived to further improve the formulations. We then demonstrate the theoretical and computational superiority of the new model: First, the formulation is tight in the sense that, if time windows shrink to a single point in time, the linear programming relaxation yields integer (and hence optimal) solutions. Second, extensive numerical experiments on benchmark instances show that computational times are on average reduced by 49.7% compared to state-of-the-art event-based approaches.

3.Reproducing kernel approach to linear quadratic mean field control problems

2308.11435

Authors:Pierre-Cyril Aubin-Frankowski, Alain Bensoussan

Abstract: Mean-field control problems have received continuous interest over the last decade. Despite being more intricate than in classical optimal control, the linear-quadratic setting can still be tackled through Riccati equations. Remarkably, we demonstrate that another significant attribute extends to the mean-field case: the existence of an intrinsic reproducing kernel Hilbert space associated with the problem. Our findings reveal that this Hilbert space not only encompasses deterministic controlled push-forward mappings but can also represent of stochastic dynamics. Specifically, incorporating Brownian noise affects the deterministic kernel through a conditional expectation, to make the trajectories adapted. Introducing reproducing kernels allows us to rewrite the mean-field control problem as optimizing over a Hilbert space of trajectories rather than controls. This framework even accommodates nonlinear terminal costs, without resorting to adjoint processes or Pontryagin's maximum principle, further highlighting the versatility of the proposed methodology.

4.Iterative risk-constrained model predictive control: A data-driven distributionally robust approach

2308.11510

Authors:Alireza Zolanvari, Ashish Cherukuri

Abstract: This paper proposes an iterative distributionally robust model predictive control (MPC) scheme to solve a risk-constrained infinite-horizon optimal control problem. In each iteration, the algorithm generates a trajectory from the starting point to the target equilibrium state with the aim of respecting risk constraints with high probability (that encodes safe operation of the system) and improving the cost of the trajectory as compared to previous iterations. At the end of each iteration, the visited states and observed samples of the uncertainty are stored and accumulated with the previous observations. For each iteration, the states stored previously are considered as terminal constraints of the MPC scheme, and samples obtained thus far are used to construct distributionally robust risk constraints. As iterations progress, more data is obtained and the environment is explored progressively to ensure better safety and cost optimality. We prove that the MPC scheme in each iteration is recursively feasible and the resulting trajectories converge asymptotically to the target while ensuring safety with high probability. We identify conditions under which the cost-to-go reduces as iterations progress. For systems with locally one-step reachable target, we specify scenarios that ensure finite-time convergence of iterations. We provide computationally tractable reformulations of the risk constraints for total variation and Wasserstein distance-based ambiguity sets. A simulation example illustrates the application of our results in finding a risk-constrained path for two mobile robots facing an uncertain obstacle.

5.Risk-Minimizing Two-Player Zero-Sum Stochastic Differential Game via Path Integral Control

2308.11546

Authors:Apurva Patil, Yujing Zhou, David Fridovich-Keil, Takashi Tanaka

Abstract: This paper addresses a continuous-time risk-minimizing two-player zero-sum stochastic differential game (SDG), in which each player aims to minimize its probability of failure. Failure occurs in the event when the state of the game enters into predefined undesirable domains, and one player's failure is the other's success. We derive a sufficient condition for this game to have a saddle-point equilibrium and show that it can be solved via a Hamilton-Jacobi-Isaacs (HJI) partial differential equation (PDE) with Dirichlet boundary condition. Under certain assumptions on the system dynamics and cost function, we establish the existence and uniqueness of the saddle-point of the game. We provide explicit expressions for the saddle-point policies which can be numerically evaluated using path integral control. This allows us to solve the game online via Monte Carlo sampling of system trajectories. We implement our control synthesis framework on two classes of risk-minimizing zero-sum SDGs: a disturbance attenuation problem and a pursuit-evasion game. Simulation studies are presented to validate the proposed control synthesis framework.

6.Decision-Making for Land Conservation: A Derivative-Free Optimization Framework with Nonlinear Inputs

2308.11549

Authors:Cassidy K. Buhler, Hande Y. Benson

Abstract: Protected areas (PAs) are designated spaces where human activities are restricted to preserve critical habitats. Decision-makers are challenged with balancing a trade-off of financial feasibility with ecological benefit when establishing PAs. Given the long-term ramifications of these decisions and the constantly shifting environment, it is crucial that PAs are carefully selected with long-term viability in mind. Using AI tools like simulation and optimization is common for designating PAs, but current decision models are primarily linear. In this paper, we propose a derivative-free optimization framework paired with a nonlinear component, population viability analysis (PVA). Formulated as a mixed integer nonlinear programming (MINLP) problem, our model allows for linear and nonlinear inputs. Connectivity, competition, crowding, and other similar concerns are handled by the PVA software, rather than expressed as constraints of the optimization model. In addition, we present numerical results that serve as a proof of concept, showing our models yield PAs with similar expected risk to that of preserving every parcel in a habitat, but at a significantly lower cost. The overall goal is to promote interdisciplinary work by providing a new mathematical programming tool for conservationists that allows for nonlinear inputs and can be paired with existing ecological software.

Mon, 21 Aug 2023digest

1.A relaxation method for binary orthogonal optimization problems with its applications

2308.10506

Authors:Lianghai Xiao, Yitian Qian, Shaohua Pan

Abstract: This paper focuses on a class of binary orthogonal optimization problems frequently arising in semantic hashing. Consider that this class of problems may have an empty feasible set, rendering them not well-defined. We introduce an equivalent model involving a restricted Stiefel manifold and a matrix box set, and then investigate its penalty problems induced by the $\ell_1$-distance from the box set and its Moreau envelope. The two penalty problems are always well-defined, and moreover, they serve as the global exact penalties provided that the original model is well-defined. Notably, the penalty problem induced by the Moreau envelope is a smooth optimization over an embedded submanifold with a favorable structure. We develop a retraction-based nonmonotone line-search Riemannian gradient method to address this penalty problem to achieve a desirable solution for the original binary orthogonal problems. Finally, the proposed method is applied to supervised and unsupervised hashing tasks and is compared with several popular methods on the MNIST and CIFAR-10 datasets. The numerical comparisons reveal that our algorithm is significantly superior to other solvers in terms of feasibility violation, and it is comparable even superior to others in terms of evaluation metrics related to the Hamming distance.

2.Universal Approximation of Parametric Optimization via Neural Networks with Piecewise Linear Policy Approximation

2308.10534

Authors:Hyunglip Bae, Jang Ho Kim, Woo Chang Kim

Abstract: Parametric optimization solves a family of optimization problems as a function of parameters. It is a critical component in situations where optimal decision making is repeatedly performed for updated parameter values, but computation becomes challenging when complex problems need to be solved in real-time. Therefore, in this study, we present theoretical foundations on approximating optimal policy of parametric optimization problem through Neural Networks and derive conditions that allow the Universal Approximation Theorem to be applied to parametric optimization problems by constructing piecewise linear policy approximation explicitly. This study fills the gap on formally analyzing the constructed piecewise linear approximation in terms of feasibility and optimality and show that Neural Networks (with ReLU activations) can be valid approximator for this approximation in terms of generalization and approximation error. Furthermore, based on theoretical results, we propose a strategy to improve feasibility of approximated solution and discuss training with suboptimal solutions.

3.The Unique Solvability Conditions for the Generalized Absolute Value Equations

2308.10536

Authors:Shubham Kumar, Deepmala

Abstract: This paper investigates the conditions that guarantee unique solvability and unsolvability for the generalized absolute value equations (GAVE) given by $Ax - B \vert x \vert = b$. Further, these conditions are also valid to determine the unique solution of the generalized absolute value matrix equations (GAVME) $AX - B \vert X \vert =F$. Finally, certain aspects related to the solvability and unsolvability of the absolute value equations (AVE) have been deliberated upon.

4.Decentralized Riemannian Conjugate Gradient Method on the Stiefel Manifold

2308.10547

Authors:Jun Chen, Haishan Ye, Mengmeng Wang, Tianxin Huang, Guang Dai, Ivor W. Tsang, Yong Liu

Abstract: The conjugate gradient method is a crucial first-order optimization method that generally converges faster than the steepest descent method, and its computational cost is much lower than the second-order methods. However, while various types of conjugate gradient methods have been studied in Euclidean spaces and on Riemannian manifolds, there has little study for those in distributed scenarios. This paper proposes a decentralized Riemannian conjugate gradient descent (DRCGD) method that aims at minimizing a global function over the Stiefel manifold. The optimization problem is distributed among a network of agents, where each agent is associated with a local function, and communication between agents occurs over an undirected connected graph. Since the Stiefel manifold is a non-convex set, a global function is represented as a finite sum of possibly non-convex (but smooth) local functions. The proposed method is free from expensive Riemannian geometric operations such as retractions, exponential maps, and vector transports, thereby reducing the computational complexity required by each agent. To the best of our knowledge, DRCGD is the first decentralized Riemannian conjugate gradient algorithm to achieve global convergence over the Stiefel manifold.

5.Restricted inverse optimal value problem on linear programming under weighted $l_1$ norm

2308.10563

Authors:Junhua Jia, Xiucui Guan, Xinqiang Qian, Panos M. Pardalos

Abstract: We study the restricted inverse optimal value problem on linear programming under weighted $l_1$ norm (RIOVLP $_1$). Given a linear programming problem $LP_c: \min \{cx|Ax=b,x\geq 0\}$ with a feasible solution $x^0$ and a value $K$, we aim to adjust the vector $c$ to $\bar{c}$ such that $x^0$ becomes an optimal solution of the problem LP$_{\bar c}$ whose objective value $\bar{c}x^0$ equals $K$. The objective is to minimize the distance $\|\bar c - c\|_1=\sum_{j=1}^nd_j|\bar c_j-c_j|$ under weighted $l_1$ norm.Firstly, we formulate the problem (RIOVLP$_1$) as a linear programming problem by dual theories. Secondly, we construct a sub-problem $(D^z)$, which has the same form as $LP_c$, of the dual (RIOVLP$_1$) problem corresponding to a given value $z$. Thirdly, when the coefficient matrix $A$ is unimodular, we design a binary search algorithm to calculate the critical value $z^*$ corresponding to an optimal solution of the problem (RIOVLP$_1$). Finally, we solve the (RIOV) problems on Hitchcock and shortest path problem, respectively, in $O(T_{MCF}\log\max\{d_{max},x^0_{max},n\})$ time, where we solve a sub-problem $(D^z)$ by minimum cost flow in $T_{MCF}$ time in each iteration. The values $d_{max},x^0_{max}$ are the maximum values of $d$ and $x^0$, respectively.

6.Feedback rectifiable pairs and stabilization of switched linear systems

2308.10591

Authors:Maria C. Honecker, Hannes Gernandt, Kai Wulff, Carsten Trunk, Johann Reger

Abstract: We address the feedback design problem for switched linear systems. In particular we aim to design a switched state-feedback such that the resulting closed-loop switched system is in upper triangular form. To this effect we formulate and analyse the feedback rectification problem for pairs of matrices. We present necessary and sufficient conditions for the feedback rectifiability of pairs for two subsystems and give a constructive procedure to design stabilizing state-feedback for a class of switched systems. Several examples illustrate the characteristics of the problem considered and the application of the proposed constructive procedure.

7.A Homogenization Approach for Gradient-Dominated Stochastic Optimization

2308.10630

Authors:Jiyuan Tan, Chenyu Xue, Chuwen Zhang, Qi Deng, Dongdong Ge, Yinyu Ye

Abstract: Gradient dominance property is a condition weaker than strong convexity, yet it sufficiently ensures global convergence for first-order methods even in non-convex optimization. This property finds application in various machine learning domains, including matrix decomposition, linear neural networks, and policy-based reinforcement learning (RL). In this paper, we study the stochastic homogeneous second-order descent method (SHSODM) for gradient-dominated optimization with $\alpha \in [1, 2]$ based on a recently proposed homogenization approach. Theoretically, we show that SHSODM achieves a sample complexity of $O(\epsilon^{-7/(2 \alpha) +1})$ for $\alpha \in [1, 3/2)$ and $\tilde{O}(\epsilon^{-2/\alpha})$ for $\alpha \in [3/2, 2]$. We further provide a SHSODM with a variance reduction technique enjoying an improved sample complexity of $O( \epsilon ^{-( 7-3\alpha ) /( 2\alpha )})$ for $\alpha \in [1,3/2)$. Our results match the state-of-the-art sample complexity bounds for stochastic gradient-dominated optimization without \emph{cubic regularization}. Since the homogenization approach only relies on solving extremal eigenvector problems instead of Newton-type systems, our methods gain the advantage of cheaper iterations and robustness in ill-conditioned problems. Numerical experiments on several RL tasks demonstrate the efficiency of SHSODM compared to other off-the-shelf methods.

Fri, 18 Aug 2023digest

1.Geometric characterizations for strong minima with applications to nuclear norm minimization problems

2308.09224

Authors:Jalal Fadili, Tran T. A. Nghia, Duy Nhat Phan

Abstract: In this paper, we introduce several geometric characterizations for strong minima of optimization problems. Applying these results to nuclear norm minimization problems allows us to obtain new necessary and sufficient quantitative conditions for this important property. Our characterizations for strong minima are weaker than the Restricted Injectivity and Nondegenerate Source Condition, which are usually used to identify solution uniqueness of nuclear norm minimization problems. Consequently, we obtain the minimum (tight) bound on the number of measurements for (strong) exact recovery of low-rank matrices.

Thu, 17 Aug 2023digest

1.Convex Optimization-Based Model Predictive Control for the Guidance of Active Debris Removal Transfers

2308.08783

Authors:Minduli Wijayatunga, Roberto Armellin, Harry Holt, Laura Pirovano, Claudio Bombardelli

Abstract: Active debris removal (ADR) missions have garnered significant interest as means of mitigating collision risks in space. This work proposes a convex optimization-based model predictive control (MPC) approach to provide guidance for such missions. While convex optimization can obtain optimal solutions in polynomial time, it relies on the successive convexification of nonconvex dynamics, leading to inaccuracies. Here, the need for successive convexification is eliminated by using near-linear Generalized Equinoctial Orbital Elements (GEqOE) and by updating the reference trajectory through a new split-Edelbaum approach. The solution accuracy is then measured relative to a high-fidelity dynamics model, showing that the MPC-convex method can generate accurate solutions without iterations.

2.Learning the hub graphical Lasso model with the structured sparsity via an efficient algorithm

2308.08852

Authors:Chengjing Wang, Peipei Tang, Wenling He, Meixia Lin

Abstract: Graphical models have exhibited their performance in numerous tasks ranging from biological analysis to recommender systems. However, graphical models with hub nodes are computationally difficult to fit, particularly when the dimension of the data is large. To efficiently estimate the hub graphical models, we introduce a two-phase algorithm. The proposed algorithm first generates a good initial point via a dual alternating direction method of multipliers (ADMM), and then warm starts a semismooth Newton (SSN) based augmented Lagrangian method (ALM) to compute a solution that is accurate enough for practical tasks. The sparsity structure of the generalized Jacobian ensures that the algorithm can obtain a nice solution very efficiently. Comprehensive experiments on both synthetic data and real data show that it obviously outperforms the existing state-of-the-art algorithms. In particular, in some high dimensional tasks, it can save more than 70\% of the execution time, meanwhile still achieves a high-quality estimation.

3.Stabilizability for nonautonomous linear parabolic equations with actuators as distributions

2308.08932

Authors:Karl Kunisch, Sérgio S. Rodrigues, Daniel Walter

Abstract: The stabilizability of a general class of abstract parabolic-like equations is investigated, with a finite number of actuators. This class includes the case of actuators given as delta distributions located at given points in the spatial domain of concrete parabolic equations. A stabilizing feedback control operator is constructed and given in explicit form. Then, an associated optimal control is considered and the corresponding Riccati feedback is investigated. Results of simulations are presented showing the stabilizing performance of both explicit and Riccati feedbacks.

4.Hitting the High-Dimensional Notes: An ODE for SGD learning dynamics on GLMs and multi-index models

2308.08977

Authors:Elizabeth Collins-Woodfin, Courtney Paquette, Elliot Paquette, Inbar Seroussi

Abstract: We analyze the dynamics of streaming stochastic gradient descent (SGD) in the high-dimensional limit when applied to generalized linear models and multi-index models (e.g. logistic regression, phase retrieval) with general data-covariance. In particular, we demonstrate a deterministic equivalent of SGD in the form of a system of ordinary differential equations that describes a wide class of statistics, such as the risk and other measures of sub-optimality. This equivalence holds with overwhelming probability when the model parameter count grows proportionally to the number of data. This framework allows us to obtain learning rate thresholds for stability of SGD as well as convergence guarantees. In addition to the deterministic equivalent, we introduce an SDE with a simplified diffusion coefficient (homogenized SGD) which allows us to analyze the dynamics of general statistics of SGD iterates. Finally, we illustrate this theory on some standard examples and show numerical simulations which give an excellent match to the theory.

5.Progressively Strengthening and Tuning MIP Solvers for Reoptimization

2308.08986

Authors:Krunal Kishor Patel

Abstract: This paper explores reoptimization techniques for solving sequences of similar mixed integer programs (MIPs) more effectively. Traditionally, these MIPs are solved independently, without capitalizing on information from previously solved instances. Our approach focuses on primal bound improvements by reusing the solutions of the previously solved instances as well as dual bound improvements by reusing the branching history and automating parameter-tuning. We also describe ways to improve the solver performance by extending ideas from reliability branching to generate better pseudocosts. Our reoptimization approach, which we developed for the computational competition of the MIP 2023 workshop, earned us the first prize. In this paper, we thoroughly analyze the performance of each technique and their combined impact on the solver's performance. Finally, we present ways to extend our techniques in practice for further improvements.

6.Derivative-Free Global Minimization in One Dimension: Relaxation, Monte Carlo, and Sampling

2308.09050

Authors:Alexandra A. Gomes, Diogo A. Gomes

Abstract: We introduce a derivative-free global optimization algorithm that efficiently computes minima for various classes of one-dimensional functions, including non-convex, and non-smooth functions.This algorithm numerically approximates the gradient flow of a relaxed functional, integrating strategies such as Monte Carlos methods, rejection sampling, and adaptive techniques. These strategies enhance performance in solving a diverse range of optimization problems while significantly reducing the number of required function evaluations compared to established methods. We present a proof of the convergence of the algorithm and illustrate its performance by comprehensive benchmarking. The proposed algorithm offers a substantial potential for real-world models. It is particularly advantageous in situations requiring computationally intensive objective function evaluations.

7.A non-convex relaxed version of minimax theorems

2308.09111

Authors:M. I. A. Ghitri, A. Hantoute

Abstract: Given a subset $A\times B$ of a locally convex space $X\times Y$ (with $A$ compact) and a function $f:A\times B\rightarrow\overline{\mathbb{R}}$ such that $f(\cdot,y),$ $y\in B,$ are concave and upper semicontinuous, the minimax inequality $\max_{x\in A} \inf_{y\in B} f(x,y) \geq \inf_{y\in B} \sup_{x\in A_{0}} f(x,y)$ is shown to hold provided that $A_{0}$ be the set of $x\in A$ such that $f(x,\cdot)$ is proper, convex and lower semi-contiuous. Moreover, if in addition $A\times B\subset f^{-1}(\mathbb{R})$, then we can take as $A_{0}$ the set of $x\in A$ such that $f(x,\cdot)$ is convex. The relation to Moreau's biconjugate representation theorem is discussed, and some applications to\ convex duality are provided. Key words. Minimax theorem, Moreau theorem, conjugate function, convex optimization.

8.A DPG method for linear quadratic optimal control problems

2308.09169

Authors:Thomas Führer, Francisco Fuica

Abstract: The DPG method with optimal test functions for solving linear quadratic optimal control problems with control constraints is studied. We prove existence of a unique optimal solution of the nonlinear discrete problem and characterize it through first order optimality conditions. Furthermore, we systematically develop a priori as well as a posteriori error estimates. Our proposed method can be applied to a wide range of constrained optimal control problems subject to, e.g., scalar second-order PDEs and the Stokes equations. Numerical experiments that illustrate our theoretical findings are presented.

9.Linear Parameter Varying Power Regulation of Variable Speed Pitch Manipulated Wind Turbine in the Full Load Regime

2308.09190

Authors:T. Shaqarin, Mahmoud M. S. Al-Suod

Abstract: In a wind energy conversion system (WECS), changing the pitch angle of the wind turbine blades is a typical practice to regulate the electrical power generation in the full-load regime. Due to the turbulent nature of the wind and the large variations of the mean wind speed during the day, the rotary elements of the WECS are subjected to significant mechanical stresses and fatigue, resulting in conceivably mechanical failures and higher maintenance costs. Consequently, it is imperative to design a control system capable of handling continuous wind changes. In this work, Linear Parameter Varying (LPV) H_inf controller is used to cope with wind variations and turbulent winds with a turbulence intensity greater than 10%. The proposed controller is designed to regulate the rotational rotor speed and generator torque, thus, regulating the output power via pitch angle manipulations. In addition, a PI-Fuzzy control system is designed to be compared with the proposed control system. The closed-loop simulations of both controllers established the robustness and stability of the suggested LPV controller under large wind velocity variations, with minute power fluctuations compared to the PI-Fuzzy controller. The results show that in the presence of turbulent wind speed variations, the proposed LPV controller achieves improved transient and steady-state performance along with reduced mechanical loads in the above-rated wind speed region.

Wed, 16 Aug 2023digest

1.Stochastic Controlled Averaging for Federated Learning with Communication Compression

2308.08165

Authors:Xinmeng Huang, Ping Li, Xiaoyun Li

Abstract: Communication compression, a technique aiming to reduce the information volume to be transmitted over the air, has gained great interests in Federated Learning (FL) for the potential of alleviating its communication overhead. However, communication compression brings forth new challenges in FL due to the interplay of compression-incurred information distortion and inherent characteristics of FL such as partial participation and data heterogeneity. Despite the recent development, the performance of compressed FL approaches has not been fully exploited. The existing approaches either cannot accommodate arbitrary data heterogeneity or partial participation, or require stringent conditions on compression. In this paper, we revisit the seminal stochastic controlled averaging method by proposing an equivalent but more efficient/simplified formulation with halved uplink communication costs. Building upon this implementation, we propose two compressed FL algorithms, SCALLION and SCAFCOM, to support unbiased and biased compression, respectively. Both the proposed methods outperform the existing compressed FL methods in terms of communication and computation complexities. Moreover, SCALLION and SCAFCOM accommodates arbitrary data heterogeneity and do not make any additional assumptions on compression errors. Experiments show that SCALLION and SCAFCOM can match the performance of corresponding full-precision FL approaches with substantially reduced uplink communication, and outperform recent compressed FL methods under the same communication budget.

2.Learning to Pivot as a Smart Expert

2308.08171

Authors:Tianhao Liu, Shanwen Pu, Dongdong Ge, Yinyu Ye

Abstract: Linear programming has been practically solved mainly by simplex and interior point methods. Compared with the weakly polynomial complexity obtained by the interior point methods, the existence of strongly polynomial bounds for the length of the pivot path generated by the simplex methods remains a mystery. In this paper, we propose two novel pivot experts that leverage both global and local information of the linear programming instances for the primal simplex method and show their excellent performance numerically. The experts can be regarded as a benchmark to evaluate the performance of classical pivot rules, although they are hard to directly implement. To tackle this challenge, we employ a graph convolutional neural network model, trained via imitation learning, to mimic the behavior of the pivot expert. Our pivot rule, learned empirically, displays a significant advantage over conventional methods in various linear programming problems, as demonstrated through a series of rigorous experiments.

3.A Joint Electricity and Carbon Pricing Method

2308.08195

Authors:Yue Chen, Changhong Zhao

Abstract: The joint electricity and carbon pricing (JECP) problem is crucial for the low-carbon energy system transition. It is also challenging due to requirements such as providing incentives that can motivate market participants to follow the dispatch schedule and minimizing the impact on affected parties compared to when they were in the traditional electricity market. This letter proposes a novel JECP method based on partial carbon tax and primal-dual optimality conditions. Several nice properties of the proposed method are proven. Tests on different systems show its advantages over the two existing pricing methods.

4.Norm and time optimal control problems of stochastic heat equations

2308.08202

Authors:Yuanhang Liu, Donghui Yang, Jie Zhong

Abstract: This paper investigates the norm and time optimal control problems for stochastic heat equations. We begin by presenting a characterization of the norm optimal control, followed by a discussion of its properties. We then explore the equivalence between the norm optimal control and time optimal control, and subsequently establish the bang-bang property of the time optimal control. These problems, to the best of our knowledge, are among the first to discuss in the stochastic case.

5.SCQPTH: an efficient differentiable splitting method for convex quadratic programming

2308.08232

Authors:Andrew Butler

Abstract: We present SCQPTH: a differentiable first-order splitting method for convex quadratic programs. The SCQPTH framework is based on the alternating direction method of multipliers (ADMM) and the software implementation is motivated by the state-of-the art solver OSQP: an operating splitting solver for convex quadratic programs (QPs). The SCQPTH software is made available as an open-source python package and contains many similar features including efficient reuse of matrix factorizations, infeasibility detection, automatic scaling and parameter selection. The forward pass algorithm performs operator splitting in the dimension of the original problem space and is therefore suitable for large scale QPs with $100-1000$ decision variables and thousands of constraints. Backpropagation is performed by implicit differentiation of the ADMM fixed-point mapping. Experiments demonstrate that for large scale QPs, SCQPTH can provide a $1\times - 10\times$ improvement in computational efficiency in comparison to existing differentiable QP solvers.

6.Global solution and optimal control of an epidemic propagation with a heterogeneous diffusion

2308.08251

Authors:Pierluigi Colli, Gianni Gilardi, Gabriela Marinoschi

Abstract: In this paper, we explore the solvability and the optimal control problem for a compartmental model based on reaction-diffusion partial differential equations describing a transmissible disease. The nonlinear model takes into account the disease spreading due to the human social diffusion, under a dynamic heterogeneity in infection risk. The analysis of the resulting system provides the existence proof for a global solution and determines the conditions of optimality to reduce the concentration of the infected population in certain spatial areas.

7.A Framework for Data-Driven Explainability in Mathematical Optimization

2308.08309

Authors:Kevin-Martin Aigner, Marc Goerigk, Michael Hartisch, Frauke Liers, Arthur Miehlich

Abstract: Advancements in mathematical programming have made it possible to efficiently tackle large-scale real-world problems that were deemed intractable just a few decades ago. However, provably optimal solutions may not be accepted due to the perception of optimization software as a black box. Although well understood by scientists, this lacks easy accessibility for practitioners. Hence, we advocate for introducing the explainability of a solution as another evaluation criterion, next to its objective value, which enables us to find trade-off solutions between these two criteria. Explainability is attained by comparing against (not necessarily optimal) solutions that were implemented in similar situations in the past. Thus, solutions are preferred that exhibit similar features. Although we prove that already in simple cases the explainable model is NP-hard, we characterize relevant polynomially solvable cases such as the explainable shortest-path problem. Our numerical experiments on both artificial as well as real-world road networks show the resulting Pareto front. It turns out that the cost of enforcing explainability can be very small.

8.Digital twinning of cardiac electrophysiology models from the surface ECG: a geodesic backpropagation approach

2308.08410

Authors:Thomas Grandits, Jan Verhülsdonk, Gundolf Haase, Alexander Effland, Simone Pezzuto

Abstract: The eikonal equation has become an indispensable tool for modeling cardiac electrical activation accurately and efficiently. In principle, by matching clinically recorded and eikonal-based electrocardiograms (ECGs), it is possible to build patient-specific models of cardiac electrophysiology in a purely non-invasive manner. Nonetheless, the fitting procedure remains a challenging task. The present study introduces a novel method, Geodesic-BP, to solve the inverse eikonal problem. Geodesic-BP is well-suited for GPU-accelerated machine learning frameworks, allowing us to optimize the parameters of the eikonal equation to reproduce a given ECG. We show that Geodesic-BP can reconstruct a simulated cardiac activation with high accuracy in a synthetic test case, even in the presence of modeling inaccuracies. Furthermore, we apply our algorithm to a publicly available dataset of a rabbit model, with very positive results. Given the future shift towards personalized medicine, Geodesic-BP has the potential to help in future functionalizations of cardiac models meeting clinical time constraints while maintaining the physiological accuracy of state-of-the-art cardiac models.

9.Constrained Global Optimization by Smoothing

2308.08422

Authors:Vladimir Norkin, Alois Pichler, Anton Kozyriev

Abstract: This paper proposes a novel technique called "successive stochastic smoothing" that optimizes nonsmooth and discontinuous functions while considering various constraints. Our methodology enables local and global optimization, making it a powerful tool for many applications. First, a constrained problem is reduced to an unconstrained one by the exact nonsmooth penalty function method, which does not assume the existence of the objective function outside the feasible area and does not require the selection of the penalty coefficient. This reduction is exact in the case of minimization of a lower semicontinuous function under convex constraints. Then the resulting objective function is sequentially smoothed by the kernel method starting from relatively strong smoothing and with a gradually vanishing degree of smoothing. The finite difference stochastic gradient descent with trajectory averaging minimizes each smoothed function locally. Finite differences over stochastic directions sampled from the kernel estimate the stochastic gradients of the smoothed functions. We investigate the convergence rate of such stochastic finite-difference method on convex optimization problems. The "successive smoothing" algorithm uses the results of previous optimization runs to select the starting point for optimizing a consecutive, less smoothed function. Smoothing provides the "successive smoothing" method with some global properties. We illustrate the performance of the "successive stochastic smoothing" method on test-constrained optimization problems from the literature.

10.Differentiable Robust Model Predictive Control

2308.08426

Authors:Alex Oshin, Evangelos A. Theodorou

Abstract: Deterministic model predictive control (MPC), while powerful, is often insufficient for effectively controlling autonomous systems in the real-world. Factors such as environmental noise and model error can cause deviations from the expected nominal performance. Robust MPC algorithms aim to bridge this gap between deterministic and uncertain control. However, these methods are often excessively difficult to tune for robustness due to the nonlinear and non-intuitive effects that controller parameters have on performance. To address this challenge, a unifying perspective on differentiable optimization for control is presented, which enables derivation of a general, differentiable tube-based MPC algorithm. The proposed approach facilitates the automatic and real-time tuning of robust controllers in the presence of large uncertainties and disturbances.

11.Episodic Bayesian Optimal Control with Unknown Randomness Distributions

2308.08478

Authors:Alexander Shapiro, Enlu Zhou, Yifan Lin, Yuhao Wang

Abstract: Stochastic optimal control with unknown randomness distributions has been studied for a long time, encompassing robust control, distributionally robust control, and adaptive control. We propose a new episodic Bayesian approach that incorporates Bayesian learning with optimal control. In each episode, the approach learns the randomness distribution with a Bayesian posterior and subsequently solves the corresponding Bayesian average estimate of the true problem. The resulting policy is exercised during the episode, while additional data/observations of the randomness are collected to update the Bayesian posterior for the next episode. We show that the resulting episodic value functions and policies converge almost surely to their optimal counterparts of the true problem if the parametrized model of the randomness distribution is correctly specified. We further show that the asymptotic convergence rate of the episodic value functions is of the order $O(N^{-1/2})$. We develop an efficient computational method based on stochastic dual dynamic programming for a class of problems that have convex value functions. Our numerical results on a classical inventory control problem verify the theoretical convergence results and demonstrate the effectiveness of the proposed computational method.

12.Generalizing the Min-Max Regret Criterion using Ordered Weighted Averaging

2308.08522

Authors:Werner Baak, Marc Goerigk, Adam Kasperski, Paweł Zieliński

Abstract: In decision making under uncertainty, several criteria have been studied to aggregate the performance of a solution over multiple possible scenarios, including the ordered weighted averaging (OWA) criterion and min-max regret. This paper introduces a novel generalization of min-max regret, leveraging the modeling power of OWA to enable a more nuanced expression of preferences in handling regret values. This new OWA regret approach is studied both theoretically and numerically. We derive several properties, including polynomially solvable and hard cases, and introduce an approximation algorithm. Through computational experiments using artificial and real-world data, we demonstrate the advantages of our OWAR method over the conventional min-max regret approach, alongside the effectiveness of the proposed clustering heuristics.

Tue, 15 Aug 2023digest

1.Q-Learning for Continuous State and Action MDPs under Average Cost Criteria

2308.07591

Authors:Ali Devran Kara, Serdar Yuksel

Abstract: For infinite-horizon average-cost criterion problems, we present several approximation and reinforcement learning results for Markov Decision Processes with standard Borel spaces. Toward this end, (i) we first provide a discretization based approximation method for fully observed Markov Decision Processes (MDPs) with continuous spaces under average cost criteria, and we provide error bounds for the approximations when the dynamics are only weakly continuous under certain ergodicity assumptions. In particular, we relax the total variation condition given in prior work to weak continuity as well as Wasserstein continuity conditions. (ii) We provide synchronous and asynchronous Q-learning algorithms for continuous spaces via quantization, and establish their convergence. (iii) We show that the convergence is to the optimal Q values of the finite approximate models constructed via quantization. Our Q-learning convergence results and their convergence to near optimality are new for continuous spaces, and the proof method is new even for finite spaces, to our knowledge.

2.Entropic Model Predictive Optimal Transport for Underactuated Linear Systems

2308.07599

Authors:Kaito Ito, Kenji Kashima

Abstract: This letter investigates dynamical optimal transport of underactuated linear systems over an infinite time horizon. In our previous work, we proposed to integrate model predictive control and the celebrated Sinkhorn algorithm to perform efficient dynamical transport of agents. However, the proposed method requires the invertibility of input matrices, which severely limits its applicability. To resolve this issue, we extend the method to (possibly underactuated) controllable linear systems. In addition, we ensure the convergence properties of the method for general controllable linear systems. The effectiveness of the proposed method is demonstrated by a numerical example.

3.Quantile Optimization via Multiple Timescale Local Search for Black-box Functions

2308.07607

Authors:Jiaqiao Hu, Meichen Song, Michael C. Fu

Abstract: We consider quantile optimization of black-box functions that are estimated with noise. We propose two new iterative three-timescale local search algorithms. The first algorithm uses an appropriately modified finite-difference-based gradient estimator that requires $2d$ + 1 samples of the black-box function per iteration of the algorithm, where $d$ is the number of decision variables (dimension of the input vector). For higher-dimensional problems, this algorithm may not be practical if the black-box function estimates are expensive. The second algorithm employs a simultaneous-perturbation-based gradient estimator that uses only three samples for each iteration regardless of problem dimension. Under appropriate conditions, we show the almost sure convergence of both algorithms. In addition, for the class of strongly convex functions, we further establish their (finite-time) convergence rate through a novel fixed-point argument. Simulation experiments indicate that the algorithms work well on a variety of test problems and compare well with recently proposed alternative methods.

4.60 years of cyclic monotonicity: a survey

2308.07682

Authors:A. Kausamo, L. De Pascale, K. Wyczesany

Abstract: The primary purpose of this note is to provide an instructional summary of the state of the art regarding cyclic monotonicity and related notions. We will also present how these notions are tied to optimality in the optimal transport (or Monge-Kantorovich) problem.

5.high-order proximal point algorithm for the monotone variational inequality problem and its application

2308.07689

Authors:Jingyu Gao, Xiurui Geng

Abstract: The proximal point algorithm (PPA) has been developed to solve the monotone variational inequality problem. It provides a theoretical foundation for some methods, such as the augmented Lagrangian method (ALM) and the alternating direction method of multipliers (ADMM). This paper generalizes the PPA to the $p$th-order ($p\geq 1$) and proves its convergence rate $O \left(1/k^{p/2}\right)$ . Additionally, the $p$th-order ALM is proposed based on the $p$th-order PPA. Some numerical experiments are presented to demonstrate the performance of the $p$th-order ALM.

6.A Fast Smoothing Newton Method for Bilevel Hyperparameter Optimization for SVC with Logistic Loss

2308.07734

Authors:Yixin Wang, Qingna Li

Abstract: Support Vector Classification with logistic loss has excellent theoretical properties in classification problems where the label values are not continuous. In this paper, we reformulate the hyperparameter selection for SVC with logistic loss as a bilevel optimization problem in which the upper-level problem and the lower-level problem are both based on logistic loss. The resulting bilevel optimization model is converted to a single-level nonlinear programming (NLP) problem based on the KKT conditions of the lower-level problem. Such NLP contains a set of nonlinear equality constraints and a simple lower bound constraint. The second-order sufficient condition is characterized, which guarantees that the strict local optimizers are obtained. To solve such NLP, we apply the smoothing Newton method proposed in \cite{Liang} to solve the KKT conditions, which contain one pair of complementarity constraints. We show that the smoothing Newton method has a superlinear convergence rate. Extensive numerical results verify the efficiency of the proposed approach and strict local minimizers can be achieved both numerically and theoretically. In particular, compared with other methods, our algorithm can achieve competitive results while consuming less time than other methods.

7.Optimization of piecewise smooth shapes under uncertainty using the example of Navier-Stokes flow

2308.07742

Authors:Caroline Geiersbach, Tim Suchan, Kathrin Welker

Abstract: We investigate a complex system involving multiple shapes to be optimized in a domain, taking into account geometric constraints on the shapes and uncertainty appearing in the physics. We connect the differential geometry of product shape manifolds with multi-shape calculus, which provides a novel framework for the handling of piecewise smooth shapes. This multi-shape calculus is applied to a shape optimization problem where shapes serve as obstacles in a system governed by steady state incompressible Navier-Stokes flow. Numerical experiments use our recently developed stochastic augmented Lagrangian method and we investigate the choice of algorithmic parameters using the example of this application.

8.An efficient sieving based secant method for sparse optimization problems with least-squares constraints

2308.07812

Authors:Qian Li, Defeng Sun, Yancheng Yuan

Abstract: In this paper, we propose an efficient sieving based secant method to address the computational challenges of solving sparse optimization problems with least-squares constraints. A level-set method has been introduced in [X. Li, D.F. Sun, and K.-C. Toh, SIAM J. Optim., 28 (2018), pp. 1842--1866] that solves these problems by using the bisection method to find a root of a univariate nonsmooth equation $\varphi(\lambda) = \varrho$ for some $\varrho > 0$, where $\varphi(\cdot)$ is the value function computed by a solution of the corresponding regularized least-squares optimization problem. When the objective function in the constrained problem is a polyhedral gauge function, we prove that (i) for any positive integer $k$, $\varphi(\cdot)$ is piecewise $C^k$ in an open interval containing the solution $\lambda^*$ to the equation $\varphi(\lambda) = \varrho$; (ii) the Clarke Jacobian of $\varphi(\cdot)$ is always positive. These results allow us to establish the essential ingredients of the fast convergence rates of the secant method. Moreover, an adaptive sieving technique is incorporated into the secant method to effectively reduce the dimension of the level-set subproblems for computing the value of $\varphi(\cdot)$. The high efficiency of the proposed algorithm is demonstrated by extensive numerical results.

Mon, 14 Aug 2023digest

1.Non-Myopic Sensor Control for Target Search and Track Using a Sample-Based GOSPA Implementation

2308.07088

Authors:Marcel Hernandez, Angel Garcia-Fernandez, Simon Maskell

Abstract: This paper is concerned with sensor management for target search and track using the generalised optimal subpattern assignment (GOSPA) metric. Utilising the GOSPA metric to predict future system performance is computationally challenging, because of the need to account for uncertainties within the scenario, notably the number of targets, the locations of targets, and the measurements generated by the targets subsequent to performing sensing actions. In this paper, efficient sample-based techniques are developed to calculate the predicted mean square GOSPA metric. These techniques allow for missed detections and false alarms, and thereby enable the metric to be exploited in scenarios more complex than those previously considered. Furthermore, the GOSPA methodology is extended to perform non-myopic (i.e. multi-step) sensor management via the development of a Bellman-type recursion that optimises a conditional GOSPA-based metric. Simulations for scenarios with missed detections, false alarms, and planning horizons of up to three time steps demonstrate the approach, in particular showing that optimal plans align with an intuitive understanding of how taking into account the opportunity to make future observations should influence the current action. It is concluded that the GOSPA-based, non-myopic search and track algorithm offers a powerful mechanism for sensor management.

2.Existence of Markov equilibrium control in discrete time

2308.07227

Authors:Erhan Bayraktar, Bingyan Han

Abstract: For time-inconsistent stochastic controls in discrete time and finite horizon, an open problem in Bj\"ork and Murgoci (Finance Stoch, 2014) is the existence of an equilibrium control. A nonrandomized Borel measurable Markov equilibrium policy exists if the objective is inf-compact in every time step. We provide a sufficient condition for the inf-compactness and thus existence, with costs that are lower semicontinuous (l.s.c.) and bounded from below and transition kernels that are continuous in controls under given states. The control spaces need not to be compact.

3.Self-Healing First-Order Distributed Optimization with Packet Loss

2308.07246

Authors:Israel L. Donato Ridgley, Randy A. Freeman, Kevin M. Lynch

Abstract: We describe SH-SVL, a parameterized family of first-order distributed optimization algorithms that enable a network of agents to collaboratively calculate a decision variable that minimizes the sum of cost functions at each agent. These algorithms are self-healing in that their convergence to the correct optimizer can be guaranteed even if they are initialized randomly, agents join or leave the network, or local cost functions change. We also present simulation evidence that our algorithms are self-healing in the case of dropped communication packets. Our algorithms are the first single-Laplacian methods for distributed convex optimization to exhibit all of these characteristics. We achieve self-healing by sacrificing internal stability, a fundamental trade-off for single-Laplacian methods.

4.Vibrational Stabilization of Cluster Synchronization in Oscillator Networks

2308.07302

Authors:Yuzhen Qin, Alberto Maria Nobili, Danielle S. Bassett, Fabio Pasqualetti

Abstract: Cluster synchronization is of paramount importance for the normal functioning of numerous technological and natural systems. Deviations from normal cluster synchronization patterns are closely associated with various malfunctions, such as neurological disorders in the brain. Therefore, it is crucial to restore normal system functions by stabilizing the appropriate cluster synchronization patterns. Most existing studies focus on designing controllers based on state measurements to achieve system stabilization. However, in many real-world scenarios, measuring system states, such as neuronal activity in the brain, poses significant challenges, rendering the stabilization of such systems difficult. To overcome this challenge, in this paper, we employ an open-loop control strategy, vibrational control, which does not requires any state measurements. We establish some sufficient conditions under which vibrational inputs stabilize cluster synchronization. Further, we provide a tractable approach to design vibrational control. Finally, numerical experiments are conducted to demonstrate our theoretical findings.

Fri, 11 Aug 2023digest

1.Comparison of Dynamic Tomato Growth Models for Optimal Control in Greenhouses

2308.06031

Authors:Michael Fink, Annalena Daniels, Cheng Qian, Víctor Martínez Velásquez, Sahil Salotra, Dirk Wollherr

Abstract: As global demand for efficiency in agriculture rises, there is a growing interest in high-precision farming practices. Particularly greenhouses play a critical role in ensuring a year-round supply of fresh produce. In order to maximize efficiency and productivity while minimizing resource use, mathematical techniques such as optimal control have been employed. However, selecting appropriate models for optimal control requires domain expertise. This study aims to compare three established tomato models for their suitability in an optimal control framework. Results show that all three models have similar yield predictions and accuracy, but only two models are currently applicable for optimal control due to implementation limitations. The two remaining models each have advantages in terms of economic yield and computation times, but the differences in optimal control strategies suggest that they require more accurate parameter identification and calibration tailored to greenhouses.

Thu, 10 Aug 2023digest

1.Existence theorems for optimal solutions in semi-algebraic optimization

2308.05349

Authors:Jae Hyoung Lee, Gue Myung Lee, Tien Son Pham

Abstract: Consider the problem of minimizing a lower semi-continuous semi-algebraic function $f \colon \mathbb{R}^n \to \mathbb{R} \cup \{+\infty\}$ on an unbounded closed semi-algebraic set $S \subset \mathbb{R}^n.$ Employing adequate tools of semi-algebraic geometry, we first establish some properties of the tangency variety of the restriction of $f$ on $S.$ Then we derive verifiable necessary and sufficient conditions for the existence of optimal solutions of the problem as well as the boundedness from below and coercivity of the restriction of $f$ on $S.$ We also present a computable formula for the optimal value of the problem.

2.Optimal Control of Dynamic District Heating Networks

2308.05376

Authors:Christian Jäkle, Lena Reichle, Stefan Volkwein

Abstract: In the present paper an optimal control problem for a system of differential-algebraic equations (DAEs) is considered. This problem arises in the dynamic optimization of unsteady district heating networks. Based on the Carath\'eodory theory existence of a unique solution to the DAE system is proved using specific properties of the district heating network model. Moreover, it is shown that the optimal control problem possesses optimal solutions. For the numerical experiments different networks are considered including also data from a real district heating network.

3.A Generalized Primal-Dual Correction Method for Saddle-Point Problems with the Nonlinear Coupling Operator

2308.05388

Authors:Sai Wang, Yi Gong

Abstract: Recently, the generalized primal-dual (GPD) method was developed for saddle-point problems (SPPs) with a linear coupling operator. However, the coupling operator in many engineering applications is nonlinear. In this letter, we propose a generalized primal-dual correction method (GPD-CM) to handle SPPs with a nonlinear coupling operator. To achieve this, we customize the proximal matrix and corrective matrix by adjusting the values of regularization factors. By the unified framework, the convergence of GPD-CM is directly obtained. Numerical results on a SPP with an exponential coupling operator support theoretical analysis.

4.Communication-efficient distributed optimization with adaptability to system heterogeneity

2308.05395

Authors:Ziyi Yu, Nikolaos M. Freris

Abstract: We consider the setting of agents cooperatively minimizing the sum of local objectives plus a regularizer on a graph. This paper proposes a primal-dual method in consideration of three distinctive attributes of real-life multi-agent systems, namely: (i)expensive communication, (ii)lack of synchronization, and (iii)system heterogeneity. In specific, we propose a distributed asynchronous algorithm with minimal communication cost, in which users commit variable amounts of local work on their respective sub-problems. We illustrate this both theoretically and experimentally in the machine learning setting, where the agents hold private data and use a stochastic Newton method as the local solver. Under standard assumptions on Lipschitz continuous gradients and strong convexity, our analysis establishes linear convergence in expectation and characterizes the dependency of the rate on the number of local iterations. We proceed a step further to propose a simple means for tuning agents' hyperparameters locally, so as to adjust to heterogeneity and accelerate the overall convergence. Last, we validate our proposed method on a benchmark machine learning dataset to illustrate the merits in terms of computation, communication, and run-time saving as well as adaptability to heterogeneity.

5.Unifying Distributionally Robust Optimization via Optimal Transport Theory

2308.05414

Authors:Jose Blanchet, Daniel Kuhn, Jiajin Li, Bahar Taskesen

Abstract: In the past few years, there has been considerable interest in two prominent approaches for Distributionally Robust Optimization (DRO): Divergence-based and Wasserstein-based methods. The divergence approach models misspecification in terms of likelihood ratios, while the latter models it through a measure of distance or cost in actual outcomes. Building upon these advances, this paper introduces a novel approach that unifies these methods into a single framework based on optimal transport (OT) with conditional moment constraints. Our proposed approach, for example, makes it possible for optimal adversarial distributions to simultaneously perturb likelihood and outcomes, while producing an optimal (in an optimal transport sense) coupling between the baseline model and the adversarial model.Additionally, the paper investigates several duality results and presents tractable reformulations that enhance the practical applicability of this unified framework.

6.Bounding the Difference between the Values of Robust and Non-Robust Markov Decision Problems

2308.05520

Authors:Ariel Neufeld, Julian Sester

Abstract: In this note we provide an upper bound for the difference between the value function of a distributionally robust Markov decision problem and the value function of a non-robust Markov decision problem, where the ambiguity set of probability kernels of the distributionally robust Markov decision process is described by a Wasserstein-ball around some reference kernel whereas the non-robust Markov decision process behaves according to a fixed probability kernel contained in the ambiguity set. Our derived upper bound for the difference between the value functions is dimension-free and depends linearly on the radius of the Wasserstein-ball.

7.Learning (With) Distributed Optimization

2308.05548

Authors:Aadharsh Aadhithya A, Abinesh S, Akshaya J, Jayanth M, Vishnu Radhakrishnan, Sowmya V, Soman K. P

Abstract: This paper provides an overview of the historical progression of distributed optimization techniques, tracing their development from early duality-based methods pioneered by Dantzig, Wolfe, and Benders in the 1960s to the emergence of the Augmented Lagrangian Alternating Direction Inexact Newton (ALADIN) algorithm. The initial focus on Lagrangian relaxation for convex problems and decomposition strategies led to the refinement of methods like the Alternating Direction Method of Multipliers (ADMM). The resurgence of interest in distributed optimization in the late 2000s, particularly in machine learning and imaging, demonstrated ADMM's practical efficacy and its unifying potential. This overview also highlights the emergence of the proximal center method and its applications in diverse domains. Furthermore, the paper underscores the distinctive features of ALADIN, which offers convergence guarantees for non-convex scenarios without introducing auxiliary variables, differentiating it from traditional augmentation techniques. In essence, this work encapsulates the historical trajectory of distributed optimization and underscores the promising prospects of ALADIN in addressing non-convex optimization challenges.

8.Disturbance attenuation in the Euler-Bernoulli beam using piezoelectric actuators

2308.05551

Authors:Anton Selivanov, Emilia Fridman

Abstract: We consider a simply-supported Euler-Bernoulli beam with viscous and Kelvin-Voigt damping. Our objective is to attenuate the effect of an unknown distributed disturbance using one piezoelectric actuator. We show how to design a suitable $H_\infty$ state-feedback controller based on a finite number of dominating modes. If the remaining (infinitely many) modes are ignored, the calculated $L^2$ gain is wrong. This happens because of the spillover phenomenon that occurs when the effect of the control on truncated modes is not accounted for in the feedback design. We propose a simple modification of the $H_\infty$ cost that prevents spillover. The key idea is to treat the control as a disturbance in the truncated modes and find the corresponding $L^2$ gains using the bounded real lemma. These $L^2$ gains are added to the control weight in the $H_\infty$ cost for the dominating modes, which prevents spillover. A numerical simulation of an aluminum beam with realistic parameters demonstrates the effectiveness of the proposed method.

9.Intercept Function and Quantity Bidding in Two-stage Electricity Market with Market Power Mitigation

2308.05570

Authors:Rajni Kant Bansal, Yue Chen, Pengcheng You, Enrique Mallada

Abstract: Electricity markets typically operate in two stages, day-ahead and real-time. Despite best efforts striving efficiency, evidence of price manipulation has called for system-level market power mitigation (MPM) initiatives that substitute noncompetitive bids with default bids. Implementing these policies with a limited understanding of participant behavior may lead to unintended economic losses. In this paper, we model the competition between generators and inelastic loads in a two-stage market with stage-wise MPM policies. The loss of Nash equilibrium and lack of guarantee of stable market outcome in the case of conventional supply function bidding motivates the use of an alternative market mechanism where generators bid an intercept function. A Nash equilibrium analysis for a day-ahead MPM policy leads to a Stackelberg-Nash game with loads exercising market power at the expense of generators. A comparison of the resulting equilibrium with the standard market (not implementing any MPM policy) shows that a day-ahead policy completely mitigates the market power of generators. On the other hand, the real-time MPM policy increases demand allocation to real-time, contrary to current market practice with most electricity trades in the day-ahead market. Numerical studies illustrate the impact of the slope of the intercept function on the standard market.

Wed, 09 Aug 2023digest

1.Expected decrease for derivative-free algorithms using random subspaces

2308.04734

Authors:Warren Hare, Lindon Roberts, Clément W. Royer

Abstract: Derivative-free algorithms seek the minimum of a given function based only on function values queried at appropriate points. Although these methods are widely used in practice, their performance is known to worsen as the problem dimension increases. Recent advances in developing randomized derivative-free techniques have tackled this issue by working in low-dimensional subspaces that are drawn at random in an iterative fashion. The connection between the dimension of these random subspaces and the algorithmic guarantees has yet to be fully understood. In this paper, we develop an analysis for derivative-free algorithms (both direct-search and model-based approaches) employing random subspaces. Our results leverage linear local approximations of smooth functions to obtain understanding of the expected decrease achieved per function evaluation. Although the quantities of interest involve multidimensional integrals with no closed-form expression, a relative comparison for different subspace dimensions suggest that low dimension is preferable. Numerical computation of the quantities of interest confirm the benefit of operating in low-dimensional subspaces.

2.Modelling and Simulation of District Heating Networks

2308.04790

Authors:Christian Jäkle, Lena Reichle, Stefan Volkwein

Abstract: In the present paper a detailed mathematical model is derived for district heating networks. After semidiscretization of the convective heat equation and introducing coupling conditions at the nodes of the network one gets a high-dimensional system of differential-algebraic equations (DAEs). Neglecting temporal changes of the water velocity in the pipes, the numerical solutions do not change significantly and the DAEs have index one. Numerical experiments illustrate that the model describes the real situation very well.

3.Periodic optimal control of a plug flow reactor model with an isoperimetric constraint

2308.04804

Authors:Yevgeniia Yevgenieva, Alexander Zuyev, Peter Benner, Andreas Seidel-Morgenstern

Abstract: We study a class of nonlinear hyperbolic partial differential equations with boundary control. This class describes chemical reactions of the type ``$A \to$ product'' carried out in a plug flow reactor (PFR) in the presence of an inert component. An isoperimetric optimal control problem with periodic boundary conditions and input constraints is formulated for the considered mathematical model in order to maximize the mean amount of product over the period. For the single-input system, the optimality of a bang-bang control strategy is proved in the class of bounded measurable inputs. The case of controlled flow rate input is also analyzed by exploiting the method of characteristics. A case study is performed to illustrate the performance of the reaction model under different control strategies.

4.Fourier series and sidewise profile control of 1-d waves

2308.04906

Authors:E. Zuazua

Abstract: We discuss the sidewise control properties of 1-d waves. In analogy with classical control and inverse problems for wave propagation, the problem consists on controlling the behaviour of waves on part of the boundary of the domain where they propagate, by means of control actions localised on a different subset of the boundary. In contrast with classical problems, the goal is not to control the dynamics of the waves on the interior of the domain, but rather their boundary traces. It is therefore a goal oriented controllability problem. We propose a duality method that reduces the problem to suitable new observability inequalities, which consist of estimating the boundary traces of waves on part of the boundary from boundary measurements done on another subset of the boundary. These inequalities lead to novel questions that do not seem to be treatable by the classical techniques employed in the field, such as Carleman inequalities, non-harmonic Fourier series, microlocal analysis and multipliers. We propose a genuinely 1-d solution method, based on sidewise energy propagation estimates yielding a complete sharp solution. The obtained observability results can be reinterpreted in terms of Fourier series. This leads to new non-standard questions in the context of non-harmonic Fourier series.

5.How to induce regularization in generalized linear models: A guide to reparametrizing gradient flow

2308.04921

Authors:Hung-Hsu Chou, Johannes Maly, Dominik Stöger

Abstract: In this work, we analyze the relation between reparametrizations of gradient flow and the induced implicit bias on general linear models, which encompass various basic classification and regression tasks. In particular, we aim at understanding the influence of the model parameters - reparametrization, loss, and link function - on the convergence behavior of gradient flow. Our results provide user-friendly conditions under which the implicit bias can be well-described and convergence of the flow is guaranteed. We furthermore show how to use these insights for designing reparametrization functions that lead to specific implicit biases like $\ell_p$- or trigonometric regularizers.

6.Comparative analysis of mathematical formulations for the two-dimensional guillotine cutting problem

2308.04965

Authors:Henrique Becker, Mateus Martin, Olinto Araujo, Luciana S. Buriol, Reinaldo Morabito

Abstract: About ten years ago, a paper proposed the first integer linear programming formulation for the constrained two-dimensional guillotine cutting problem (with unlimited cutting stages). Since, six other formulations followed, five of them in the last two years. This spike of interest gave no opportunity for a comprehensive comparison between the formulations. We review each formulation and compare their empirical results over instance datasets of the literature. We adapt most formulations to allow for piece rotation. The possibility of adaptation was already predicted but not realized by the prior work. The results show the dominance of pseudo-polynomial formulations until the point instances become intractable by them, while more compact formulations keep achieving good primal solutions. Our study also reveals a small but consistent advantage of the Gurobi solver over the CPLEX solver in our context; that the choice of solver hardly benefits one formulation over another; and a mistake in the generation of the T instances, which should have the same optima with or without guillotine cuts. Our study also proposes hybridising the most recent formulation with a prior formulation for a restricted version of the problem. The hybridisations show a reduction of about 20% of the branch-and-bound time thanks to the symmetries broken by the hybridisation.

7.Boosting Data-Driven Mirror Descent with Randomization, Equivariance, and Acceleration

2308.05045

Authors:Hong Ye Tan, Subhadip Mukherjee, Junqi Tang, Carola-Bibiane Schönlieb

Abstract: Learning-to-optimize (L2O) is an emerging research area in large-scale optimization for data science applications. Very recently, researchers have proposed a novel L2O framework called learned mirror descent (LMD), based on the classical mirror descent (MD) algorithm, with learnable mirror maps parameterized by input-convex neural networks. The LMD approach has been shown to significantly accelerate convex solvers while inheriting the convergence properties of the classical MD algorithm. Despite the initial successes in small-/mid-scale optimization problems demonstrating the potential of this framework, there is still a long way to go to make this scheme scalable and practical for high-dimensional problems. In this work, we provide several practical extensions of the LMD algorithm. We first propose accelerated and stochastic variants of LMD, leveraging classical momentum-based acceleration and stochastic optimization techniques for improving the convergence rate and per-iteration complexity. Moreover, for the particular application of training neural networks, we derive and propose a novel and efficient parameterization for the mirror potential, exploiting the equivariant structure of the training problems to significantly reduce the dimensionality of the underlying problem. We provide theoretical convergence guarantees for our schemes under standard assumptions, and demonstrate their effectiveness in various computational imaging and machine learning applications such as image inpainting and the training of SVMs.

8.A Nesterov type algorithm with double Tikhonov regularization: fast convergence of the function values and strong convergence to the minimal norm solution

2308.05056

Authors:Mikhail Karapetyants, Szilárd Csaba László

Abstract: We investigate the strong convergence properties of a Nesterov type algorithm with two Tikhonov regularization terms in connection to the minimization problem of a smooth convex function $f.$ We show that the generated sequences converge strongly to the minimal norm element from $\argmin f$. We also show that from a practical point of view the Tikhonov regularization does not affect Nesterov's optimal convergence rate of order $\mathcal{O}(n^{-2})$ for the potential energies $f(x_n)-\min f$ and $f(y_n)-\min f$, where $(x_n),\,(y_n)$ are the sequences generated by our algorithm. Further, we obtain fast convergence to zero of the discrete velocity, but also some estimates concerning the value of the gradient of the objective function in the generated sequences.

9.Impact of environmental constraints in hydrothermal energy planning

2308.05091

Authors:Luís Felipe Bueno, André Luiz Diniz, Rafael Durbano Lobato, Claudia Sagastizábal, Kenny Vinente

Abstract: As a follow-up of the industrial problems dealt with in 2018, 2019, 2021 and 2022, in partnership with CCEE and CEPEL, in 2023 the study group Energy planning and environmental constraints focused on the impact that prioritizing multiple uses of water has on the electric energy production systems, specially in predominantly hydro systems, which is the case of Brazil. In order to model environmental constraints in the long-term hydrothermal generation planning problem, the resulting large-scale multi-stage linear programming problem was modelled in JuMP and solved by stochastic dual dynamic programming. To assess if the development represented well the behavior of the Brazilian power system, the Julia formulation first was benchmarked with Brazil s official model, Newave. Environmental constraints were introduced in this problem by two different approaches, one that represents the multiple uses of water by means of 0-1 variables, and another one that makes piecewise linear approximations of the relevant constraints. Numerical results show that penalties of slack variables strongly affect the obtained water values.

10.Optimal design of vaccination policies: A case study for Newfoundland and Labrador

2308.05204

Authors:Faraz Khoshbakhtian, Hamidreza Validi, Mario Ventresca, Dionne Aleman

Abstract: This paper proposes pandemic mitigation vaccination policies for Newfoundland and Labrador (NL) based on two compact mixed integer programming (MIP) models of the distance-based critical node detection problem (DCNDP). Our main focus is on two variants of the DCNDP that seek to minimize the number of connections with lengths of at most one (1-DCNDP) and two (2-DCNDP). A polyhedral study for the 1-DCNDP is conducted, and new aggregated inequalities are provided for the 2-DCNDP. The computational experiments show that the 2-DCNDP with aggregated inequalities outperforms the one with disaggregated inequalities for graphs with a density of at least 0.5%. We also study the strategic vaccine allocation problem as a real-world application of the DCNDP and conduct a set of computational experiments on a simulated contact network of NL. Our computational results demonstrate that the DCNDP-based strategies can have a better performance in comparison with the real-world strategies implemented during COVID-19.

11.Improving preference disaggregation in multicriteria decision making: incorporating time series analysis and a multi-objective approach

2308.05259

Authors:Betania S. C. Campello, Sarah BenAmor, Leonardo T. Duarte, João Marcos Travassos Romano

Abstract: Preference disaggregation analysis (PDA) is a widely used approach in multicriteria decision analysis that aims to extract preferential information from holistic judgments provided by decision makers. This paper presents an original methodological framework for PDA that addresses two significant challenges in this field. Firstly, it considers the multidimensional structure of data to capture decision makers' preferences based on descriptive measures of the criteria time series, such as trend and average. This novel approach enables an understanding of decision makers' preferences in decision-making scenarios involving time series analysis, which is common in medium- to long-term impact decisions. Secondly, the paper addresses the robustness issue commonly encountered in PDA methods by proposing a multi-objective and Monte Carlo simulation approach. This approach enables the consideration of multiple preference models and provides a mechanism to converge towards the most likely preference model. The proposed method is evaluated using real data, demonstrating its effectiveness in capturing preferences based on criteria and time series descriptive measures. The multi-objective analysis highlights the generation of multiple solutions, and, under specific conditions, reveals the possibility of achieving convergence towards a single solution that represents the decision maker's preferences.

Tue, 08 Aug 2023digest

1.Symplectic Discretization Approach for Developing New Proximal Point Algorithms

2308.03986

Authors:Ya-xiang Yuan, Yi Zhang

Abstract: Proximal point algorithms have found numerous applications in the field of convex optimization, and their accelerated forms have also been proposed. However, the most commonly used accelerated proximal point algorithm was first introduced in 1967, and recent studies on accelerating proximal point algorithms are relatively scarce. In this paper, we propose high-resolution ODEs for the proximal point operators for both closed proper convex functions and maximally monotone operators, and present a Lyapunov function framework to demonstrate that the trajectories of our high-resolution ODEs exhibit accelerated behavior. Subsequently, by symplectically discretizing our high-resolution ODEs, we obtain new proximal point algorithms known as symplectic proximal point algorithms. By decomposing the continuous-time Lyapunov function into its elementary components, we demonstrate that symplectic proximal point algorithms possess $O(1/k^2)$ convergence rates.

Mon, 07 Aug 2023digest

1.Non-Convex Bilevel Optimization with Time-Varying Objective Functions

2308.03811

Authors:Sen Lin, Daouda Sow, Kaiyi Ji, Yingbin Liang, Ness Shroff

Abstract: Bilevel optimization has become a powerful tool in a wide variety of machine learning problems. However, the current nonconvex bilevel optimization considers an offline dataset and static functions, which may not work well in emerging online applications with streaming data and time-varying functions. In this work, we study online bilevel optimization (OBO) where the functions can be time-varying and the agent continuously updates the decisions with online streaming data. To deal with the function variations and the unavailability of the true hypergradients in OBO, we propose a single-loop online bilevel optimizer with window averaging (SOBOW), which updates the outer-level decision based on a window average of the most recent hypergradient estimations stored in the memory. Compared to existing algorithms, SOBOW is computationally efficient and does not need to know previous functions. To handle the unique technical difficulties rooted in single-loop update and function variations for OBO, we develop a novel analytical technique that disentangles the complex couplings between decision variables, and carefully controls the hypergradient estimation error. We show that SOBOW can achieve a sublinear bilevel local regret under mild conditions. Extensive experiments across multiple domains corroborate the effectiveness of SOBOW.

2.Multi-criteria scheduling of realistic flexible job shop: a novel approach for integrating simulation modelling and multi-criteria decision making

2308.03379

Authors:M. Thenarasu G-SCOP\_DOME2S, K. Rameshkumar G-SCOP\_DOME2S, M. Di Mascolo G-SCOP\_DOME2S, S. P. Anbuudayasankar

Abstract: Increased flexibility in job shops leads to more complexity in decision-making for shop floor engineers. Partial Flexible Job Shop Scheduling (PFJSS) is a subset of Job shop problems and has substantial application in the real world. Priority Dispatching Rules (PDRs) are simple and easy to implement for making quick decisions in real-time. The present study proposes a novel method of integrating Multi-Criteria Decision Making (MCDM) methods and the Discrete Event Simulation (DES) Model to define job priorities in large-scale problems involving multiple criteria. DES approach is employed to model the PFJSS to evaluate Makespan, Flow Time, and Tardiness-based measures considering static and dynamic job arrivals. The proposed approach is implemented in a benchmark problem and large-scale PFJSS. The integration of MCDM methods and simulation models offers the flexibility to choose the parameters that need to govern the ranking of jobs. The solution given by the proposed methods is tested with the best-performing Composite Dispatching Rules (CDR), combining several PDR, which are available in the literature. Proposed MCDM approaches perform well for Makespan, Flow Time, and Tardiness-based measures for large-scale real-world problems. The proposed methodology integrated with the DES model is easy to implement in a real-time shop floor environment.

3.Optimal Design of Lines Replaceable Units

2308.03388

Authors:Joni Driessen, Joost de Kruijf, Joachim Arts, Geert-Jan van Houtum

Abstract: A Line Replaceable Unit (LRU) is a collection of connected parts in a system that is replaced when any part of the LRU fails. Companies use LRUs as a mechanism to reduce downtime of systems following a failure. The design of LRUs determines how fast a replacement is performed, so a smart design reduces replacement and downtime cost. A firm must purchase/repair a LRU upon failure, and large LRUs are more expensive to purchase/repair. Hence, a firm seeks to design LRUs such that the average costs per time unit are minimized. We formalize this problem in a new model that captures how parts in a system are connected, and how they are disassembled from the system. Our model optimizes the design of LRUs such that the replacement (and downtime) costs and LRU purchase/repair costs are minimized. We present a set partitioning formulation for which we prove a rare result: the optimal solution is integer, despite a non--integral feasible polyhedron. Secondly, we formulate our problem as a binary linear program. The paper concludes by numerically comparing the computation times of both formulations and illustrates the effects of various parameters on the model's outcome.

4.Approximate propagation of normal distributions for stochastic optimal control of nonsmooth systems

2308.03431

Authors:Florian Messerer, Katrin Baumgärtner, Armin Nurkanović, Moritz Diehl

Abstract: We present a method for the approximate propagation of mean and covariance of a probability distribution through ordinary differential equations (ODE) with discontinous right-hand side. For piecewise affine systems, a normalization of the propagated probability distribution at every time step allows us to analytically compute the expectation integrals of the mean and covariance dynamics while explicitly taking into account the discontinuity. This leads to a natural smoothing of the discontinuity such that for relevant levels of uncertainty the resulting ODE can be integrated directly with standard schemes and it is neither necessary to prespecify the switching sequence nor to use a switch detection method. We then show how this result can be employed in the more general case of piecewise smooth functions based on a structure preserving linearization scheme. The resulting dynamics can be straightforwardly used within standard formulations of stochastic optimal control problems with chance constraints.

5.Feasible approximation of matching equilibria for large-scale matching for teams problems

2308.03550

Authors:Ariel Neufeld, Qikun Xiang

Abstract: We propose a numerical algorithm for computing approximately optimal solutions of the matching for teams problem. Our algorithm is efficient for problems involving a large number of agent categories and allows for the measures describing the agent types to be non-discrete. Specifically, we parametrize the so-called transfer functions and develop a parametric version of the dual formulation. Our algorithm tackles this parametric formulation and produces feasible and approximately optimal solutions for the primal and dual formulations of the matching for teams problem. These solutions also yield upper and lower bounds for the optimal value, and the difference between the upper and lower bounds provides a direct sub-optimality estimate of the computed solutions. Moreover, we are able to control a theoretical upper bound on the sub-optimality to be arbitrarily close to 0 under mild conditions. We subsequently prove that the approximate primal and dual solutions converge when the sub-optimality goes to 0 and their limits constitute a true matching equilibrium. Thus, the outputs of our algorithm are regarded as an approximate matching equilibrium. We also analyze the theoretical computational complexity of our parametric formulation as well as the sparsity of the resulting approximate matching equilibrium. Through numerical experiments, we showcase that the proposed algorithm can produce high-quality approximate matching equilibria and is applicable to versatile settings, including a high-dimensional setting involving 100 agent categories.

6.A Branch-and-Cut-and-Price Algorithm for Cutting Stock and Related Problems

2308.03595

Authors:Renan F. F. da Silva, Rafael C. S. Schouery

Abstract: We present a branch-and-cut-and-price framework to solve Cutting Stock Problems with strong relaxations using Set Covering (Partition) Formulations, which are solved by column generation. We propose an extended Ryan-Foster branching scheme for non-binary models, a pricing algorithm that converges in a few iterations, and a variable selection algorithm based on branching history. These strategies are combined with subset-row cuts and custom primal heuristics to create a framework that overcomes the current state-of-the-art for the following problems: Cutting Stock, Skiving Stock, Ordered Open-End Bin Packing, Class-Constrained Bin Packing, and Identical Parallel Machines Scheduling with Minimum Makespan. Additionally, a new challenging benchmark for Cutting Stock is introduced.

7.RIP-based Performance Guarantee for Low Rank Matrix Recovery via $L_{*-F}$ Minimization

2308.03642

Authors:Yan Li, Liping Zhang

Abstract: In the undetermined linear system $\bm{b}=\mathcal{A}(\bm{X})+\bm{s}$, vector $\bm{b}$ and operator $\mathcal{A}$ are the known measurements and $\bm{s}$ is the unknown noise. In this paper, we investigate sufficient conditions for exactly reconstructing desired matrix $\bm{X}$ being low-rank or approximately low-rank. We use the difference of nuclear norm and Frobenius norm ($L_{*-F}$) as a surrogate for rank function and establish a new nonconvex relaxation of such low rank matrix recovery, called the $L_{*-F}$ minimization, in order to approximate the rank function closer. For such nonconvex and nonsmooth constrained $L_{*-F}$ minimization problems, based on whether the noise level is $0$, we give the upper bound estimation of the recovery error respectively. Particularly, in the noise-free case, one sufficient condition for exact recovery is presented. If linear operator $\mathcal{A}$ satisfies the restricted isometry property with $\delta_{4r}<\frac{\sqrt{2r}-1}{\sqrt{2r}-1+\sqrt{2}(\sqrt{2r}+1)}$, then $r$-\textbf{rank} matrix $\bm{X}$ can be exactly recovered without other assumptions. In addition, we also take insights into the regularized $L_{*-F}$ minimization model since such regularized model is more widely used in algorithm design. We provide the recovery error estimation of this regularized $L_{*-F}$ minimization model via RIP tool. To our knowledge, this is the first result on exact reconstruction of low rank matrix via regularized $L_{*-F}$ minimization.

8.Almost-sure convergence of iterates and multipliers in stochastic sequential quadratic optimization

2308.03687

Authors:Frank E. Curtis, Xin Jiang, Qi Wang

Abstract: Stochastic sequential quadratic optimization (SQP) methods for solving continuous optimization problems with nonlinear equality constraints have attracted attention recently, such as for solving large-scale data-fitting problems subject to nonconvex constraints. However, for a recently proposed subclass of such methods that is built on the popular stochastic-gradient methodology from the unconstrained setting, convergence guarantees have been limited to the asymptotic convergence of the expected value of a stationarity measure to zero. This is in contrast to the unconstrained setting in which almost-sure convergence guarantees (of the gradient of the objective to zero) can be proved for stochastic-gradient-based methods. In this paper, new almost-sure convergence guarantees for the primal iterates, Lagrange multipliers, and stationarity measures generated by a stochastic SQP algorithm in this subclass of methods are proved. It is shown that the error in the Lagrange multipliers can be bounded by the distance of the primal iterate to a primal stationary point plus the error in the latest stochastic gradient estimate. It is further shown that, subject to certain assumptions, this latter error can be made to vanish by employing a running average of the Lagrange multipliers that are computed during the run of the algorithm. The results of numerical experiments are provided to demonstrate the proved theoretical guarantees.

9.Quadratic-exponential coherent feedback control of linear quantum stochastic systems

2308.03918

Authors:Igor G. Vladimirov, Ian R. Petersen

Abstract: This paper considers a risk-sensitive optimal control problem for a field-mediated interconnection of a quantum plant with a coherent (measurement-free) quantum controller. The plant and the controller are multimode open quantum harmonic oscillators governed by linear quantum stochastic differential equations, which are coupled to each other and driven by multichannel quantum Wiener processes modelling the external bosonic fields. The control objective is to internally stabilize the closed-loop system and minimize the infinite-horizon asymptotic growth rate of a quadratic-exponential functional which penalizes the plant variables and the controller output. We obtain first-order necessary conditions of optimality for this problem by computing the partial Frechet derivatives of the cost functional with respect to the energy and coupling matrices of the controller in frequency domain and state space. An infinitesimal equivalence between the risk-sensitive and weighted coherent quantum LQG control problems is also established. In addition to variational methods, we employ spectral factorizations and infinite cascades of auxiliary classical systems. Their truncations are applicable to numerical optimization algorithms (such as the gradient descent) for coherent quantum risk-sensitive feedback synthesis.

Fri, 04 Aug 2023digest

1.Optimization on Pareto sets: On a theory of multi-objective optimization

2308.02145

Authors:Abhishek Roy, Geelon So, Yi-An Ma

Abstract: In multi-objective optimization, a single decision vector must balance the trade-offs between many objectives. Solutions achieving an optimal trade-off are said to be Pareto optimal: these are decision vectors for which improving any one objective must come at a cost to another. But as the set of Pareto optimal vectors can be very large, we further consider a more practically significant Pareto-constrained optimization problem, where the goal is to optimize a preference function constrained to the Pareto set. We investigate local methods for solving this constrained optimization problem, which poses significant challenges because the constraint set is (i) implicitly defined, and (ii) generally non-convex and non-smooth, even when the objectives are. We define notions of optimality and stationarity, and provide an algorithm with a last-iterate convergence rate of $O(K^{-1/2})$ to stationarity when the objectives are strongly convex and Lipschitz smooth.

2.Completely Abstract Dynamic Programming

2308.02148

Authors:Thomas J. Sargent, John Stachurski

Abstract: We introduce a completely abstract dynamic programming framework in which dynamic programs are sets of policy operators acting on a partially ordered space. We provide an optimality theory based on high-level assumptions. We then study symmetric and asymmetric relationships between dynamic programs, and show how these relationships transmit optimality properties. Our formulation includes and extends applications of dynamic programming across many fields.

3.Optimal Control of Stationary Doubly Diffusive Flows on Two and Three Dimensional Bounded Lipschitz Domains: A Theoretical Study

2308.02178

Authors:Jai Tushar, Arbaz Khan, Manil T. Mohan

Abstract: In this work, a theoretical framework is developed to study the control constrained distributed optimal control of a stationary double diffusion model presented in [Burger, Mendez, Ruiz-Baier, SINUM (2019), 57:1318-1343]. For the control problem, as the source term belongs to a weaker space, a new solvability analysis of the governing equation is presented using Faedo- Galerkin approximation techniques. Some new minimal regularity results for the governing equation are established on two and three-dimensional bounded Lipschitz domains and are of independent interest. Moreover, we show the existence of an optimal control with quadratic type cost functional, study the Frechet differentiability properties of the control-to-state map and establish the first-order necessary optimality conditions corresponding to the optimal control problem.

4.Adaptive Proximal Gradient Method for Convex Optimization

2308.02261

Authors:Yura Malitsky, Konstantin Mishchenko

Abstract: In this paper, we explore two fundamental first-order algorithms in convex optimization, namely, gradient descent (GD) and proximal gradient method (ProxGD). Our focus is on making these algorithms entirely adaptive by leveraging local curvature information of smooth functions. We propose adaptive versions of GD and ProxGD that are based on observed gradient differences and, thus, have no added computational costs. Moreover, we prove convergence of our methods assuming only local Lipschitzness of the gradient. In addition, the proposed versions allow for even larger stepsizes than those initially suggested in [MM20].

5.Blessing of High-Order Dimensionality: from Non-Convex to Convex Optimization for Sensor Network Localization

2308.02278

Authors:Mingyu Lei, Jiayu Zhang, Yinyu Ye

Abstract: This paper investigates the Sensor Network Localization (SNL) problem, which seeks to determine sensor locations based on known anchor locations and partially given anchors-sensors and sensors-sensors distances. Two primary methods for solving the SNL problem are analyzed: the low-dimensional method that directly minimizes a loss function, and the high-dimensional semi-definite relaxation (SDR) method that reformulates the SNL problem as an SDP (semi-definite programming) problem. The paper primarily focuses on the intrinsic non-convexity of the loss function of the low-dimensional method, which is shown in our main theorem. The SDR method, via second-order dimension augmentation, is discussed in the context of its ability to transform non-convex problems into convex ones; while the first-order direct dimension augmentation fails. Additionally, we will show that more edges don't necessarily contribute to the better convexity of the loss function. Moreover, we provide an explanation for the success of the SDR+GD (gradient descent) method which uses the SDR solution as a warm-start of the minimization of the loss function by gradient descent. The paper also explores the parallels among SNL, max-cut, and neural networks in terms of the blessing of high-order dimension augmentation.

6.Approximation of deterministic mean field type control systems

2308.02301

Authors:Yurii Averboukh

Abstract: The paper is concerned with the approximation of the deterministic the mean field type control system by a mean field Markov chain. It turns out that the dynamics of the distribution in the approximating system is described by a system of ordinary differential equations. Given a strategy for the Markov chain, we explicitly construct a control in the deterministic mean field type control system. Our method is a realization of the model predictive approach. The converse construction is also presented. These results lead to an estimate of the Hausdorff distance between the bundles of motions in the deterministic mean field type control system and the mean field Markov chain. Especially, we pay the attention to the case when one can approximate the bundle of motions in the mean field type system by solutions of a finite systems of ODEs.

Thu, 03 Aug 2023digest

1.Optimal Distributed Control for a Cahn-Hilliard-Darcy System with Mass Sources, Unmatched Viscosities and Singular Potential

2308.01569

Authors:Marco Abatangelo, Cecilia Cavaterra, Maurizio Grasselli, Hao Wu

Abstract: We study a Cahn-Hilliard-Darcy system in two dimensions with mass sources, unmatched viscosities and singular potential. This system is equipped with no-flux boundary conditions for the (volume) averaged velocity $\mathbf{u}$, the difference of the volume fractions $\varphi$, and the chemical potential $\mu$, along with an initial condition for $\varphi$. The resulting initial boundary value problem can be considered as a basic, though simplified, model for the evolution of solid tumor growth. The source term in the Cahn-Hilliard equation contains a control $R$ that can be thought, for instance, as a drug or a nutrient. Our goal is to study an optimal control problem with a tracking type cost functional given by the sum of three $L^2$ norms involving $\varphi(T)$ ($T>0$ is the final time), $\varphi$ and $R$. We first prove the existence and uniqueness of a global strong solution with $\varphi$ being strictly separated from the pure phases $\pm 1$. Thanks to this result, we are able to analyze the control-to-state mapping $\mathcal{S}: R \mapsto \varphi$, obtaining the existence of an optimal control, the Fr\'{e}chet differentiability of $\mathcal{S}$ and first-order necessary optimality conditions expressed through a suitable variational inequality for the adjoint variables. Finally, we show the differentiability of the control-to-costate operator and establish a second-order sufficient condition for the strict local optimality.

2.Efficiency of First-Order Methods for Low-Rank Tensor Recovery with the Tensor Nuclear Norm Under Strict Complementarity

2308.01677

Authors:Dan Garber, Atara Kaplan

Abstract: We consider convex relaxations for recovering low-rank tensors based on constrained minimization over a ball induced by the tensor nuclear norm, recently introduced in \cite{tensor_tSVD}. We build on a recent line of results that considered convex relaxations for the recovery of low-rank matrices and established that under a strict complementarity condition (SC), both the convergence rate and per-iteration runtime of standard gradient methods may improve dramatically. We develop the appropriate strict complementarity condition for the tensor nuclear norm ball and obtain the following main results under this condition: 1. When the objective to minimize is of the form $f(\mX)=g(\mA\mX)+\langle{\mC,\mX}\rangle$ , where $g$ is strongly convex and $\mA$ is a linear map (e.g., least squares), a quadratic growth bound holds, which implies linear convergence rates for standard projected gradient methods, despite the fact that $f$ need not be strongly convex. 2. For a smooth objective function, when initialized in certain proximity of an optimal solution which satisfies SC, standard projected gradient methods only require SVD computations (for projecting onto the tensor nuclear norm ball) of rank that matches the tubal rank of the optimal solution. In particular, when the tubal rank is constant, this implies nearly linear (in the size of the tensor) runtime per iteration, as opposed to super linear without further assumptions. 3. For a nonsmooth objective function which admits a popular smooth saddle-point formulation, we derive similar results to the latter for the well known extragradient method. An additional contribution which may be of independent interest, is the rigorous extension of many basic results regarding tensors of arbitrary order, which were previously obtained only for third-order tensors.

3.Topology Optimization for Uniform Flow Distribution in Electrolysis Cells

2308.01826

Authors:Leon Baeck, Sebastian Blauth, Christian Leithäuser, René Pinnau, Kevin Sturm

Abstract: In this paper we consider the topology optimization for a bipolar plate of a hydrogen electrolysis cell. We present a model for the bipolar plate using the Stokes equation with an additional drag term, which models the influence of fluid and solid regions. Furthermore, we derive a criterion for a uniform flow distribution in the bipolar plate. To obtain shapes that are well-manufacturable, we introduce a novel smoothing technique for the fluid velocity. Finally, we present some numerical results and investigate the influence of the smoothing on the obtained shapes.

4.Subspace-Constrained Continuous Methane Leak Monitoring and Optimal Sensor Placement

2308.01836

Authors:Kashif Rashid, Lukasz Zielinski, Junyi Yuan, Andrew Speck

Abstract: This work presents a procedure that can quickly identify and isolate methane emission sources leading to expedient remediation. Minimizing the time required to identify a leak and the subsequent time to dispatch repair crews can significantly reduce the amount of methane released into the atmosphere. The procedure developed utilizes permanently installed low-cost methane sensors at an oilfield facility to continuously monitor leaked gas concentration above background levels. The methods developed for optimal sensor placement and leak inversion in consideration of predefined subspaces and restricted zones are presented. In particular, subspaces represent regions comprising one or more equipment items that may leak, and restricted zones define regions in which a sensor may not be placed due to site restrictions by design. Thus, subspaces constrain the inversion problem to specified locales, while restricted zones constrain sensor placement to feasible zones. The development of synthetic wind models, and those based on historical data, are also presented as a means to accommodate optimal sensor placement under wind uncertainty. The wind models serve as realizations for planning purposes, with the aim of maximizing the mean coverage measure for a given number of sensors. Once the optimal design is established, continuous real-time monitoring permits localization and quantification of a methane leak source. The necessary methods, mathematical formulation and demonstrative test results are presented.

5.Energy System Optimisation using (Mixed Integer) Linear Programming

2308.01882

Authors:Sebastian Miehling, Andreas Hanel, Jerry Lambert, Sebastian Fendt, Hartmut Spliethoff

Abstract: Although energy system optimisation based on linear optimisation is often used for influential energy outlooks and studies for political decision-makers, the underlying background still needs to be described in the scientific literature in a concise and general form. This study presents the main equations and advanced ideas and explains further possibilities mixed integer linear programming offers in energy system optimisation. Furthermore, the equations are shown using an example system to present a more practical point of view. Therefore, this study is aimed at researchers trying to understand the background of studies using energy system optimisation and researchers building their implementation into a new framework. This study describes how to build a standard model, how to implement advanced equations using linear programming, and how to implement advanced equations using mixed integer linear programming, as well as shows a small exemplary system. - Presentation of the OpTUMus energy system optimisation framework - Set of equations for a fully functional energy system model - Example of a simple energy system model

Wed, 02 Aug 2023digest

1.Accelerated Benders Decomposition for Variable-Height Transport Packaging Optimisation

2308.01104

Authors:Alain Lehmann, Wilhelm Kleiminger, Hakim Invernizzi, Aurel Gautschi

Abstract: This paper tackles the problem of finding optimal variable-height transport packaging. The goal is to reduce the empty space left in a box when shipping goods to customers, thereby saving on filler and reducing waste. We cast this problem as a large-scale mixed integer problem (with over seven billion variables) and demonstrate various acceleration techniques to solve it efficiently in about three hours on a laptop. We present a KD-Tree algorithm to avoid exhaustive grid evaluation of the 3D-bin-packing, provide analytical transformations to accelerate the Benders decomposition, and an efficient implementation of the Benders sub problem for significant memory savings and a three order of magnitude runtime speedup.

2.Multiobjective Optimization of Non-Smooth PDE-Constrained Problems

2308.01113

Authors:Marco Bernreuther, Michael Dellnitz, Bennet Gebken, Georg Müller, Sebastian Peitz, Konstantin Sonntag, Stefan Volkwein

Abstract: Multiobjective optimization plays an increasingly important role in modern applications, where several criteria are often of equal importance. The task in multiobjective optimization and multiobjective optimal control is therefore to compute the set of optimal compromises (the Pareto set) between the conflicting objectives. The advances in algorithms and the increasing interest in Pareto-optimal solutions have led to a wide range of new applications related to optimal and feedback control - potentially with non-smoothness both on the level of the objectives or in the system dynamics. This results in new challenges such as dealing with expensive models (e.g., governed by partial differential equations (PDEs)) and developing dedicated algorithms handling the non-smoothness. Since in contrast to single-objective optimization, the Pareto set generally consists of an infinite number of solutions, the computational effort can quickly become challenging, which is particularly problematic when the objectives are costly to evaluate or when a solution has to be presented very quickly. This article gives an overview of recent developments in the field of multiobjective optimization of non-smooth PDE-constrained problems. In particular we report on the advances achieved within Project 2 "Multiobjective Optimization of Non-Smooth PDE-Constrained Problems - Switches, State Constraints and Model Order Reduction" of the DFG Priority Programm 1962 "Non-smooth and Complementarity-based Distributed Parameter Systems: Simulation and Hierarchical Optimization".

3.Optimal Mixed Strategies to the Zero-sum Linear Differential Game

2308.01144

Authors:Tao Xu, Wang Xi, Jianping He

Abstract: This paper exploits the weak approximation method to study a zero-sum linear differential game under mixed strategies. The stochastic nature of mixed strategies poses challenges in evaluating the game value and deriving the optimal strategies. To overcome these challenges, we first define the mixed strategy based on time discretization given the control period $\delta$. Then, we design a stochastic differential equation (SDE) to approximate the discretized game dynamic with a small approximation error of scale $\mathcal{O}(\delta^2)$ in the weak sense. Moreover, we prove that the game payoff is also approximated in the same order of accuracy. Next, we solve the optimal mixed strategies and game values for the linear quadratic differential games. The effect of the control period is explicitly analyzed when the payoff is a terminal cost. Our results provide the first implementable form of the optimal mixed strategies for a zero-sum linear differential game. Finally, we provide numerical examples to illustrate and elaborate on our results.

4.Stochastic smoothing accelerated gradient method for nonsmooth convex composite optimization

2308.01252

Authors:Ruyu Wang, Chao Zhang

Abstract: We propose a novel stochastic smoothing accelerated gradient (SSAG) method for general constrained nonsmooth convex composite optimization, and analyze the convergence rates. The SSAG method allows various smoothing techniques, and can deal with the nonsmooth term that is not easy to compute its proximal term, or that does not own the linear max structure. To the best of our knowledge, it is the first stochastic approximation type method with solid convergence result to solve the convex composite optimization problem whose nonsmooth term is the maximization of numerous nonlinear convex functions. We prove that the SSAG method achieves the best-known complexity bounds in terms of the stochastic first-order oracle ($\mathcal{SFO}$), using either diminishing smoothing parameters or a fixed smoothing parameter. We give two applications of our results to distributionally robust optimization problems. Numerical results on the two applications demonstrate the effectiveness and efficiency of the proposed SSAG method.

5.Revitalizing Public Transit in Low Ridership Areas: An Exploration of On-Demand Multimodal Transit Systems

2308.01298

Authors:Jiawei Lu, Connor Riley, Krishna Murthy Gurumurthy, Pascal Van Hentenryck

Abstract: Public transit plays an essential role in mitigating traffic congestion, reducing emissions, and enhancing travel accessibility and equity. One of the critical challenges in designing public transit systems is distributing finite service supplies temporally and spatially to accommodate time-varying and space-heterogeneous travel demands. Particularly, for regions with low or scattered ridership, there is a dilemma in designing traditional transit lines and corresponding service frequencies. Dense transit lines and high service frequency increase operation costs, while sparse transit lines and low service frequency result in poor accessibility and long passenger waiting time. In the coming era of Mobility-as-a-Service, the aforementioned challenge is expected to be addressed by on-demand services. In this study, we design an On-Demand Multimodel Transit System (ODMTS) for regions with low or scattered travel demands, in which some low-ridership bus lines are replaced with flexible on-demand ride-sharing shuttles. In the proposed ODMTS, riders within service regions can request shuttles to finish their trips or to connect to fixed-route services such as bus, metro, and light rail. Leveraging the integrated transportation system modeling platform, POLARIS, a simulation-based case study is conducted to assess the effectiveness of this system in Austin, Texas.

Tue, 01 Aug 2023digest

1.Practical asymptotic stability of data-driven model predictive control using extended DMD

2308.00296

Authors:Lea Bold, Lars Grüne, Manuel Schaller, Karl Worthmann

Abstract: The extended Dynamic Mode Decomposition (eDMD) is a very popular method to obtain data-driven surrogate models for nonlinear (control) systems governed by ordinary and stochastic differential equations. Its theoretical foundation is the Koopman framework, in which one propagates observable functions of the state to obtain a linear representation in an infinite-dimensional space. In this work, we prove practical asymptotic stability of a (controlled) equilibrium for eDMD-based model predictive control, in which the optimization step is conducted using the data-based surrogate model. To this end, we derive error bounds that converge to zero if the state approaches the desired equilibrium. Further, we show that, if the underlying system is cost controllable, then this stabilizablility property is preserved. We conduct numerical simulations, which illustrate the proven practical asymptotic stability.

2.Threshold-aware Learning to Generate Feasible Solutions for Mixed Integer Programs

2308.00327

Authors:Taehyun Yoon, Jinwon Choi, Hyokun Yun, Sungbin Lim

Abstract: Finding a high-quality feasible solution to a combinatorial optimization (CO) problem in a limited time is challenging due to its discrete nature. Recently, there has been an increasing number of machine learning (ML) methods for addressing CO problems. Neural diving (ND) is one of the learning-based approaches to generating partial discrete variable assignments in Mixed Integer Programs (MIP), a framework for modeling CO problems. However, a major drawback of ND is a large discrepancy between the ML and MIP objectives, i.e., variable value classification accuracy over primal bound. Our study investigates that a specific range of variable assignment rates (coverage) yields high-quality feasible solutions, where we suggest optimizing the coverage bridges the gap between the learning and MIP objectives. Consequently, we introduce a post-hoc method and a learning-based approach for optimizing the coverage. A key idea of our approach is to jointly learn to restrict the coverage search space and to predict the coverage in the learned search space. Experimental results demonstrate that learning a deep neural network to estimate the coverage for finding high-quality feasible solutions achieves state-of-the-art performance in NeurIPS ML4CO datasets. In particular, our method shows outstanding performance in the workload apportionment dataset, achieving the optimality gap of 0.45%, a ten-fold improvement over SCIP within the one-minute time limit.

3.Linear-Quadratic Optimal Control Problem for Mean-Field Stochastic Differential Equations with a Type of Random Coefficients

2308.00335

Authors:Hongwei Mei, Qingmeng Wei, Jiongmin Yong

Abstract: Motivated by linear-quadratic optimal control problems (LQ problems, for short) for mean-field stochastic differential equations (SDEs, for short) with the coefficients containing regime switching governed by a Markov chain, we consider an LQ problem for an SDE with the coefficients being adapted to a filtration independent of the Brownian motion driving the control system. Classical approach of completing the square is applied to the current problem and obvious shortcomings are indicated. Open-loop and closed-loop solvability are introduced and characterized.

4.An Efficient Algorithm for Computational Protein Design Problem

2308.00360

Authors:Yukai Zheng, Weikun Chen, Qingna Li

Abstract: A protein is a sequence of basic blocks called amino acids, and it plays an important role in animals and human beings. The computational protein design (CPD) problem is to identify a protein that could perform some given functions. The CPD problem can be formulated as a quadratic semi-assigement problem (QSAP) and is extremely challenging due to its combinatorial properties over different amino acid sequences. In this paper, we first show that the QSAP is equivalent to its continuous relaxation problem, the RQSAP, in the sense that the QSAP and RQSAP share the same optimal solution. Then we design an efficient quadratic penalty method to solve large-scale RQSAP. Numerical results on benchmark instances verify the superior performance of our approach over the state-of-the-art branch-and-cut solvers. In particular, our proposed algorithm outperforms the state-of-the-art solvers by three order of magnitude in CPU time in most cases while returns a high-quality solution.

5.Maneuvering tracking algorithm for reentry vehicles with guaranteed prescribed performance

2308.00367

Authors:Zongyi Guo, Xiyu Gu, Yonglin Han, Jianguo Guo, Thomas Berger

Abstract: This paper presents a prescribed performance-based tracking control strategy for the atmospheric reentry flight of space vehicles subject to rapid maneuvers during flight mission. A time-triggered non-monotonic performance funnel is proposed with the aim of constraints violation avoidance in the case of sudden changes of the reference trajectory. Compared with traditional prescribed performance control methods, the novel funnel boundary is adaptive with respect to the reference path and is capable of achieving stability under disturbances. A recursive control structure is introduced which does not require any knowledge of specific system parameters. By a stability analysis we show that the tracking error evolves within the prescribed error margin under a condition which represents a trade-off between the reference signal and the performance funnel. The effectiveness of the proposed control scheme is verified by simulations.

6.Adaptive Methods or Variational Inequalities with Relatively Smooth and Reletively Strongly Monotone Operators

2308.00468

Authors:S. S. Ablaev, F. S. Stonyakin, M. S. Alkousa, D. A. Pasechnyuk

Abstract: The article is devoted to some adaptive methods for variational inequalities with relatively smooth and relatively strongly monotone operators. Starting from the recently proposed proximal variant of the extragradient method for this class of problems, we investigate in detail the method with adaptively selected parameter values. An estimate of the convergence rate of this method is proved. The result is generalized to a class of variational inequalities with relatively strongly monotone generalized smooth variational inequality operators. Numerical experiments have been performed for the problem of ridge regression and variational inequality associated with box-simplex games.

7.Robust Railway Network Design based on Strategic Timetables

2308.00483

Authors:Tim Sander, Nadine Friesen, Karl Nachtigall, Nils Nießen

Abstract: Using strategic timetables as input for railway network design has become increasingly popular among western European railway infrastructure operators. Although both railway timetabling and railway network design on their own are well covered by academic research, there is still a gap in the literature concerning timetable-based network design. Therefore, we propose a mixed-integer linear program to design railway infrastructure so that the demand derived from a strategic timetable can be satisfied with minimal infrastructure costs. The demand is given by a list of trains, each featuring start and destination nodes as well as time bounds and a set of frequency and transfer constraints that capture the strategic timetable's main characteristics. During the optimization, the solver decides which railway lines need to be built or expanded and whether travel or headway times must be shortened to meet the demand. Since strategic timetables are subject to uncertainty, we expand the optimization model to a robust version. Uncertain timetables are modelled as discrete scenarios, while uncertain freight train demand is modelled using optional trains, which can be inserted into the resulting timetable if they do not require additional infrastructure. We present computational results for both the deterministic and the robust case and give an outlook on further research.

8.On damping a control system with global aftereffect on quantum graphs

2308.00496

Authors:Sergey Buterin

Abstract: This paper naturally connects the theory of quantum graphs, the control theory and the theory of functional-differential equations. Specifically, we study the problem of damping a control system described by first-order equations on an arbitrary tree graph with global delay. The latter means that the constant delay imposed starting from the initial moment of time propagates through all internal vertices of the graph. By minimizing the energy functional, we arrive at the corresponding variational problem and then prove its equivalence to a self-adjoint boundary value problem on the tree for second-order equations involving both the global delay and the global advance. It is remarkable that the resulting problem acquires Kirchhoff's conditions at the internal vertices of the graph, which often appear in the theory of quantum graphs as well as various applications. The unique solvability of this boundary value problem is proved.

9.On the properties of the linear conjugate gradient method

2308.00598

Authors:Zexian Liu, Qiao Li

Abstract: The linear conjugate gradient method is an efficient iterative method for the convex quadratic minimization problems $ \mathop {\min }\limits_{x \in { \mathbb R^n}} f(x) =\dfrac{1}{2}x^TAx+b^Tx $, where $ A \in R^{n \times n} $ is symmetric and positive definite and $ b \in R^n $. It is generally agreed that the gradients $ g_k $ are not conjugate with respective to $ A $ in the linear conjugate gradient method (see page 111 in Numerical optimization (2nd, Springer, 2006) by Nocedal and Wright). In the paper we prove the conjugacy of the gradients $ g_k $ generated by the linear conjugate gradient method, namely, $$g_k^TAg_i=0, \; i=0,1,\cdots, k-2.$$ In addition,a new way is exploited to derive the linear conjugate gradient method based on the conjugacy of the search directions and the orthogonality of the gradients, rather than the conjugacy of the search directions and the exact stepsize.

10.Hierarchical Space Exploration Campaign Schedule Optimization With Ambiguous Programmatic Requirements

2308.00632

Authors:Nick Gollins, Koki Ho

Abstract: Space exploration plans are becoming increasingly complex as public agencies and private companies target deep-space locations, such as cislunar space and beyond, which require long-duration missions and many supporting systems and payloads. Optimizing multi-mission exploration campaigns is challenging due to the large number of required launches as well as their sequencing and compatibility requirements, making the conventional space logistics formulations not scalable. To tackle this challenge, this paper proposes an alternative approach that leverages a two-level hierarchical optimization algorithm: an evolutionary algorithm is used to explore the campaign scheduling solution space, and each of the solutions is then evaluated using a time-expanded multi-commodity flow mixed-integer linear program. A number of case studies, focusing on the Artemis lunar exploration program, demonstrate how the method can be used to analyze potential campaign architectures. The method enables a potential mission planner to study the sensitivity of a campaign to program-level parameters such as logistics vehicle availability and performance, payload launch windows, and in-situ resource utilization infrastructure efficiency.

11.Krylov Solvers for Interior Point Methods with Applications in Radiation Therapy

2308.00637

Authors:Felix Liu, Albin Fredriksson, Stefano Markidis

Abstract: Interior point methods are widely used for different types of mathematical optimization problems. Many implementations of interior point methods in use today rely on direct linear solvers to solve systems of equations in each iteration. The need to solve ever larger optimization problems more efficiently and the rise of hardware accelerators for general purpose computing has led to a large interest in using iterative linear solvers instead, with the major issue being inevitable ill-conditioning of the linear systems arising as the optimization progresses. We investigate the use of Krylov solvers for interior point methods in solving optimization problems from radiation therapy. We implement a prototype interior point method using a so called doubly augmented formulation of the Karush-Kuhn-Tucker (KKT) linear system of equations, originally proposed by Forsgren and Gill, and evaluate its performance on real optimization problems from radiation therapy. Crucially, our implementation uses a preconditioned conjugate gradient method with Jacobi preconditioning internally. Our measurements of the conditioning of the linear systems indicate that the Jacobi preconditioner improves the conditioning of the systems to a degree that they can be solved iteratively, but there is room for further improvement in that regard. Furthermore, profiling of our prototype code shows that it is suitable for GPU acceleration, which may further improve its performance in practice. Overall, our results indicate that our method can find solutions of acceptable accuracy in reasonable time, even with a simple Jacobi preconditioner.

12.Increasing Supply Chain Resiliency Through Equilibrium Pricing and Stipulating Transportation Quota Regulation

2308.00681

Authors:Mostafa Pazoki, Hamed Samarghandi, Mehdi Behroozi

Abstract: Supply chain disruption can occur for a variety of reasons, including natural disasters or market dynamics. If the disruption is profound and with dire consequences for the economy, the regulators may decide to intervene to minimize the impact for the betterment of the society. This paper investigates the minimum quota regulation on transportation amounts, stipulated by the government in a market where transportation capacity is below total production and profitability levels differ significantly among different products. In North America, an interesting example can happen in rail transportation market, where the rail capacity is used for a variety of products and commodities such as oil and grains. This research assumes that there is a shipping company with limited capacity which will ship a group of products with heterogeneous transportation and production costs and prices. Mathematical problems for the market players as well as the government are presented, solutions are proposed, and implemented in a framed Canadian case study. Subsequently, the conditions that justify government intervention are identified, and an algorithm to obtain the optimum minimum quota is presented.

Mon, 31 Jul 2023digest

1.Multiobjective optimization approach to shape and topology optimization of plane trusses with various aspect ratios

2307.16473

Authors:Makoto Ohsaki, Saku Aoyagi, Kazuki Hayashi

Abstract: A multiobjective optimization method is proposed for obtaining the optimal plane trusses simultaneously for various aspect ratios of the initial ground structure as a set of Pareto optimal solutions generated through a single optimization process. The shape and topology are optimized simultaneously to minimize the compliance under constraint on the total structural volume. The strain energy of each member is divided into components of two coordinate directions on the plane. The force density method is used for alleviating difficulties due to existence of coalescent or melting nodes. It is shown in the numerical example that sufficiently accurate optimal solutions are obtained by comparison with those obtained by the linear weighted sum approach that requires solving a single-objective optimization problem many times.

2.Thermo-mechanical level-set topology optimization of a load carrying battery pack for electric aircraft

2307.16521

Authors:Alexandre T. R. Guibert, Murtaza Bookwala, Ashley Cronk, Y. Shirley Meng, H. Alicia Kim

Abstract: A persistent challenge with the development of electric vertical take-off and landing vehicles (eVTOL) to meet flight power and energy demands is the mass of the load and thermal management systems for batteries. One possible strategy to overcome this problem is to employ optimization techniques to obtain a lightweight battery pack while satisfying structural and thermal requirements. In this work, a structural battery pack with high-energy-density cylindrical cells is optimized using the level-set topology optimization method. The heat generated by the batteries is predicted using a high-fidelity electrochemical model for a given eVTOL flight profile. The worst-case scenario for the battery's heat generation is then considered as a source term in the weakly coupled steady-state thermomechanical finite element model used for optimization. The objective of the optimization problem is to minimize the weighted sum of thermal compliance and structural compliance subjected to a volume constraint. The methodology is demonstrated with numerical examples for different sets of weights. The optimized results due to different weights are compared, discussed, and evaluated with thermal and structural performance indicators. The optimized pack topologies are subjected to a transient thermal finite element analysis to assess the battery pack's thermal response.

3.Cooperative Multi-Agent Constrained POMDPs: Strong Duality and Primal-Dual Reinforcement Learning with Approximate Information States

2307.16536

Authors:Nouman Khan, Vijay Subramanian

Abstract: We study the problem of decentralized constrained POMDPs in a team-setting where the multiple non-strategic agents have asymmetric information. Strong duality is established for the setting of infinite-horizon expected total discounted costs when the observations lie in a countable space, the actions are chosen from a finite space, and the immediate cost functions are bounded. Following this, connections with the common-information and approximate information-state approaches are established. The approximate information-states are characterized independent of the Lagrange-multipliers vector so that adaptations of the multiplier (during learning) will not necessitate new representations. Finally, a primal-dual multi-agent reinforcement learning (MARL) framework based on centralized training distributed execution (CTDE) and three time-scale stochastic approximation is developed with the aid of recurrent and feedforward neural-networks as function-approximators.

4.Line Search for Convex Minimization

2307.16560

Authors:Laurent Orseau, Marcus Hutter

Abstract: Golden-section search and bisection search are the two main principled algorithms for 1d minimization of quasiconvex (unimodal) functions. The first one only uses function queries, while the second one also uses gradient queries. Other algorithms exist under much stronger assumptions, such as Newton's method. However, to the best of our knowledge, there is no principled exact line search algorithm for general convex functions -- including piecewise-linear and max-compositions of convex functions -- that takes advantage of convexity. We propose two such algorithms: $\Delta$-Bisection is a variant of bisection search that uses (sub)gradient information and convexity to speed up convergence, while $\Delta$-Secant is a variant of golden-section search and uses only function queries. While bisection search reduces the $x$ interval by a factor 2 at every iteration, $\Delta$-Bisection reduces the (sometimes much) smaller $x^*$-gap $\Delta^x$ (the $x$ coordinates of $\Delta$) by at least a factor 2 at every iteration. Similarly, $\Delta$-Secant also reduces the $x^*$-gap by at least a factor 2 every second function query. Moreover, the $y^*$-gap $\Delta^y$ (the $y$ coordinates of $\Delta$) also provides a refined stopping criterion, which can also be used with other algorithms. Experiments on a few convex functions confirm that our algorithms are always faster than their quasiconvex counterparts, often by more than a factor 2. We further design a quasi-exact line search algorithm based on $\Delta$-Secant. It can be used with gradient descent as a replacement for backtracking line search, for which some parameters can be finicky to tune -- and we provide examples to this effect, on strongly-convex and smooth functions. We provide convergence guarantees, and confirm the efficiency of quasi-exact line search on a few single- and multivariate convex functions.

5.Differentially Private and Communication-Efficient Distributed Nonconvex Optimization Algorithms

2307.16656

Authors:Antai Xie, Xinlei Yi, Xiaofan Wang, Ming Cao, Xiaoqiang Ren

Abstract: This paper studies the privacy-preserving distributed optimization problem under limited communication, where each agent aims to keep its cost function private while minimizing the sum of all agents' cost functions. To this end, we propose two differentially private distributed algorithms under compressed communication. We show that the proposed algorithms achieve sublinear convergence for smooth (possibly nonconvex) cost functions and linear convergence when the global cost function additionally satisfies the Polyak--Lojasiewicz condition, even for a general class of compressors with bounded relative compression error. Furthermore, we rigorously prove that the proposed algorithms ensure $\epsilon$-differential privacy. Noting that the definition of $\epsilon$-differential privacy is stricter than the definition of ($\epsilon$, $\delta$)-differential privacy used in the literature. Simulations are presented to demonstrate the effectiveness of our proposed approach.

6.Learning-based Improvement in State Estimation for Unobservable Systems

2307.16822

Authors:J. G. De la Varga, S. Pineda, J. M. Morales, Á. Porras

Abstract: The task of state estimation faces a major challenge due to the inherent lack of real-time observability, as certain measurements can only be acquired with a delay. As a result, power systems are essentially unobservable in real time, indicating the existence of multiple states that result in identical values for the available measurements. Certain existing approaches utilize historical data to infer the relationship between real-time available measurements and the state. Other learning-based methods aim at generating the pseudo-measurements required to make the system observable. Our paper presents a methodology that utilizes the outcome of an unobservable state estimator to exploit information on the joint probability distribution between real-time available measurements and delayed ones. Through numerical simulations conducted on a realistic electricity network with insufficient real-time measurements, the proposed procedure showcases superior performance compared to existing state forecasting approaches and those relying on inferred pseudo-measurements.

7.Accelerating Optimal Power Flow with GPUs: SIMD Abstraction of Nonlinear Programs and Condensed-Space Interior-Point Methods

2307.16830

Authors:Sungho Shin, François Pacaud, Mihai Anitescu

Abstract: This paper introduces a novel computational framework for solving alternating current optimal power flow (ACOPF) problems using graphics processing units (GPUs). While GPUs have demonstrated remarkable performance in various computing domains, their application in AC OPF has been limited due to challenges associated with porting sparse automatic differentiation (AD) and sparse linear solver routines to GPUs. We aim to address these issues with two key strategies. First, we utilize a single-instruction, multiple-data (SIMD) abstraction of nonlinear programs (NLP). This approach enables the specification of model equations while preserving their parallelizable structure, and in turn, facilitates the implementation of AD routines that can exploit such structure. Second, we employ a condensed-space interior-point method (IPM) with an inequality relaxation strategy. This technique involves relaxing equality constraints to inequalities and condensing the Karush-Kuhn-Tucker system into a much smaller positive definite system. This strategy offers the key advantage of being able to factorize the KKT matrix without numerical pivoting, which in the past has hampered the parallelization of the IPM algorithm. By combining these two strategies, we can perform the majority of operations on GPUs while keeping the data residing in the device memory only. Comprehensive numerical benchmark results showcase the substantial computational advantage of our approach. Remarkably, for solving large-scale AC OPF problems to a moderate accuracy, our implementations -- MadNLP.jl and ExaModels.jl -- running on NVIDIA GPUs achieve an order of magnitude speedup compared to state-of-the-art tools running on contemporary CPUs.

8.Multi-year Investment Modelling in Energy Systems

2307.16842

Authors:Diego A. Tejada-Arango

Abstract: This paper summarises the main multi-year investment modelling approaches in energy planning models. Therefore, here we will go from a simple (basic) formulation to a more complex (general) one to understand different levels of detail, including examples to make more accessible the understanding of the concepts.

Fri, 28 Jul 2023digest

1.Modeling Nonlinear Control Systems via Koopman Control Family: Universal Forms and Subspace Invariance Proximity

2307.15368

Authors:Masih Haseli, Jorge Cortés

Abstract: This paper introduces the Koopman Control Family (KCF), a mathematical framework for modeling general discrete-time nonlinear control systems with the aim of providing a solid theoretical foundation for the use of Koopman-based methods in systems with inputs. We demonstrate that the concept of KCF can completely capture the behavior of nonlinear control systems on a (potentially infinite-dimensional) function space. By employing a generalized notion of subspace invariance under the KCF, we establish a universal form for finite-dimensional models, which encompasses the commonly used linear, bilinear, and linear switched models as specific instances. In cases where the subspace is not invariant under the KCF, we propose a method for approximating models in general form and characterize the model's accuracy using the concept of invariance proximity. The proposed framework naturally lends itself to the incorporation of data-driven methods in modeling and control.

2.On new generalized differentials with respect to a set and their applications

2307.15389

Authors:Xiaolong Qin, Vo Duc Thinh, Jen-Chih Yao

Abstract: The notions and certain fundamental characteristics of the proximal and limiting normal cones with respect to a set are first presented in this paper. We present the ideas of the limiting coderivative and subdifferential with respect to a set of multifunctions and singleton mappings, respectively, based on these normal cones. The necessary and sufficient conditions for the Aubin property with respect to a set of multifunctions are then described by using the limiting coderivative with respect to a set. As a result of the limiting subdifferential with respect to a set, we offer the requisite optimality criteria for local solutions to optimization problems. In addition, we also provide examples to demonstrate the outcomes.

3.Minimal error momentum Bregman-Kaczmarz

2307.15435

Authors:Dirk A. Lorenz, Maximilian Winkler

Abstract: The Bregman-Kaczmarz method is an iterative method which can solve strongly convex problems with linear constraints and uses only one or a selected number of rows of the system matrix in each iteration, thereby making it amenable for large-scale systems. To speed up convergence, we investigate acceleration by heavy ball momentum in the so-called dual update. Heavy ball acceleration of the Kaczmarz method with constant parameters has turned out to be difficult to analyze, in particular no accelerated convergence for the L2-error of the iterates has been proven to the best of our knowledge. Here we propose a way to adaptively choose the momentum parameter by a minimal-error principle similar to a recently proposed method for the standard randomized Kaczmarz method. The momentum parameter can be chosen to exactly minimize the error in the next iterate or to minimize a relaxed version of the minimal error principle. The former choice leads to a theoretically optimal step while the latter is cheaper to compute. We prove improved convergence results compared to the non-accelerated method. Numerical experiments show that the proposed methods can accelerate convergence in practice, also for matrices which arise from applications such as computational tomography.

4.Symmetric separable convex resource allocation problems with structured disjoint interval bound constraints

2307.15459

Authors:Martijn H. H. Schoot Uiterkamp

Abstract: Motivated by the problem of scheduling electric vehicle (EV) charging with a minimum charging threshold in smart distribution grids, we introduce the resource allocation problem (RAP) with a symmetric separable convex objective function and disjoint interval bound constraints. In this RAP, the aim is to allocate an amount of resource over a set of $n$ activities, where each individual allocation is restricted to a disjoint collection of $m$ intervals. This is a generalization of classical RAPs studied in the literature where in contrast each allocation is only restricted by simple lower and upper bounds, i.e., $m=1$. We propose an exact algorithm that, for four special cases of the problem, returns an optimal solution in $O \left(\binom{n+m-2}{m-2} (n \log n + nF) \right)$ time, where the term $nF$ represents the number of flops required for one evaluation of the separable objective function. In particular, the algorithm runs in polynomial time when the number of intervals $m$ is fixed. Moreover, we show how this algorithm can be adapted to also output an optimal solution to the problem with integer variables without increasing its time complexity. Computational experiments demonstrate the practical efficiency of the algorithm for small values of $m$ and in particular for solving EV charging problems.

5.Convex Optimization of PV-Battery System Sizing and Operation with Non-Linear Loss Models

2307.15507

Authors:Jolien Despeghel, Jeroen Tant, Johan Driesen

Abstract: In the literature, when optimizing the sizing and operation of a residential PV system in combination with a battery energy storage system, the efficiency of the battery and the converter is generally assumed constant, which corresponds to a linear loss model that can be readily integrated in an optimization model. However, this assumption does not always represent the impact of the losses accurately. For this reason, an approach is presented that includes non-linear converter and battery loss models by applying convex relaxations to the non-linear constraints. The relaxed convex formulation is equivalent to the original non-linear formulation and can be solved more efficiently. The difference between the optimization model with non-linear loss models and linear loss models is illustrated for a residential DC-coupled PV-battery system. The linear loss model is shown to result in an underestimation of the battery size and cost as well as a lower utilization of the battery. The proposed method is useful to accurately model the impact of losses on the optimal sizing and operation in exchange for a slightly higher computational time compared to linear loss models, though far below that of solving the non-relaxed non-linear problem.

6.Nonlinear conjugate gradient method for vector optimization on Riemannian manifolds with retraction and vector transport

2307.15515

Authors:Kangming Chen, Ellen H. Fukuda, Hiroyuki Sato

Abstract: In this paper, we propose nonlinear conjugate gradient methods for vector optimization on Riemannian manifolds. The concepts of Wolfe and Zoutendjik conditions are extended for Riemannian manifolds. Specifically, we establish the existence of intervals of step sizes that satisfy the Wolfe conditions. The convergence analysis covers the vector extensions of the Fletcher--Reeves, conjugate descent, and Dai--Yuan parameters. Under some assumptions, we prove that the sequence obtained by the algorithm can converge to a Pareto stationary point. Moreover, we also discuss several other choices of the parameter. Numerical experiments illustrating the practical behavior of the methods are presented.

7.Be greedy and learn: efficient and certified algorithms for parametrized optimal control problems

2307.15590

Authors:Hendrik Kleikamp, Martin Lazar, Cesare Molinari

Abstract: We consider parametrized linear-quadratic optimal control problems and provide their online-efficient solutions by combining greedy reduced basis methods and machine learning algorithms. To this end, we first extend the greedy control algorithm, which builds a reduced basis for the manifold of optimal final time adjoint states, to the setting where the objective functional consists of a penalty term measuring the deviation from a desired state and a term describing the control energy. Afterwards, we apply machine learning surrogates to accelerate the online evaluation of the reduced model. The error estimates proven for the greedy procedure are further transferred to the machine learning models and thus allow for efficient a posteriori error certification. We discuss the computational costs of all considered methods in detail and show by means of two numerical examples the tremendous potential of the proposed methodology.

8.Inexact proximal methods for weakly convex functions

2307.15596

Authors:Pham Duy Khanh, Boris Mordukhovich, Dat Ba Tran

Abstract: This paper proposes and develops inexact proximal methods for finding stationary points of the sum of a smooth function and a nonsmooth weakly convex one, where an error is present in the calculation of the proximal mapping of the nonsmooth term. A general framework for finding zeros of a continuous mapping is derived from our previous paper on this subject to establish convergence properties of the inexact proximal point method when the smooth term is vanished and of the inexact proximal gradient method when the smooth term satisfies a descent condition. The inexact proximal point method achieves global convergence with constructive convergence rates when the Moreau envelope of the objective function satisfies the Kurdyka-Lojasiewicz (KL) property. Meanwhile, when the smooth term is twice continuously differentiable with a Lipschitz continuous gradient and a differentiable approximation of the objective function satisfies the KL property, the inexact proximal gradient method achieves the global convergence of iterates with constructive convergence rates.

9.$\ell_p$-sphere covering and approximating nuclear $p$-norm

2307.15616

Authors:Jiewen Guan, Simai He, Bo Jiang, Zhening Li

Abstract: The spectral $p$-norm and nuclear $p$-norm of matrices and tensors appear in various applications albeit both are NP-hard to compute. The former sets a foundation of $\ell_p$-sphere constrained polynomial optimization problems and the latter has been found in many rank minimization problems in machine learning. We study approximation algorithms of the tensor nuclear $p$-norm with an aim to establish the approximation bound matching the best one of its dual norm, the tensor spectral $p$-norm. Driven by the application of sphere covering to approximate both tensor spectral and nuclear norms ($p=2$), we propose several types of hitting sets that approximately represent $\ell_p$-sphere with adjustable parameters for different levels of approximations and cardinalities, providing an independent toolbox for decision making on $\ell_p$-spheres. Using the idea in robust optimization and second-order cone programming, we obtain the first polynomial-time algorithm with an $\Omega(1)$-approximation bound for the computation of the matrix nuclear $p$-norm when $p\in(2,\infty)$ is a rational, paving a way for applications in modeling with the matrix nuclear $p$-norm. These two new results enable us to propose various polynomial-time approximation algorithms for the computation of the tensor nuclear $p$-norm using tensor partitions, convex optimization and duality theory, attaining the same approximation bound to the best one of the tensor spectral $p$-norm. We believe the ideas of $\ell_p$-sphere covering with its applications in approximating nuclear $p$-norm would be useful to tackle optimization problems on other sets such as the binary hypercube with its applications in graph theory and neural networks, the nonnegative sphere with its applications in copositive programming and nonnegative matrix factorization.

10.Convergence of Augmented Lagrangian Methods for Composite Optimization Problems

2307.15627

Authors:Nguyen T. V. Hang, Ebrahim Sarabi

Abstract: Local convergence analysis of the augmented Lagrangian method (ALM) is established for a large class of composite optimization problems with nonunique Lagrange multipliers under a second-order sufficient condition. We present a new second-order variational property, called the semi-stability of second subderivatives, and demonstrate that it is widely satisfied for numerous classes of functions, important for applications in constrained and composite optimization problems. Using the latter condition and a certain second-order sufficient condition, we are able to establish Q-linear convergence of the primal-dual sequence for an inexact version of the ALM for composite programs.

Thu, 27 Jul 2023digest

1.Adjoint-based optimal control of contractile elastic bodies. Application to limbless locomotion on frictional substrates

2307.14681

Authors:Ashutosh Bijalwan, Jose J Munoz

Abstract: In nature, limbless locomotion is adopted by a wide range of organisms at various length scales. Interestingly, undulatory, crawling and inching/looping gait constitutes a fundamental class of limbless locomotion and is often observed in many species such as caterpillars, earthworms, leeches, larvae, and \emph{C. elegans}, to name a few. In this work, we developed a computationally efficient 3D Finite Element (FE) based unified framework for the locomotion of limbless organisms on soft substrates. Muscle activity is simulated with a multiplicative decomposition of deformation gradient, which allows mimicking a broad range of locomotion patterns in 3D solids on frictional substrates. In particular, a two-field FE formulation based on positions and velocities is proposed. Governing partial differential equations are transformed into equivalent time-continuous differential-algebraic equations (DAEs). Next, the optimal locomotion strategies are studied in the framework of optimal control theory. We resort to adjoint-based methods and deduce the first-order optimality conditions, that yield a system of DAEs with two-point end conditions. Hidden symplectic structure and Symplectic Euler time integration of optimality conditions have been discussed. The resulting discrete first-order optimality conditions form a non-linear programming problem that is solved efficiently with the Forward Backwards Sweep Method. Finally, some numerical examples are provided to demonstrate the comprehensiveness of the proposed computational framework and investigate the energy-efficient optimal limbless locomotion strategy out of distinct locomotion patterns adopted by limbless organisms.

2.Optimality of Split Covariance Intersection Fusion

2307.14741

Authors:Colin Cros, Pierre-Olivier Amblard, Christophe Prieur, Jean-François Da Rocha

Abstract: Linear fusion is a cornerstone of estimation theory. Optimal linear fusion was derived by Bar-Shalom and Campo in the 1980s. It requires knowledge of the cross-covariances between the errors of the estimators. In distributed or cooperative systems, these cross-covariances are difficult to compute. To avoid an underestimation of the errors when these cross-covariances are unknown, conservative fusions must be performed. A conservative fusion provides a fused estimator with a covariance bound which is guaranteed to be larger than the true (but not computable) covariance of the error. Previous research by Reinhardt et al. proved that, if no additional assumption is made about the errors of the estimators, the minimal bound for fusing two estimators is given by a fusion called Covariance Intersection (CI). In practice, the errors of the estimators often have an uncorrelated component, because the dynamic or measurement noise is assumed to be independent. In this context, CI is no longer the optimal method and an adaptation called Split Covariance Intersection (SCI) has been designed to take advantage from these uncorrelated components. The contribution of this paper is to prove that SCI is the optimal fusion rule for two estimators under the assumption that they have an uncorrelated component. It is proved that SCI provides the optimal covariance bound with respect to any increasing cost function. To prove the result, a minimal volume that should contain all conservative bounds is derived, and the SCI bounds are proved to be the only bounds that tightly circumscribe this minimal volume.

3.A Variance-Reduced Aggregation Based Gradient Tracking method for Distributed Optimization over Directed Networks

2307.14776

Authors:Shengchao Zhao, Siyuan Song, Yongchao Liu

Abstract: This paper studies the distributed optimization problem over directed networks with noisy information-sharing. To resolve the imperfect communication issue over directed networks, a series of noise-robust variants of Push-Pull/AB method have been developed. These methods improve the robustness of Push-Pull method against the information-sharing noise through adding small factors on weight matrices and replacing the global gradient tracking with the cumulative gradient tracking. Based on the two techniques, we propose a new variant of the Push-Pull method by presenting a novel mechanism of inter-agent information aggregation, named variance-reduced aggregation (VRA). VRA helps us to release some conditions on the objective function and networks. When the objective function is convex and the sharing-information noise is variance-unbounded, it can be shown that the proposed method converges to the optimal solution almost surely. When the objective function is strongly convex and the sharing-information noise is variance-bounded, the proposed method achieves the convergence rate of $\mathcal{O}\left(k^{-(1-\epsilon)}\right)$ in the mean square sense, where $\epsilon$ could be close to 0 infinitely. Simulated experiments on ridge regression problems verify the effectiveness of the proposed method.

4.On the robustness of networks of heterogeneous semi-passive systems interconnected over directed graphs

2307.14868

Authors:Anes Lazri, Elena Panteley, Antonio Loria

Abstract: In this short note we provide a proof of boundedness of solutions for a network system composed of heterogeneous nonlinear autonomous systems interconnected over a directed graph. The sole assumptions imposed are that the systems are semi-passive [1] and the graph contains a spanning tree.

5.Feedback and Open-Loop Nash Equilibria for LQ Infinite-Horizon Discrete-Time Dynamic Games

2307.14898

Authors:A. Monti, B. Nortmann, T. Mylvaganam, M. Sassano

Abstract: We consider dynamic games defined over an infinite horizon, characterized by linear, discrete-time dynamics and quadratic cost functionals. Considering such linear-quadratic (LQ) dynamic games, we focus on their solutions in terms Nash equilibrium strategies. Both Feedback (F-NE) and Open-Loop (OL-NE) Nash equilibrium solutions are considered. The contributions of the paper are threefold. First, our detailed study reveals some interesting structural insights in relation to F-NE solutions. Second, as a stepping stone towards our consideration of OL-NE strategies, we consider a specific infinite-horizon discrete-time (single-player) optimal control problem, wherein the dynamics are influenced by a known exogenous input and draw connections between its solution obtained via Dynamic Programming and Pontryagin's Minimum Principle. Finally, we exploit the latter result to provide a characterization of OL-NE strategies of the class of infinite-horizon dynamic games. The results and key observations made throughout the paper are illustrated via a numerical example.

6.A Stochastic Gradient Tracking Algorithm for Decentralized Optimization With Inexact Communication

2307.14942

Authors:Suhail M. Shah, Raghu Bollapragada

Abstract: Decentralized optimization is typically studied under the assumption of noise-free transmission. However, real-world scenarios often involve the presence of noise due to factors such as additive white Gaussian noise channels or probabilistic quantization of transmitted data. These sources of noise have the potential to degrade the performance of decentralized optimization algorithms if not effectively addressed. In this paper, we focus on the noisy communication setting and propose an algorithm that bridges the performance gap caused by communication noise while also mitigating other challenges like data heterogeneity. We establish theoretical results of the proposed algorithm that quantify the effect of communication noise and gradient noise on the performance of the algorithm. Notably, our algorithm achieves the optimal convergence rate for minimizing strongly convex, smooth functions in the context of inexact communication and stochastic gradients. Finally, we illustrate the superior performance of the proposed algorithm compared to its state-of-the-art counterparts on machine learning problems using MNIST and CIFAR-10 datasets.

Wed, 26 Jul 2023digest

1.Efficient Algorithm for QCQP problem with Multiple Quadratic Constraints

2307.13998

Authors:Huang Yin

Abstract: Starting from a classic financial optimization problem, we first propose a cutting plane algorithm for this problem. Then we use spectral decomposition to tranform the problem into an equivalent D.C. programming problem, and the corresponding upper bound estimate is given by the SCO algorithm; then the corresponding lower bound convex relaxation is given by McCormick envelope. Based on this, we propose a global algorithm for this problem and establish the convergence of the algorithms. What's more, the algorithm is still valid for QCQP with multiple quadratic constraints and quadratic matrix in general form.

2.Stabilization of uncertain linear dynamics: an offline-online strategy

2307.14090

Authors:Philipp A. Guth, Karl Kunisch, Sérgio S. Rodrigues

Abstract: A strategy is proposed for adaptive stabilization of linear systems, depending on an uncertain parameter. Offline, the Riccati stabilizing feedback input control operators, corresponding to parameters in a finite training set of chosen candidates for the uncertain parameter, are solved and stored in a library. A uniform partition of the infinite time interval is chosen. In each of these subintervals, the input is given by one of the stored parameter dependent Riccati feedback operators. This parameter is updated online, at the end of each subinterval, based on input and output data, where the true data, corresponding to the true parameter, is compared to fictitious data that one would obtain in case the parameter was in a selected subset of the training set. The auxiliary data can be computed in parallel, so that the parameter update can be performed in real time. The focus is put on the case that the unknown parameter is constant and that the free dynamics is time-periodic. The stabilizing performance of the input obtained by the proposed strategy is illustrated by numerical simulations, for both constant and switching parameters.

3.Gradient-Type Method for Optimization Problems with Polyak-Lojasiewicz Condition: Relative Inexactness in Gradient and Adaptive Parameters Setting

2307.14101

Authors:Sergei M. Puchinin, Fedor S. Stonyakin

Abstract: We consider minimization problems with the well-known Polya-Lojasievich condition and Lipshitz-continuous gradient. Such problem occurs in different places in machine learning and related fields. Furthermore, we assume that a gradient is available with some relative inexactness. We propose some adaptive gradient-type algorithm, where the adaptivity took place with respect to the smoothness parameter and the level of the gradient inexactness. The theoretical estimate of the the quality of the output point is obtained and backed up by experimental results.

4.Improving Conflict Analysis in MIP Solvers by Pseudo-Boolean Reasoning

2307.14166

Authors:Gioni Mexi, Timo Berthold, Ambros Gleixner, Jakob Nordström

Abstract: Conflict analysis has been successfully generalized from Boolean satisfiability (SAT) solving to mixed integer programming (MIP) solvers, but although MIP solvers operate with general linear inequalities, the conflict analysis in MIP has been limited to reasoning with the more restricted class of clausal constraint. This is in contrast to how conflict analysis is performed in so-called pseudo-Boolean solving, where solvers can reason directly with 0-1 integer linear inequalities rather than with clausal constraints extracted from such inequalities. In this work, we investigate how pseudo-Boolean conflict analysis can be integrated in MIP solving, focusing on 0-1 integer linear programs (0-1 ILPs). Phrased in MIP terminology, conflict analysis can be understood as a sequence of linear combinations and cuts. We leverage this perspective to design a new conflict analysis algorithm based on mixed integer rounding (MIR) cuts, which theoretically dominates the state-of-the-art division-based method in pseudo-Boolean solving. We also report results from a first proof-of-concept implementation of different pseudo-Boolean conflict analysis methods in the open-source MIP solver SCIP. When evaluated on a large and diverse set of 0-1 ILP instances from MIPLIB 2017, our new MIR-based conflict analysis outperforms both previous pseudo-Boolean methods and the clause-based method used in MIP. Our conclusion is that pseudo-Boolean conflict analysis in MIP is a promising research direction that merits further study, and that it might also make sense to investigate the use of such conflict analysis to generate stronger no-goods in constraint programming.

5.Convex semi-infinite programming algorithms with inexact separation oracles

2307.14181

Authors:Antoine Oustry, Martina Cerulli

Abstract: Solving convex Semi-Infinite Programming (SIP) problems is challenging when the separation problem, i.e., the problem of finding the most violated constraint, is computationally hard. We propose to tackle this difficulty by solving the separation problem approximately, i.e., by using an inexact oracle. Our focus lies in two algorithms for SIP, namely the Cutting-Planes (CP) and the Inner-Outer Approximation (IOA) algorithms. We prove the CP convergence rate to be in O(1/k), where k is the number of calls to the limited-accuracy oracle, if the objective function is strongly convex. Compared to the CP algorithm, the advantage of the IOA algorithm is the feasibility of its iterates. In the case of a semi-infinite program with Quadratically Constrained Quadratic Programming separation problem, we prove the convergence of the IOA algorithm toward an optimal solution of the SIP problem despite the oracle's inexactness.

6.Optimisation and monotonicity of the second Robin eigenvalue on a planar exterior domain

2307.14286

Authors:David Krejcirik, Vladimir Lotoreichik

Abstract: We consider the Laplace operator in the exterior of a compact set in the plane, subject to Robin boundary conditions. If the boundary coupling is sufficiently negative, there are at least two discrete eigenvalues below the essential spectrum. We state a general conjecture that the second eigenvalue is maximised by the exterior of a disk under isochoric or isoperimetric constraints. We prove an isoelastic version of the conjecture for the exterior of convex domains. Finally, we establish a monotonicity result for the second eigenvalue under the condition that the compact set is strictly star-shaped and centrally symmetric.

7.Robust Regret Optimal Control

2307.14297

Authors:Jietian Liu, Peter Seiler

Abstract: This paper presents a synthesis method for robust, regret optimal control. The plant is modeled in discrete-time by an uncertain linear time-invariant (LTI) system. An optimal non-causal controller is constructed using the nominal plant model and given full knowledge of the disturbance. Robust regret is defined relative to the performance of this optimal non-causal control. It is shown that a controller achieves robust regret if and only if it satisfies a robust H-infinity performance condition. DK-iteration can be used to synthesize a controller that satisfies this condition and hence achieve a given level of robust regret. The approach is demonstrated via two examples.

8.Parameter-Free FISTA by Adaptive Restart and Backtracking

2307.14323

Authors:Jean-François Aujol, Luca Calatroni, Charles Dossal, Hippolyte Labarrière, Aude Rondepierre

Abstract: We consider a combined restarting and adaptive backtracking strategy for the popular Fast Iterative Shrinking-Thresholding Algorithm frequently employed for accelerating the convergence speed of large-scale structured convex optimization problems. Several variants of FISTA enjoy a provable linear convergence rate for the function values $F(x_n)$ of the form $\mathcal{O}( e^{-K\sqrt{\mu/L}~n})$ under the prior knowledge of problem conditioning, i.e. of the ratio between the (\L ojasiewicz) parameter $\mu$ determining the growth of the objective function and the Lipschitz constant $L$ of its smooth component. These parameters are nonetheless hard to estimate in many practical cases. Recent works address the problem by estimating either parameter via suitable adaptive strategies. In our work both parameters can be estimated at the same time by means of an algorithmic restarting scheme where, at each restart, a non-monotone estimation of $L$ is performed. For this scheme, theoretical convergence results are proved, showing that a $\mathcal{O}( e^{-K\sqrt{\mu/L}n})$ convergence speed can still be achieved along with quantitative estimates of the conditioning. The resulting Free-FISTA algorithm is therefore parameter-free. Several numerical results are reported to confirm the practical interest of its use in many exemplar problems.

Tue, 25 Jul 2023digest

1.Federated K-Means Clustering via Dual Decomposition-based Distributed Optimization

2307.13267

Authors:Vassilios Yfantis, Achim Wagner, Martin Ruskowski

Abstract: The use of distributed optimization in machine learning can be motivated either by the resulting preservation of privacy or the increase in computational efficiency. On the one hand, training data might be stored across multiple devices. Training a global model within a network where each node only has access to its confidential data requires the use of distributed algorithms. Even if the data is not confidential, sharing it might be prohibitive due to bandwidth limitations. On the other hand, the ever-increasing amount of available data leads to large-scale machine learning problems. By splitting the training process across multiple nodes its efficiency can be significantly increased. This paper aims to demonstrate how dual decomposition can be applied for distributed training of $ K $-means clustering problems. After an overview of distributed and federated machine learning, the mixed-integer quadratically constrained programming-based formulation of the $ K $-means clustering training problem is presented. The training can be performed in a distributed manner by splitting the data across different nodes and linking these nodes through consensus constraints. Finally, the performance of the subgradient method, the bundle trust method, and the quasi-Newton dual ascent algorithm are evaluated on a set of benchmark problems. While the mixed-integer programming-based formulation of the clustering problems suffers from weak integer relaxations, the presented approach can potentially be used to enable an efficient solution in the future, both in a central and distributed setting.

2.Finding the spectral radius of a nonnegative irreducible symmetric tensor via DC programming

2307.13287

Authors:Xueli Bai, Dong-Hui Li, Lei Wu, Jiefeng Xu

Abstract: The Perron-Frobenius theorem says that the spectral radius of an irreducible nonnegative tensor is the unique positive eigenvalue corresponding to a positive eigenvector. With this in mind, the purpose of this paper is to find the spectral radius and its corresponding positive eigenvector of an irreducible nonnegative symmetric tensor. By transferring the eigenvalue problem into an equivalent problem of minimizing a concave function on a closed convex set, which is typically a DC (difference of convex functions) programming, we derive a simpler and cheaper iterative method. The proposed method is well-defined. Furthermore, we show that both sequences of the eigenvalue estimates and the eigenvector evaluations generated by the method $Q$-linearly converge to the spectral radius and its corresponding eigenvector, respectively. To accelerate the method, we introduce a line search technique. The improved method retains the same convergence property as the original version. Preliminary numerical results show that the improved method performs quite well.

3.DecisionProgramming.jl --A framework for modelling decision problems using mathematical programming

2307.13299

Authors:Juho Andelmin, Jaan Tollander de Balsch, Helmi Hankimaa, Olli Herrala, Fabricio Oliveira

Abstract: We present DecisionProgramming.jl, a new Julia package for modelling decision problems as mixed-integer programming (MIP) equivalents. The package allows the user to pose decision problems as influence diagrams which are then automatically converted to an equivalent MIP formulation. This MIP formulation is implemented using JuMP.jl, a Julia package providing an algebraic syntax for formulating mathematical programming problems. In this paper, we show novel MIP formulations used in the package, which considerably improve the computational performance of the MIP solver. We also present a novel heuristic that can be employed to warm start the solution, as well as providing heuristic solutions to more computationally challenging problems. Lastly, we describe a novel case study showcasing decision programming as an alternative framework for modelling multi-stage stochastic dynamic programming problems.

4.Computational Guarantees for Doubly Entropic Wasserstein Barycenters via Damped Sinkhorn Iterations

2307.13370

Authors:Lénaïc Chizat, Tomas Vaškevičius

Abstract: We study the computation of doubly regularized Wasserstein barycenters, a recently introduced family of entropic barycenters governed by inner and outer regularization strengths. Previous research has demonstrated that various regularization parameter choices unify several notions of entropy-penalized barycenters while also revealing new ones, including a special case of debiased barycenters. In this paper, we propose and analyze an algorithm for computing doubly regularized Wasserstein barycenters. Our procedure builds on damped Sinkhorn iterations followed by exact maximization/minimization steps and guarantees convergence for any choice of regularization parameters. An inexact variant of our algorithm, implementable using approximate Monte Carlo sampling, offers the first non-asymptotic convergence guarantees for approximating Wasserstein barycenters between discrete point clouds in the free-support/grid-free setting.

5.A new Lagrangian approach to control affine systems with a quadratic Lagrange term

2307.13402

Authors:Sigrid Leyendecker, Sofya Maslovskaya, Sina Ober-Blobaum, Rodrigo T. Sato Martin de Almagro, Flora Orsolya Szemenyei

Abstract: In this work, we consider optimal control problems for mechanical systems on vector spaces with fixed initial and free final state and a quadratic Lagrange term. Specifically, the dynamics is described by a second order ODE containing an affine control term and we allow linear coordinate changes in the configuration space. Classically, Pontryagin's maximum principle gives necessary optimality conditions for the optimal control problem. For smooth problems, alternatively, a variational approach based on an augmented objective can be followed. Here, we propose a new Lagrangian approach leading to equivalent necessary optimality conditions in the form of Euler-Lagrange equations. Thus, the differential geometric structure (similar to classical Lagrangian dynamics) can be exploited in the framework of optimal control problems. In particular, the formulation enables the symplectic discretisation of the optimal control problem via variational integrators in a straightforward way.

6.Multiple Lyapunov Functions and Memory: A Symbolic Dynamics Approach to Systems and Control

2307.13543

Authors:Matteo Della Rossa, Raphaël M. Jungers

Abstract: We propose a novel framework for the Lyapunov analysis of a large class of hybrid systems, inspired by the theory of symbolic dynamics and earlier results on the restricted class of switched systems. This new framework allows us to leverage language theory tools in order to provide a universal characterization of Lyapunov stability for this class of systems. We establish, in particular, a formal connection between multiple Lyapunov functions and techniques based on memorization and/or prediction of the discrete part of the state. This allows us to provide an equivalent (single) Lyapunov function, for any given multiple-Lyapunov criterion. By leveraging our Language-theoretic formalism, a new class of stability conditions is then obtained when considering both memory and future values of the state in a joint fashion, providing new numerical schemes that outperform existing technique. Our techniques are then illustrated on numerical examples.

7.Assortment Optimization with Visibility Constraints

2307.13656

Authors:Theo Barre, Omar El Housni, Andrea Lodi

Abstract: Motivated by applications in e-retail and online advertising, we study the problem of assortment optimization under visibility constraints, that we refer to as APV. We are given a universe of substitutable products and a stream of T customers. The objective is to determine the optimal assortment of products to offer to each customer in order to maximize the total expected revenue, subject to the constraint that each product is required to be shown to a minimum number of customers. The minimum display requirement for each product is given exogenously and we refer to these constraints as visibility constraints. We assume that customer choices follow a Multinomial Logit model (MNL). We provide a characterization of the structure of the optimal assortments and present an efficient polynomial time algorithm for solving APV. To accomplish this, we introduce a novel function called the ``expanded revenue" of an assortment and establish its supermodularity. Our algorithm takes advantage of this structural property. Additionally, we demonstrate that APV can be formulated as a compact linear program. We also examine the revenue loss resulting from the enforcement of visibility constraints, comparing it to the unconstrained version of the problem. To offset this loss, we propose a novel strategy to distribute the loss among the products subject to visibility constraints. Each vendor is charged an amount proportional to their product's contribution to the revenue loss. Finally, we present the results of our numerical experiments providing illustration of the obtained outcomes, and we discuss some preliminary results on the extension of the problem to accommodate cardinality constraints.

8.Reduced Control Systems on Symmetric Lie Algebras

2307.13664

Authors:Emanuel Malvetti, Gunther Dirr, Frederik vom Ende, Thomas Schulte-Herbrüggen

Abstract: For a symmetric Lie algebra $\mathfrak g=\mathfrak k\oplus\mathfrak p$ we consider a class of bilinear or more general control-affine systems on $\mathfrak p$ defined by a drift vector field $X$ and control vector fields $\mathrm{ad}_{k_i}$ for $k_i\in\mathfrak k$ such that one has fast and full control on the corresponding compact group $\mathbf K$. We show that under quite general assumptions on $X$ such a control system is essentially equivalent to a natural reduced system on a maximal Abelian subspace $\mathfrak a\subseteq\mathfrak p$, and likewise to related differential inclusions defined on $\mathfrak a$. We derive a number of general results for such systems and as an application we prove a simulation result with respect to the preorder induced by the Weyl group action.

9.On structural contraction of biological interaction networks

2307.13678

Authors:M. Ali Al-Radhawi, David Angeli, Eduardo Sontag

Abstract: In previous work, we have developed an approach for characterizing the long-term dynamics of classes of Biological Interaction Networks (BINs), based on "rate-dependent Lyapunov functions". In this work, we show that stronger notions of convergence can be established by proving structural contractivity with respect to non-standard norms. We illustrate our theory with examples from signaling pathways.

Mon, 24 Jul 2023digest

1.Decentralized Optimization Over Slowly Time-Varying Graphs: Algorithms and Lower Bounds

2307.12562

Authors:Dmitry Metelev, Aleksandr Beznosikov, Alexander Rogozin, Alexander Gasnikov, Anton Proskurnikov

Abstract: We consider a decentralized convex unconstrained optimization problem, where the cost function can be decomposed into a sum of strongly convex and smooth functions, associated with individual agents, interacting over a static or time-varying network. Our main concern is the convergence rate of first-order optimization algorithms as a function of the network's graph, more specifically, of the condition numbers of gossip matrices. We are interested in the case when the network is time-varying but the rate of changes is restricted. We study two cases: randomly changing network satisfying Markov property and a network changing in a deterministic manner. For the random case, we propose a decentralized optimization algorithm with accelerated consensus. For the deterministic scenario, we show that if the graph is changing in a worst-case way, accelerated consensus is not possible even if only two edges are changed at each iteration. The fact that such a low rate of network changes is sufficient to make accelerated consensus impossible is novel and improves the previous results in the literature.

2.Finite-sum optimization: Adaptivity to smoothness and loopless variance reduction

2307.12615

Authors:Bastien Batardière, Julien Chiquet, Joon Kwon

Abstract: For finite-sum optimization, variance-reduced gradient methods (VR) compute at each iteration the gradient of a single function (or of a mini-batch), and yet achieve faster convergence than SGD thanks to a carefully crafted lower-variance stochastic gradient estimator that reuses past gradients. Another important line of research of the past decade in continuous optimization is the adaptive algorithms such as AdaGrad, that dynamically adjust the (possibly coordinate-wise) learning rate to past gradients and thereby adapt to the geometry of the objective function. Variants such as RMSprop and Adam demonstrate outstanding practical performance that have contributed to the success of deep learning. In this work, we present AdaVR, which combines the AdaGrad algorithm with variance-reduced gradient estimators such as SAGA or L-SVRG. We assess that AdaVR inherits both good convergence properties from VR methods and the adaptive nature of AdaGrad: in the case of $L$-smooth convex functions we establish a gradient complexity of $O(n+(L+\sqrt{nL})/\varepsilon)$ without prior knowledge of $L$. Numerical experiments demonstrate the superiority of AdaVR over state-of-the-art methods. Moreover, we empirically show that the RMSprop and Adam algorithm combined with variance-reduced gradients estimators achieve even faster convergence.

3.Simultaneous Optimization of Launch Vehicle Stage and Trajectory Considering Operational Safety Constraints

2307.12642

Authors:Jaeyoul Ko, Jaewoo Kim, Jimin Choi, Jaemyung Ahn

Abstract: A conceptual design of a launch vehicle involves the optimization of trajectory and stages considering its launch operations. This process encompasses various disciplines, such as structural design, aerodynamics, propulsion systems, flight control, and stage sizing. Traditional approaches used for the conceptual design of a launch vehicle conduct the stage and trajectory designs sequentially, often leading to high computational complexity and suboptimal results. This paper presents an optimization framework that addresses both trajectory optimization and staging in an integrated way. The proposed framework aims to maximize the payload-to-liftoff mass ratio while satisfying the constraints required for safe launch operations (e.g., the impact points of burnt stages and fairing). A case study demonstrates the advantage of the proposed framework compared to the traditional sequential optimization approach.

4.Dissipative State and Output Estimation of Systems with General Delays

2307.12694

Authors:Qian Feng, Feng Xiao, Xiaoyu Wang

Abstract: Dissipative state and output estimation for continuous time-delay systems pose a significant challenge when an unlimited number of pointwise and general distributed delays (DDs) are concerned. We propose an effective solution to this open problem using the Krasovski\u{\i} functional (KF) framework in conjunction with a quadratic supply rate function, where both the plant and the estimator can accommodate an unlimited number of pointwise and general DDs. All DDs can contain an unlimited number of square-integrable kernel functions, which are treated by an equivalent decomposition-approximation scheme. This novel approach allows for the factorization or approximation of any kernel function without introducing conservatism, and facilitates the construction of a complete-type KF with integral kernels that can encompass any number of differentiable (weak derivatives) and linearly independent functions. Our proposed solution is expressed as convex semidefinite programs presented in two theorems along with an iterative algorithm, which eliminates the need of nonlinear solvers. We demonstrate the effectiveness of our method using two challenging numerical experiments, including a system stabilized by a non-smooth controller.

5.Accelerated Zero-Order SGD Method for Solving the Black Box Optimization Problem under "Overparametrization" Condition

2307.12725

Authors:Aleksandr Lobanov, Alexander Gasnikov

Abstract: This paper is devoted to solving a convex stochastic optimization problem in a overparameterization setup for the case where the original gradient computation is not available, but an objective function value can be computed. For this class of problems we provide a novel gradient-free algorithm, whose creation approach is based on applying a gradient approximation with $l_2$ randomization instead of a gradient oracle in the biased Accelerated SGD algorithm, which generalizes the convergence results of the AC-SA algorithm to the case where the gradient oracle returns a noisy (inexact) objective function value. We also perform a detailed analysis to find the maximum admissible level of adversarial noise at which we can guarantee to achieve the desired accuracy. We verify the theoretical results of convergence using a model example.

6.Open Problem: Polynomial linearly-convergent method for geodesically convex optimization?

2307.12743

Authors:Christopher Criscitiello, David Martínez-Rubio, Nicolas Boumal

Abstract: Let $f \colon \mathcal{M} \to \mathbb{R}$ be a Lipschitz and geodesically convex function defined on a $d$-dimensional Riemannian manifold $\mathcal{M}$. Does there exist a first-order deterministic algorithm which (a) uses at most $O(\mathrm{poly}(d) \log(\epsilon^{-1}))$ subgradient queries to find a point with target accuracy $\epsilon$, and (b) requires only $O(\mathrm{poly}(d))$ arithmetic operations per query? In convex optimization, the classical ellipsoid method achieves this. After detailing related work, we provide an ellipsoid-like algorithm with query complexity $O(d^2 \log^2(\epsilon^{-1}))$ and per-query complexity $O(d^2)$ for the limited case where $\mathcal{M}$ has constant curvature (hemisphere or hyperbolic space). We then detail possible approaches and corresponding obstacles for designing an ellipsoid-like method for general Riemannian manifolds.

7.Impulsive optimal control problems with time delays in the drift term

2307.12806

Authors:Giovanni Fusco, Monica Motta

Abstract: We introduce a notion of bounded variation solution for a new class of nonlinear control systems with ordinary and impulsive controls, in which the drift function depends not only on the state, but also on its past history, through a finite number of time delays. After proving the well posedness of such solutions and the continuity of the corresponding input output map with respect to suitable topologies, we establish necessary optimality conditions for an associated optimal control problem. The approach, which involves approximating the problem by a non impulsive optimal control problem with time delays and using Ekeland principle combined with a recent, nonsmooth version of the Maximum Principle for conventional delayed systems, allows us to deal with mild regularity assumptions and a general endpoint constraint.

8.Optimal Algorithm with Complexity Separation for Strongly Convex-Strongly Concave Composite Saddle Point Problems

2307.12946

Authors:Ekaterina Borodich, Georgiy Kormakov, Dmitry Kovalev, Aleksandr Beznosikov, Alexander Gasnikov

Abstract: In this work, we focuses on the following saddle point problem $\min_x \max_y p(x) + R(x,y) - q(y)$ where $R(x,y)$ is $L_R$-smooth, $\mu_x$-strongly convex, $\mu_y$-strongly concave and $p(x), q(y)$ are convex and $L_p, L_q$-smooth respectively. We present a new algorithm with optimal overall complexity $\mathcal{O}\left(\left(\sqrt{\frac{L_p}{\mu_x}} + \frac{L_R}{\sqrt{\mu_x \mu_y}} + \sqrt{\frac{L_q}{\mu_y}}\right)\log \frac{1}{\varepsilon}\right)$ and separation of oracle calls in the composite and saddle part. This algorithm requires $\mathcal{O}\left(\left(\sqrt{\frac{L_p}{\mu_x}} + \sqrt{\frac{L_q}{\mu_y}}\right) \log \frac{1}{\varepsilon}\right)$ oracle calls for $\nabla p(x)$ and $\nabla q(y)$ and $\mathcal{O} \left( \max\left\{\sqrt{\frac{L_p}{\mu_x}}, \sqrt{\frac{L_q}{\mu_y}}, \frac{L_R}{\sqrt{\mu_x \mu_y}} \right\}\log \frac{1}{\varepsilon}\right)$ oracle calls for $\nabla R(x,y)$ to find an $\varepsilon$-solution of the problem. To the best of our knowledge, we are the first to develop optimal algorithm with complexity separation in the case $\mu_x \not = \mu_y$. Also, we apply this algorithm to a bilinear saddle point problem and obtain the optimal complexity for this class of problems.

Fri, 21 Jul 2023digest

1.Robust stabilization of $2 \times 2$ first-order hyperbolic PDEs with uncertain input delay

2307.11424

Authors:Jing Zhang, Jie Qi

Abstract: A backstepping-based compensator design is developed for a system of $2\times2$ first-order linear hyperbolic partial differential equations (PDE) in the presence of an uncertain long input delay at boundary. We introduce a transport PDE to represent the delayed input, which leads to three coupled first-order hyperbolic PDEs. A novel backstepping transformation, composed of two Volterra transformations and an affine Volterra transformation, is introduced for the predictive control design. The resulting kernel equations from the affine Volterra transformation are two coupled first-order PDEs and each with two boundary conditions, which brings challenges to the well-posedness analysis. We solve the challenge by using the method of characteristics and the successive approximation. To analyze the sensitivity of the closed-loop system to uncertain input delay, we introduce a neutral system which captures the control effect resulted from the delay uncertainty. It is proved that the proposed control is robust to small delay variations. Numerical examples illustrate the performance of the proposed compensator.

2.Second-order optimality conditions for bilevel programs

2307.11427

Authors:Xiang Liu, Mengwei Xu, Liwei Zhang

Abstract: Second-order optimality conditions of the bilevel programming problems are dependent on the second-order directional derivatives of the value functions or the solution mappings of the lower level problems under some regular conditions, which can not be calculated or evaluated. To overcome this difficulty, we propose the notion of the bi-local solution. Under the Jacobian uniqueness conditions for the lower level problem, we prove that the bi-local solution is a local minimizer of some one-level minimization problem. Basing on this property, the first-order necessary optimality conditions and second-order necessary and sufficient optimality conditions for the bi-local optimal solution of a given bilevel program are established. The second-order optimality conditions proposed here only involve second-order derivatives of the defining functions of the bilevel problem. The second-order sufficient optimality conditions are used to derive the Q-linear convergence rate of the classical augmented Lagrangian method.

3.Neural Operators for Delay-Compensating Control of Hyperbolic PIDEs

2307.11436

Authors:Jie Qi, Jing Zhang, Miroslav Krstic

Abstract: The recently introduced DeepONet operator-learning framework for PDE control is extended from the results for basic hyperbolic and parabolic PDEs to an advanced hyperbolic class that involves delays on both the state and the system output or input. The PDE backstepping design produces gain functions that are outputs of a nonlinear operator, mapping functions on a spatial domain into functions on a spatial domain, and where this gain-generating operator's inputs are the PDE's coefficients. The operator is approximated with a DeepONet neural network to a degree of accuracy that is provably arbitrarily tight. Once we produce this approximation-theoretic result in infinite dimension, with it we establish stability in closed loop under feedback that employs approximate gains. In addition to supplying such results under full-state feedback, we also develop DeepONet-approximated observers and output-feedback laws and prove their own stabilizing properties under neural operator approximations. With numerical simulations we illustrate the theoretical results and quantify the numerical effort savings, which are of two orders of magnitude, thanks to replacing the numerical PDE solving with the DeepONet.

4.Note on Steepest Descent Algorithm for Quasi L$^{\natural}$-convex Function Minimization

2307.11491

Authors:Kazuo Murota, Akiyoshi Shioura

Abstract: We define a class of discrete quasi convex functions, called semi-strictly quasi L$^{\natural}$-convex functions, and show that the steepest descent algorithm for L$^{\natural}$-convex function minimization also works for this class of quasi convex functions. The analysis of the exact number of iterations is also extended, revealing the so-called geodesic property of the steepest descent algorithm when applied to semi-strictly quasi L$^{\natural}$-convex functions.

5.Forward Completeness and Applications to Control of Automated Vehicles

2307.11515

Authors:Iasson Karafyllis, Dionysis Theodosis, Markos Papageorgiou

Abstract: Forward complete systems are guaranteed to have solutions that exist globally for all positive time. In this paper, a relaxed Lyapunov-like condition for forward completeness is presented for finite-dimensional systems defined on open sets that does not require boundedness of the Lyapunov-like function along the solutions of the system. The corresponding condition is then exploited for the design of autonomous two-dimensional movement, with focus on lane-free cruise controllers for automated vehicles described by the bicycle kinematic model. The derived feedback laws (cruise controllers) are decentralized and can account for collision avoidance, roads of variable width, on-ramps and off-ramps as well as different desired speed for each vehicle.

6.Further Remarks on the Sampled-Data Feedback Stabilization Problem

2307.11517

Authors:John Tsinias, Dionysis Theodosis

Abstract: The paper deals with the problem of the sampled data feedback stabilization for autonomous nonlinear systems. The corresponding results extend those obtained in earlier works by the same authors. The sufficient conditions we establish are based on the existence of discontinuous control Lyapunov functions and the corresponding results are applicable to a class of nonlinear affine in the control systems.

7.Simultaneous Planning of Liner Ship Speed Optimization, Fleet Deployment, Scheduling and Cargo Allocation with Container Transshipment

2307.11583

Authors:Jasashwi Mandal, Adrijit Goswami, Lakshman Thakur, Manoj Kumar Tiwari

Abstract: Due to a substantial growth in the world waterborne trade volumes and drastic changes in the global climate accounted for CO2 emissions, the shipping companies need to escalate their operational and energy efficiency. Therefore, a multi-objective mixed-integer non-linear programming (MINLP) model is proposed in this study to simultaneously determine the optimal service schedule, number of vessels in a fleet serving each route, vessel speed between two ports of call, and flow of cargo considering transshipment operations for each pair of origin-destination. This MINLP model presents a trade-off between economic and environmental aspects considering total shipping time and overall shipping cost as the two conflicting objectives. The shipping cost comprises of CO2 emission, fuel consumption and several operational costs where fuel consumption is determined using speed and load. Two efficient evolutionary algorithms: Nondominated Sorting Genetic Algorithm II (NSGA-II) and Online Clustering-based Evolutionary Algorithm (OCEA) are applied to attain the near-optimal solution of the proposed problem. Furthermore, six problem instances of different sizes are solved using these algorithms to validate the proposed model.

8.A more efficient reformulation of complex SDP as real SDP

2307.11599

Authors:Jie Wang

Abstract: This note proposes a novel reformulation of complex semidefinite programs (SDPs) as real SDPs by using Lagrange duality. As an application, we present an economical reformulation of complex SDP relaxations of complex polynomial optimization problems as real SDPs and derive some further reductions by exploiting structure of the complex SDP relaxations. Various numerical examples demonstrate that our new reformulation runs several times (one magnitude in some cases) faster than the usual popular reformulation.

9.Vector-borne disease outbreak control via instant releases

2307.11614

Authors:Luis Almeida, Jesús Bellver Arnau, Yannick Privat, Carlota Rebelo

Abstract: This paper is devoted to the study of optimal release strategies to control vector-borne diseases, such as dengue, Zika, chikungunya and malaria. Two techniques are considered: the sterile insect one (SIT), which consists in releasing sterilized males among wild vectors in order to perturb their reproduction, and the Wolbachia one (presently used mainly for mosquitoes), which consists in releasing vectors, that are infected with a bacterium limiting their vector capacity, in order to replace the wild population by one with reduced vector capacity. In each case, the time dynamics of the vector population is modeled by a system of ordinary differential equations in which the releases are represented by linear combinations of Dirac measures with positive coefficients determining their intensity. We introduce optimal control problems that we solve numerically using ad-hoc algorithms, based on writing first-order optimality conditions characterizing the best combination of Dirac measures. We then discuss the results obtained, focusing in particular on the complexity and efficiency of optimal controls and comparing the strategies obtained. Mathematical modeling can help testing a great number of scenarios that are potentially interesting in future interventions (even those that are orthogonal to the present strategies) but that would be hard, costly or even impossible to test in the field in present conditions.

10.About the Blaschke-Santalo diagram of area, perimeter and moment of inertia

2307.11658

Authors:Raphael Gastaldello, Antoine Henrot, Ilaria Lucardesi

Abstract: We study the Blaschke-Santal\'o diagram associated to the area, the perimeter, and the moment of inertia. We work in dimension 2, under two assumptions on the shapes: convexity and the presence of two orthogonal axis of symmetry. We discuss topological and geometrical properties of the diagram. As a by-product we address a conjecture by P\'olya, in the simplified setting of double symmetry.

11.A Sampling-Based Method for Gittins Index Approximation

2307.11713

Authors:Stef Baas, Richard J. Boucherie, Aleida Braaksma

Abstract: A sampling-based method is introduced to approximate the Gittins index for a general family of alternative bandit processes. The approximation consists of a truncation of the optimization horizon and support for the immediate rewards, an optimal stopping value approximation, and a stochastic approximation procedure. Finite-time error bounds are given for the three approximations, leading to a procedure to construct a confidence interval for the Gittins index using a finite number of Monte Carlo samples, as well as an epsilon-optimal policy for the Bayesian multi-armed bandit. Proofs are given for almost sure convergence and convergence in distribution for the sampling based Gittins index approximation. In a numerical study, the approximation quality of the proposed method is verified for the Bernoulli bandit and Gaussian bandit with known variance, and the method is shown to significantly outperform Thompson sampling and the Bayesian Upper Confidence Bound algorithms for a novel random effects multi-armed bandit.

Thu, 20 Jul 2023digest

1.A Generalized Pell's equation for a class of multivariate orthogonal polynomials

2307.10668

Authors:Jean-Bernard Lasserre LAAS-POP, Yuan Xu

Abstract: We extend the polynomial Pell's equation satisfied by univariate Chebyshev polynomials on [--1, 1] from one variable to several variables, using orthogonal polynomials on regular domains that include cubes, balls, and simplexes of arbitrary dimension. Moreover, we show that such an equation is strongly connected (i) to a certificate of positivity (from real algebraic geometry) on the domain, as well as (ii) to the Christoffel functions of the equilibrium measure on the domain. In addition, the solution to Pell's equation reflects an extremal property of orthonormal polynomials associated with an entropy-like criterion.

2.Gotta catch 'em all: Modeling All Discrete Alternatives for Industrial Energy System Transitions

2307.10687

Authors:Hendrik Schricker, Benedikt Schuler, Christiane Reinert, Niklas von der Aßen

Abstract: Industrial decision-makers often base decisions on mathematical optimization models to achieve cost-efficient design solutions in energy transitions. However, since a model can only approximate reality, the optimal solution is not necessarily the best real-world energy system. Exploring near-optimal design spaces, e.g., by the Modeling All Alternatives (MAA) method, provides a more holistic view of decision alternatives beyond the cost-optimal solution. However, the MAA method misses out on discrete in-vestment decisions. Incorporating such discrete investment decisions is crucial when modeling industrial energy systems. Our work extends the MAA method by integrating discrete design decisions. We optimize the design and operation of an industrial energy system transformation using a mixed-integer linear program. First, we explore the continuous, near-optimal design space by applying the MAA method. Thereafter, we sample all discrete design alternatives from the continuous, near-optimal design space. In a case study, we apply our method to identify all near-optimal design alternatives of an industrial energy system. We find 128 near-optimal design alternatives where costs are allowed to increase to a maximum of one percent offering decision-makers more flexibility in their investment decisions. Our work enables the analysis of discrete design alternatives for industrial energy transitions and supports the decision-making process for investments in energy infrastructure.

3.A unified observability result for non-autonomous observation problems

2307.10716

Authors:Fabian Gabel, Albrecht Seelmann

Abstract: A final-state observability result in the Banach space setting for non-autonomous observation problems is obtained that covers and extends all previously known results in this context, while providing a streamlined proof that follows the established Lebeau-Robbiano strategy.

4.Quantifying low rank approximations of third order symmetric tensors

2307.10855

Authors:Shenglong Hu, Defeng Sun, Kim-Chuan Toh

Abstract: In this paper, we present a method to certify the approximation quality of a low rank tensor to a given third order symmetric tensor. Under mild assumptions, best low rank approximation is attained if a control parameter is zero or quantified quasi-optimal low rank approximation is obtained if the control parameter is positive.This is based on a primal-dual method for computing a low rank approximation for a given tensor. The certification is derived from the global optimality of the primal and dual problems, and is characterized by easily checkable relations between the primal and the dual solutions together with another rank condition. The theory is verified theoretically for orthogonally decomposable tensors as well as numerically through examples in the general case.

5.Decentralized conditional gradient method over time-varying graphs

2307.10978

Authors:Roman Vedernikov, Alexander Rogozin, Alexander Gasnikov

Abstract: In this paper we study a generalization of distributed conditional gradient method to time-varying network architectures. We theoretically analyze convergence properties of the algorithm and provide numerical experiments. The time-varying network is modeled as a deterministic of a stochastic sequence of graphs.

Wed, 19 Jul 2023digest

1.On the Bredies-Chenchene-Lorenz-Naldi algorithm

2307.09747

Authors:Heinz H. Bauschke, Walaa M. Moursi, Shambhavi Singh, Xianfu Wang

Abstract: Monotone inclusion problems occur in many areas of optimization and variational analysis. Splitting methods, which utilize resolvents or proximal mappings of the underlying operators, are often applied to solve these problems. In 2022, Bredies, Chenchene, Lorenz, and Naldi introduced a new elegant algorithmic framework that encompasses various well known algorithms including Douglas-Rachford and Chambolle-Pock. They obtained powerful weak and strong convergence results, where the latter type relies on additional strong monotonicity assumptions. In this paper, we complement the analysis by Bredies et al. by relating the projections of the fixed point sets of the underlying operators that generate the (reduced and original) preconditioned proximal point sequences. We also obtain strong convergence results in the case of linear relations. Various examples are provided to illustrate the applicability of our results.

2.Stopping Rules for Gradient Method for Saddle Point Problems with Twoside Polyak-Lojasievich Condition

2307.09921

Authors:Muratidi A. Ya., Stonyakin F. S

Abstract: The paper considers approaches to saddle point problems with a two-sided variant of the Polyak-Lojasievich condition based on the gradient method with inexact information and proposes a stopping rule based on the smallness of the norm of the inexact gradient of the external subproblem. Achieving this rule in combination with a suitable accuracy of solving the auxiliary subproblem ensures that the quality of the original saddle point problem is acceptable. The results of numerical experiments for various saddle point problems are discussed to illustrate the effectiveness of the proposed method, including the comparison with proven convergence rate estimates.

3.Information Structures in AC/DC Grids

2307.09922

Authors:Josh A. Taylor

Abstract: The converters in an AC/DC grid form actuated boundaries between the AC and DC subgrids. We show how in both simple linear and balanced dq-frame models, the states on either side of these boundaries are coupled only by control inputs. This topological property imparts all AC/DC grids with poset-causal information structures. A practical benefit is that certain decentralized control problems that are hard in general are tractable for poset-causal systems. We also show that special cases like multi-terminal DC grids can have coordinated and leader-follower information structures.

4.Inexact Direct-Search Methods for Bilevel Optimization Problems

2307.09924

Authors:Youssef Diouane, Vyacheslav Kungurtsev, Francesco Rinaldi, Damiano Zeffiro

Abstract: In this work, we introduce new direct search schemes for the solution of bilevel optimization (BO) problems. Our methods rely on a fixed accuracy black box oracle for the lower-level problem, and deal both with smooth and potentially nonsmooth true objectives. We thus analyze for the first time in the literature direct search schemes in these settings, giving convergence guarantees to approximate stationary points, as well as complexity bounds in the smooth case. We also propose the first adaptation of mesh adaptive direct search schemes for BO. Some preliminary numerical results on a standard set of bilevel optimization problems show the effectiveness of our new approaches.

5.A non-monotone extra-gradient trust-region method with noisy oracles

2307.10038

Authors:Natasa Krejic, Natasa Krklec Jerinkic, Angeles Martinez, Mahsa Yousefi

Abstract: In this work, we introduce a novel stochastic second-order method, within the framework of a non-monotone trust-region approach, for solving the unconstrained, nonlinear, and non-convex optimization problems arising in the training of deep neural networks. The proposed algorithm makes use of subsampling strategies which yield noisy approximations of the finite sum objective function and its gradient. To effectively control the resulting approximation error, we introduce an adaptive sample size strategy based on inexpensive additional sampling. Depending on the estimated progress of the algorithm, this can yield sample size scenarios ranging from mini-batch to full sample functions. We provide convergence analysis for all possible scenarios and show that the proposed method achieves almost sure convergence under standard assumptions for the trust-region framework. We report numerical experiments showing that the proposed algorithm outperforms its state-of-the-art counterpart in deep neural network training for image classification and regression tasks while requiring a significantly smaller number of gradient evaluations.

6.Convergence Guarantees for Stochastic Subgradient Methods in Nonsmooth Nonconvex Optimization

2307.10053

Authors:Nachuan Xiao, Xiaoyin Hu, Kim-Chuan Toh

Abstract: In this paper, we investigate the convergence properties of the stochastic gradient descent (SGD) method and its variants, especially in training neural networks built from nonsmooth activation functions. We develop a novel framework that assigns different timescales to stepsizes for updating the momentum terms and variables, respectively. Under mild conditions, we prove the global convergence of our proposed framework in both single-timescale and two-timescale cases. We show that our proposed framework encompasses a wide range of well-known SGD-type methods, including heavy-ball SGD, SignSGD, Lion, normalized SGD and clipped SGD. Furthermore, when the objective function adopts a finite-sum formulation, we prove the convergence properties for these SGD-type methods based on our proposed framework. In particular, we prove that these SGD-type methods find the Clarke stationary points of the objective function with randomly chosen stepsizes and initial points under mild assumptions. Preliminary numerical experiments demonstrate the high efficiency of our analyzed SGD-type methods.

7.An Operator-Splitting Approach for Variational Optimal Control Formulations for Diffeomorphic Shape Matching

2307.10114

Authors:Andreas Mang, Jiwen He, Robert Azencott

Abstract: We present formulations and numerical algorithms for solving diffeomorphic shape matching problems. We formulate shape matching as a variational problem governed by a dynamical system that models the flow of diffeomorphism $f_t \in \operatorname{diff}(\mathbb{R}^3)$. We overview our contributions in this area, and present an improved, matrix-free implementation of an operator splitting strategy for diffeomorphic shape matching. We showcase results for diffeomorphic shape matching of real clinical cardiac data in $\mathbb{R}^3$ to assess the performance of our methodology.

Tue, 18 Jul 2023digest

1.Solution of the Optimal Control Problem for the Cahn-Hilliard Equation Using Finite Difference Approximation

2307.09016

Authors:Gobinda Garai, Bankim C. Mandal

Abstract: This paper is concerned with the designing, analyzing and implementing linear and nonlinear discretization scheme for the distributed optimal control problem (OCP) with the Cahn-Hilliard (CH) equation as constrained. We propose three difference schemes to approximate and investigate the solution behaviour of the OCP for the CH equation. We present the convergence analysis of the proposed discretization. We verify our findings by presenting numerical experiments.

2.Harnessing the mathematics of matrix decomposition to solve planted and maximum clique problem

2307.09022

Authors:Salma Omer, Montaz Ali

Abstract: We consider the problem of identifying a maximum clique in a given graph. We have proposed a mathematical model for this problem. The model resembles the matrix decomposition of the adjacency matrix of a given graph. The objective function of the mathematical model includes a weighted $\ell_{1}$-norm of the sparse matrix of the decomposition, which has an advantage over the known $\ell_{1}-$norm in reducing the error. The use of dynamically changing the weights for the $\ell_{1}$-norm has been motivated. We have used proximal operators within the iterates of the ADMM (alternating direction method of multipliers) algorithm to solve the optimization problem. Convergence of the proposed ADMM algorithm has been provided. The theoretical guarantee of the maximum clique in the form of the low-rank matrix has also been established using the golfing scheme to construct approximate dual certificates. We have constructed conditions that guarantee the recovery and uniqueness of the solution, as well as a tight bound on the dual matrix that validates optimality conditions. Numerical results for planted cliques are presented showing clear advantages of our model when compared with two recent mathematical models. Results are also presented for randomly generated graphs with minimal errors. These errors are found using a formula we have proposed based on the size of the clique. Moreover, we have applied our algorithm to real-world graphs for which cliques have been recovered successfully. The validity of these clique sizes comes from the decomposition of input graph into a rank-one matrix (corresponds to the clique) and a sparse matrix.

3.Globally solving the Gromov-Wasserstein problem for point clouds in low dimensional Euclidean spaces

2307.09057

Authors:Martin Ryner, Jan Kronqvist, Johan Karlsson

Abstract: This paper presents a framework for computing the Gromov-Wasserstein problem between two sets of points in low dimensional spaces, where the discrepancy is the squared Euclidean norm. The Gromov-Wasserstein problem is a generalization of the optimal transport problem that finds the assignment between two sets preserving pairwise distances as much as possible. This can be used to quantify the similarity between two formations or shapes, a common problem in AI and machine learning. The problem can be formulated as a Quadratic Assignment Problem (QAP), which is in general computationally intractable even for small problems. Our framework addresses this challenge by reformulating the QAP as an optimization problem with a low-dimensional domain, leveraging the fact that the problem can be expressed as a concave quadratic optimization problem with low rank. The method scales well with the number of points, and it can be used to find the global solution for large-scale problems with thousands of points. We compare the computational complexity of our approach with state-of-the-art methods on synthetic problems and apply it to a near-symmetrical problem which is of particular interest in computational biology.

4.Decentralized Stochastic Linear-Quadratic Optimal Control with Risk Constraint and Partial Observation

2307.09152

Authors:Jia Hui, Yuan-Hua Ni

Abstract: This paper addresses a risk-constrained decentralized stochastic linear-quadratic optimal control problem with one remote controller and one local controller, where the risk constraint is posed on the cumulative state weighted variance in order to reduce the oscillation of system trajectory. In this model, local controller can only partially observe the system state, and sends the estimate of state to remote controller through an unreliable channel, whereas the channel from remote controller to local controllers is perfect. For the considered constrained optimization problem, we first punish the risk constraint into cost function through Lagrange multiplier method, and the resulting augmented cost function will include a quadratic mean-field term of state. In the sequel, for any but fixed multiplier, explicit solutions to finite-horizon and infinite-horizon mean-field decentralized linear-quadratic problems are derived together with necessary and sufficient condition on the mean-square stability of optimal system. Then, approach to find the optimal Lagrange multiplier is presented based on bisection method. Finally, two numerical examples are given to show the efficiency of the obtained results.

5.A Sweeping Process Control Problem Subject To Mixed Constraints

2307.09164

Authors:Karla L. Cortez, Nathalie T. Khalil, Julio E. Solis

Abstract: In this study, we investigate optimal control problems that involve sweeping processes with a drift term and mixed inequality constraints. Our goal is to establish necessary optimality conditions for these problems. We address the challenges that arise due to the combination of sweeping processes and inequality mixed constraints in two contexts: regular and non-regular. This requires working with different types of multipliers, such as finite positive Radon measures for the sweeping term and integrable functions for regular mixed constraints. For non-regular mixed constraints, the multipliers correspond to purely finitely additive set functions.

6.BOP-Elites, a Bayesian Optimisation Approach to Quality Diversity Search with Black-Box descriptor functions

2307.09326

Authors:Paul Kent, Adam Gaier, Jean-Baptiste Mouret, Juergen Branke

Abstract: Quality Diversity (QD) algorithms such as MAP-Elites are a class of optimisation techniques that attempt to find many high performing points that all behave differently according to a user-defined behavioural metric. In this paper we propose the Bayesian Optimisation of Elites (BOP-Elites) algorithm. Designed for problems with expensive black-box fitness and behaviour functions, it is able to return a QD solution-set with excellent final performance already after a relatively small number of samples. BOP-Elites models both fitness and behavioural descriptors with Gaussian Process (GP) surrogate models and uses Bayesian Optimisation (BO) strategies for choosing points to evaluate in order to solve the quality-diversity problem. In addition, BOP-Elites produces high quality surrogate models which can be used after convergence to predict solutions with any behaviour in a continuous range. An empirical comparison shows that BOP-Elites significantly outperforms other state-of-the-art algorithms without the need for problem-specific parameter tuning.

7.Disturbance decoupled functional observers for fault estimation in nonlinear systems

2307.09359

Authors:Sunjeev Venkateswaran, Costas Kravaris

Abstract: This work deals with the problem of designing disturbance decupled observers for the estimation of a function of the states in nonlinear systems. Necessary and sufficient conditions for the existence of lower order disturbance decoupled functional observers with linear dynamics and linear output map are derived. Based on this methodology, a fault-estimation scheme based on disturbance decoupled observers will be presented. Throughout the paper, the application of the results will be illustrated through a chemical reactor case study

8.Grid-Forming Hybrid Angle Control: Behavior, Stability, Variants and Verification

2307.09398

Authors:Ali Tayyebi, Denis Vettoretti, Adolfo Anta, Florian Dörfler

Abstract: This work explores the stability, behavior, variants, and a controller-hardware-in-the-loop (C-HiL) verification of the recently proposed grid-forming (GFM) hybrid angle control (HAC). We revisit the foundation of GFM HAC, and highlight its behavioral properties in relation to the conventional synchronous machine (SM). Next, we introduce the required complementary controls to be combined with the HAC to realize a GFM behavior. The characterization of the analytical operating point and nonlinear energy-based stability analysis of a grid-connected converter under the HAC is presented. Further, we consider various output filter configurations and derive an approximation for the original control proposal. Moreover, we provide details on the integration of GFM HAC into a complex converter control architecture and introduce several variants of the standard HAC. Finally, the performance of GFM HAC is verified by several test scenarios in a C-HiL setup to test its behavior against real-world effect such as noise and delays.

9.Jointly Improving the Sample and Communication Complexities in Decentralized Stochastic Minimax Optimization

2307.09421

Authors:Xuan Zhang, Gabriel Mancino-Ball, Necdet Serhat Aybat, Yangyang Xu

Abstract: We propose a novel single-loop decentralized algorithm called DGDA-VR for solving the stochastic nonconvex strongly-concave minimax problem over a connected network of $M$ agents. By using stochastic first-order oracles to estimate the local gradients, we prove that our algorithm finds an $\epsilon$-accurate solution with $\mathcal{O}(\epsilon^{-3})$ sample complexity and $\mathcal{O}(\epsilon^{-2})$ communication complexity, both of which are optimal and match the lower bounds for this class of problems. Unlike competitors, our algorithm does not require multiple communications for the convergence results to hold, making it applicable to a broader computational environment setting. To the best of our knowledge, this is the first such algorithm to jointly optimize the sample and communication complexities for the problem considered here.

Mon, 17 Jul 2023digest

1.Convex Bi-Level Optimization Problems with Non-smooth Outer Objective Function

2307.08245

Authors:Roey Merchav, Shoham Sabach

Abstract: In this paper, we propose the Bi-Sub-Gradient (Bi-SG) method, which is a generalization of the classical sub-gradient method to the setting of convex bi-level optimization problems. This is a first-order method that is very easy to implement in the sense that it requires only a computation of the associated proximal mapping or a sub-gradient of the outer non-smooth objective function, in addition to a proximal gradient step on the inner optimization problem. We show, under very mild assumptions, that Bi-SG tackles bi-level optimization problems and achieves sub-linear rates both in terms of the inner and outer objective functions. Moreover, if the outer objective function is additionally strongly convex (still could be non-smooth), the outer rate can be improved to a linear rate. Last, we prove that the distance of the generated sequence to the set of optimal solutions of the bi-level problem converges to zero.

2.Global convergence of a BFGS-type algorithm for nonconvex multiobjective optimization problems

2307.08429

Authors:L. F. Prudente, D. R. Souza

Abstract: We propose a modified BFGS algorithm for multiobjective optimization problems with global convergence, even in the absence of convexity assumptions on the objective functions. Furthermore, we establish the superlinear convergence of the method under usual conditions. Our approach employs Wolfe step sizes and ensures that the Hessian approximations are updated and corrected at each iteration to address the lack of convexity assumption. Numerical results shows that the introduced modifications preserve the practical efficiency of the BFGS method.

3.Robust Combinatorial Optimization Problems Under Budgeted Interdiction Uncertainty

2307.08525

Authors:Marc Goerigk, Mohammad Khosravi

Abstract: In robust combinatorial optimization, we would like to find a solution that performs well under all realizations of an uncertainty set of possible parameter values. How we model this uncertainty set has a decisive influence on the complexity of the corresponding robust problem. For this reason, budgeted uncertainty sets are often studied, as they enable us to decompose the robust problem into easier subproblems. We propose a variant of discrete budgeted uncertainty for cardinality-based constraints or objectives, where a weight vector is applied to the budget constraint. We show that while the adversarial problem can be solved in linear time, the robust problem becomes NP-hard and not approximable. We discuss different possibilities to model the robust problem and show experimentally that despite the hardness result, some models scale relatively well in the problem size.

Fri, 14 Jul 2023digest

1.Conic cancellation laws and some applications

2307.07185

Authors:Marius Durea, Elena-Andreea Florea

Abstract: We discuss, on finite and infinite dimensional normed vector spaces, some versions of Radstr\"{o}m cancellation law (or lemma) that are suited for applications to set optimization problems. In this sense, we call our results "conic" variants of the celebrated result of Radstr\"{o}m, since they involve the presence of an ordering cone on the underlying space. Several adaptations to this context of some topological properties of sets are studied and some applications to subdifferential calculus associated to set-valued maps and to necessary optimality conditions for constrained set optimization problems are given. Finally, a stability problem is considered.

2.Stable domains for higher order elliptic operators

2307.07217

Authors:Jean-François Grosjean, Antoine Lemenant, Rémy Mougenot

Abstract: This paper is devoted to prove that any domain satisfying a $(\delta_0,r_0)-$capacity condition of first order is automatically $(m,p)-$stable for all $m\geqslant 1$ and $p\geqslant 1$, and for any dimension $N\geqslant 1$. In particular, this includes regular enough domains such as $\mathscr{C}^1-$domains, Lipchitz domains, Reifenberg flat domains, but is weak enough to also includes cusp points. Our result extends some of the results of Hayouni and Pierre valid only for $N=2,3$, and extends also the results of Bucur and Zolesio for higher order operators, with a different and simpler proof.

3.Stability analysis of the Navier-Stokes velocity tracking problem with bang-bang controls

2307.07283

Authors:Alberto Domínguez Corella, Nicolai Jork, Šarká Nečasová, John Sebastian H. Simon

Abstract: This paper focuses on the stability of solutions for a velocity-tracking problem associated with the two-dimensional Navier-Stokes equations. The considered optimal control problem does not possess any regularizer in the cost, and hence bang-bang solutions can be expected. We investigate perturbations that account for uncertainty in the tracking data and the initial condition of the state, and analyze the convergence rate of solutions when the original problem is regularized by the Tikhonov term. The stability analysis relies on the H\"older subregularity of the optimality mapping, which stems from the necessary conditions of the problem.

4.Projection onto a Capped Rotated Second-Order Cone with Applications to Sparse Regression Relaxations

2307.07290

Authors:Noam Goldberg, Ishy Zagdoun

Abstract: This paper establishes a closed-form expression for projecting onto a capped rotated second-order cone. This special object is a convex set that arises as a part of the feasible region of the perspective relaxation of mixed-integer nonlinear programs (MINLP) with binary indicator variables. The rapid computation of the projection onto this convex set enables the development of effective methods for solving the continuous relaxation of MINLPs whose feasible region may involve a Cartesian product of a large number of such sets. As a proof of concept for the applicability of our projection method, we develop a projected gradient method and specialize a general form of FISTA to use our projection technique in order to effectively solve the continuous perspective relaxation of a sparse regression problem with $L_0$ and $L_2$ penalties. We also generalize the basic sparse regression formulation and solution method to support group sparsity. In experiments we first demonstrate that the projection problem is solved faster and more accurately with our closed-form than with an interior-point solver, and also when solving sparse regression problems our methods that applies our projection formula can outperform a state-of-the-art interior point solver while nearly matching its solution accuracy.

5.A Unified Distributed Method for Constrained Networked Optimization via Saddle-Point Dynamics

2307.07318

Authors:Yi Huang, Ziyang Meng, Jian Sun, Wei Ren

Abstract: This paper develops a unified distributed method for solving two classes of constrained networked optimization problems, i.e., optimal consensus problem and resource allocation problem with non-identical set constraints. We first transform these two constrained networked optimization problems into a unified saddle-point problem framework with set constraints. Subsequently, two projection-based primal-dual algorithms via Optimistic Gradient Descent Ascent (OGDA) method and Extra-gradient (EG) method are developed for solving constrained saddle-point problems. It is shown that the developed algorithms achieve exact convergence to a saddle point with an ergodic convergence rate $O(1/k)$ for general convex-concave functions. Based on the proposed primal-dual algorithms via saddle-point dynamics, we develop unified distributed algorithm design and convergence analysis for these two networked optimization problems. Finally, two numerical examples are presented to demonstrate the theoretical results.

6.A Context-Aware Cutting Plane Selection Algorithm for Mixed-Integer Programming

2307.07322

Authors:Mark Turner, Timo Berthold, Mathieu Besançon

Abstract: The current cut selection algorithm used in mixed-integer programming solvers has remained largely unchanged since its creation. In this paper, we propose a set of new cut scoring measures, cut filtering techniques, and stopping criteria, extending the current state-of-the-art algorithm and obtaining a 4\% performance improvement for SCIP over the MIPLIB 2017 benchmark set.

7.Strict pseudocontractions and demicontractions, their properties and applications

2307.07337

Authors:Andrzej Cegielski

Abstract: We give properties of strict pseudocontractions and demicontractions defined on a Hilbert space, which constitute wide classes of operators that arise in iterative methods for solving fixed point problems. In particular, we give necessary and sufficient conditions under which a convex combination and composition of strict pseudocontractions as well as demicontractions that share a common fixed point is again a strict pseudocontraction or a demicontraction, respectively. Moreover, we introduce a generalized relaxation of composition of demicontraction and give its properties. We apply these properties to prove the weak convergence of a class of algorithms that is wider than the Douglas-Rachford algorithm and projected Landweber algorithms. We have also presented two numerical examples, where we compare the behavior of the presented methods with the Douglas-Rachford method.

8.Inverse Optimization for Routing Problems

2307.07357

Authors:Pedro Zattoni Scroccaro, Piet van Beek, Peyman Mohajerin Esfahani, Bilge Atasoy

Abstract: We propose a method for learning decision-makers' behavior in routing problems using Inverse Optimization (IO). The IO framework falls into the supervised learning category and builds on the premise that the target behavior is an optimizer of an unknown cost function. This cost function is to be learned through historical data, and in the context of routing problems, can be interpreted as the routing preferences of the decision-makers. In this view, the main contributions of this study are to propose an IO methodology with a hypothesis function, loss function, and stochastic first-order algorithm tailored to routing problems. We further test our IO approach in the Amazon Last Mile Routing Research Challenge, where the goal is to learn models that replicate the routing preferences of human drivers, using thousands of real-world routing examples. Our final IO-learned routing model achieves a score that ranks 2nd compared with the 48 models that qualified for the final round of the challenge. Our results showcase the flexibility and real-world potential of the proposed IO methodology to learn from decision-makers' decisions in routing problems.

Thu, 13 Jul 2023digest

1.Efficient KKT reformulations for bilevel linear programming

2307.06639

Authors:Christoph Buchheim

Abstract: It is a well-known result that bilevel linear programming is NP-hard. In many publications, reformulations as mixed-integer linear programs are proposed, which suggests that the decision version of the problem belongs to NP. However, to the best of our knowledge, a rigorous proof of membership in NP has never been published, so we close this gap by reporting a simple but not entirely trivial proof. A related question is whether a large enough "big M" for the classical KKT-based reformulation can be computed efficiently, which we answer in the affirmative. In particular, our big M has polynomial encoding length in the original problem data.

2.Weighted tardiness minimization for unrelated machines with sequence-dependent and resource-constrained setups

2307.06671

Authors:Ioannis Avgerinos, Ioannis Mourtos, Stavros Vatikiotis, Georgios Zois

Abstract: Motivated by the need of quick job (re-)scheduling, we examine an elaborate scheduling environment under the objective of total weighted tardiness minimization. The examined problem variant moves well beyond existing literature, as it considers unrelated machines, sequence-dependent and machine-dependent setup times and a renewable resource constraint on the number of simultaneous setups. For this variant, we provide a relaxed MILP to calculate lower bounds, thus estimating a worst-case optimality gap. As a fast exact approach appears not plausible for instances of practical importance, we extend known (meta-)heuristics to deal with the problem at hand, coupling them with a Constraint Programming (CP) component - vital to guarantee the non-violation of the problem's constraints - which optimally allocates resources with respect to tardiness minimization. The validity and versatility of employing different (meta-)heuristics exploiting a relaxed MILP as a quality measure is revealed by our extensive experimental study, which shows that the methods deployed have complementary strengths depending on the instance parameters. Since the problem description has been obtained from a textile manufacturer where jobs of diverse size arrive continuously under tight deadlines, we also discuss the practical impact of our approach in terms of both tardiness decrease and broader managerial insights.

3.Hypergraph-Based Fast Distributed AC Power Flow Optimization

2307.06728

Authors:Xinliang Dai, Yingzhao Lian, Yuning Jiang, Colin N. Jones, Veit Hagenmeyer

Abstract: This paper presents a novel distributed approach for solving AC power flow (PF) problems. The optimization problem is reformulated into a distributed form using a communication structure corresponding to a hypergraph, by which complex relationships between subgrids can be expressed as hyperedges. Then, a hypergraph-based distributed sequential quadratic programming (HDQ) approach is proposed to handle the reformulated problems, and the hypergraph-based distributed sequential quadratic programming (HDSQP) is used as the inner algorithm to solve the corresponding QP subproblems, which are respectively condensed using Schur complements with respect to coupling variables defined by hyperedges. Furthermore, we rigorously establish the convergence guarantee of the proposed algorithm with a locally quadratic rate and the one-step convergence of the inner algorithm when using the Levenberg-Marquardt regularization. Our analysis also demonstrates that the computational complexity of the proposed algorithm is much lower than the state-of-art distributed algorithm. We implement the proposed algorithm in an open-source toolbox, i.e., rapidPF, and conduct numerical tests that validate the proof and demonstrate the great potential of the proposed distributed algorithm in terms of communication effort and computational speed.

4.Linear programming sensitivity measured by the optimal value worst-case analysis

2307.06733

Authors:Milan Hladík

Abstract: This paper introduces a concept of a derivative of the optimal value function in linear programming (LP). Basically, it is the the worst case optimal value of an interval LP problem when the nominal data the data are inflated to intervals according to given perturbation patterns. By definition, the derivative expresses how the optimal value can worsen when the data are subject to variation. In addition, it also gives a certain sensitivity measure or condition number of an LP problem. If the LP problem is nondegenerate, the derivatives are easy to calculate from the computed primal and dual optimal solutions. For degenerate problems, the computation is more difficult. We propose an upper bound and some kind of characterization, but there are many open problems remaining. We carried out numerical experiments with specific LP problems and with real LP data from Netlib repository. They show that the derivatives give a suitable sensitivity measure of LP problems. It remains an open problem how to efficiently and rigorously handle degenerate problems.

5.Sharpness and well-conditioning of nonsmooth convex formulations in statistical signal recovery

2307.06873

Authors:Lijun Ding, Alex L. Wang

Abstract: We study a sample complexity vs. conditioning tradeoff in modern signal recovery problems where convex optimization problems are built from sampled observations. We begin by introducing a set of condition numbers related to sharpness in $\ell_p$ or Schatten-p norms ($p\in[1,2]$) based on nonsmooth reformulations of a class of convex optimization problems, including sparse recovery, low-rank matrix sensing, covariance estimation, and (abstract) phase retrieval. In each of the recovery tasks, we show that the condition numbers become dimension independent constants once the sample size exceeds some constant multiple of the recovery threshold. Structurally, this result ensures that the inaccuracy in the recovered signal due to both observation noise and optimization error is well-controlled. Algorithmically, such a result ensures that a new first-order method for solving the class of sharp convex functions in a given $\ell_p$ or Schatten-p norm, when applied to the nonsmooth formulations, achieves nearly-dimension-independent linear convergence.

Wed, 12 Jul 2023digest

1.Outlier detection in regression: conic quadratic formulations

2307.05975

Authors:Andrés Gómez, José Neto

Abstract: In many applications, when building linear regression models, it is important to account for the presence of outliers, i.e., corrupted input data points. Such problems can be formulated as mixed-integer optimization problems involving cubic terms, each given by the product of a binary variable and a quadratic term of the continuous variables. Existing approaches in the literature, typically relying on the linearization of the cubic terms using big-M constraints, suffer from weak relaxation and poor performance in practice. In this work we derive stronger second-order conic relaxations that do not involve big-M constraints. Our computational experiments indicate that the proposed formulations are several orders-of-magnitude faster than existing big-M formulations in the literature for this problem.

2.Online Inventory Problems: Beyond the i.i.d. Setting with Online Convex Optimization

2307.06048

Authors:Massil Hihat, Stéphane Gaïffas, Guillaume Garrigos, Simon Bussy

Abstract: We study multi-product inventory control problems where a manager makes sequential replenishment decisions based on partial historical information in order to minimize its cumulative losses. Our motivation is to consider general demands, losses and dynamics to go beyond standard models which usually rely on newsvendor-type losses, fixed dynamics, and unrealistic i.i.d. demand assumptions. We propose MaxCOSD, an online algorithm that has provable guarantees even for problems with non-i.i.d. demands and stateful dynamics, including for instance perishability. We consider what we call non-degeneracy assumptions on the demand process, and argue that they are necessary to allow learning.

3.On the sharp Makai inequality

2307.06086

Authors:Francesca Prinari, Anna Chiara Zagati

Abstract: On a convex bounded open set, we prove that Poincar\'e-Sobolev constants for functions vanishing at the boundary can be bounded from below in terms of the norm of the distance function in a suitable Lebesgue space. This generalizes a result shown, in the planar case, by E. Makai, for the torsional rigidity. In addition, we compare the sharp Makai constants obtained in the class of convex sets with the optimal constants defined in other classes of open sets. Finally, an alternative proof of the Hersch-Protter inequality for convex sets is given.

4.Integrated supervisory control and fixed path speed trajectory generation for hybrid electric ships via convex optimization

2307.06184

Authors:Antti Ritari, Niklas Katzenburg, Fabricio Oliveira, Kari Tammi

Abstract: Battery-hybrid power source architectures can reduce fuel consumption and emissions for ships with diverse operation profiles. However, conventional control strategies may fail to improve performance if the future operation profile is unknown to the controller. This paper proposes a guidance, navigation, and control (GNC) function that integrates trajectory generation and hybrid power source supervisory control. We focus on time and fuel optimal path-constrained trajectory planning. This problem is a nonlinear and nonconvex optimal control problem, which means that it is not readily amenable to efficient and reliable solution onboard. We propose a nonlinear change of variables and constraint relaxations that transform the nonconvex planning problem into a convex optimal control problem. The nonconvex three-degree-of-freedom dynamics, hydrodynamic forces, fixed pitch propeller, battery, and general energy converter (e.g., fuel cell or generating set) dissipation constraints are expressed in convex functional form. A condition derived from Pontryagin's Minimum Principle guarantees that, when satisfied, the solution of the relaxed problem provides the solution to the original problem. The validity and effectiveness of this approach are numerically illustrated for a battery-hybrid vessel in model scale. First, the convex hydrodynamic hull and rudder force models are validated with towing tank test data. Second, optimal trajectories and supervisory control schemes are evaluated under varying mission requirements. The convexification scheme in this work lays the path for the employment of mature, computationally robust convex optimization methods and creates a novel possibility for real-time optimization onboard future smart and unmanned surface vehicles.

5.A preliminary model for optimal control of moisture content in unsaturated soils

2307.06217

Authors:Marco Berardi, Fabio V. Difonzo, Roberto Guglielmi

Abstract: In this paper we introduce an optimal control approach to Richards' equation in an irrigation framework, aimed at minimizing water consumption while maximizing root water uptake. We first describe the physics of the nonlinear model under consideration, and then develop the first-order necessary optimality conditions of the associated boundary control problem. We show that our model provides a promising framework to support optimized irrigation strategies, thus facing water scarcity in irrigation. The characterization of the optimal control in terms of a suitable relation with the adjoint state of the optimality conditions is then used to develop numerical simulations on different hydrological settings, that supports the analytical findings of the paper.

6.Further techniques on a polynomial positivity question of Collins, Dykema, and Torres-Ayala

2307.06311

Authors:Nathaniel K. Green, Edward D. Kim

Abstract: We prove that the coefficient of $t^2$ in $\mathsf{trace}((A+tB)^6)$ is a sum of squares in the entries of the symmetric matrices $A$ and $B$.

7.Provably Faster Gradient Descent via Long Steps

2307.06324

Authors:Benjamin Grimmer

Abstract: This work establishes provably faster convergence rates for gradient descent via a computer-assisted analysis technique. Our theory allows nonconstant stepsize policies with frequent long steps potentially violating descent by analyzing the overall effect of many iterations at once rather than the typical one-iteration inductions used in most first-order method analyses. We show that long steps, which may increase the objective value in the short term, lead to provably faster convergence in the long term. A conjecture towards proving a faster $O(1/T\log T)$ rate for gradient descent is also motivated along with simple numerical validation.

Tue, 11 Jul 2023digest

1.Control and estimation of multi-commodity network flow under aggregation

2307.05103

Authors:Yongxin Chen, Tryphon T. Georgiou, Michele Pavon

Abstract: A paradigm put forth by E. Schr\"odinger in 1931/32, known as Schr\"odinger bridges, represents a formalism to pose and solve control and estimation problems seeking a perturbation from an initial control schedule (in the case of control), or from a prior probability law (in the case of estimation), sufficient to reconcile data in the form of marginal distributions and minimal in the sense of relative entropy to the prior. In the same spirit, we consider traffic-flow and apply a Schr\"odinger-type dictum, to perturb minimally with respect to a suitable relative entropy functional a prior schedule/law so as to reconcile the traffic flow with scarce aggregate distributions on families of indistinguishable individuals. Specifically, we consider the problem to regulate/estimate multi-commodity network flow rates based only on empirical distributions of commodities being transported (e.g., types of vehicles through a network, in motion) at two given times. Thus, building on Schr\"odinger's large deviation rationale, we develop a method to identify {\em the most likely flow rates (traffic flow)}, given prior information and aggregate observations. Our method further extends the Schr\"odinger bridge formalism to the multi-commodity setting, allowing commodities to exit or enter the flow field as well (e.g., vehicles to enter and stop and park) at any time. The behavior of entering or exiting the flow field, by commodities or vehicles, is modeled by a Markov chains with killing and creation states. Our method is illustrated with a numerical experiment.

2.Linearization via Ordering Variables in Binary Optimization for Ising Machines

2307.05125

Authors:Kentaro Ohno, Nozomu Togawa

Abstract: Ising machines are next-generation computers expected for efficiently sampling near-optimal solutions of combinatorial oprimization problems. Combinatorial optimization problems are modeled as quadratic unconstrained binary optimization (QUBO) problems to apply an Ising machine. However, current state-of-the-art Ising machines still often fail to output near-optimal solutions due to the complicated energy landscape of QUBO problems. Furthermore, physical implementation of Ising machines severely restricts the size of QUBO problems to be input as a result of limited hardware graph structures. In this study, we take a new approach to these challenges by injecting auxiliary penalties preserving the optimum, which reduces quadratic terms in QUBO objective functions. The process simultaneously simplifies the energy landscape of QUBO problems, allowing search for near-optimal solutions, and makes QUBO problems sparser, facilitating encoding into Ising machines with restriction on the hardware graph structure. We propose linearization via ordering variables of QUBO problems as an outcome of the approach. By applying the proposed method to synthetic QUBO instances and to multi-dimensional knapsack problems, we empirically validate the effects on enhancing minor embedding of QUBO problems and performance of Ising machines.

3.A regularized Interior Point Method for sparse Optimal Transport on Graphs

2307.05186

Authors:Stefano Cipolla, Jacek Gondzio, Filippo Zanetti

Abstract: In this work, the authors address the Optimal Transport (OT) problem on graphs using a proximal stabilized Interior Point Method (IPM). In particular, strongly leveraging on the induced primal-dual regularization, the authors propose to solve large scale OT problems on sparse graphs using a bespoke IPM algorithm able to suitably exploit primal-dual regularization in order to enforce scalability. Indeed, the authors prove that the introduction of the regularization allows to use sparsified versions of the normal Newton equations to inexpensively generate IPM search directions. A detailed theoretical analysis is carried out showing the polynomial convergence of the inner algorithm in the proposed computational framework. Moreover, the presented numerical results showcase the efficiency and robustness of the proposed approach when compared to network simplex solvers.

4.A stochastic two-step inertial Bregman proximal alternating linearized minimization algorithm for nonconvex and nonsmooth problems

2307.05287

Authors:Chenzheng Guo, Jing Zhao, Qiao-Li Dong

Abstract: In this paper, for solving a broad class of large-scale nonconvex and nonsmooth optimization problems, we propose a stochastic two step inertial Bregman proximal alternating linearized minimization (STiBPALM) algorithm with variance-reduced stochastic gradient estimators. And we show that SAGA and SARAH are variance-reduced gradient estimators. Under expectation conditions with the Kurdyka-Lojasiewicz property and some suitable conditions on the parameters, we obtain that the sequence generated by the proposed algorithm converges to a critical point. And the general convergence rate is also provided. Numerical experiments on sparse nonnegative matrix factorization and blind image-deblurring are presented to demonstrate the performance of the proposed algorithm.

5.Stochastic Nested Compositional Bi-level Optimization for Robust Feature Learning

2307.05384

Authors:Xuxing Chen, Krishnakumar Balasubramanian, Saeed Ghadimi

Abstract: We develop and analyze stochastic approximation algorithms for solving nested compositional bi-level optimization problems. These problems involve a nested composition of $T$ potentially non-convex smooth functions in the upper-level, and a smooth and strongly convex function in the lower-level. Our proposed algorithm does not rely on matrix inversions or mini-batches and can achieve an $\epsilon$-stationary solution with an oracle complexity of approximately $\tilde{O}_T(1/\epsilon^{2})$, assuming the availability of stochastic first-order oracles for the individual functions in the composition and the lower-level, which are unbiased and have bounded moments. Here, $\tilde{O}_T$ hides polylog factors and constants that depend on $T$. The key challenge we address in establishing this result relates to handling three distinct sources of bias in the stochastic gradients. The first source arises from the compositional nature of the upper-level, the second stems from the bi-level structure, and the third emerges due to the utilization of Neumann series approximations to avoid matrix inversion. To demonstrate the effectiveness of our approach, we apply it to the problem of robust feature learning for deep neural networks under covariate shift, showcasing the benefits and advantages of our methodology in that context.

6.Reliable optimal controls for SEIR models in epidemiology

2307.05415

Authors:Simone Cacace, Alessio Oliviero

Abstract: We present and compare two different optimal control approaches applied to SEIR models in epidemiology, which allow us to obtain some policies for controlling the spread of an epidemic. The first approach uses Dynamic Programming to characterise the value function of the problem as the solution of a partial differential equation, the Hamilton-Jacobi-Bellman equation, and derive the optimal policy in feedback form. The second is based on Pontryagin's maximum principle and directly gives open-loop controls, via the solution of an optimality system of ordinary differential equations. This method, however, may not converge to the optimal solution. We propose a combination of the two methods in order to obtain high-quality and reliable solutions. Several simulations are presented and discussed.

7.Stability and genericity of bang-bang controls in affine problems

2307.05418

Authors:Alberto Domínguez Corella, Gerd Wachsmuth

Abstract: We analyse the role of the bang-bang property in affine optimal control problems. We show that many essential stability properties of affine problems are only satisfied when minimizers have the bang-bang property. Moreover, we prove that almost any perturbation in an affine optimal control problem leads to a bang-bang strict global minimizer. We work in an abstract framework that allows to cover many problems in the literature of optimal control, this includes problems constrained by partial and ordinary differential equations. We give examples that show the applicability of our results to specific optimal control problems.

Mon, 10 Jul 2023digest

1.Invex Programs: First Order Algorithms and Their Convergence

2307.04456

Authors:Adarsh Barik, Suvrit Sra, Jean Honorio

Abstract: Invex programs are a special kind of non-convex problems which attain global minima at every stationary point. While classical first-order gradient descent methods can solve them, they converge very slowly. In this paper, we propose new first-order algorithms to solve the general class of invex problems. We identify sufficient conditions for convergence of our algorithms and provide rates of convergence. Furthermore, we go beyond unconstrained problems and provide a novel projected gradient method for constrained invex programs with convergence rate guarantees. We compare and contrast our results with existing first-order algorithms for a variety of unconstrained and constrained invex problems. To the best of our knowledge, our proposed algorithm is the first algorithm to solve constrained invex programs.

2.Tropical convexity in location problems

2307.04465

Authors:Andrei Comăneci

Abstract: We investigate location problems whose optimum lies in the tropical convex hull of the input points. Firstly, we study geodesically star-convex sets under the asymmetric tropical distance and introduce the class of tropically quasiconvex functions whose sub-level sets have this shape. The latter are related to monotonic functions. Then we show that location problems whose distances are measured by tropically quasiconvex functions as before give an optimum in the tropical convex hull of the input points. We also show that a similar result holds if we replace the input points by tropically convex sets. Finally, we focus on applications to phylogenetics presenting properties of consensus methods arising from our class of location problems.

3.An Algorithm with Optimal Dimension-Dependence for Zero-Order Nonsmooth Nonconvex Stochastic Optimization

2307.04504

Authors:Guy Kornowski, Ohad Shamir

Abstract: We study the complexity of producing $(\delta,\epsilon)$-stationary points of Lipschitz objectives which are possibly neither smooth nor convex, using only noisy function evaluations. Recent works proposed several stochastic zero-order algorithms that solve this task, all of which suffer from a dimension-dependence of $\Omega(d^{3/2})$ where $d$ is the dimension of the problem, which was conjectured to be optimal. We refute this conjecture by providing a faster algorithm that has complexity $O(d\delta^{-1}\epsilon^{-3})$, which is optimal (up to numerical constants) with respect to $d$ and also optimal with respect to the accuracy parameters $\delta,\epsilon$, thus solving an open question due to Lin et al. (NeurIPS'22). Moreover, the convergence rate achieved by our algorithm is also optimal for smooth objectives, proving that in the nonconvex stochastic zero-order setting, nonsmooth optimization is as easy as smooth optimization. We provide algorithms that achieve the aforementioned convergence rate in expectation as well as with high probability. Our analysis is based on a simple yet powerful geometric lemma regarding the Goldstein-subdifferential set, which allows utilizing recent advancements in first-order nonsmooth nonconvex optimization.

Fri, 07 Jul 2023digest

1.Randomized subspace gradient method for constrained optimization

2307.03335

Authors:Ryota Nozawa, Pierre-Louis Poirion, Akiko Takeda

Abstract: We propose randomized subspace gradient methods for high-dimensional constrained optimization. While there have been similarly purposed studies on unconstrained optimization problems, there have been few on constrained optimization problems due to the difficulty of handling constraints. Our algorithms project gradient vectors onto a subspace that is a random projection of the subspace spanned by the gradients of active constraints. We determine the worst-case iteration complexity under linear and nonlinear settings and theoretically confirm that our algorithms can take a larger step size than their deterministic version. From the advantages of taking longer step and randomized subspace gradients, we show that our algorithms are especially efficient in view of time complexity when gradients cannot be obtained easily. Numerical experiments show that they tend to find better solutions because of the randomness of their subspace selection. Furthermore, they performs well in cases where gradients could not be obtained directly, and instead, gradients are obtained using directional derivatives.

2.Scylla: a matrix-free fix-propagate-and-project heuristic for mixed-integer optimization

2307.03466

Authors:Gioni Mexi, Mathieu Besançon, Suresh Bolusani, Antonia Chmiela, Ambros Gleixner, Alexander Hoen

Abstract: We introduce Scylla, a primal heuristic for mixed-integer optimization problems. It exploits approximate solves of the Linear Programming relaxations through the matrix-free Primal-Dual Hybrid Gradient algorithm with specialized termination criteria, and derives integer-feasible solutions via fix-and-propagate procedures and feasibility-pump-like updates to the objective function. Computational experiments show that the method is particularly suited to instances with hard linear relaxations.

3.Finite Elements with Switch Detection for Numerical Optimal Control of Nonsmooth Dynamical Systems with Set-Valued Step Functions

2307.03482

Authors:Armin Nurkanović, Anton Pozharskiy, Jonathan Frey, Moritz Diehl

Abstract: This paper develops high-accuracy methods for numerically solving optimal control problems subject to nonsmooth differential equations with set-valued step functions. A notable subclass of these systems are Filippov systems. The set-valued step functions are here written as the solution map of a linear program. Using the optimality conditions of this problem we rewrite the initial nonsmooth system into a equivalent dynamic complementarity systems (DCS). We extend the Finite Elements with Switch Detection (FESD) method [Nurkanovi\'c et al., 2022], initially developed for Filippov systems transformed via Stewart's reformulation into DCS [Stewart, 1990], to the class of nonsmooth systems with set-valued step functions. The key ideas are to start with a standard Runge-Kutta method for the obtained DCS and to let the integration step sizes to be degrees of freedom. Next, we introduce additional conditions to enable implicit but exact switch detection and to remove possible spurious degrees of freedom if no switches occur. The theoretical properties of the method are studied. Its favorable properties are illustrated on numerical simulation and optimal control examples. All methods introduced in this paper are implemented in the open-source software package NOSNOC.

4.Absolute value linear programming

2307.03510

Authors:Milan Hladík, David Hartman

Abstract: We deal with linear programming problems involving absolute values in their formulations, so that they are no more expressible as standard linear programs. The presence of absolute values causes the problems to be nonconvex and nonsmooth, so hard to solve. In this paper, we study fundamental properties on the topology and the geometric shape of the solution set, and also conditions for convexity, connectedness, boundedness and integrality of the vertices. Further, we address various complexity issues, showing that many basic questions are NP-hard to solve. We show that the feasible set is a (nonconvex) polyhedral set and, more importantly, every nonconvex polyhedral set can be described by means of absolute value constraints. We also provide a necessary and sufficient condition when a KKT point of a nonconvex quadratic programming reformulation solves the original problem.

5.Parallel drone scheduling vehicle routing problems with collective drones

2307.03523

Authors:Roberto Montemanni, Mauro Dell'Amico, Andrea Corsini

Abstract: We study last-mile delivery problems where trucks and drones collaborate to deliver goods to final customers. In particular, we focus on problem settings where either a single truck or a fleet with several homogeneous trucks work in parallel to drones, and drones have the capability of collaborating for delivering missions. This cooperative behaviour of the drones, which are able to connect to each other and work together for some delivery tasks, enhance their potential, since connected drone has increased lifting capabilities and can fly at higher speed, overcoming the main limitations of the setting where the drones can only work independently. In this work, we contribute a Constraint Programming model and a valid inequality for the version of the problem with one truck, namely the \emph{Parallel Drone Scheduling Traveling Salesman Problem with Collective Drones} and we introduce for the first time the variant with multiple trucks, called the \emph{Parallel Drone Scheduling Vehicle Routing Problem with Collective Drones}. For the latter variant, we propose two Constraint Programming models and a Mixed Integer Linear Programming model. An extensive experimental campaign leads to state-of-the-art results for the problem with one truck and some understanding of the presented models' behaviour on the version with multiple trucks. Some insights about future research are finally discussed.

6.The generalized Nash game proposed by Rosen

2307.03532

Authors:Carlos Calderón, John Cotrina

Abstract: We deal with the generalized Nash game proposed by Rosen, which is a game with strategy sets that are coupled across players through a shared constraint. A reduction to a classical game is shown, and as a consequence, Rosen's result can be deduced from the one given by Arrow and Debreu. We also establish necessary and sufficient conditions for a point to be a generalized Nash equilibrium employing the variational inequality approach. Finally, some existence results are given in the non-compact case under coerciveness conditions.

7.Time-dependent parameter identification in a Fokker-Planck equation based magnetization model of large ensembles of nanoparticles

2307.03560

Authors:Hannes Albers, Tobias Kluth

Abstract: In this article, we consider a model motivated by large ensembles of nanoparticles' magnetization dynamics using the Fokker-Planck equation and analyze the underlying parabolic PDE being defined on a smooth, compact manifold without boundary with respect to time-dependent parameter identification using regularization schemes. In the context of magnetic particle imaging, possible fields of application can be found including calibration procedures improved by time-dependent particle parameters and dynamic tracking of nanoparticle orientation. This results in reconstructing different parameters of interest, such as the applied magnetic field and the particles' easy axis. These problems are in particular addressed in the accompanied numerical study.

8.Absorbing games with irrational values

2307.03570

Authors:Miquel Oliu-Barton

Abstract: Can an absorbing game with rational data have an irrational limit value? Yes: In this note we provide the simplest examples where this phenomenon arises. That is, the following $3\times 3$ absorbing game \[ A = \begin{bmatrix} 1^* & 1^* & 2^* \\ 1^* & 2^* & 0\phantom{^*} \\ 2^* & 0\phantom{^*} & 1^* \end{bmatrix}, \] and a sequence of $2\times 2$ absorbing games whose limit values are $\sqrt{k}$, for all integer $k$. Finally, we conjecture that any algebraic number can be represented as the limit value of an absorbing game.

9.Accelerated Optimization Landscape of Linear-Quadratic Regulator

2307.03590

Authors:Lechen Feng, Yuan-Hua Ni

Abstract: Linear-quadratic regulator (LQR) is a landmark problem in the field of optimal control, which is the concern of this paper. Generally, LQR is classified into state-feedback LQR (SLQR) and output-feedback LQR (OLQR) based on whether the full state is obtained. It has been suggested in existing literature that both the SLQR and the OLQR could be viewed as \textit{constrained nonconvex matrix optimization} problems in which the only variable to be optimized is the feedback gain matrix. In this paper, we introduce a first-order accelerated optimization framework of handling the LQR problem, and give its convergence analysis for the cases of SLQR and OLQR, respectively. Specifically, a Lipschiz Hessian property of LQR performance criterion is presented, which turns out to be a crucial property for the application of modern optimization techniques. For the SLQR problem, a continuous-time hybrid dynamic system is introduced, whose solution trajectory is shown to converge exponentially to the optimal feedback gain with Nesterov-optimal order $1-\frac{1}{\sqrt{\kappa}}$ ($\kappa$ the condition number). Then, the symplectic Euler scheme is utilized to discretize the hybrid dynamic system, and a Nesterov-type method with a restarting rule is proposed that preserves the continuous-time convergence rate, i.e., the discretized algorithm admits the Nesterov-optimal convergence order. For the OLQR problem, a Hessian-free accelerated framework is proposed, which is a two-procedure method consisting of semiconvex function optimization and negative curvature exploitation. In a time $\mathcal{O}(\epsilon^{-7/4}\log(1/\epsilon))$, the method can find an $\epsilon$-stationary point of the performance criterion; this entails that the method improves upon the $\mathcal{O}(\epsilon^{-2})$ complexity of vanilla gradient descent. Moreover, our method provides the second-order guarantee of stationary point.

10.A second order dynamical system method for solving a maximal comonotone inclusion problem

2307.03596

Authors:Zengzhen Tan, Rong Hu, Yaping Fang

Abstract: In this paper a second order dynamical system model is proposed for computing a zero of a maximal comonotone operator in Hilbert spaces. Under mild conditions, we prove existence and uniqueness of a strong global solution of the proposed dynamical system. A proper tuning of the parameters can allow us to establish fast convergence properties of the trajectories generated by the dynamical system. The weak convergence of the trajectory to a zero of the maximal comonotone operator is also proved. Furthermore, a discrete version of the dynamical system is considered and convergence properties matching to that of the dynamical system are established under a same framework. Finally, the validity of the proposed dynamical system and its discrete version is demonstrated by two numerical examples.

11.Optimal Solutions for a Class of Set-Valued Evolution Problems

2307.03599

Authors:Stefano Bianchini, Alberto Bressan, Maria Teresa Chiri

Abstract: The paper is concerned with a class of optimization problems for moving sets $t\mapsto\Omega(t)\subset\mathbb{R}^2$, motivated by the control of invasive biological populations. Assuming that the initial contaminated set $\Omega_0$ is convex, we prove that a strategy is optimal if an only if at each given time $t\in [0,T]$ the control is active along the portion of the boundary $\partial \Omega(t)$ where the curvature is maximal. In particular, this implies that $\Omega(t)$ is convex for all $t\geq 0$. The proof relies on the analysis of a one-step constrained optimization problem, obtained by a time discretization.

12.Cascading Failures in the Global Financial System: A Dynamical Model

2307.03604

Authors:Leonardo Stella, Dario Bauso, Franco Blanchini, Patrizio Colaneri

Abstract: In this paper, we propose a dynamical model to capture cascading failures among interconnected organizations in the global financial system. Failures can take the form of bankruptcies, defaults, and other insolvencies. The network that underpins the financial interdependencies between different organizations constitutes the backbone of the financial system. A failure in one or more of these organizations can lead the propagation of the financial collapse onto other organizations in a domino effect. Paramount importance is therefore given to the mitigation of these failures. Motivated by the relevance of this problem and recent prominent events connected to it, we develop a framework that allows us to investigate under what conditions organizations remain healthy or are involved in the propagation of the failures in the network. The contribution of this paper is the following: i) we develop a dynamical model that describes the equity values of financial organizations and their evolution over time given an initial condition; ii) we characterize the equilibria for this model by proving the existence and uniqueness of these equilibria, and by providing an explicit expression for them; and iii) we provide a computational method via sign-space iteration to analyze the propagation of failures and the attractive equilibrium point.

13.Tikhonov regularized second-order plus first-order primal-dual dynamical systems with asymptotically vanishing damping for linear equality constrained convex optimization problems

2307.03612

Authors:Ting Ting Zhu, Rong Hu, Ya Ping Fang

Abstract: In this paper, in the setting of Hilbert spaces, we consider a Tikhonov regularized second-order plus first-order primal-dual dynamical system with asymptotically vanishing damping for a linear equality constrained convex optimization problem. The convergence properties of the proposed dynamical system depend heavily upon the choice of the Tikhonov regularization parameter. When the Tikhonov regularization parameter decreases rapidly to zero, we establish the fast convergence rates of the primal-dual gap, the objective function error, the feasibility measure, and the gradient norm of the objective function along the trajectory generated by the system. When the Tikhonov regularization parameter tends slowly to zero, we prove that the primal trajectory of the Tikhonov regularized dynamical system converges strongly to the minimal norm solution of the linear equality constrained convex optimization problem. Numerical experiments are performed to illustrate the efficiency of our approach.

14.On the Geometry and Refined Rate of Primal-Dual Hybrid Gradient for Linear Programming

2307.03664

Authors:Haihao Lu, Jinwen Yang

Abstract: We study the convergence behaviors of primal-dual hybrid gradient (PDHG) for solving linear programming (LP). PDHG is the base algorithm of a new general-purpose first-order method LP solver, PDLP, which aims to scale up LP by taking advantage of modern computing architectures. Despite its numerical success, the theoretical understanding of PDHG for LP is still very limited; the previous complexity result relies on the global Hoffman constant of the KKT system, which is known to be very loose and uninformative. In this work, we aim to develop a fundamental understanding of the convergence behaviors of PDHG for LP and to develop a refined complexity rate that does not rely on the global Hoffman constant. We show that there are two major stages of PDHG for LP: in Stage I, PDHG identifies active variables and the length of the first stage is driven by a certain quantity which measures how close the non-degeneracy part of the LP instance is to degeneracy; in Stage II, PDHG effectively solves a homogeneous linear inequality system, and the complexity of the second stage is driven by a well-behaved local sharpness constant of the system. This finding is closely related to the concept of partial smoothness in non-smooth optimization, and it is the first complexity result of finite time identification without the non-degeneracy assumption. An interesting implication of our results is that degeneracy itself does not slow down the convergence of PDHG for LP, but near-degeneracy does.

15.Bilateral boundary control of an input delayed 2-D reaction-diffusion equation

2307.03727

Authors:Dandan Guan, Yanmei Chen, Jie Qi, Linglong Du

Abstract: In this paper, a delay compensation design method based on PDE backstepping is developed for a two-dimensional reaction-diffusion partial differential equation (PDE) with bilateral input delays. The PDE is defined in a rectangular domain, and the bilateral control is imposed on a pair of opposite sides of the rectangle. To represent the delayed bilateral inputs, we introduce two 2-D transport PDEs that form a cascade system with the original PDE. A novel set of backstepping transformations is proposed for delay compensator design, including one Volterra integral transformation and two affine Volterra integral transformations. Unlike the kernel equation for 1-D PDE systems with delayed boundary input, the resulting kernel equations for the 2-D system have singular initial conditions governed by the Dirac Delta function. Consequently, the kernel solutions are written as a double trigonometric series with singularities. To address the challenge of stability analysis posed by the singularities, we prove a set of inequalities by using the Cauchy-Schwarz inequality, the 2-D Fourier series, and the Parseval's theorem. A numerical simulation illustrates the effectiveness of the proposed delay-compensation method.

16.Symmetry reduction and recovery of trajectories of optimal control problems via measure relaxations

2307.03787

Authors:Nicolas Augier, Didier Henrion, Milan Korda, Victor Magron

Abstract: We address the problem of symmetry reduction of optimal control problems under the action of a finite group from a measure relaxation viewpoint. We propose a method based on the moment-SOS aka Lasserre hierarchy which allows one to significantly reduce the computation time and memory requirements compared to the case without symmetry reduction. We show that the recovery of optimal trajectories boils down to solving a symmetric parametric polynomial system. Then we illustrate our method on the symmetric integrator and the time-optimal inversion of qubits.

Thu, 06 Jul 2023digest

1.A generalized Routh-Hurwitz criterion for the stability analysis of polynomials with complex coefficients: application to the PI-control of vibrating structures

2307.02823

Authors:Anthony Hastir, Riccardo Muolo

Abstract: The classical Routh-Hurwitz criterion is one of the most popular methods to study the stability of polynomials with real coefficients, given its simplicity and ductility. However, when moving to polynomials with complex coefficients, a generalization exists but it is rather cumbersome and not as easy to apply. In this paper, we make such generalization clear and understandable for a wider public and develop an algorithm to apply it. After having explained the method, we demonstrate its use to determine the external stability of a system consisting of the interconnection between a rotating shaft and a PI-regulator. The extended Routh-Hurwitz criterion gives then necessary and sufficient conditions on the gains of the PI-regulator to achieve stabilization of the system together with regulation of the output. This illustrative example makes our formulation of the extended Routh-Hurwitz criterion ready to be used in several other applications.

2.Benign landscapes of low-dimensional relaxations for orthogonal synchronization on general graphs

2307.02941

Authors:Andrew D. McRae, Nicolas Boumal

Abstract: Orthogonal group synchronization is the problem of estimating $n$ elements $Z_1, \ldots, Z_n$ from the orthogonal group $\mathrm{O}(r)$ given some relative measurements $R_{ij} \approx Z_i^{}Z_j^{-1}$. The least-squares formulation is nonconvex. To avoid its local minima, a Shor-type convex relaxation squares the dimension of the optimization problem from $O(n)$ to $O(n^2)$. Burer--Monteiro-type nonconvex relaxations have generic landscape guarantees at dimension $O(n^{3/2})$. For smaller relaxations, the problem structure matters. It has been observed in the robotics literature that nonconvex relaxations of only slightly increased dimension seem sufficient for SLAM problems. We partially explain this. This also has implications for Kuramoto oscillators. Specifically, we minimize the least-squares cost function in terms of estimators $Y_1, \ldots, Y_n$. Each $Y_i$ is relaxed to the Stiefel manifold $\mathrm{St}(r, p)$ of $r \times p$ matrices with orthonormal rows. The available measurements implicitly define a (connected) graph $G$ on $n$ vertices. In the noiseless case, we show that second-order critical points are globally optimal as soon as $p \geq r+2$ for all connected graphs $G$. (This implies that Kuramoto oscillators on $\mathrm{St}(r, p)$ synchronize for all $p \geq r + 2$.) This result is the best possible for general graphs; the previous best known result requires $2p \geq 3(r + 1)$. For $p > r + 2$, our result is robust to modest amounts of noise (depending on $p$ and $G$). When local minima remain, they still achieve minimax-optimal error rates. Our proof uses a novel randomized choice of tangent direction to prove (near-)optimality of second-order critical points. Finally, we partially extend our noiseless landscape results to the complex case (unitary group), showing that there are no spurious local minima when $2p \geq 3r$.

3.Stochastic Approximation for Expectation Objective and Expectation Inequality-Constrained Nonconvex Optimization

2307.02943

Authors:Francisco Facchinei, Vyacheslav Kungurtsev

Abstract: Stochastic Approximation has been a prominent set of tools for solving problems with noise and uncertainty. Increasingly, it becomes important to solve optimization problems wherein there is noise in both a set of constraints that a practitioner requires the system to adhere to, as well as the objective, which typically involves some empirical loss. We present the first stochastic approximation approach for solving this class of problems using the Ghost framework of incorporating penalty functions for analysis of a sequential convex programming approach together with a Monte Carlo estimator of nonlinear maps. We provide almost sure convergence guarantees and demonstrate the performance of the procedure on some representative examples.

4.Constraint Programming models for the parallel drone scheduling vehicle routing problem

2307.02980

Authors:Roberto Montemanni, Mauro Dell'Amico

Abstract: Drones are currently seen as a viable way for improving the distribution of parcels in urban and rural environments, while working in coordination with traditional vehicles like trucks. In this paper we consider the parallel drone scheduling vehicle routing problem, where the service of a set of customers requiring a delivery is split between a fleet of trucks and a fleet of drones. We consider two variations of the problem. In the first one the problem is more theoretical, and the target is the minimization of the time required to complete the service and have all the vehicles back to the depot. In the second variant more realistic constraints involving operating costs, capacity limitation and workload balance, are considered, and the target is to minimize the total operational costs. We propose several constraint programming models to deal with the two problems. An experimental champaign on the instances previously adopted in the literature is presented to validate the new solving methods. The results show that on top of being a viable way to solve problems to optimality, the models can also be used to derive effective heuristic solutions and high-quality lower bounds for the optimal cost, if the execution is interrupted after its natural end.

5.Convergence rate of entropy-regularized multi-marginal optimal transport costs

2307.03023

Authors:Luca Nenna, Paul Pegon

Abstract: We investigate the convergence rate of multi-marginal optimal transport costs that are regularized with the Boltzmann-Shannon entropy, as the noise parameter $\varepsilon$ tends to $0$. We establish lower and upper bounds on the difference with the unregularized cost of the form $C\varepsilon\log(1/\varepsilon)+O(\varepsilon)$ for some explicit dimensional constants $C$ depending on the marginals and on the ground cost, but not on the optimal transport plans themselves. Upper bounds are obtained for Lipschitz costs or locally semi-concave costs for a finer estimate, and lower bounds for $\mathcal{C}^2$ costs satisfying some signature condition on the mixed second derivatives that may include degenerate costs, thus generalizing results previously in the two marginals case and for non-degenerate costs. We obtain in particular matching bounds in some typical situations where the optimal plan is deterministic.

6.Exploratory mean-variance portfolio selection with Choquet regularizers

2307.03026

Authors:Junyi Guo, Xia Han, Hao Wang

Abstract: In this paper, we study a continuous-time exploratory mean-variance (EMV) problem under the framework of reinforcement learning (RL), and the Choquet regularizers are used to measure the level of exploration. By applying the classical Bellman principle of optimality, the Hamilton-Jacobi-Bellman equation of the EMV problem is derived and solved explicitly via maximizing statically a mean-variance constrained Choquet regularizer. In particular, the optimal distributions form a location-scale family, whose shape depends on the choices of the Choquet regularizer. We further reformulate the continuous-time Choquet-regularized EMV problem using a variant of the Choquet regularizer. Several examples are given under specific Choquet regularizers that generate broadly used exploratory samplers such as exponential, uniform and Gaussian. Finally, we design a RL algorithm to simulate and compare results under the two different forms of regularizers.

7.Distributed Interior Point Methods for Optimization in Energy Networks

2307.03040

Authors:Alexander Engelmann, Michael Kaupmann, Timm Faulwasser

Abstract: This note discusses an essentially decentralized interior point method, which is well suited for optimization problems arising in energy networks. Advantages of the proposed method are guaranteed and fast local convergence also for problems with non-convex constraints. Moreover, our method exhibits a small communication footprint and it achieves a comparably high solution accuracy with a limited number of iterations, whereby the local subproblems are of low computational complexity. We illustrate the performance of the proposed method on a problem from energy systems, i.e., we consider an optimal power flow problem with 708 buses.

8.Convergence Properties of Newton's Method for Globally Optimal Free Flight Trajectory Optimization

2307.03046

Authors:Ralf Borndörfer, Fabian Danecker, Martin Weiser

Abstract: The algorithmic efficiency of Newton-based methods for Free Flight Trajectory Optimization is heavily influenced by the size of the domain of convergence. We provide numerical evidence that the convergence radius is much larger in practice than what the theoretical worst case bounds suggest. The algorithm can be further improved by a convergence-enhancing domain decomposition.

9.Multiplicative Updates for Online Convex Optimization over Symmetric Cones

2307.03136

Authors:Ilayda Canyakmaz, Wayne Lin, Georgios Piliouras, Antonios Varvitsiotis

Abstract: We study online convex optimization where the possible actions are trace-one elements in a symmetric cone, generalizing the extensively-studied experts setup and its quantum counterpart. Symmetric cones provide a unifying framework for some of the most important optimization models, including linear, second-order cone, and semidefinite optimization. Using tools from the field of Euclidean Jordan Algebras, we introduce the Symmetric-Cone Multiplicative Weights Update (SCMWU), a projection-free algorithm for online optimization over the trace-one slice of an arbitrary symmetric cone. We show that SCMWU is equivalent to Follow-the-Regularized-Leader and Online Mirror Descent with symmetric-cone negative entropy as regularizer. Using this structural result we show that SCMWU is a no-regret algorithm, and verify our theoretical results with extensive experiments. Our results unify and generalize the analysis for the Multiplicative Weights Update method over the probability simplex and the Matrix Multiplicative Weights Update method over the set of density matrices.

10.Extreme occupation measures in Markov decision processes with a cemetery

2307.03158

Authors:Alexey Piunovskiy, Yi Zhang

Abstract: In this paper, we consider a Markov decision process (MDP) with a Borel state space $\textbf{X}\cup\{\Delta\}$, where $\Delta$ is an absorbing state (cemetery), and a Borel action space $\textbf{A}$. We consider the space of finite occupation measures restricted on $\textbf{X}\times \textbf{A}$, and the extreme points in it. It is possible that some strategies have infinite occupation measures. Nevertheless, we prove that every finite extreme occupation measure is generated by a deterministic stationary strategy. Then, for this MDP, we consider a constrained problem with total undiscounted criteria and $J$ constraints, where the cost functions are nonnegative. By assumption, the strategies inducing infinite occupation measures are not optimal. Then, our second main result is that, under mild conditions, the solution to this constrained MDP is given by a mixture of no more than $J+1$ occupation measures generated by deterministic stationary strategies.

11.Convergence of the momentum method for semi-algebraic functions with locally Lipschitz gradients

2307.03331

Authors:Cédric Josz, Lexiao Lai, Xiaopeng Li

Abstract: We propose a new length formula that governs the iterates of the momentum method when minimizing differentiable semi-algebraic functions with locally Lipschitz gradients. It enables us to establish local convergence, global convergence, and convergence to local minimizers without assuming global Lipschitz continuity of the gradient, coercivity, and a global growth condition, as is done in the literature. As a result, we provide the first convergence guarantee of the momentum method starting from arbitrary initial points when applied to principal component analysis, matrix sensing, and linear neural networks.

Wed, 05 Jul 2023digest

1.A Mini-Batch Quasi-Newton Proximal Method for Constrained Total-Variation Nonlinear Image Reconstruction

2307.02043

Authors:Tao Hong, Thanh-an Pham, Irad Yavneh, Michael Unser

Abstract: Over the years, computational imaging with accurate nonlinear physical models has drawn considerable interest due to its ability to achieve high-quality reconstructions. However, such nonlinear models are computationally demanding. A popular choice for solving the corresponding inverse problems is accelerated stochastic proximal methods (ASPMs), with the caveat that each iteration is expensive. To overcome this issue, we propose a mini-batch quasi-Newton proximal method (BQNPM) tailored to image-reconstruction problems with total-variation regularization. It involves an efficient approach that computes a weighted proximal mapping at a cost similar to that of the proximal mapping in ASPMs. However, BQNPM requires fewer iterations than ASPMs to converge. We assess the performance of BQNPM on three-dimensional inverse-scattering problems with linear and nonlinear physical models. Our results on simulated and real data show the effectiveness and efficiency of BQNPM,

2.Mixed Leader-Follower Dynamics

2307.02510

Authors:Hsin-Lun Li

Abstract: The original Leader-Follower (LF) model partitions all agents whose opinion is a number in $[-1,1]$ to a follower group, a leader group with a positive target opinion in $[0,1]$ and a leader group with a negative target opinion in $[-1,0]$. A leader group agent has a constant degree to its target and mixes it with the average opinion of its group neighbors at each update. A follower has a constant degree to the average opinion of the opinion neighbors of each leader group and mixes it with the average opinion of its group neighbors at each update. In this paper, we consider a variant of the LF model, namely the mixed model, in which the degrees can vary over time, the opinions can be high dimensional, and the number of leader groups can be more than two. We investigate circumstances under which all agents achieve a consensus. In particular, a few leaders can dominate the whole population.

3.Ill-posed linear inverse problems with box constraints: A new convex optimization approach

2307.03680

Authors:Henryk Gzyl

Abstract: Consider the linear equation $\mathbf{A}\mathbf{x}=\mathbf{y}$, where $\mathbf{A}$ is a $k\times N$-matrix, $\mathbf{x}\in\mathcal{K}\subset \mathbb{R}^N$ and $\mathbf{y}\in\mathbb{R}^M$ a given vector. When $\mathcal{K}$ is a convex set and $M\not= N$ this is a typical ill-posed, linear inverse problem with convex constraints. Here we propose a new way to solve this problem when $\mathcal{K} = \prod_j[a_j,b_j]$. It consists of regarding $\mathbf{A}\mathbf{x}=\mathbf{y}$ as the constraint of a convex minimization problem, in which the objective (cost) function is the dual of a moment generating function. This leads to a nice minimization problem and some interesting comparison results. More importantly, the method provides a solution that lies in the interior of the constraint set $\mathcal{K}$. We also analyze the dependence of the solution on the data and relate it to the Le Chatellier principle.

4.From NeurODEs to AutoencODEs: a mean-field control framework for width-varying Neural Networks

2307.02279

Authors:Cristina Cipriani, Massimo Fornasier, Alessandro Scagliotti

Abstract: In our work, we build upon the established connection between Residual Neural Networks (ResNets) and continuous-time control systems known as NeurODEs. By construction, NeurODEs have been limited to constant-width layers, making them unsuitable for modeling deep learning architectures with width-varying layers. In this paper, we propose a continuous-time Autoencoder, which we call AutoencODE, and we extend to this case the mean-field control framework already developed for usual NeurODEs. In this setting, we tackle the case of low Tikhonov regularization, resulting in potentially non-convex cost landscapes. While the global results obtained for high Tikhonov regularization may not hold globally, we show that many of them can be recovered in regions where the loss function is locally convex. Inspired by our theoretical findings, we develop a training method tailored to this specific type of Autoencoders with residual connections, and we validate our approach through numerical experiments conducted on various examples.

5.Extended team orienteering problem: Algorithms and applications

2307.02397

Authors:Wen Ji, Ke Han, Qian Ge

Abstract: The team orienteering problem (TOP) determines a set of routes, each within a time or distance budget, which collectively visit a set of points of interest (POIs) such that the total score collected at those visited points are maximized. This paper proposes an extension of the TOP (ETOP) by allowing the POIs to be visited multiple times to accumulate scores. Such an extension is necessary for application scenarios like urban sensing where each POI needs to be continuously monitored, or disaster relief where certain locations need to be repeatedly covered. We present two approaches to solve the ETOP, one based on the adaptive large neighborhood search (ALNS) algorithm and the other being a bi-level matheuristic method. Sensitivity analyses are performed to fine-tune the algorithm parameters. Test results on complete graphs with different problem sizes show that: (1) both algorithms significantly outperform a greedy heuristic, with improvements ranging from 9.43% to 27.68%; and (2) while the ALNS-based algorithm slightly outperform the matheuristic in terms of solution optimality, the latter is far more computationally efficient, by 11 to 385 times faster. Finally, a real-world case study of VOCs sensing is presented and formulated as ETOP on a road network (incomplete graph), where the ALNS is outperformed by matheuristic in terms of optimality as the destroy and repair operators yield limited perturbation of existing solutions when constrained by a road network.

6.QUBO.jl: A Julia Ecosystem for Quadratic Unconstrained Binary Optimization

2307.02577

Authors:Pedro Maciel Xavier, Pedro Ripper, Tiago Andrade, Joaquim Dias Garcia, Nelson Maculan, David E. Bernal Neira

Abstract: We present QUBO.jl, an end-to-end Julia package for working with QUBO (Quadratic Unconstrained Binary Optimization) instances. This tool aims to convert a broad range of JuMP problems for straightforward application in many physics and physics-inspired solution methods whose standard optimization form is equivalent to the QUBO. These methods include quantum annealing, quantum gate-circuit optimization algorithms (Quantum Optimization Alternating Ansatz, Variational Quantum Eigensolver), other hardware-accelerated platforms, such as Coherent Ising Machines and Simulated Bifurcation Machines, and more traditional methods such as simulated annealing. Besides working with reformulations, QUBO.jl allows its users to interface with the aforementioned hardware, sending QUBO models in various file formats and retrieving results for subsequent analysis. QUBO.jl was written as a JuMP / MathOptInterface (MOI) layer that automatically maps between the input and output frames, thus providing a smooth modeling experience.

7.AI4OPT: AI Institute for Advances in Optimization

2307.02671

Authors:Pascal Van Hentenryck, Kevin Dalmeijer

Abstract: This article is a short introduction to AI4OPT, the NSF AI Institute for Advances in Optimization. AI4OPT fuses AI and Optimization, inspired by end-use cases in supply chains, energy systems, chip design and manufacturing, and sustainable food systems. AI4OPT also applies its "teaching the teachers" philosophy to provide longitudinal educational pathways in AI for engineering.

Tue, 04 Jul 2023digest

1.Accelerated stochastic approximation with state-dependent noise

2307.01497

Authors:Sasila Ilandarideva, Anatoli Juditsky, Guanghui Lan, Tianjiao Li

Abstract: We consider a class of stochastic smooth convex optimization problems under rather general assumptions on the noise in the stochastic gradient observation. As opposed to the classical problem setting in which the variance of noise is assumed to be uniformly bounded, herein we assume that the variance of stochastic gradients is related to the "sub-optimality" of the approximate solutions delivered by the algorithm. Such problems naturally arise in a variety of applications, in particular, in the well-known generalized linear regression problem in statistics. However, to the best of our knowledge, none of the existing stochastic approximation algorithms for solving this class of problems attain optimality in terms of the dependence on accuracy, problem parameters, and mini-batch size. We discuss two non-Euclidean accelerated stochastic approximation routines--stochastic accelerated gradient descent (SAGD) and stochastic gradient extrapolation (SGE)--which carry a particular duality relationship. We show that both SAGD and SGE, under appropriate conditions, achieve the optimal convergence rate, attaining the optimal iteration and sample complexities simultaneously. However, corresponding assumptions for the SGE algorithm are more general; they allow, for instance, for efficient application of the SGE to statistical estimation problems under heavy tail noises and discontinuous score functions. We also discuss the application of the SGE to problems satisfying quadratic growth conditions, and show how it can be used to recover sparse solutions. Finally, we report on some simulation experiments to illustrate numerical performance of our proposed algorithms in high-dimensional settings.

2.Exponential stability of Euler-Bernoulli beam under boundary controls in rotation and angular velocity

2307.01518

Authors:Alemdar Hasanov

Abstract: This paper addresses the analysis of a boundary feedback system involving a non-homogeneous Euler-Bernoulli beam governed by the equation $m(x)u_{tt}+\mu(x)u_{t}$$+\left(r(x)u_{xx}\right)_{xx}=0$, subject to the initial $u(x,0)=u_0(x)$, $u_t(x,0)=v_0(x)$ and boundary conditions $u(0,t)=0$, $\left (-r(x)u_{xx}(x,t)\right )_{x=0}=-k^{-}_r u_{x}(0,t)-k^{-}_a u_{xt}(0,t)$, $u(\ell,t)=0$, $\left (-r(x)u_{xx}(x,t)\right )_{x=\ell}=-k^{+}_r u_{x}(\ell,t)-k^{+}_a u_{xt}(\ell,t)$, with boundary control at both ends resulting from the rotation and angular velocity. The approach proposed in this study relies on the utilization of regular weak solutions, energy identity, and a physically motivated Lyapunov function. By imposing natural assumptions concerning physical parameters and other inputs, which ensure the existence of a regular weak solution, we successfully derive a uniform exponential decay estimate for the system's energy. The decay rate constant featured in this estimate is solely dependent on the physical and geometric properties of the beam. These properties encompass crucial parameters such as the viscous external damping coefficient $\mu(x)$, as well as the boundary springs $k^{-}_r,k^+_r $ and dampers $k^{-}_a,k^+_a$. To illustrate the practical effectiveness of our theoretical findings, numerical examples are provided. These examples serve to demonstrate the applicability and relevance of our derived results in real-world scenarios.

3.Strong stability of convexity with respect to the perimeter

2307.01633

Authors:Alessio Figalli, Yi Ru-Ya Zhang

Abstract: Let $E\subset \mathbb R^n$, $n\ge 2$, be a set of finite perimeter with $|E|=|B|$, where $B$ denotes the unit ball. When $n=2$, since convexification decreases perimeter (in the class of open connected sets), it is easy to prove the existence of a convex set $F$, with $|E|=|F|$, such that $$ P(E) - P(F) \ge c\,|E\Delta F|, \qquad c>0. $$ Here we prove that, when $n\ge 3$, there exists a convex set $F$, with $|E|=|F|$, such that $$ P(E) - P(F) \ge c(n) \,f\big(|E\Delta F|\big), \qquad c(n)>0,\qquad f(t)=\frac{t}{|\log t|} \text{ for }t \ll 1. $$ Moreover, one can choose $F$ to be a small $C^2$-deformation of the unit ball. Furthermore, this estimate is essentially sharp as we can show that the inequality above fails for $f(t)=t.$ Interestingly, the proof of our result relies on a new stability estimate for Alexandrov's Theorem on constant mean curvature sets.

4.Decentralized optimization with affine constraints over time-varying networks

2307.01655

Authors:Demyan Yarmoshik, Alexander Rogozin, Alexander Gasnikov

Abstract: The decentralized optimization paradigm assumes that each term of a finite-sum objective is privately stored by the corresponding agent. Agents are only allowed to communicate with their neighbors in the communication graph. We consider the case when the agents additionally have local affine constraints and the communication graph can change over time. We provide the first linearly convergent decentralized algorithm for time-varying networks by generalizing the optimal decentralized algorithm ADOM to the case of affine constraints. We show that its rate of convergence is optimal for first-order methods by providing the lower bounds for the number of communications and oracle calls.

5.Wasserstein medians: robustness, PDE characterization and numerics

2307.01765

Authors:Guillaume Carlier, Enis Chenchene, Katharina Eichinger

Abstract: We investigate the notion of Wasserstein median as an alternative to the Wasserstein barycenter, which has become popular but may be sensitive to outliers. In terms of robustness to corrupted data, we indeed show that Wasserstein medians have a breakdown point of approximately $\frac{1}{2}$. We give explicit constructions of Wasserstein medians in dimension one which enable us to obtain $L^p$ estimates (which do not hold in higher dimensions). We also address dual and multimarginal reformulations. In convex subsets of $\mathbb{R}^d$, we connect Wasserstein medians to a minimal (multi) flow problem \`a la Beckmann and a system of PDEs of Monge-Kantorovich-type, for which we propose a $p$-Laplacian approximation. Our analysis eventually leads to a new numerical method to compute Wasserstein medians, which is based on a Douglas-Rachford scheme applied to the minimal flow formulation of the problem.

6.Assessing the impact of Higher Order Network Structure on Tightness of OPF Relaxation

2307.01931

Authors:Nafis Sadik, Mohammad Rasoul Narimani

Abstract: AC optimal power flow (AC OPF) is a fundamental problem in power system operation and control. Accurately modeling the network physics via the AC power flow equations makes AC OPF a challenging nonconvex problem that results in significant computational challenges. To search for global optima, recent research has developed a variety of convex relaxations to bound the optimal objective values of AC OPF problems. However, the quality of these bounds varies for different test cases, suggesting that OPF problems exhibit a range of difficulties. Understanding this range of difficulty is helpful for improving relaxation algorithms. Power grids are naturally represented as graphs, with buses as nodes and power lines as edges. Graph theory offers various methods to measure power grid graphs, enabling researchers to characterize system structure and optimize algorithms. Leveraging graph theory-based algorithms, this paper presents an empirical study aiming to find correlations between optimality gaps and local structures in the underlying test case's graph. Network graphlets, which are induced subgraphs of a network, are used to investigate the correlation between power system topology and OPF relaxation tightness. Specifically, this paper examines how the existence of particular graphlets that are either too frequent or infrequent in the power system graph affects the tightness of the OPF convex relaxation. Numerous test cases are analyzed from a local structural perspective to establish a correlation between their topology and their OPF convex relaxation tightness.

7.Impact of Higher-Order Structures in Power Grids' Graph on Line Outage Distribution Factor

2307.01949

Authors:Nafis Sadik, Mohammad Rasoul Narimani

Abstract: Power systems often include a specific set of lines that are crucial for the regular operations of the grid. Identifying the reasons behind the criticality of these lines is an important challenge in power system studies. When a line fails, the line outage distribution factor (LODF) quantifies the changes in power flow on the remaining lines. This paper proposes a network analysis from a local structural perspective to investigate the impact of local structural patterns in the underlying graph of power systems on the LODF of individual lines. In particular, we focus on graphlet analysis to determine the local structural properties of each line. This research analyzes potential connections between specific graphlets and the most critical lines based on their LODF. In this regard, we investigate N-1 and N-2 contingency analysis for various test cases and identifies the lines that have the greatest impact on the LODFs of other lines. We then determine which subgraphs contain the most significant lines. Our findings reveal that the most critical lines often belong to subgraphs with a less meshed but more radial structure. These findings are further validated through various test cases. Particularly, it is observed that networks with a higher percentage of ring or meshed subgraphs on their most important line (based on LODF) experience a lower LODF when that critical line is subject to an outage. Additionally, we investigate how the LODF of the most critical line varies among different test cases and examine the subgraph characteristics of those critical lines.

Mon, 03 Jul 2023digest

1.Quantifying Distributional Model Risk in Marginal Problems via Optimal Transport

2307.00779

Authors:Yanqin Fan, Hyeonseok Park, Gaoqian Xu

Abstract: This paper studies distributional model risk in marginal problems, where each marginal measure is assumed to lie in a Wasserstein ball centered at a fixed reference measure with a given radius. Theoretically, we establish several fundamental results including strong duality, finiteness of the proposed Wasserstein distributional model risk, and the existence of an optimizer at each radius. In addition, we show continuity of the Wasserstein distributional model risk as a function of the radius. Using strong duality, we extend the well-known Makarov bounds for the distribution function of the sum of two random variables with given marginals to Wasserstein distributionally robust Markarov bounds. Practically, we illustrate our results on four distinct applications when the sample information comes from multiple data sources and only some marginal reference measures are identified. They are: partial identification of treatment effects; externally valid treatment choice via robust welfare functions; Wasserstein distributionally robust estimation under data combination; and evaluation of the worst aggregate risk measures.

2.Variational theory and algorithms for a class of asymptotically approachable nonconvex problems

2307.00780

Authors:Hanyang Li, Ying Cui

Abstract: We investigate a class of composite nonconvex functions, where the outer function is the sum of univariate extended-real-valued convex functions and the inner function is the limit of difference-of-convex functions. A notable feature of this class is that the inner function can be merely lower semicontinuous instead of continuous. It covers a range of important yet challenging applications, including the composite value functions of nonlinear programs, the weighted value-at-risk for continuously distributed random variables, and composite rank functions. We propose an asymptotic decomposition of the composite function that guarantees epi-convergence to the original function, leading to necessary optimality conditions for the corresponding minimization problems. The proposed decomposition also enables us to design a numerical algorithm that is provably convergent to a point satisfying the newly introduced optimality conditions. These results expand on the study of so-called amenable functions introduced by Poliquin and Rockafellar in 1992, which are compositions of convex functions with smooth maps, and the prox-linear methods for their minimization.

3.Monte Carlo Policy Gradient Method for Binary Optimization

2307.00783

Authors:Cheng Chen, Ruitao Chen, Tianyou Li, Ruichen Ao, Zaiwen Wen

Abstract: Binary optimization has a wide range of applications in combinatorial optimization problems such as MaxCut, MIMO detection, and MaxSAT. However, these problems are typically NP-hard due to the binary constraints. We develop a novel probabilistic model to sample the binary solution according to a parameterized policy distribution. Specifically, minimizing the KL divergence between the parameterized policy distribution and the Gibbs distributions of the function value leads to a stochastic optimization problem whose policy gradient can be derived explicitly similar to reinforcement learning. For coherent exploration in discrete spaces, parallel Markov Chain Monte Carlo (MCMC) methods are employed to sample from the policy distribution with diversity and approximate the gradient efficiently. We further develop a filter scheme to replace the original objective function by the one with the local search technique to broaden the horizon of the function landscape. Convergence to stationary points in expectation of the policy gradient method is established based on the concentration inequality for MCMC. Numerical results show that this framework is very promising to provide near-optimal solutions for quite a few binary optimization problems.

4.Global stabilization of sterile insect technique model by feedback laws

2307.00846

Authors:Kala Agbo Bidi LJLL, Luis Almeida LJLL, Jean-Michel Coron LJLL

Abstract: The Sterile Insect Technique or SIT is presently one of the most ecological methods for controlling insect pests responsible for disease transmission or crop destruction worldwide. This technique consists of releasing sterile males into the insect pest population. This approach aims at reducing fertility in the population and, consequently, reduce significantly the native insect population after a few generations. In this work, we study the global stabilization of a pest population at extinction equilibrium by the SIT method. We construct explicit feedback laws that stabilize the model and do numerical simulations to show the efficiency of our feedback laws. The different feedback laws are also compared taking into account their possible implementation in field interventions.

5.Minimal-time nonlinear control via semi-infinite programming

2307.00857

Authors:Antoine Oustry OptimiX, LIX, ENPC, Matteo Tacchi GIPSA-lab

Abstract: We address the problem of computing a control for a time-dependent nonlinear system to reach a target set in a minimal time. To solve this minimal time control problem, we introduce a hierarchy of linear semi-infinite programs, the values of which converge to the value of the control problem. These semi-infinite programs are increasing restrictions of the dual of the nonlinear control problem, which is a maximization problem over the subsolutions of the Hamilton-Jacobi-Bellman (HJB) equation. Our approach is compatible with generic dynamical systems and state constraints. Specifically, we use an oracle that, for a given differentiable function, returns a point at which the function violates the HJB inequality. We solve the semi-infinite programs using a classical convex optimization algorithm with a convergence rate of O(1/k), where k is the number of calls to the oracle. This algorithm yields subsolutions of the HJB equation that approximate the value function and provide a lower bound on the optimal time. We study the closed-loop control built on the obtained approximate value functions, and we give theoretical guarantees on its performance depending on the approximation error for the value function. We show promising numerical results for three non-polynomial systems with up to 6 state variables and 5 control variables.

6.Coefficient Control of Variational Inequalities

2307.00869

Authors:Andreas Hehl, Denis Khimin, Ira Neitzel, Nicolai Simon, Thomas Wick, Winnifried Wollner

Abstract: Within this chapter, we discuss control in the coefficients of an obstacle problem. Utilizing tools from H-convergence, we show existence of optimal solutions. First order necessary optimality conditions are obtained after deriving directional differentiability of the coefficient to solution mapping for the obstacle problem. Further, considering a regularized obstacle problem as a constraint yields a limiting optimality system after proving, strong, convergence of the regularized control and state variables. Numerical examples underline convergence with respect to the regularization. Finally, some numerical experiments highlight the possible extension of the results to coefficient control in phase-field fracture.

7.On the stochastic inventory problem under order capacity constraints

2307.00942

Authors:Roberto Rossi, Zhen Chen, S. Armagan Tarim

Abstract: We consider the single-item single-stocking location stochastic inventory system under a fixed ordering cost component. A long-standing problem is that of determining the structure of the optimal control policy when this system is subject to order quantity capacity constraints; to date, only partial characterisations of the optimal policy have been discussed. An open question is whether a policy with a single continuous interval over which ordering is prescribed is optimal for this problem. Under the so-called "continuous order property" conjecture, we show that the optimal policy takes the modified multi-$(s,S)$ form. Moreover, we provide a numerical counterexample in which the continuous order property is violated, and hence show that a modified multi-$(s,S)$ policy is not optimal in general. However, in an extensive computational study, we show that instances violating the continuous order property are extremely rare in practice, and that the plans generated by a modified multi-$(s,S)$ policy can therefore be considered, for all practical purposes, optimal. Finally, we show that a modified $(s,S)$ policy also performs well in practice.

8.Fast Convergence of Inertial Multiobjective Gradient-like Systems with Asymptotic Vanishing Damping

2307.00975

Authors:Konstantin Sonntag, Sebastian Peitz

Abstract: We present a new gradient-like dynamical system related to unconstrained convex smooth multiobjective optimization which involves inertial effects and asymptotic vanishing damping. To the best of our knowledge, this system is the first inertial gradient-like system for multiobjective optimization problems including asymptotic vanishing damping, expanding the ideas laid out in [H. Attouch and G. Garrigos, Multiobjective optimization: an inertial approach to Pareto optima, preprint, arXiv:1506.02823, 201]. We prove existence of solutions to this system in finite dimensions and further prove that its bounded solutions converge weakly to weakly Pareto optimal points. In addition, we obtain a convergence rate of order $O(t^{-2})$ for the function values measured with a merit function. This approach presents a good basis for the development of fast gradient methods for multiobjective optimization.

9.Feasibility problems via paramonotone operators in a convex setting

2307.00979

Authors:J. Camacho, M. J. Cánovas, J. E. Martínez-Legaz, J. Parra

Abstract: This paper is focused on some properties of paramonotone operators on Banach spaces and their application to certain feasibility problems for convex sets in a Hilbert space and convex systems in the Euclidean space. In particular, it shows that operators that are simultaneously paramonotone and bimonotone are constant on their domains, and this fact is applied to tackle two particular situations. The first one, closely related to simultaneous projections, deals with a finite amount of convex sets with an empty intersection and tackles the problem of finding the smallest perturbations (in the sense of translations) of these sets to reach a nonempty intersection. The second is focused on the distance to feasibility; specifically, given an inconsistent convex inequality system, our goal is to compute/estimate the smallest right-hand side perturbations that reach feasibility. We advance that this work derives lower and upper estimates of such a distance, which become the exact value when confined to linear systems.

10.Stochastic Recursive Optimal Control of McKean-Vlasov Type: A Viscosity Solution Approach

2307.00983

Authors:Liangquan Zhang

Abstract: In this paper, we study a kind of optimal control problem for forward-backward stochastic differential equations (FBSDEs for short) of McKean--Vlasov type via the dynamic programming principle (DPP for short) motivated by studying the infinite dimensional Hamilton--Jacobi--Bellman (HJB for short) equation derived from the decoupling field of the FBSDEs posed by Carmona and Delarue (\emph{Ann Probab}, 2015, \cite{cd15}). At the beginning, by considering the cost functional defined by the backward component of the solution of the controlled FBSDEs alluded to earlier, on one hand, we can prove the value function is deterministic function with respect to the initial random variable; On the other hand, we can show that the value function is \emph{law-invariant}, i.e., depend on only via its distribution by virtue of BSDE property. Afterward, utilizing the notion of differentiability with respect to probability measures introduced by P.L. Lions \cite{Lions2012}, we are able to establish a DPP for the value function in the Wasserstein space of probability measures based on the application of BSDE approach, particularly, employing the notion of stochastic \emph{backward semigroups} associated with stochastic optimal control problems and It\^{o}'s formula along a flow property of the conditional law of the controlled forward state process. We prove that the value function is the unique viscosity solutions of the associated generalized HJB equations in some sparable Hilbert space. Finally, as an application, we formulate an optimal control problem for linear stochastic differential equations with quadratic cost functionals of McKean-Vlasov type under nonlinear expectation, $g$-expectation introduced by Peng \cite{Peng04} and derive the optimal feedback control explicitly by means of several groups of Riccati equations.

11.Incomplete Information Linear-Quadratic Mean-Field Games and Related Riccati Equations

2307.01005

Authors:Min Li, Tianyang Nie, Shunjun Wang, Ke Yan

Abstract: We study a class of linear-quadratic mean-field games with incomplete information. For each agent, the state is given by a linear forward stochastic differential equation with common noise. Moreover, both the state and control variables can enter the diffusion coefficients of the state equation. We deduce the open-loop adapted decentralized strategies and feedback decentralized strategies by mean-field forward-backward stochastic differential equation and Riccati equations, respectively. The well-posedness of the corresponding consistency condition system is obtained and the limiting state-average turns out to be the solution of a mean-field stochastic differential equation driven by common noise. We also verify the $\varepsilon$-Nash equilibrium property of the decentralized control strategies. Finally, a network security problem is studied to illustrate our results as an application.

12.Hoffman constant of the argmin mapping in linear optimization

2307.01034

Authors:J. Camacho, M. J. Cánovas, H. Gfrerer, J. Parra

Abstract: The main contribution of this paper consists of providing an explicit formula to compute the Hoffman constant of the argmin mapping in linear optimization. The work is developed in the context of right-hand side perturbations of the constraint system as the Hoffman constant is always infinite when we perturb the objective function coefficients, unless the left-hand side of the constraints reduces to zero. In our perturbation setting, the argmin mapping is a polyhedral mapping whose graph is the union of convex polyhedral sets which assemble in a so nice way that global measures of the stability (Hoffman constants) can be computed through semilocal and local ones (as Lipschitz upper semicontinuity and calmness moduli, whose computation has been developed in previous works). Indeed, we isolate this nice behavior of the graph in the concept of well-connected polyhedral mappings and, in a first step, the paper focuses on Hoffman constant for these multifunctions. When confined to the optimal set, some specifics on directional stability are also presented.

13.Synthesizing Control Laws from Data using Sum-of-Squares Optimization

2307.01089

Authors:Jason J. Bramburger, Steven Dahdah, James Richard Forbes

Abstract: The control Lyapunov function (CLF) approach to nonlinear control design is well established. Moreover, when the plant is control affine and polynomial, sum-of-squares (SOS) optimization can be used to find a polynomial controller as a solution to a semidefinite program. This letter considers the use of data-driven methods to design a polynomial controller by leveraging Koopman operator theory, CLFs, and SOS optimization. First, Extended Dynamic Mode Decomposition (EDMD) is used to approximate the Lie derivative of a given CLF candidate with polynomial lifting functions. Then, the polynomial Koopman model of the Lie derivative is used to synthesize a polynomial controller via SOS optimization. The result is a flexible data-driven method that skips the intermediary process of system identification and can be applied widely to control problems. The proposed approach is used to successfully synthesize a controller to stabilize an inverted pendulum on a cart.

14.Analyzing and Improving Greedy 2-Coordinate Updates for Equality-Constrained Optimization via Steepest Descent in the 1-Norm

2307.01169

Authors:Amrutha Varshini Ramesh, Aaron Mishkin, Mark Schmidt, Yihan Zhou, Jonathan Wilder Lavington, Jennifer She

Abstract: We consider minimizing a smooth function subject to a summation constraint over its variables. By exploiting a connection between the greedy 2-coordinate update for this problem and equality-constrained steepest descent in the 1-norm, we give a convergence rate for greedy selection under a proximal Polyak-Lojasiewicz assumption that is faster than random selection and independent of the problem dimension $n$. We then consider minimizing with both a summation constraint and bound constraints, as arises in the support vector machine dual problem. Existing greedy rules for this setting either guarantee trivial progress only or require $O(n^2)$ time to compute. We show that bound- and summation-constrained steepest descent in the L1-norm guarantees more progress per iteration than previous rules and can be computed in only $O(n \log n)$ time.

15.A numerical algorithm for attaining the Chebyshev bound in optimal learning

2307.01304

Authors:Pradyumna Paruchuri, Debasish Chatterjee

Abstract: Given a compact subset of a Banach space, the Chebyshev center problem consists of finding a minimal circumscribing ball containing the set. In this article we establish a numerically tractable algorithm for solving the Chebyshev center problem in the context of optimal learning from a finite set of data points. For a hypothesis space realized as a compact but not necessarily convex subset of a finite-dimensional subspace of some underlying Banach space, this algorithm computes the Chebyshev radius and the Chebyshev center of the hypothesis space, thereby solving the problem of optimal recovery of functions from data. The algorithm itself is based on, and significantly extends, recent results for near-optimal solutions of convex semi-infinite problems by means of targeted sampling, and it is of independent interest. Several examples of numerical computations of Chebyshev centers are included in order to illustrate the effectiveness of the algorithm.

16.A geometric framework for discrete time port-Hamiltonian systems

2307.01351

Authors:Karim Cherifi, Hannes Gernandt, Dorothea Hinsen, Volker Mehrmann

Abstract: Port-Hamiltonian systems provide an energy-based formulation with a model class that is closed under structure preserving interconnection. For continuous-time systems these interconnections are constructed by geometric objects called Dirac structures. In this paper, we derive this geometric formulation and the interconnection properties for scattering passive discrete-time port-Hamiltonian systems.

Fri, 30 Jun 2023digest

1.Calm local optimality for nonconvex-nonconcave minimax problems

2306.17443

Authors:Xiaoxiao Ma, Wei Yao, Jane J. Ye, Jin Zhang

Abstract: Nonconvex-nonconcave minimax problems have found numerous applications in various fields including machine learning. However, questions remain about what is a good surrogate for local minimax optimum and how to characterize the minimax optimality. Recently Jin, Netrapalli, and Jordan (ICML 2020) introduced a concept of local minimax point and derived optimality conditions for the smooth and unconstrained case. In this paper, we introduce the concept of calm local minimax point, which is a local minimax point with a calm radius function. With the extra calmness property we obtain first and second-order sufficient and necessary optimality conditions for a very general class of nonsmooth nonconvex-nonconcave minimax problem. Moreover we show that the calm local minimax optimality and the local minimax optimality coincide under a weak sufficient optimality condition for the maximization problem. This equivalence allows us to derive stronger optimality conditions under weaker assumptions for local minimax optimality.

2.Impulse control with generalised discounting

2306.17448

Authors:Damian Jelito, Łukasz Stettner

Abstract: In this paper, we investigate the effects of applying generalised (non-exponential) discounting on a long-run impulse control problem for a Feller-Markov process. We show that the optimal value of the discounted problem is the same as the optimal value of its undiscounted version. Next, we prove that an optimal strategy for the undiscounted discrete time functional is also optimal for the discrete-time discounted criterion and nearly optimal for the continuous-time discounted one. This shows that the discounted problem, being time-inconsistent in nature, admits a time-consistent solution. Also, instead of a complex time-dependent Bellman equation one may consider its simpler time-independent version.

3.An Oblivious Stochastic Composite Optimization Algorithm for Eigenvalue Optimization Problems

2306.17470

Authors:Clément Lezane, Cristóbal Guzmán, Alexandre d'Aspremont

Abstract: In this work, we revisit the problem of solving large-scale semidefinite programs using randomized first-order methods and stochastic smoothing. We introduce two oblivious stochastic mirror descent algorithms based on a complementary composite setting. One algorithm is designed for non-smooth objectives, while an accelerated version is tailored for smooth objectives. Remarkably, both algorithms work without prior knowledge of the Lipschitz constant or smoothness of the objective function. For the non-smooth case with $\mathcal{M}-$bounded oracles, we prove a convergence rate of $ O( {\mathcal{M}}/{\sqrt{T}} ) $. For the $L$-smooth case with a feasible set bounded by $D$, we derive a convergence rate of $ O( {L^2 D^2}/{(T^{2}\sqrt{T})} + {(D_0^2+\sigma^2)}/{\sqrt{T}} )$, where $D_0$ is the starting distance to an optimal solution, and $ \sigma^2$ is the stochastic oracle variance. These rates had only been obtained so far by either assuming prior knowledge of the Lipschitz constant or the starting distance to an optimal solution. We further show how to extend our framework to relative scale and demonstrate the efficiency and robustness of our methods on large scale semidefinite programs.

4.Convergence property of the Quantized Distributed Gradient descent with constant stepsizes and an effective strategy for the stepsize selection

2306.17481

Authors:Woocheol Choi, Myeong-Su Lee

Abstract: In this paper, we establish new convergence results for the quantized distributed gradient descent and suggest a novel strategy of choosing the stepsizes for the high-performance of the algorithm. Under the strongly convexity assumption on the aggregate cost function and the smoothness assumption on each local cost function, we prove the algorithm converges exponentially fast to a small neighborhood of the optimizer whose radius depends on the stepsizes. Based on our convergence result, we suggest an effective selection of stepsizes which repeats diminishing the stepsizes after a number of specific iterations. Both the convergence results and the effectiveness of the suggested stepsize selection are also verified by the numerical experiments.

5.Homogeneous Second-Order Descent Framework: A Fast Alternative to Newton-Type Methods

2306.17516

Authors:Chang He, Yuntian Jiang, Chuwen Zhang, Dongdong Ge, Bo Jiang, Yinyu Ye

Abstract: This paper proposes a homogeneous second-order descent framework (HSODF) for nonconvex and convex optimization based on the generalized homogeneous model (GHM). In comparison to the Newton steps, the GHM can be solved by extremal symmetric eigenvalue procedures and thus grant an advantage in ill-conditioned problems. Moreover, GHM extends the ordinary homogeneous model (OHM) to allow adaptiveness in the construction of the aggregated matrix. Consequently, HSODF is able to recover some well-known second-order methods such as trust-region methods and gradient regularized methods while maintaining comparable iteration complexity bounds. We also study two specific realizations of HSODF. One is adptive HSODM, which has a parameter-free $O(\epsilon^{-3/2})$ global complexity bound for nonconvex second-order Lipschitz continuous functions. The other one is homotopy HSODM, which is proven to have a global linear rate of convergence without strong convexity. The efficiency of our appproach on ill-conditioned and high-dimensional problems are justified by some perlimiarny numerical results.

6.The risk-sensitive optimal stopping problem: geometric solution and algorithms

2306.17623

Authors:Tomasz Kosmala, John Moriarty

Abstract: We use the geometry of functions associated with martingales under a risk measure to solve risk-sensitive Markovian optimal stopping problems. Generalising the risk-neutral case due to Dynkin and Yushkievich (1969), the risk-sensitive value function is the pointwise infimum of those functions which dominate the gain function. The functions are not required to be differentiable, can explode to infinity, and form a three-dimensional set, and in the differentiable case the smooth fit principle holds. Only elementary properties of the driving Markov process $X$ are used. Algorithms are provided to construct the value function, with the computational cost of a two-dimensional search.

7.Convex quartic problems: homogenized gradient method and preconditioning

2306.17683

Authors:Radu-Alexandru Dragomir, Yurii Nesterov

Abstract: We consider a convex minimization problem for which the objective is the sum of a homogeneous polynomial of degree four and a linear term. Such task arises as a subproblem in algorithms for quadratic inverse problems with a difference-of-convex structure. We design a first-order method called Homogenized Gradient, along with an accelerated version, which enjoy fast convergence rates of respectively $\mathcal{O}(\kappa^2/K^2)$ and $\mathcal{O}(\kappa^2/K^4)$ in relative accuracy, where $K$ is the iteration counter. The constant $\kappa$ is the quartic condition number of the problem. Then, we show that for a certain class of problems, it is possible to compute a preconditioner for which this condition number is $\sqrt{n}$, where $n$ is the problem dimension. To establish this, we study the more general problem of finding the best quadratic approximation of an $\ell_p$ norm composed with a quadratic map. Our construction involves a generalization of the so-called Lewis weights.

8.Algorithms for Shipping Container Delivery Scheduling

2306.17789

Authors:Anna Collins, Dimitrios Letsios, Gueorgui Mihaylov

Abstract: Motivated by distribution problems arising in the supply chain of Haleon, we investigate a discrete optimization problem that we call the "container delivery scheduling problem". The problem models a supplier dispatching ordered products with shipping containers from manufacturing sites to distribution centers, where orders are collected by the buyers at agreed due times. The supplier may expedite or delay item deliveries to reduce transshipment costs at the price of increasing inventory costs, as measured by the number of containers and distribution center storage/backlog costs, respectively. The goal is to compute a delivery schedule attaining good trade-offs between the two. This container delivery scheduling problem is a temporal variant of classic bin packing problems, where the item sizes are not fixed, but depend on the item due times and delivery times. An approach for solving the problem should specify a batching policy for container consolidation and a scheduling policy for deciding when each container should be delivered. Based on the available item due times, we develop algorithms with sequential and nested batching policies as well as on-time and delay-tolerant scheduling policies. We elaborate on the problem's hardness and substantiate the proposed algorithms with positive and negative approximation bounds, including the derivation of an algorithm achieving an asymptotically tight 2-approximation ratio.

9.Accelerating Inexact HyperGradient Descent for Bilevel Optimization

2307.00126

Authors:Haikuo Yang, Luo Luo, Chris Junchi Li, Michael I. Jordan

Abstract: We present a method for solving general nonconvex-strongly-convex bilevel optimization problems. Our method -- the \emph{Restarted Accelerated HyperGradient Descent} (\texttt{RAHGD}) method -- finds an $\epsilon$-first-order stationary point of the objective with $\tilde{\mathcal{O}}(\kappa^{3.25}\epsilon^{-1.75})$ oracle complexity, where $\kappa$ is the condition number of the lower-level objective and $\epsilon$ is the desired accuracy. We also propose a perturbed variant of \texttt{RAHGD} for finding an $\big(\epsilon,\mathcal{O}(\kappa^{2.5}\sqrt{\epsilon}\,)\big)$-second-order stationary point within the same order of oracle complexity. Our results achieve the best-known theoretical guarantees for finding stationary points in bilevel optimization and also improve upon the existing upper complexity bound for finding second-order stationary points in nonconvex-strongly-concave minimax optimization problems, setting a new state-of-the-art benchmark. Empirical studies are conducted to validate the theoretical results in this paper.

10.Convex Optimization in Legged Robots

2307.00156

Authors:Prathamesh Saraf, Mustafa Shaikh, Myron Phan

Abstract: Convex optimization is crucial in controlling legged robots, where stability and optimal control are vital. Many control problems can be formulated as convex optimization problems, with a convex cost function and constraints capturing system dynamics. Our review focuses on active balancing problems and presents a general framework for formulating them as second-order cone programming (SOCP) for robustness and efficiency with existing interior point algorithms. We then discuss some prior work around the Zero Moment Point stability criterion, Linear Quadratic Regulator Control, and then the feedback model predictive control (MPC) approach to improve prediction accuracy and reduce computational costs. Finally, these techniques are applied to stabilize the robot for jumping and landing tasks. Further research in convex optimization of legged robots can have a significant societal impact. It can lead to improved gait planning and active balancing which enhances their ability to navigate complex environments, assist in search and rescue operations and perform tasks in hazardous environments. These advancements have the potential to revolutionize industries and help humans in daily life.

11.Optimal Control of Chromate Removal via Enhanced Modeling using the Method of Moments

2307.00172

Authors:Fred Ghanem, Kirti M. Yenkie

Abstract: Single-use anion-exchange resins can reduce hazardous chromates to safe levels in drinking water. However, since most process control strategies monitor effluent concentrations, detection of any chromate leakage leads to premature resin replacement. Furthermore, variations in the inlet chromate concentration and other process conditions make process control a challenging step. In this work, we capture the uncertainty of the process conditions by applying the Ito process of Brownian motion with drift into a stochastic optimal control strategy. The ion exchange process is modeled using the method of moments which helps capture the process dynamics, later formulated into mathematical objectives representing desired chromate removal. We then solved our developed models as an optimal control problem via Pontryagin's maximum principle. The objectives enabled a successful control via flow rate adjustments leading to higher chromate extraction. Such an approach maximized the capacity of the resin and column efficiency to remove toxic compounds from water while capturing deviations in the process conditions.

Thu, 29 Jun 2023digest

1.Moreau Envelope Based Difference-of-weakly-Convex Reformulation and Algorithm for Bilevel Programs

2306.16761

Authors:Lucy L. Gao, Jane J. Ye, Haian Yin, Shangzhi Zeng, Jin Zhang

Abstract: Recently, Ye et al. (Mathematical Programming 2023) designed an algorithm for solving a specific class of bilevel programs with an emphasis on applications related to hyperparameter selection, utilizing the difference of convex algorithm based on the value function approach reformulation. The proposed algorithm is particularly powerful when the lower level problem is fully convex , such as a support vector machine model or a least absolute shrinkage and selection operator model. In this paper, to suit more applications related to machine learning and statistics, we substantially weaken the underlying assumption from lower level full convexity to weak convexity. Accordingly, we propose a new reformulation using Moreau envelope of the lower level problem and demonstrate that this reformulation is a difference of weakly convex program. Subsequently, we develop a sequentially convergent algorithm for solving this difference of weakly convex program. To evaluate the effectiveness of our approach, we conduct numerical experiments on the bilevel hyperparameter selection problem from elastic net, sparse group lasso, and RBF kernel support vector machine models.

2.Sampling-Based Approaches for Multimarginal Optimal Transport Problems with Coulomb Cost

2306.16763

Authors:Yukuan Hu, Mengyu Li, Xin Liu, Cheng Meng

Abstract: The multimarginal optimal transport problem with Coulomb cost arises in quantum physics and is vital in understanding strongly correlated quantum systems. Its intrinsic curse of dimensionality can be overcome with a Monge-like ansatz. A nonconvex quadratic programmming then emerges after employing discretization and $\ell_1$ penalty. To globally solve this nonconvex problem, we adopt a grid refinements-based framework, in which a local solver is heavily invoked and hence significantly determines the overall efficiency. The block structure of this nonconvex problem suggests taking block coordinate descent-type methods as the local solvers, while the existing ones can get seriously afflicted with the poor scalability induced by the associated sparse-dense matrix multiplications. In this work, borrowing the tools from optimal transport, we develop novel methods that favor highly scalable schemes for subproblems and are completely free of the full matrix multiplications after introducing entrywise sampling. Convergence and asymptotic properties are built on the theory of random matrices. The numerical results on several typical physical systems corroborate the effectiveness and better scalability of our approach, which also allows the first visualization for the approximate optimal transport maps between electrons in three-dimensional contexts.

3.Approximate controllabillity of a 2D linear system related to the motion of two fluids with surface tension

2306.16908

Authors:Sebastien Court

Abstract: We consider a coupled system of partial differential equations describing the interactions between a closed free interface and two viscous incompressible fluids. The fluids are assumed to satisfy the incompressible Navier-Stokes equations in time-dependent domains that are determined by the free interface. The mean curvature of the interface induces a surface tension force that creates a jump of the Cauchy stress tensor on both sides. It influences the behavior of the surrounding fluids, and therefore the deformation of this interface via the equality of velocities. In dimension 2, the steady states correspond to immobile interfaces that are circles with all the same volume. Considering small displacements of steady states, we are lead to consider a linearized version of this system. We prove that the latter is approximately controllable to a given steady state for any time $T>0$ by the means of additional surface tension type forces, provided that the radius of the circle of reference does not coincide with a scaled zero of the Bessel function of first kind.

4.A Low-Power Hardware-Friendly Optimisation Algorithm With Absolute Numerical Stability and Convergence Guarantees

2306.16935

Authors:Anis Hamadouche, Yun Wu, Andrew M. Wallace, Joao F. C. Mota

Abstract: We propose Dual-Feedback Generalized Proximal Gradient Descent (DFGPGD) as a new, hardware-friendly, operator splitting algorithm. We then establish convergence guarantees under approximate computational errors and we derive theoretical criteria for the numerical stability of DFGPGD based on absolute stability of dynamical systems. We also propose a new generalized proximal ADMM that can be used to instantiate most of existing proximal-based composite optimization solvers. We implement DFGPGD and ADMM on FPGA ZCU106 board and compare them in light of FPGA's timing as well as resource utilization and power efficiency. We also perform a full-stack, application-to-hardware, comparison between approximate versions of DFGPGD and ADMM based on dynamic power/error rate trade-off, which is a new hardware-application combined metric.

5.A Counterexample to D. J. White's Theorem on a Vector-valued Extension of the Optimality Equations of a Markov Decision Process

2306.16937

Authors:Anas Mifrani

Abstract: It is well known that under the expected total reward criterion, the optimal value of a finite-horizon Markov decision process can be determined by solving a set of recursively defined equations backward in time. An extension of those equations to vector-valued processes was proposed by D. J. White in 1982. By means of a counterexample, we show that the assumptions underlying this extension are insufficient to guarantee its validity. A strong assumption on state dynamics is introduced to resolve this issue.

6.Improved Convergence Bounds For Operator Splitting Algorithms With Rare Extreme Errors

2306.16964

Authors:Anis Hamadouche, Andrew M. Wallace, Joao F. C. Mota

Abstract: In this paper, we improve upon our previous work[24,22] and establish convergence bounds on the objective function values of approximate proximal-gradient descent (AxPGD), approximate accelerated proximal-gradient descent (AxAPGD) and approximate proximal ADMM (AxWLM-ADMM) schemes. We consider approximation errors that manifest rare extreme events and we propagate their effects through iterations. We establish probabilistic asymptotic and non-asymptotic convergence bounds as functions of the range (upper/lower bounds) and variance of approximation errors. We use the derived bound to assess AxPGD in a sparse model predictive control of a spacecraft system and compare its accuracy with previously derived bounds.

7.Robust Time-inconsistent Linear-Quadratic Stochastic Controls: A Stochastic Differential Game Approach

2306.16982

Authors:Bingyan Han, Chi Seng Pun, Hoi Ying Wong

Abstract: This paper studies robust time-inconsistent (TIC) linear-quadratic stochastic control problems, formulated by stochastic differential games. By a spike variation approach, we derive sufficient conditions for achieving the Nash equilibrium, which corresponds to a time-consistent (TC) robust policy, under mild technical assumptions. To illustrate our framework, we consider two scenarios of robust mean-variance analysis, namely with state- and control-dependent ambiguity aversion. We find numerically that with time inconsistency haunting the dynamic optimal controls, the ambiguity aversion enhances the effective risk aversion faster than the linear, implying that the ambiguity in the TIC cases is more impactful than that under the TC counterparts, e.g., expected utility maximization problems.

8.Consistency of sample-based stationary points for infinite-dimensional stochastic optimization

2306.17032

Authors:Johannes Milz

Abstract: We consider stochastic optimization problems with possibly nonsmooth integrands posed in Banach spaces and approximate these stochastic programs via a sample-based approaches. We establish the consistency of approximate Clarke stationary points of the sample-based approximations. Our framework is applied to risk-averse semilinear PDE-constrained optimization using the average value-at-risk and to risk-neutral bilinear PDE-constrained optimization.

9.Pupil-driven quantitative differential phase contrast imaging

2306.17088

Authors:Shuhe Zhang, Hao Wu, Tao Peng, Zeyu Ke, Meng Shao, Tos T. J. M. Berendschot, Jinhua Zhou

Abstract: In this research, we reveal the inborn but hitherto ignored properties of quantitative differential phase contrast (qDPC) imaging: the phase transfer function being an edge detection filter. Inspired by this, we highlighted the duality of qDPC between optics and pattern recognition, and propose a simple and effective qDPC reconstruction algorithm, termed Pupil-Driven qDPC (pd-qDPC), to facilitate the phase reconstruction quality for the family of qDPC-based phase reconstruction algorithms. We formed a new cost function in which modified L0-norm was used to represent the pupil-driven edge sparsity, and the qDPC convolution operator is duplicated in the data fidelity term to achieve automatic background removal. Further, we developed the iterative reweighted soft-threshold algorithms based on split Bregman method to solve this modified L0-norm problem. We tested pd-qDPC on both simulated and experimental data and compare against state-of-the-art (SOTA) methods including L2-norm, total variation regularization (TV-qDPC), isotropic-qDPC, and Retinex qDPC algorithms. Results show that our proposed model is superior in terms of phase reconstruction quality and implementation efficiency, in which it significantly increases the experimental robustness while maintaining the data fidelity. In general, the pd-qDPC enables the high-quality qDPC reconstruction without any modification of the optical system. It simplifies the system complexity and benefits the qDPC community and beyond including but not limited to cell segmentation and PTF learning based on the edge filtering property.

10.PANTR: A proximal algorithm with trust-region updates for nonconvex constrained optimization

2306.17119

Authors:Alexander Bodard, Pieter Pas, Panagiotis Patrinos

Abstract: This work presents PANTR, an efficient solver for nonconvex constrained optimization problems, that is well-suited as an inner solver for an augmented Lagrangian method. The proposed scheme combines forward-backward iterations with solutions to trust-region subproblems: the former ensures global convergence, whereas the latter enables fast update directions. We discuss how the algorithm is able to exploit exact Hessian information of the smooth objective term through a linear Newton approximation, while benefiting from the structure of box-constraints or l1-regularization. An open-source C++ implementation of PANTR is made available as part of the NLP solver library ALPAQA. Finally, the effectiveness of the proposed method is demonstrated in nonlinear model predictive control applications.

11.The Boosted Double-Proximal Subgradient Algorithm for Nonconvex Optimization

2306.17144

Authors:Francisco J. Aragón-Artacho, Pedro Pérez-Aros, David Torregrosa-Belén

Abstract: In this paper we introduce the Boosted Double-proximal Subgradient Algorithm (BDSA), a novel splitting algorithm designed to address general structured nonsmooth and nonconvex mathematical programs expressed as sums and differences of composite functions. BDSA exploits the combined nature of subgradients from the data and proximal steps, and integrates a line-search procedure to enhance its performance. While BDSA encompasses existing schemes proposed in the literature, it extends its applicability to more diverse problem domains. We establish the convergence of BDSA under the Kurdyka--Lojasiewicz property and provide an analysis of its convergence rate. To evaluate the effectiveness of BDSA, we introduce a novel family of challenging test functions with an abundance of critical points. We conduct comparative evaluations demonstrating its ability to effectively escape non-optimal critical points. Additionally, we present two practical applications of BDSA for testing its efficacy, namely, a constrained minimum-sum-of-squares clustering problem and a nonconvex generalization of Heron's problem.

Wed, 28 Jun 2023digest

1.Stochastic Trip Planning in High Dimensional Public Transit Network

2306.15941

Authors:Raashid Altaf, Pravesh Biyani

Abstract: This paper proposes a generalised framework for density estimation in large networks with measurable spatiotemporal variance in edge weights. We solve the stochastic shortest path problem for a large network by estimating the density of the edge weights in the network and analytically finding the distribution of a path. In this study, we employ Gaussian Processes to model the edge weights. This approach not only reduces the analytical complexity associated with computing the stochastic shortest path but also yields satisfactory performance. We also provide an online version of the model that yields a 30 times speedup in the algorithm's runtime while retaining equivalent performance. As an application of the model, we design a real-time trip planning system to find the stochastic shortest path between locations in the public transit network of Delhi. Our observations show that different paths have different likelihoods of being the shortest path at any given time in a public transit network. We demonstrate that choosing the stochastic shortest path over a deterministic shortest path leads to savings in travel time of up to 40\%. Thus, our model takes a significant step towards creating a reliable trip planner and increase the confidence of the general public in developing countries to take up public transit as a primary mode of transportation.

2.Auction algorithm sensitivity for multi-robot task allocation

2306.16032

Authors:Katie Clinch, Tony A. Wood, Chris Manzie

Abstract: We consider the problem of finding a low-cost allocation and ordering of tasks between a team of robots in a d-dimensional, uncertain, landscape, and the sensitivity of this solution to changes in the cost function. Various algorithms have been shown to give a 2-approximation to the MinSum allocation problem. By analysing such an auction algorithm, we obtain intervals on each cost, such that any fluctuation of the costs within these intervals will result in the auction algorithm outputting the same solution.

3.Guarantees for data-driven control of nonlinear systems using semidefinite programming: A survey

2306.16042

Authors:Tim Martin, Thomas B. Schön, Frank Allgöwer

Abstract: This survey presents recent research on determining control-theoretic properties and designing controllers with rigorous guarantees and for nonlinear systems for which no mathematical models but measured trajectories are available. Data-driven control techniques have been developed to circumvent a time-consuming modelling by first principles and because of the increasing availability of data. Recently, this research field has gained increased attention by the application of Willems' fundamental lemma, which provides a fertile ground for the development of data-driven control schemes with guarantees for linear time-invariant systems. While the fundamental lemma can be generalized to further system classes, there does not exist a comparable comprising theory for nonlinear systems. At the same time, nonlinear systems constitute the majority of practical systems. Moreover, they include additional challenges such as nonconvex optimization and data-based surrogate models that prevent end-to-end guarantees. Therefore, a variety of data-driven control approaches has been developed with different required prior insights into the system to ensure a guaranteed inference. In this survey, we will discuss developments in the context of data-driven control for nonlinear systems. In particular, we will focus on approaches providing guarantees from finite data, while the analysis and the controller design are computationally tractable due to semidefinite programming. Thus, these approaches achieve reasonable advances compared to the state-of-the-art system analysis and controller design by models from system identification.

4.Forward-backward algorithm for functions with locally Lipschitz gradient: applications to mean field games

2306.16047

Authors:Luis M. Briceno-Arias XLIM, Francisco José Silva XLIM, Xianjin Yang CALTECH

Abstract: In this paper, we provide a generalization of the forward-backward splitting algorithm for minimizing the sum of a proper convex lower semicontinuous function and a differentiable convex function whose gradient satisfies a locally Lipschitztype condition. We prove the convergence of our method and derive a linear convergence rate when the differentiable function is locally strongly convex. We recover classical results in the case when the gradient of the differentiable function is globally Lipschitz continuous and an already known linear convergence rate when the function is globally strongly convex. We apply the algorithm to approximate equilibria of variational mean field game systems with local couplings. Compared with some benchmark algorithms to solve these problems, our numerical tests show similar performances in terms of the number of iterations but an important gain in the required computational time.

5.An optimal hierarchical control scheme for smart generation units: an application to combined steam and electricity generation

2306.16146

Authors:Stefano Spinelli, Marcello Farina, Andrea Ballarino

Abstract: Optimal management of thermal and energy grids with fluctuating demand and prices requires to orchestrate the generation units (GU) among all their operating modes. A hierarchical approach is proposed to control coupled energy nonlinear systems. The high level hybrid optimization defines the unit commitment, with the optimal transition strategy, and best production profiles. The low level dynamic model predictive control (MPC), receiving the set-points from the upper layer, safely governs the systems considering process constraints. To enhance the overall efficiency of the system, a method to optimal start-up the GU is here presented: a linear parameter varying MPC computes the optimal trajectory in closed-loop by iteratively linearising the system along the previous optimal solution. The introduction of an intermediate equilibrium state as additional decision variable permits the reduction of the optimization horizon,while a terminal cost term steers the system to the target set-point. Simulation results show the effectiveness of the proposed approach.

6.Alternating minimization for simultaneous estimation of a latent variable and identification of a linear continuous-time dynamic system

2306.16150

Authors:Pierre-Cyril Aubin-Frankowski, Alain Bensoussan, S. Joe Qin

Abstract: We propose an optimization formulation for the simultaneous estimation of a latent variable and the identification of a linear continuous-time dynamic system, given a single input-output pair. We justify this approach based on Bayesian maximum a posteriori estimators. Our scheme takes the form of a convex alternating minimization, over the trajectories and the dynamic model respectively. We prove its convergence to a local minimum which verifies a two point-boundary problem for the (latent) state variable and a tensor product expression for the optimal dynamics.

7.Equal area partitions of the sphere with diameter bounds, via optimal transport

2306.16239

Authors:Jun Kitagawa, Asuka Takatsu

Abstract: We prove existence of equal area partitions of the unit sphere via optimal transport methods, accompanied by diameter bounds written in terms of Monge--Kantorovich distances. This can be used to obtain bounds on the expectation of the maximum diameter of partition sets, when points are uniformly sampled from the sphere. An application to the computation of sliced Monge--Kantorovich distances is also presented.

8.Theory and applications of the Sum-Of-Squares technique

2306.16255

Authors:Francis Bach, Elisabetta Cornacchia, Luca Pesce, Giovanni Piccioli

Abstract: The Sum-of-Squares (SOS) approximation method is a technique used in optimization problems to derive lower bounds to the optimal value of an objective function. By representing the objective function as a sum of squares in a feature space, the SOS method transforms non-convex global optimization problems into solvable semidefinite programs. This note presents an overview of the SOS method. We start with its application in finite-dimensional feature spaces and, subsequently, we extend it to infinite-dimensional feature spaces using kernels (k-SOS). Additionally, we highlight the utilization of SOS for estimating some relevant quantities in information theory, including the log-partition function.

9.The interdependence between hospital choice and waiting time -- with a case study in urban China

2306.16256

Authors:Joris van de Klundert, Roberto Cominetti, Yun Liu, Qingxia Kong

Abstract: Hospital choice models often employ random utility theory and include waiting time as a choice determinant. When applied to evaluate health system improvement interventions, these models disregard that hospital choice in turn is a determinant of waiting time. We present a novel, general model capturing the endogeneous relationship between waiting time and hospital choice, including the choice to opt out, and characterize the unique equilibrium solution of the resulting convex problem. We apply the general model in a case study on the urban Chinese health system, specifying that patient choice follows a multinomial logit (MNL) model and waiting times are determined by M/M/1 queues. The results reveal that analyses which solely rely on MNL models overestimate the effectiveness of present policy interventions and that this effectiveness is limited. We explore alternative, more effective, improvement interventions.

10.An optimization approach to study the phase changing behavior of multi-component mixtures

2306.16327

Authors:Gustavo E. O. Celis, Reza Arefidamghani, Hamidreza Anbarlooei, Daniel O. A. Cruz

Abstract: The appropriate design, construction, and operation of carbon capture and storage (CCS) and enhanced oil recovery (EOR) processes require a deep understanding of the resulting phases behavior in hydrocarbons-CO_2 multi-component mixtures under reservoir conditions. To model this behavior a nonlinear system consists of the equation of states and some mixing rules (for each component) needed to be solved simultaneously. The mixing usually requires to model the binary interaction between the components of the mixture. This work employs optimization techniques to enhance the predictions of such model by optimizing the binary interaction parameters. The results show that the optimized parameters, although obtained mathematically, are in physical ranges and can reproduce successfully the experimental observations, specially for the multi-component hydrocarbons systems containing Carbon dioxide at reservoir temperatures and pressures

11.Upper bounds on maximum admissible noise in zeroth-order optimisation

2306.16371

Authors:Dmitry A. Pasechnyuk, Aleksandr Lobanov, Alexander Gasnikov

Abstract: In this paper, based on information-theoretic upper bound on noise in convex Lipschitz continuous zeroth-order optimisation, we provide corresponding upper bounds for strongly-convex and smooth classes of problems using non-constructive proofs through optimal reductions. Also, we show that based on one-dimensional grid-search optimisation algorithm one can construct algorithm for simplex-constrained optimisation with upper bound on noise better than that for ball-constrained and asymptotic in dimensionality case.

Tue, 27 Jun 2023digest

1.Automating Steady and Unsteady Adjoints: Efficiently Utilizing Implicit and Algorithmic Differentiation

2306.15243

Authors:Andrew Ning, Taylor McDonnell

Abstract: Algorithmic differentiation (AD) has become increasingly capable and straightforward to use. However, AD is inefficient when applied directly to solvers, a feature of most engineering analyses. We can leverage implicit differentiation to define a general AD rule, making adjoints automatic. Furthermore, we can leverage the structure of differential equations to automate unsteady adjoints in a memory efficient way. We also derive a technique to speed up explicit differential equation solvers, which have no iterative solver to exploit. All of these techniques are demonstrated on problems of various sizes, showing order of magnitude speed-ups with minimal code changes. Thus, we can enable users to easily compute accurate derivatives across complex analyses with internal solvers, or in other words, automate adjoints using a combination of AD and implicit differentiation.

2.Parameterized Complexity of Chordal Conversion for Sparse Semidefinite Programs with Small Treewidth

2306.15288

Authors:Richard Y. Zhang

Abstract: If a sparse semidefinite program (SDP), specified over $n\times n$ matrices and subject to $m$ linear constraints, has an aggregate sparsity graph $G$ with small treewidth, then chordal conversion will frequently allow an interior-point method to solve the SDP in just $O(m+n)$ time per-iteration. This is a significant reduction over the minimum $\Omega(n^{3})$ time per-iteration for a direct solution, but a definitive theoretical explanation was previously unknown. Contrary to popular belief, the speedup is not guaranteed by a small treewidth in $G$, as a diagonal SDP would have treewidth zero but can still necessitate up to $\Omega(n^{3})$ time per-iteration. Instead, we construct an extended aggregate sparsity graph $\overline{G}\supseteq G$ by forcing each constraint matrix $A_{i}$ to be its own clique in $G$. We prove that a small treewidth in $\overline{G}$ does indeed guarantee that chordal conversion will solve the SDP in $O(m+n)$ time per-iteration, to $\epsilon$-accuracy in at most $O(\sqrt{m+n}\log(1/\epsilon))$ iterations. For classical SDPs like the MAX-$k$-CUT relaxation and the Lovasz Theta problem, the two sparsity graphs coincide $G=\overline{G}$, so our result provide a complete characterization for the complexity of chordal conversion, showing that a small treewidth is both necessary and sufficient for $O(m+n)$ time per-iteration. Real-world SDPs like the AC optimal power flow relaxation have different graphs $G\subseteq\overline{G}$ with similar small treewidths; while chordal conversion is already widely used on a heuristic basis, in this paper we provide the first rigorous guarantee that it solves such SDPs in $O(m+n)$ time per-iteration. [Supporting code at https://github.com/ryz-codes/chordalConv/]

3.Topology optimization of transient vibroacoustic problems for broadband filter design using cut elements

2306.15325

Authors:Cetin B. Dilgen, Niels Aage

Abstract: The focus of this article is on shape and topology optimization of transient vibroacoustic problems. The main contribution is a transient problem formulation that enables optimization over wide ranges of frequencies with complex signals, which are often of interest in industry. The work employs time domain methods to realize wide band optimization in the frequency domain. To this end, the objective function is defined in frequency domain where the frequency response of the system is obtained through a fast Fourier transform (FFT) algorithm on the transient response of the system. The work utilizes a parametric level set approach to implicitly define the geometry in which the zero level describes the interface between acoustic and structural domains. A cut element method is used to capture the geometry on a fixed background mesh through utilization of a special integration scheme that accurately resolves the interface. This allows for accurate solutions to strongly coupled vibroacoustic systems without having to re-mesh at each design update. The present work relies on efficient gradient based optimizers where the discrete adjoint method is used to calculate the sensitivities of objective and constraint functions. A thorough explanation of the consistent sensitivity calculation is given involving the FFT operation needed to define the objective function in frequency domain. Finally, the developed framework is applied to various vibroacoustic filter designs and the optimization results are verified using commercial finite element software with a steady state time-harmonic formulation.

4.Convergence aspects for sets of measures with divergences and boundary conditions

2306.15366

Authors:Nicholas Chisholm, Carlos N. Rautenberg

Abstract: In this paper we study set convergence aspects for Banach spaces of vector-valued measures with divergences (represented by measures or by functions) and applications. We consider a form of normal trace characterization to establish subspaces of measures that directionally vanish in parts of the boundary, and present examples constructed with binary trees. Subsequently we study convex sets with total variation bounds and their convergence properties together with applications to the stability of optimization problems.

5.Quality Control in Particle Precipitation via Robust Optimization

2306.15432

Authors:Martina Kuchlbauer, Jana Dienstbier, Adeel Muneer, Hanna Hedges, Michael Stingl, Frauke Liers, Lukas Pflug

Abstract: In this work, we propose a robust optimization approach to mitigate the impact of uncertainties in particle precipitation. Our model incorporates partial differential equations, more particular nonlinear and nonlocal population balance equations to describe particle synthesis. The goal of the optimization problem is to design products with desired size distributions. Recognizing the impact of uncertainties, we extend the model to hedge against them. We emphasize the importance of robust protection to ensure the production of high-quality particles. To solve the resulting robust problem, we enhance a novel adaptive bundle framework for nonlinear robust optimization that integrates the exact method of moments approach for solving the population balance equations. Computational experiments performed with the integrated algorithm focus on uncertainties in the total mass of the system as it greatly influence the quality of the resulting product. Using realistic parameter values for quantum dot synthesis, we demonstrate the efficiency of our integrated algorithm. Furthermore, we find that the unprotected process fails to achieve the desired particle characteristics, even for small uncertainties, which highlights the necessity of the robust process. The latter consistently outperforms the unprotected process in quality of the obtained product, in particular in perturbed scenarios.

6.Limited-Memory Greedy Quasi-Newton Method with Non-asymptotic Superlinear Convergence Rate

2306.15444

Authors:Zhan Gao, Aryan Mokhtari, Alec Koppel

Abstract: Non-asymptotic convergence analysis of quasi-Newton methods has gained attention with a landmark result establishing an explicit superlinear rate of O$((1/\sqrt{t})^t)$. The methods that obtain this rate, however, exhibit a well-known drawback: they require the storage of the previous Hessian approximation matrix or instead storing all past curvature information to form the current Hessian inverse approximation. Limited-memory variants of quasi-Newton methods such as the celebrated L-BFGS alleviate this issue by leveraging a limited window of past curvature information to construct the Hessian inverse approximation. As a result, their per iteration complexity and storage requirement is O$(\tau d)$ where $\tau \le d$ is the size of the window and $d$ is the problem dimension reducing the O$(d^2)$ computational cost and memory requirement of standard quasi-Newton methods. However, to the best of our knowledge, there is no result showing a non-asymptotic superlinear convergence rate for any limited-memory quasi-Newton method. In this work, we close this gap by presenting a limited-memory greedy BFGS (LG-BFGS) method that achieves an explicit non-asymptotic superlinear rate. We incorporate displacement aggregation, i.e., decorrelating projection, in post-processing gradient variations, together with a basis vector selection scheme on variable variations, which greedily maximizes a progress measure of the Hessian estimate to the true Hessian. Their combination allows past curvature information to remain in a sparse subspace while yielding a valid representation of the full history. Interestingly, our established non-asymptotic superlinear convergence rate demonstrates a trade-off between the convergence speed and memory requirement, which to our knowledge, is the first of its kind. Numerical results corroborate our theoretical findings and demonstrate the effectiveness of our method.

7.Demand-side management via optimal production scheduling in power-intensive industries: The case of metal casting process

2306.15499

Authors:Danial Ramin, Stefano Spinelli, Alessandro Brusaferri

Abstract: The increasing challenges to the grid stability posed by the penetration of renewable energy resources urge a more active role for demand response programs as viable alternatives to a further expansion of peak power generators. This work presents a methodology to exploit the demand flexibility of energy-intensive industries under Demand-Side Management programs in the energy and reserve markets. To this end, we propose a novel scheduling model for a multi-stage multi-line process, which incorporates both the critical manufacturing constraints and the technical requirements imposed by the market. Using mixed integer programming approach, two optimization problems are formulated to sequentially minimize the cost in a day-ahead energy market and maximize the reserve provision when participating in the ancillary market. The effectiveness of day-ahead scheduling model has been verified for the case of a real metal casting plant in the Nordic market, where a significant reduction of energy cost is obtained. Furthermore, the reserve provision is shown to be a potential tool for capitalizing on the reserve market as a secondary revenue stream.

Mon, 26 Jun 2023digest

1.Open-loop and closed-loop solvabilities for discrete-time mean-field stochastic linear quadratic optimal control problems

2306.14496

Authors:Teng Song, Bin Liu

Abstract: This paper discusses the discrete-time mean-field stochastic linear quadratic optimal control problems, whose weighting matrices in the cost functional are not assumed to be definite. The open-loop solvability is characterized by the existence of the solution to a mean-field forward-backward stochastic difference equations with a convexity condition and a stationary condition. The closed-loop solvability is presented by virtue of the existences of the regular solution to the generalized Riccati equations and the solution to the linear recursive equation, which is also shown by the uniform convexity of the cost functional. Moreover, based on a family of uniformly convex cost functionals, the finiteness of the problem is characterized. Also, it turns out that a minimizing sequence, whose convergence is equivalent to the open-loop solvability of the problem. Finally, some examples are given to illustrate the theory developed.

2.Nonconvex Stochastic Bregman Proximal Gradient Method with Application to Deep Learning

2306.14522

Authors:Kuangyu Ding, Jingyang Li, Kim-Chuan Toh

Abstract: The widely used stochastic gradient methods for minimizing nonconvex composite objective functions require the Lipschitz smoothness of the differentiable part. But the requirement does not hold true for problem classes including quadratic inverse problems and training neural networks. To address this issue, we investigate a family of stochastic Bregman proximal gradient (SBPG) methods, which only require smooth adaptivity of the differentiable part. SBPG replaces the upper quadratic approximation used in SGD with the Bregman proximity measure, resulting in a better approximation model that captures the non-Lipschitz gradients of the nonconvex objective. We formulate the vanilla SBPG and establish its convergence properties under nonconvex setting without finite-sum structure. Experimental results on quadratic inverse problems testify the robustness of SBPG. Moreover, we propose a momentum-based version of SBPG (MSBPG) and prove it has improved convergence properties. We apply MSBPG to the training of deep neural networks with a polynomial kernel function, which ensures the smooth adaptivity of the loss function. Experimental results on representative benchmarks demonstrate the effectiveness and robustness of MSBPG in training neural networks. Since the additional computation cost of MSBPG compared with SGD is negligible in large-scale optimization, MSBPG can potentially be employed an universal open-source optimizer in the future.

3.The Implicit Rigid Tube Model Predictive Control

2306.14543

Authors:Saša V. Raković

Abstract: A computationally efficient reformulation of the rigid tube model predictive control is developed. A unique feature of the derived formulation is the utilization of the implicit set representations. This novel formulation does not require any set algebraic operations to be performed explicitly, and its implementation requires merely the use of the standard optimization solvers.

4.Optimal control of a parabolic equation with a nonlocal nonlinearity

2306.14559

Authors:Cyrille Kenne, Landry Djomegne, Gisèle Mophou

Abstract: This paper proposes an optimal control problem for a parabolic equation with a nonlocal nonlinearity. The system is described by a parabolic equation involving a nonlinear term that depends on the solution and its integral over the domain. We prove the existence and uniqueness of the solution to the system and the boundedness of the solution. Regularity results for the control-to-state operator, the cost functional and the adjoint state are also proved. We show the existence of optimal solutions and derive first-order necessary optimality conditions. In addition, second-order necessary and sufficient conditions for optimality are established.

5.Stability of optimal shapes and convergence of thresholding algorithms in linear and spectral optimal control problems

2306.14577

Authors:Antonin Chambolle, Idriss Mazari-Fouquer, Yannick Privat

Abstract: We prove the convergence of the fixed-point (also called thresholding) algorithm in three optimal control problems under large volume constraints. This algorithm was introduced by C\'ea, Gioan and Michel, and is of constant use in the simulation of $L^\infty-L^1$ optimal control problems. In this paper we consider the optimisation of the Dirichlet energy, of Dirichlet eigenvalues and of certain non-energetic problems. Our proofs rely on new diagonalisation procedure for shape hessians in optimal control problems, which leads to local stability estimates.

6.Sum-of-squares relaxations for polynomial min-max problems over simple sets

2306.14607

Authors:Francis Bach SIERRA

Abstract: We consider min-max optimization problems for polynomial functions, where a multivariate polynomial is maximized with respect to a subset of variables, and the resulting maximal value is minimized with respect to the remaining variables. When the variables belong to simple sets (e.g., a hypercube, the Euclidean hypersphere, or a ball), we derive a sum-of-squares formulation based on a primal-dual approach. In the simplest setting, we provide a convergence proof when the degree of the relaxation tends to infinity and observe empirically that it can be finitely convergent in several situations. Moreover, our formulation leads to an interesting link with feasibility certificates for polynomial inequalities based on Putinar's Positivstellensatz.

7.Generalized Scaling for the Constrained Maximum-Entropy Sampling Problem

2306.14661

Authors:Zhongzhu Chen, Marcia Fampa, Jon Lee

Abstract: The best practical techniques for exact solution of instances of the constrained maximum-entropy sampling problem, a discrete-optimization problem arising in the design of experiments, are via a branch-and-bound framework, working with a variety of concave continuous relaxations of the objective function. A standard and computationally-important bound-enhancement technique in this context is (ordinary) scaling, via a single positive parameter. Scaling adjusts the shape of continuous relaxations to reduce the gaps between the upper bounds and the optimal value. We extend this technique to generalized scaling, employing a positive vector of parameters, which allows much more flexibility and thus significantly reduces the gaps further. We give mathematical results aimed at supporting algorithmic methods for computing optimal generalized scalings, and we give computational results demonstrating the performance of generalized scaling on benchmark problem instances.

8.Gain Confidence, Reduce Disappointment: A New Approach to Cross-Validation for Sparse Regression

2306.14851

Authors:Ryan Cory-Wright, Andrés Gómez

Abstract: Ridge regularized sparse regression involves selecting a subset of features that explains the relationship between a design matrix and an output vector in an interpretable manner. To select the sparsity and robustness of linear regressors, techniques like leave-one-out cross-validation are commonly used for hyperparameter tuning. However, cross-validation typically increases the cost of sparse regression by several orders of magnitude. Additionally, validation metrics are noisy estimators of the test-set error, with different hyperparameter combinations giving models with different amounts of noise. Therefore, optimizing over these metrics is vulnerable to out-of-sample disappointment, especially in underdetermined settings. To address this, we make two contributions. First, we leverage the generalization theory literature to propose confidence-adjusted variants of leave-one-out that display less propensity to out-of-sample disappointment. Second, we leverage ideas from the mixed-integer literature to obtain computationally tractable relaxations of confidence-adjusted leave-one-out, thereby minimizing it without solving as many MIOs. Our relaxations give rise to an efficient coordinate descent scheme which allows us to obtain significantly lower leave-one-out errors than via other methods in the literature. We validate our theory by demonstrating we obtain significantly sparser and comparably accurate solutions than via popular methods like GLMNet and suffer from less out-of-sample disappointment. On synthetic datasets, our confidence adjustment procedure generates significantly fewer false discoveries, and improves out-of-sample performance by 2-5% compared to cross-validating without confidence adjustment. Across a suite of 13 real datasets, a calibrated version of our procedure improves the test set error by an average of 4% compared to cross-validating without confidence adjustment.

9.Near-Optimal Fully First-Order Algorithms for Finding Stationary Points in Bilevel Optimization

2306.14853

Authors:Lesi Chen, Yaohua Ma, Jingzhao Zhang

Abstract: Bilevel optimization has various applications such as hyper-parameter optimization and meta-learning. Designing theoretically efficient algorithms for bilevel optimization is more challenging than standard optimization because the lower-level problem defines the feasibility set implicitly via another optimization problem. One tractable case is when the lower-level problem permits strong convexity. Recent works show that second-order methods can provably converge to an $\epsilon$-first-order stationary point of the problem at a rate of $\tilde{\mathcal{O}}(\epsilon^{-2})$, yet these algorithms require a Hessian-vector product oracle. Kwon et al. (2023) resolved the problem by proposing a first-order method that can achieve the same goal at a slower rate of $\tilde{\mathcal{O}}(\epsilon^{-3})$. In this work, we provide an improved analysis demonstrating that the first-order method can also find an $\epsilon$-first-order stationary point within $\tilde {\mathcal{O}}(\epsilon^{-2})$ oracle complexity, which matches the upper bounds for second-order methods in the dependency on $\epsilon$. Our analysis further leads to simple first-order algorithms that can achieve similar near-optimal rates in finding second-order stationary points and in distributed bilevel problems.

Fri, 23 Jun 2023digest

1.An Approximate Projection onto the Tangent Cone to the Variety of Third-Order Tensors of Bounded Tensor-Train Rank

2306.13360

Authors:Charlotte Vermeylen, Guillaume Olikier, Marc Van Barel

Abstract: An approximate projection onto the tangent cone to the variety of third-order tensors of bounded tensor-train rank is proposed and proven to satisfy a better angle condition than the one proposed by Kutschan (2019). Such an approximate projection enables, e.g., to compute gradient-related directions in the tangent cone, as required by algorithms aiming at minimizing a continuously differentiable function on the variety, a problem appearing notably in tensor completion. A numerical experiment is presented which indicates that, in practice, the angle condition satisfied by the proposed approximate projection is better than both the one satisfied by the approximate projection introduced by Kutschan and the proven theoretical bound.

2.Solving the Train Dispatching Problem in Large Networks by Column Generation

2306.13431

Authors:Maik Schälicke, Karl Nachtigall

Abstract: Disruptions in the operational flow of rail traffic can lead to conflicts between train movements, such that a scheduled timetable can no longer be realised. This is where dispatching is applied, existing conflicts are resolved and a dispatching timetable is provided. In the process, train paths are varied in their spatio-temporal course. This is called the train dispatching problem (TDP), which consists of selecting conflict-free train paths with minimum delay. Starting from a path-oriented formulation of the TDP, a binary linear decision model is introduced. For each possible train path, a binary decision variable indicates whether the train path is used by the request or not. Such a train path is constructed from a set of predefined path parts (\profiles{}) within a time-space network. Instead of modelling pairwise conflicts, stronger MIP formulation are achieved by a clique formulation. The combinatorics of speed profiles and different departure times results in a large number of possible train paths, so that the column generation method is used here. New train paths within the pricing-problem can be calculated using shortest path techniques. Here, the shadow prices of conflict cliques must be taken into account. When constructing a new train path, it must be determined whether this train path belongs to a clique or not. This problem is tackled by a MIP. The methodology is tested on practical size instances from a dispatching area in Germany. Numerical results show that the presented method achieves acceptable computation times with good solution quality while meeting the requirements for real-time dispatching.

3.Computational investigations of a two-class traffic flow model: mean-field and microscopic dynamics

2306.13543

Authors:Abderrahmane Habbal, Imad Kissami, Amal Machtalay, Ahmed Ratnani

Abstract: We address a multi-class traffic model, for which we computationally assess the ability of mean-field games (MFG) to yield approximate Nash equilibria for traffic flow games of intractable large finite-players. We introduce a two-class traffic framework, following and extending the single-class lines of \cite{huang_game-theoretic_2020}. We extend the numerical methodologies, with recourse to techniques such as HPC and regularization of LGMRES solvers. The developed apparatus allows us to perform simulations at significantly larger space and time discretization scales. For three generic scenarios of cars and trucks, and three cost functionals, we provide numerous numerical results related to the autonomous vehicles (AVs) traffic dynamics, which corroborate for the multi-class case the effectiveness of the approach emphasized in \cite{huang_game-theoretic_2020}. We additionally provide several original comparisons of macroscopic Nash mean-field speeds with their microscopic versions, allowing us to computationally validate the so-called $\epsilon-$Nash approximation, with a rate slightly better than theoretically expected.

4.Synchronous dynamic game on system observability considering one or two steps optimality

2306.13570

Authors:Yueyue Xu, Xiaoming Hu, Lin Wang

Abstract: This paper studies a system security problem in the context of observability based on a two-party non-cooperative asynchronous dynamic game. A system is assumed to be secure if it is not observable. Both the defender and the attacker have means to modify dimension of the unobservable subspace, which is set as the value function. Utilizing tools from geometric control, we construct the best response set under one-step or two-step optimality to minimize or maximize the value function. We find that the best response sets under one-step optimality are not single-valued maps, resulting in a variety of game outcomes. In the dynamic game considering two-step optimality, definition and existence conditions of lock and oscillation game modes are given. Finally, the best response under two-step optimality and the Stackelberg game equilibrium are compared.

5.Fast Approximation of Unbalanced Optimal Transport and Maximum Mean Discrepancies

2306.13618

Authors:Rajmadan Lakshmanan, Alois Pichler

Abstract: This contribution presents significant computational accelerations to prominent schemes, which enable the comparison of measures, even with varying masses. Concisely, we employ nonequispaced fast Fourier transform to accelerate the radial kernel convolution in unbalanced optimal transport approximation, building on the Sinkhorn algorithm. Accelerated schemes are presented as well for the maximum mean discrepancies involving kernels based on distances. By employing nonequispaced fast Fourier transform, our approaches significantly reduce the arithmetic operations to compute the distances from $\mathcal O(n^2)$ to $\mathcal O(n\log n)$, which enables access to large and high-dimensional data sets. Furthermore, we show some robust relation between the Wasserstein distance and maximum mean discrepancies. Numerical experiments using synthetic data and real datasets demonstrate the computational acceleration and numerical precision.

6.Optimal Sensor Placement with Adaptive Constraints for Nuclear Digital Twins

2306.13637

Authors:Niharika Karnik, Mohammad G. Abdo, Carlos E. Estrada Perez, Jun Soo Yoo, Joshua J. Cogliati, Richard S. Skifton, Pattrick Calderoni, Steven L. Brunton, Krithika Manohar

Abstract: Given harsh operating conditions and physical constraints in reactors, nuclear applications cannot afford to equip the physical asset with a large array of sensors. Therefore, it is crucial to carefully determine the placement of sensors within the given spatial limitations, enabling the reconstruction of reactor flow fields and the creation of nuclear digital twins. Various design considerations are imposed, such as predetermined sensor locations, restricted areas within the reactor, a fixed number of sensors allocated to a specific region, or sensors positioned at a designated distance from one another. We develop a data-driven technique that integrates constraints into an optimization procedure for sensor placement, aiming to minimize reconstruction errors. Our approach employs a greedy algorithm that can optimize sensor locations on a grid, adhering to user-defined constraints. We demonstrate the near optimality of our algorithm by computing all possible configurations for selecting a certain number of sensors for a randomly generated state space system. In this work, the algorithm is demonstrated on the Out-of-Pile Testing and Instrumentation Transient Water Irradiation System (OPTI-TWIST) prototype vessel, which is electrically heated to mimic the neutronics effect of the Transient Reactor Test facility (TREAT) at Idaho National Laboratory (INL). The resulting sensor-based reconstruction of temperature within the OPTI-TWIST minimizes error, provides probabilistic bounds for noise-induced uncertainty and will finally be used for communication between the digital twin and experimental facility.

Thu, 22 Jun 2023digest

1.Rotation Group Synchronization via Quotient Manifold

2306.12730

Authors:Linglingzhi Zhu, Chong Li, Anthony Man-Cho So

Abstract: Rotation group $\mathcal{SO}(d)$ synchronization is an important inverse problem and has attracted intense attention from numerous application fields such as graph realization, computer vision, and robotics. In this paper, we focus on the least-squares estimator of rotation group synchronization with general additive noise models, which is a nonconvex optimization problem with manifold constraints. Unlike the phase/orthogonal group synchronization, there are limited provable approaches for solving rotation group synchronization. First, we derive improved estimation results of the least-squares/spectral estimator, illustrating the tightness and validating the existing relaxation methods of solving rotation group synchronization through the optimum of relaxed orthogonal group version under near-optimal noise level for exact recovery. Moreover, departing from the standard approach of utilizing the geometry of the ambient Euclidean space, we adopt an intrinsic Riemannian approach to study orthogonal/rotation group synchronization. Benefiting from a quotient geometric view, we prove the positive definite condition of quotient Riemannian Hessian around the optimum of orthogonal group synchronization problem, and consequently the Riemannian local error bound property is established to analyze the convergence rate properties of various Riemannian algorithms. As a simple and feasible method, the sequential convergence guarantee of the (quotient) Riemannian gradient method for solving orthogonal/rotation group synchronization problem is studied, and we derive its global linear convergence rate to the optimum with the spectral initialization. All results are deterministic without any probabilistic model.

2.Data-driven Approximation of Distributionally Robust Chance Constraints using Bayesian Credible Intervals

2306.12735

Authors:Zhiping Chen, Wentao Ma, Bingbing Ji

Abstract: The non-convexity and intractability of distributionally robust chance constraints make them challenging to cope with. From a data-driven perspective, we propose formulating it as a robust optimization problem to ensure that the distributionally robust chance constraint is satisfied with high probability. To incorporate available data and prior distribution knowledge, we construct ambiguity sets for the distributionally robust chance constraint using Bayesian credible intervals. We establish the congruent relationship between the ambiguity set in Bayesian distributionally robust chance constraints and the uncertainty set in a specific robust optimization. In contrast to most existent uncertainty set construction methods which are only applicable for particular settings, our approach provides a unified framework for constructing uncertainty sets under different marginal distribution assumptions, thus making it more flexible and widely applicable. Additionally, under the concavity assumption, our method provides strong finite sample probability guarantees for optimal solutions. The practicality and effectiveness of our approach are illustrated with numerical experiments on portfolio management and queuing system problems. Overall, our approach offers a promising solution to distributionally robust chance constrained problems and has potential applications in other fields.

3.Don't be so Monotone: Relaxing Stochastic Line Search in Over-Parameterized Models

2306.12747

Authors:Leonardo Galli, Holger Rauhut, Mark Schmidt

Abstract: Recent works have shown that line search methods can speed up Stochastic Gradient Descent (SGD) and Adam in modern over-parameterized settings. However, existing line searches may take steps that are smaller than necessary since they require a monotone decrease of the (mini-)batch objective function. We explore nonmonotone line search methods to relax this condition and possibly accept larger step sizes. Despite the lack of a monotonic decrease, we prove the same fast rates of convergence as in the monotone case. Our experiments show that nonmonotone methods improve the speed of convergence and generalization properties of SGD/Adam even beyond the previous monotone line searches. We propose a POlyak NOnmonotone Stochastic (PoNoS) method, obtained by combining a nonmonotone line search with a Polyak initial step size. Furthermore, we develop a new resetting technique that in the majority of the iterations reduces the amount of backtracks to zero while still maintaining a large initial step size. To the best of our knowledge, a first runtime comparison shows that the epoch-wise advantage of line-search-based methods gets reflected in the overall computational time.

4.A Gradient Descent-Ascent Method for Continuous-Time Risk-Averse Optimal Control

2306.12878

Authors:Gabriel Velho, Jean Auriol, Riccardo Bonalli

Abstract: In this paper, we consider continuous-time stochastic optimal control problems where the cost is evaluated through a coherent risk measure. We provide an explicit gradient descent-ascent algorithm which applies to problems subject to non-linear stochastic differential equations. More specifically, we leverage duality properties of coherent risk measures to relax the problem via a smooth min-max reformulation which induces artificial strong concavity in the max subproblem. We then formulate necessary conditions of optimality for this relaxed problem which we leverage to prove convergence of the gradient descent-ascent algorithm to candidate solutions of the original problem. Finally, we showcase the efficiency of our algorithm through numerical simulations involving trajectory tracking problems and highlight the benefit of favoring risk measures over classical expectation.

5.The chain control set of a linear control system

2306.12936

Authors:Adriano Da Silva

Abstract: In this paper, we analyze the chain control sets of linear control systems on connected Lie groups. Our main result shows that the compactness of the central subgroup associated with the drift is a necessary and sufficient condition to assure the uniqueness and compactness of the chain control set.

Wed, 21 Jun 2023digest

1.Distributed Random Reshuffling Methods with Improved Convergence

2306.12037

Authors:Kun Huang, Linli Zhou, Shi Pu

Abstract: This paper proposes two distributed random reshuffling methods, namely Gradient Tracking with Random Reshuffling (GT-RR) and Exact Diffusion with Random Reshuffling (ED-RR), to solve the distributed optimization problem over a connected network, where a set of agents aim to minimize the average of their local cost functions. Both algorithms invoke random reshuffling (RR) update for each agent, inherit favorable characteristics of RR for minimizing smooth nonconvex objective functions, and improve the performance of previous distributed random reshuffling methods both theoretically and empirically. Specifically, both GT-RR and ED-RR achieve the convergence rate of $O(1/[(1-\lambda)^{1/3}m^{1/3}T^{2/3}])$ in driving the (minimum) expected squared norm of the gradient to zero, where $T$ denotes the number of epochs, $m$ is the sample size for each agent, and $1-\lambda$ represents the spectral gap of the mixing matrix. When the objective functions further satisfy the Polyak-{\L}ojasiewicz (PL) condition, we show GT-RR and ED-RR both achieve $O(1/[(1-\lambda)mT^2])$ convergence rate in terms of the averaged expected differences between the agents' function values and the global minimum value. Notably, both results are comparable to the convergence rates of centralized RR methods (up to constant factors depending on the network topology) and outperform those of previous distributed random reshuffling algorithms. Moreover, we support the theoretical findings with a set of numerical experiments.

2.A Novel Sensor Design for a Cantilevered Mead-Marcus-type Sandwich Beam Model by the Order-reduction Technique

2306.12065

Authors:Ahmet Ozkan Ozer, Ahmet Kaan Aydin

Abstract: A novel space-discretized Finite Differences-based model reduction, introduced in (Liu,Guo,2020) is extended to the partial differential equations (PDE) model of a multi-layer Mead-Marcus-type sandwich beam with clamped-free boundary conditions. The PDE model describes transverse vibrations for a sandwich beam whose alternating outer elastic layers constrain viscoelastic core layers, which allow transverse shear. The major goal of this project is to design a single tip velocity sensor to control the overall dynamics on the beam. Since the spectrum of the PDE can not be constructed analytically, the so-called multipliers approach is adopted to prove that the PDE model is exactly observable with sub-optimal observation time. Next, the PDE model is reduced by the ``order-reduced'' Finite-Differences technique. This method does not require any type of filtering though the exact observability as $h\to 0$ is achieved by a constraint on the material constants. The main challenge here is the strong coupling of the shear dynamics of the middle layer with overall bending dynamics. This complicates the absorption of coupling terms in the discrete energy estimates. This is sharply different from a single-layer (Euler-Bernoulli) beam.

3.Optimal Algorithms for Stochastic Bilevel Optimization under Relaxed Smoothness Conditions

2306.12067

Authors:Xuxing Chen, Tesi Xiao, Krishnakumar Balasubramanian

Abstract: Stochastic Bilevel optimization usually involves minimizing an upper-level (UL) function that is dependent on the arg-min of a strongly-convex lower-level (LL) function. Several algorithms utilize Neumann series to approximate certain matrix inverses involved in estimating the implicit gradient of the UL function (hypergradient). The state-of-the-art StOchastic Bilevel Algorithm (SOBA) [16] instead uses stochastic gradient descent steps to solve the linear system associated with the explicit matrix inversion. This modification enables SOBA to match the lower bound of sample complexity for the single-level counterpart in non-convex settings. Unfortunately, the current analysis of SOBA relies on the assumption of higher-order smoothness for the UL and LL functions to achieve optimality. In this paper, we introduce a novel fully single-loop and Hessian-inversion-free algorithmic framework for stochastic bilevel optimization and present a tighter analysis under standard smoothness assumptions (first-order Lipschitzness of the UL function and second-order Lipschitzness of the LL function). Furthermore, we show that by a slight modification of our approach, our algorithm can handle a more general multi-objective robust bilevel optimization problem. For this case, we obtain the state-of-the-art oracle complexity results demonstrating the generality of both the proposed algorithmic and analytic frameworks. Numerical experiments demonstrate the performance gain of the proposed algorithms over existing ones.

4.Comparing the Methods of Alternating and Simultaneous Projections for Two Subspaces

2306.12219

Authors:Simeon Reich, Rafał Zalas

Abstract: We study the well-known methods of alternating and simultaneous projections when applied to two nonorthogonal linear subspaces of a real Euclidean space. Assuming that both of the methods have a common starting point chosen from either one of the subspaces, we show that the method of alternating projections converges significantly faster than the method of simultaneous projections. On the other hand, we provide examples of subspaces and starting points, where the method of simultaneous projections outperforms the method of alternating projections.

5.Stability Analysis of Trajectories on Manifolds with Applications to Observer and Controller Design

2306.12256

Authors:Dongjun Wu, Bowen Yi, Anders Rantzer

Abstract: This paper examines the local exponential stability (LES) of trajectories for nonlinear systems on Riemannian manifolds. We present necessary and sufficient conditions for LES of a trajectory on a Riemannian manifold by analyzing the complete lift of the system along the given trajectory. These conditions are coordinate-free which reveal fundamental relationships between exponential stability and incremental stability in a local sense. We then apply these results to design tracking controllers and observers for Euler-Lagrangian systems on manifolds; a notable advantage of our design is that it visibly reveals the effect of curvature on system dynamics and hence suggests compensation terms in the controller and observer. Additionally, we revisit some well-known intrinsic observer problems using our proposed method, which largely simplifies the analysis compared to existing results.

Tue, 20 Jun 2023digest

1.A Lagrangian-Based Method with "False Penalty'' for Linearly Constrained Nonconvex Composite Optimization

2306.11299

Authors:Jong Gwang Kim

Abstract: We introduce a primal-dual framework for solving linearly constrained nonconvex composite optimization problems. Our approach is based on a newly developed Lagrangian, which incorporates \emph{false penalty} and dual smoothing terms. This new Lagrangian enables us to develop a simple first-order algorithm that converges to a stationary solution under standard assumptions. We further establish global convergence, provided that the objective function satisfies the Kurdyka-{\L}ojasiewicz property. Our method provides several advantages: it simplifies the treatment of constraints by effectively bounding the multipliers without boundedness assumptions on the dual iterates; it guarantees global convergence without requiring the surjectivity assumption on the linear operator; and it is a single-loop algorithm that does not involve solving penalty subproblems, achieving an iteration complexity of $\mathcal{O}(1/\epsilon^2)$ to find an $\epsilon$-stationary solution. Preliminary experiments on test problems demonstrate the practical efficiency and robustness of our method.

2.A gradient projection method for semi-supervised hypergraph clustering problems

2306.11323

Authors:Jingya Chang, Dongdong Liu, Min Xi

Abstract: Semi-supervised clustering problems focus on clustering data with labels. In this paper,we consider the semi-supervised hypergraph problems. We use the hypergraph related tensor to construct an orthogonal constrained optimization model. The optimization problem is solved by a retraction method, which employs the polar decomposition to map the gradient direction in the tangent space to the Stefiel manifold. A nonmonotone curvilinear search is implemented to guarantee reduction in the objective function value. Convergence analysis demonstrates that the first order optimality condition is satisfied at the accumulation point. Experiments on synthetic hypergraph and hypergraph given by real data demonstrate the effectivity of our method.

3.A Passivity-Based Method for Accelerated Convex Optimisation

2306.11474

Authors:Namhoon Cho, Hyo-Sang Shin

Abstract: This study presents a constructive methodology for designing accelerated convex optimisation algorithms in continuous-time domain. The two key enablers are the classical concept of passivity in control theory and the time-dependent change of variables that maps the output of the internal dynamic system to the optimisation variables. The Lyapunov function associated with the optimisation dynamics is obtained as a natural consequence of specifying the internal dynamics that drives the state evolution as a passive linear time-invariant system. The passivity-based methodology provides a general framework that has the flexibility to generate convex optimisation algorithms with the guarantee of different convergence rate bounds on the objective function value. The same principle applies to the design of online parameter update algorithms for adaptive control by re-defining the output of internal dynamics to allow for the feedback interconnection with tracking error dynamics.

4.Stabilization and Spill-Free Transfer of Viscous Liquid in a Tank

2306.11543

Authors:Iasson Karafyllis, Miroslav Krstic

Abstract: Flow control occupies a special place in the fields of partial differential equations (PDEs) and control theory, where the complex behavior of solutions of nonlinear dynamics in very high dimension is not just to be understood but also to be assigned specific desired properties, by feedback control. Among several benchmark problems in flow control, the liquid-tank problem is particularly attractive as a research topic. In the liquid-tank problem the objective is to move a tank filled with liquid, suppress the nonlinear oscillations of the liquid in the process, bring the tank and liquid to rest, and avoid liquid spillage in the process. In other words, this is a problem of nonlinear PDE stabilization subject to state constraints. This review article focuses only on recent results on liquid-tank stabilization for viscous liquids. All possible cases are studied: with and without friction from the tank walls, with and without surface tension. Moreover, novel results are provided for the linearization of the tank-liquid system. The linearization of the tank-liquid system gives a high-order PDE which is a combination of a wave equation with Kelvin-Voigt damping and an Euler-Bernoulli beam equation. The feedback design methodology presented in the article is based on Control Lyapunov Functionals (CLFs), suitably extended from the CLF methodology for ODEs to the infinite-dimensional case. The CLFs proposed are modifications and augmentations of the total energy functionals for the tank-liquid system, so that the dissipative effects of viscosity, friction, and surface tension are captured and additional dissipation by feedback is made relatively easy. The article closes with an extensive list of open problems.

5.Graph-Based Conditions for Feedback Stabilization of Switched and LPV Systems

2306.11548

Authors:Matteo Della Rossa, Thiago Alves Lima, Marc Jungers, Raphaël M. Jungers

Abstract: This paper presents novel stabilizability conditions for switched linear systems with arbitrary and uncontrollable underlying switching signals. We distinguish and study two particular settings: i) the \emph{robust} case, in which the active mode is completely unknown and unobservable, and ii) the \emph{mode-dependent} case, in which the controller depends on the current active switching mode. The technical developments are based on graph-theory tools, relying in particular on the path-complete Lyapunov functions framework. The main idea is to use directed and labeled graphs to encode Lyapunov inequalities to design robust and mode-dependent piecewise linear state-feedback controllers. This results in novel and flexible conditions, with the particular feature of being in the form of linear matrix inequalities (LMIs). Our technique thus provides a first controller-design strategy allowing piecewise linear feedback maps and piecewise quadratic (control) Lyapunov functions by means of semidefinite programming. Numerical examples illustrate the application of the proposed techniques, the relations between the graph order, the robustness, and the performance of the closed loop.

6.Regularized Robust MDPs and Risk-Sensitive MDPs: Equivalence, Policy Gradient, and Sample Complexity

2306.11626

Authors:Runyu Zhang, Yang Hu, Na Li

Abstract: This paper focuses on reinforcement learning for the regularized robust Markov decision process (MDP) problem, an extension of the robust MDP framework. We first introduce the risk-sensitive MDP and establish the equivalence between risk-sensitive MDP and regularized robust MDP. This equivalence offers an alternative perspective for addressing the regularized RMDP and enables the design of efficient learning algorithms. Given this equivalence, we further derive the policy gradient theorem for the regularized robust MDP problem and prove the global convergence of the exact policy gradient method under the tabular setting with direct parameterization. We also propose a sample-based offline learning algorithm, namely the robust fitted-Z iteration (RFZI), for a specific regularized robust MDP problem with a KL-divergence regularization term and analyze the sample complexity of the algorithm. Our results are also supported by numerical simulations.

7.Last-Iterate Convergent Policy Gradient Primal-Dual Methods for Constrained MDPs

2306.11700

Authors:Dongsheng Ding, Chen-Yu Wei, Kaiqing Zhang, Alejandro Ribeiro

Abstract: We study the problem of computing an optimal policy of an infinite-horizon discounted constrained Markov decision process (constrained MDP). Despite the popularity of Lagrangian-based policy search methods used in practice, the oscillation of policy iterates in these methods has not been fully understood, bringing out issues such as violation of constraints and sensitivity to hyper-parameters. To fill this gap, we employ the Lagrangian method to cast a constrained MDP into a constrained saddle-point problem in which max/min players correspond to primal/dual variables, respectively, and develop two single-time-scale policy-based primal-dual algorithms with non-asymptotic convergence of their policy iterates to an optimal constrained policy. Specifically, we first propose a regularized policy gradient primal-dual (RPG-PD) method that updates the policy using an entropy-regularized policy gradient, and the dual via a quadratic-regularized gradient ascent, simultaneously. We prove that the policy primal-dual iterates of RPG-PD converge to a regularized saddle point with a sublinear rate, while the policy iterates converge sublinearly to an optimal constrained policy. We further instantiate RPG-PD in large state or action spaces by including function approximation in policy parametrization, and establish similar sublinear last-iterate policy convergence. Second, we propose an optimistic policy gradient primal-dual (OPG-PD) method that employs the optimistic gradient method to update primal/dual variables, simultaneously. We prove that the policy primal-dual iterates of OPG-PD converge to a saddle point that contains an optimal constrained policy, with a linear rate. To the best of our knowledge, this work appears to be the first non-asymptotic policy last-iterate convergence result for single-time-scale algorithms in constrained MDPs.

8.Closed-form expressions for the pure time delay in terms of the input and output Laguerre spectra

2306.11805

Authors:Alexander Medvedev

Abstract: The pure time delay operator is considered in continuous and discrete time under the assumption of the input signal being integrable (summable) with square. By making use of a discrete convolution operator with polynomial Markov parameters, a common framework for handling the continuous and discrete case is set. Closed-form expressions for the delay value are derived in terms of the Laguerre spectra of the output and input signals. The expressions hold for any feasible value of the Laguerre parameter and can be utilized for e.g. building time-delay estimators that allow for non-persistent input. A simulation example is provided to illustrate the principle of Laguerre-domain time delay modeling and analysis.

9.Projection-Free Methods for Solving Nonconvex-Concave Saddle Point Problems

2306.11944

Authors:Morteza Boroun, Erfan Yazdandoost Hamedani, Afrooz Jalilzadeh

Abstract: In this paper, we investigate a class of constrained saddle point (SP) problems where the objective function is nonconvex-concave and smooth. This class of problems has wide applicability in machine learning, including robust multi-class classification and dictionary learning. Several projection-based primal-dual methods have been developed for tackling this problem, however, the availability of methods with projection-free oracles remains limited. To address this gap, we propose efficient single-loop projection-free methods reliant on first-order information. In particular, using regularization and nested approximation techniques we propose a primal-dual conditional gradient method that solely employs linear minimization oracles to handle constraints. Assuming that the constraint set in the maximization is strongly convex our method achieves an $\epsilon$-stationary solution within $\mathcal{O}(\epsilon^{-6})$ iterations. When the projection onto the constraint set of maximization is easy to compute, we propose a one-sided projection-free method that achieves an $\epsilon$-stationary solution within $\mathcal{O}(\epsilon^{-4})$ iterations. Moreover, we present improved iteration complexities of our methods under a strong concavity assumption. To the best of our knowledge, our proposed algorithms are among the first projection-free methods with convergence guarantees for solving nonconvex-concave SP problems.

Fri, 16 Jun 2023digest

1.Randomized Robust Price Optimization

2306.09659

Authors:Xinyi Guan, Velibor V. Mišić

Abstract: The robust multi-product pricing problem is to determine the prices of a collection of products so as to maximize the worst-case revenue, where the worst case is taken over an uncertainty set of demand models that the firm expects could be realized in practice. A tacit assumption in this approach is that the pricing decision is a deterministic decision: the prices of the products are fixed and do not vary. In this paper, we consider a randomized approach to robust pricing, where a decision maker specifies a distribution over potential price vectors so as to maximize its worst-case revenue over an uncertainty set of demand models. We formally define this problem -- the randomized robust price optimization problem -- and analyze when a randomized price scheme performs as well as a deterministic price vector, and identify cases in which it can yield a benefit. We also propose two solution methods for obtaining an optimal randomization scheme over a discrete set of candidate price vectors based on constraint generation and double column generation, respectively, and show how these methods are applicable for common demand models, such as the linear, semi-log and log-log demand models. We numerically compare the randomized approach against the deterministic approach on a variety of synthetic and real problem instances; on synthetic instances, we show that the improvement in worst-case revenue can be as much as 1300%, while on real data instances derived from a grocery retail scanner dataset, the improvement can be as high as 92%.

2.Linear convergence of Nesterov-1983 with the strong convexity

2306.09694

Authors:Bowen Li, Bin Shi, Ya-xiang Yuan

Abstract: For modern gradient-based optimization, a developmental landmark is Nesterov's accelerated gradient descent method, which is proposed in [Nesterov, 1983], so shorten as Nesterov-1983. Afterward, one of the important progresses is its proximal generalization, named the fast iterative shrinkage-thresholding algorithm (FISTA), which is widely used in image science and engineering. However, it is unknown whether both Nesterov-1983 and FISTA converge linearly on the strongly convex function, which has been listed as the open problem in the comprehensive review [Chambolle and Pock, 2016, Appendix B]. In this paper, we answer this question by the use of the high-resolution differential equation framework. Along with the phase-space representation previously adopted, the key difference here in constructing the Lyapunov function is that the coefficient of the kinetic energy varies with the iteration. Furthermore, we point out that the linear convergence of both the two algorithms above has no dependence on the parameter $r$ on the strongly convex function. Meanwhile, it is also obtained that the proximal subgradient norm converges linearly.

3.On the finitary content of Dykstra's cyclic projections algorithm

2306.09791

Authors:Pedro Pinto

Abstract: We study the asymptotic behaviour of the well-known Dykstra's algorithm through the lens of proof-theoretical techniques. We provide an elementary proof for the convergence of Dykstra's algorithm in which the standard argument is stripped to its central features and where the original compactness principles are circumvented, additionally providing highly uniform primitive recursive rates of metastability in a full general setting. Moreover, under an additional assumption, we are even able to obtain effective general rates of convergence. We argue that such additional condition is actually necessary for the existence of general uniform rates of convergence.

4.Barzilai-Borwein Proximal Gradient Methods for Multiobjective Composite Optimization Problems with Improved Linear Convergence

2306.09797

Authors:Jian Chen, Liping Tang, Xinmin Yang

Abstract: Over the past two decades, multiobejective gradient descent methods have received increasing attention due to the seminal work of Fliege and Svaiter. Recently, Chen et al. pointed out that imbalances among objective functions can lead to a small stepsize in Fliege and Svaiter's method, which significantly decelerates the convergence. To address the issue, Chen et al. propose the Barzilai-Borwein descent method for multiobjective optimization (BBDMO). Their work demonstrated that BBDMO achieves better stepsize and performance compared to Fliege and Svaiter's method. However, a theoretical explanation for the superiority of BBDMO over the previous method has been open. In this paper, we extend Chen et al.'s method to composite cases and propose two types of Barzilai-Borwein proximal gradient methods (BBPGMO). Moreover, we prove that the convergence rates of BBPGMO are $O(\frac{1}{\sqrt{k}})$, $O(\frac{1}{k})$, and $O(r^{k})(0<r<1)$ for non-convex, convex, and strongly convex problems, respectively. Notably, the linear rate $r$ in our proposed method is smaller than the previous rates of first-order methods for multiobjective optimization, which directly indicates its improved performance. We further validate these theoretical results through numerical experiments.

5.Version 2.0 -- cashocs: A Computational, Adjoint-Based Shape Optimization and Optimal Control Software

2306.09828

Authors:Sebastian Blauth

Abstract: In this paper, we present version 2.0 of cashocs. Our software automates the solution of PDE constrained optimization problems for design optimization and optimal control. Since its inception, many new features and useful tools have been added to cashocs, making it even more flexible and efficient. The most significant additions are a framework for space mapping, the ability to solve topology optimization problems with a level-set approach, the support for parallelism via MPI, and the ability to handle additional (state) constraints. In this software update, we describe the key additions to cashocs, which is now even better-suited for solving complex PDE constrained optimization problems.

6.Distributionally Robust Airport Ground Holding Problem under Wasserstein Ambiguity Sets

2306.09836

Authors:Haochen Wu, Max Z. Li

Abstract: The airport ground holding problem seeks to minimize flight delay costs due to reductions in the capacity of airports. However, the critical input of future airport capacities is often difficult to predict, presenting a challenging yet realistic setting. Even when capacity predictions provide a distribution of possible capacity scenarios, such distributions may themselves be uncertain (e.g., distribution shifts). To address the problem of designing airport ground holding policies under distributional uncertainty, we formulate and solve the airport ground holding problem using distributionally robust optimization (DRO). We address the uncertainty in the airport capacity distribution by defining ambiguity sets based on the Wasserstein distance metric. We propose reformulations which integrate the ambiguity sets into the airport ground holding problem structure, and discuss dicretization properties of the proposed model. We discuss comparisons (via numerical experiments) between ground holding policies and optimized costs derived through the deterministic, stochastic, and distributionally robust airport ground holding problems. Our experiments show that the DRO model outperforms the stochastic models when there is a significant difference between the empirical airport capacity distribution and the realized airport capacity distribution. We note that DRO can be a valuable tool for decision-makers seeking to design airport ground holding policies, particularly when the available data regarding future airport capacities are highly uncertain.

7.On integrality in semidefinite programming for discrete optimization

2306.09865

Authors:Frank de Meijer, Renata Sotirov

Abstract: It is well-known that by adding integrality constraints to the semidefinite programming (SDP) relaxation of the max-cut problem, the resulting integer semidefinite program is an exact formulation of the problem. In this paper we show similar results for a wide variety of discrete optimization problems for which SDP relaxations have been derived. Based on a comprehensive study on discrete positive semidefinite matrices, we follow a generic approach to derive mixed-integer semidefinite programming (MISDP) formulations of binary quadratically constrained quadratic programs and binary quadratic matrix programs. Applying a problem-specific approach, we derive more compact MISDP formulations of several problems, such as the quadratic assignment problem, the graph partition problem and the integer matrix completion problem. We also show that several structured problems allow for novel compact MISDP formulations through the notion of association schemes. Complementary to the recent advances on algorithmic aspects related to MISDP, this work opens new perspectives on solution approaches for the here considered problems.

8.A Distributed Optimization Framework to Regulate the Electricity Consumption of a Residential Neighborhood

2306.09954

Authors:Erhan Can Ozcan, Ioannis Ch. Paschalidis

Abstract: Increased variability of electricity generation due to renewable sources requires either large amounts of stand-by production capacity or some form of demand response. For residential loads, space heating and cooling, water heating, electric vehicle charging, and routine appliances make up the bulk of the electricity consumption. Controlling these loads can reduce the peak load of a residential neighborhood and facilitate matching supply with demand. However, maintaining user comfort is important for ensuring user participation to such a program. This paper formulates a novel mixed integer linear programming problem to control the overall electricity consumption of a residential neighborhood by considering the users' comfort. To efficiently solve the problem for communities involving a large number of homes, a distributed optimization framework based on the Dantzig-Wolfe decomposition technique is developed. We demonstrate the load shaping capacity and the computational performance of the proposed optimization framework in a simulated environment.

Thu, 15 Jun 2023digest

1.Optimization on product manifolds under a preconditioned metric

2306.08873

Authors:Bin Gao, Renfeng Peng, Ya-xiang Yuan

Abstract: Since optimization on Riemannian manifolds relies on the chosen metric, it is appealing to know that how the performance of a Riemannian optimization method varies with different metrics and how to exquisitely construct a metric such that a method can be accelerated. To this end, we propose a general framework for optimization problems on product manifolds where the search space is endowed with a preconditioned metric, and we develop the Riemannian gradient descent and Riemannian conjugate gradient methods under this metric. Specifically, the metric is constructed by an operator that aims to approximate the diagonal blocks of the Riemannian Hessian of the cost function, which has a preconditioning effect. We explain the relationship between the proposed methods and the variable metric methods, and show that various existing methods, e.g., the Riemannian Gauss--Newton method, can be interpreted by the proposed framework with specific metrics. In addition, we tailor new preconditioned metrics and adapt the proposed Riemannian methods to the canonical correlation analysis and the truncated singular value decomposition problems, and we propose the Gauss--Newton method to solve the tensor ring completion problem. Numerical results among these applications verify that a delicate metric does accelerate the Riemannian optimization methods.

2.Optimal control of port-Hamiltonian systems: energy, entropy, and exergy

2306.08914

Authors:Friedrich Philipp, Manuel Schaller, Karl Worthmann, Timm Faulwasser, Bernhard Maschke

Abstract: We consider irreversible and coupled reversible-irreversible nonlinear port-Hamiltonian systems and the respective sets of thermodynamic equilibria. In particular, we are concerned with optimal state transitions and output stabilization on finite-time horizons. We analyze a class of optimal control problems, where the performance functional can be interpreted as a linear combination of energy supply, entropy generation, or exergy supply. Our results establish the integral turnpike property towards the set of thermodynamic equilibria providing a rigorous connection of optimal system trajectories to optimal steady states. Throughout the paper, we illustrate our findings by means of two examples: a network of heat exchangers and a gas-piston system.

3.iNALM: An inexact Newton Augmented Lagrangian Method for Zero-One Composite Optimization

2306.08991

Authors:Penghe Zhang, Naihua Xiu, Hou-Duo Qi

Abstract: Zero-One Composite Optimization (0/1-COP) is a prototype of nonsmooth, nonconvex optimization problems and it has attracted much attention recently. The augmented Lagrangian Method (ALM) has stood out as a leading methodology for such problems. The main purpose of this paper is to extend the classical theory of ALM from smooth problems to 0/1-COP. We propose, for the first time, second-order optimality conditions for 0/1-COP. In particular, under a second-order sufficient condition (SOSC), we prove the R-linear convergence rate of the proposed ALM. In order to identify the subspace used in SOSC, we employ the proximal operator of the 0/1-loss function, leading to an active-set identification technique. Built around this identification process, we design practical stopping criteria for any algorithm to be used for the subproblem of ALM. We justify that Newton's method is an ideal candidate for the subproblem and it enjoys both global and local quadratic convergence. Those considerations result in an inexact Newton ALM (iNALM). The method of iNALM is unique in the sense that it is active-set based, it is inexact (hence more practical), and SOSC plays an important role in its R-linear convergence analysis. The numerical results on both simulated and real datasets show the fast running speed and high accuracy of iNALM when compared with several leading solvers.

4.Distributionally Robust Stratified Sampling for Stochastic Simulations with Multiple Uncertain Input Models

2306.09020

Authors:Seung Min Baik, Eunshin Byon, Young Myoung Ko

Abstract: This paper presents a robust version of the stratified sampling method when multiple uncertain input models are considered for stochastic simulation. Various variance reduction techniques have demonstrated their superior performance in accelerating simulation processes. Nevertheless, they often use a single input model and further assume that the input model is exactly known and fixed. We consider more general cases in which it is necessary to assess a simulation's response to a variety of input models, such as when evaluating the reliability of wind turbines under nonstationary wind conditions or the operation of a service system when the distribution of customer inter-arrival time is heterogeneous at different times. Moreover, the estimation variance may be considerably impacted by uncertainty in input models. To address such nonstationary and uncertain input models, we offer a distributionally robust (DR) stratified sampling approach with the goal of minimizing the maximum of worst-case estimator variances among plausible but uncertain input models. Specifically, we devise a bi-level optimization framework for formulating DR stochastic problems with different ambiguity set designs, based on the $L_2$-norm, 1-Wasserstein distance, parametric family of distributions, and distribution moments. In order to cope with the non-convexity of objective function, we present a solution approach that uses Bayesian optimization. Numerical experiments and the wind turbine case study demonstrate the robustness of the proposed approach.

5.Kinetic based optimization enhanced by genetic dynamics

2306.09199

Authors:Giacomo Albi, Federica Ferrarese, Claudia Totzeck

Abstract: We propose and analyse a variant of the recently introduced kinetic based optimization method that incorporates ideas like survival-of-the-fittest and mutation strategies well-known from genetic algorithms. Thus, we provide a first attempt to reach out from the class of consensus/kinetic-based algorithms towards genetic metaheuristics. Different generations of genetic algorithms are represented via two species identified with different labels, binary interactions are prescribed on the particle level and then we derive a mean-field approximation in order to analyse the method in terms of convergence. Numerical results underline the feasibility of the approach and show in particular that the genetic dynamics allows to improve the efficiency, of this class of global optimization methods in terms of computational cost.

6.Two sided ergodic singular control and mean field game for diffusions

2306.09263

Authors:Sören Christensen, Ernesto Mordecki, Facundo Oliú Eguren

Abstract: Consider two independent controlled linear diffusions with the same dynamics and the same ergodic controls, the first corresponding to an individual player, the second to the market. Let us also consider a cost function that depends on the first diffusion and the expectation of the second one. In this framework, we study the mean-field game consisting in finding the equilibrium points where the controls chosen by the player to minimize an ergodic integrated cost coincide with the market controls. We first show that in the control problem, without market dependence, the best policy is to reflect the process within two boundaries. We use these results to get criteria for the optimal and market controls to coincide (i.e., equilibrium existence), and give a pair of nonlinear equations to find these equilibrium points. We also get criteria for the existence and uniqueness of equilibrium points for the mean-field games under study. These results are illustrated through several examples where the existence and uniqueness of the equilibrium points depend on the values of the parameters defining the underlying diffusion.

7.A Score-based Nonlinear Filter for Data Assimilation

2306.09282

Authors:Feng Bao, Zezhong Zhang, Guannan Zhang

Abstract: We introduce a score-based generative sampling method for solving the nonlinear filtering problem with robust accuracy. A major drawback of existing nonlinear filtering methods, e.g., particle filters, is the low stability. To overcome this issue, we adopt the diffusion model framework to solve the nonlinear filtering problem. In stead of storing the information of the filtering density in finite number of Monte Carlo samples, in the score-based filter we store the information of the filtering density in the score model. Then, via the reverse-time diffusion sampler, we can generate unlimited samples to characterize the filtering density. Moreover, with the powerful expressive capabilities of deep neural networks, it has been demonstrated that a well trained score in diffusion model can produce samples from complex target distributions in very high dimensional spaces. Extensive numerical experiments show that our score-based filter could potentially address the curse of dimensionality in very high dimensional problems.

Tue, 13 Jun 2023digest

1.Equitable Optimization of Patient Re-allocation and Temporary Facility Placement to Maximize Critical Care System Resilience in Disasters

2306.07545

Authors:Chia-Fu Liu, Ali Mostafavi

Abstract: End-stage renal disease patients face a complicated sociomedical situation and rely on various forms of infrastructure for life-sustaining treatment. Disruption of these infrastructures during disasters poses a major threat to their lives. To improve patient access to dialysis treatment, there is a need to assess the potential threat to critical care facilities from hazardous events. In this study, we propose optimization models to solve critical care system resilience problems including patient and medical resource allocation. We use human mobility data in the context of Harris County (Texas) to assess patient access to critical care facilities, dialysis centers in this study, under the simulated hazard impacts, and we propose models for patient re-allocation and temporary medical facility placement to improve critical care system resilience in an equitable manner. The results show (1) the capability of the optimization model in efficient patient re-allocation to alleviate disrupted access to dialysis facilities; (2) the importance of large facilities in maintaining the functioning of the system. The critical care system, particularly the network of dialysis centers, is heavily reliant on a few larger facilities, making it susceptible to targeted disruption. (3) The consideration of equity in the optimization model formulation reduces access loss for vulnerable populations in the simulated scenarios. (4) The proposed temporary facilities placement could improve access for the vulnerable population, thereby improving the equity of access to critical care facilities in disaster. The proposed patient re-allocation model and temporary facilities placement can serve as a data-driven and analytic-based decision support tool for public health and emergency management plans to reduce the loss of access and disrupted access to critical care facilities and would reduce the dire social costs.

2.Efficient Algorithm for Solving Hyperbolic Programs

2306.07587

Authors:Yichuan Deng, Zhao Song, Lichen Zhang, Ruizhe Zhang

Abstract: Hyperbolic polynomials is a class of real-roots polynomials that has wide range of applications in theoretical computer science. Each hyperbolic polynomial also induces a hyperbolic cone that is of particular interest in optimization due to its generality, as by choosing the polynomial properly, one can easily recover the classic optimization problems such as linear programming and semidefinite programming. In this work, we develop efficient algorithms for hyperbolic programming, the problem in each one wants to minimize a linear objective, under a system of linear constraints and the solution must be in the hyperbolic cone induced by the hyperbolic polynomial. Our algorithm is an instance of interior point method (IPM) that, instead of following the central path, it follows the central Swath, which is a generalization of central path. To implement the IPM efficiently, we utilize a relaxation of the hyperbolic program to a quadratic program, coupled with the first four moments of the hyperbolic eigenvalues that are crucial to update the optimization direction. We further show that, given an evaluation oracle of the polynomial, our algorithm only requires $O(n^2d^{2.5})$ oracle calls, where $n$ is the number of variables and $d$ is the degree of the polynomial, with extra $O((n+m)^3 d^{0.5})$ arithmetic operations, where $m$ is the number of constraints.

3.Two-step inertial Bregman proximal alternating linearized minimization algorithm for nonconvex and nonsmooth problems

2306.07614

Authors:Chenzheng Guo, Jing Zhao

Abstract: In this paper, we study an algorithm for solving a class of nonconvex and nonsmooth nonseparable optimization problems. Based on proximal alternating linearized minimization (PALM), we propose a new iterative algorithm which combines two-step inertial extrapolation and Bregman distance. By constructing appropriate benefit function, with the help of Kurdyka--{\L}ojasiewicz property we establish the convergence of the whole sequence generated by proposed algorithm. We apply the algorithm to signal recovery, quadratic fractional programming problem and show the effectiveness of proposed algorithm.

4.Convergence to consensus results for Hegselmann-Krause type models with attractive-lacking interaction

2306.07658

Authors:Elisa Continelli, Cristina Pignotti

Abstract: In this paper, we analyze a Hegselmann-Krause opinion formation model with attractive-lacking interaction. More precisely, we investigate the situation in which the individuals involved in an opinion formation process interact among themselves but can eventually suspend the exchange of information among each other at some times. Under quite general assumptions, we prove the exponential convergence to consensus for the Hegselmann-Krause model in presence of possible lack of interaction. We then extend the analysis to an analogous model in presence of time delays.

5.Adaptive Stochastic Optimization Algorithms for Problems with Biased Oracles

2306.07810

Authors:Yin Liu, Sam Davanloo Tajbakhsh

Abstract: Motivated by multiple emerging applications in machine learning, we consider an optimization problem in a general form where the gradient of the objective is only available through a biased stochastic oracle. We assume the bias magnitude can be controlled by a parameter, however, lower bias requires more computation/samples. For instance, for two applications on stochastic composition optimization and policy optimization for infinite-horizon Markov decision processes, we show that the bias follows a power law and exponential decay, respectively, as functions of their corresponding bias control parameters. For problems with such gradient oracles, the paper proposes two stochastic algorithms that adaptively adjust the bias control parameter throughout the iterations. We analyze the nonasymptotic performance of the proposed algorithms in the nonconvex regime and establish $\mathcal{O}(\epsilon^{-4})$ and (optimal) $\mathcal{O}(\epsilon^{-3})$ sample complexity to obtain an $\epsilon$-stationary point. Finally, we numerically evaluate the performance of the proposed algorithms over the two applications.

6.Galerkin-like method for integro-differential inclusions with application to state-dependent sweeping processes

2306.07821

Authors:Pedro Pérez-Aros, Manuel Torres-Valdebenito, Emilio Vilches

Abstract: In this paper, we develop the Galerkin-like method to deal with first-order integro-differential inclusions. Under compactness or monotonicity conditions, we obtain new results for the existence of solutions for this class of problems, which generalize existing results in the literature and give new insights for differential inclusions with an unbounded right-hand side. The effectiveness of the proposed approach is illustrated by providing new results for nonconvex state-dependent integro-differential sweeping processes, where the right-hand side is unbounded, and the classical theory of differential inclusions is not applicable. It is the first result of this kind. The paper ends with an application to the existence of an optimal control problem governed by an integro-differential inclusion in finite dimensions.

7.Globally convergent homotopies for discrete-time optimal control

2306.07852

Authors:Willem Esterhuizen, Kathrin Flaßkamp, Matthias Hoffmann, Karl Worthmann

Abstract: Homotopy methods are attractive due to their capability of solving difficult optimization and optimal control problems. The underlying idea is to construct a homotopy, which may be considered as a continuous (zero) curve between the difficult original problem and a related, comparatively-easy one. Then, the solution of the easier one is continuously perturbed along the zero curve towards the desired solution of the difficult problem. We propose a methodology for the systematic construction of such zero curves for discrete-time optimal control problems drawing upon the theory of globally-convergent homotopies for nonlinear programs. This framework ensures that for almost every easy solution there exists a suitable homotopy path that is, in addition, numerically tractable. We demonstrate the results by solving a difficult path planning problem.

8.Symmetry & Critical Points for Symmetric Tensor Decompositions Problems

2306.07886

Authors:Yossi Arjevani, Gal Vinograd

Abstract: We consider the non-convex optimization problem associated with the decomposition of a real symmetric tensor into a sum of rank one terms. Use is made of the rich symmetry structure to derive Puiseux series representations of families of critical points, and so obtain precise analytic estimates on the critical values and the Hessian spectrum. The sharp results make possible an analytic characterization of various geometric obstructions to local optimization methods, revealing in particular a complex array of saddles and local minima which differ by their symmetry, structure and analytic properties. A desirable phenomenon, occurring for all critical points considered, concerns the index of a point, i.e., the number of negative Hessian eigenvalues, increasing with the value of the objective function. Lastly, a Newton polytope argument is used to give a complete enumeration of all critical points of fixed symmetry, and it is shown that contrarily to the set of global minima which remains invariant under different choices of tensor norms, certain families of non-global minima emerge, others disappear.

Mon, 12 Jun 2023digest

1.Convergence Rates of the Regularized Optimal Transport : Disentangling Suboptimality and Entropy

2306.06940

Authors:Hugo Malamut CEREMADE, Maxime Sylvestre CEREMADE

Abstract: We study the convergence of the transport plans $\gamma$$\epsilon$ towards $\gamma$0 as well as the cost of the entropy-regularized optimal transport (c, $\gamma$$\epsilon$) towards (c, $\gamma$0) as the regularization parameter $\epsilon$ vanishes in the setting of finite entropy marginals. We show that under the assumption of infinitesimally twisted cost and compactly supported marginals the distance W2($\gamma$$\epsilon$, $\gamma$0) is asymptotically greater than C $\sqrt$ $\epsilon$ and the suboptimality (c, $\gamma$$\epsilon$) -- (c, $\gamma$0) is of order $\epsilon$. In the quadratic cost case the compactness assumption is relaxed into a moment of order 2 + $\delta$ assumption. Moreover, in the case of a Lipschitz transport map for the non-regularized problem, the distance W2($\gamma$$\epsilon$, $\gamma$0) converges to 0 at rate $\sqrt$ $\epsilon$. Finally, if in addition the marginals have finite Fisher information, we prove (c, $\gamma$$\epsilon$) -- (c, $\gamma$0) $\sim$ d$\epsilon$/2 and we provide a companion expansion of H($\gamma$$\epsilon$). These results are achieved by disentangling the role of the cost and the entropy in the regularized problem. Contents

2.Sensitivity Analysis in Parametric Convex Vector Optimization

2306.06947

Authors:Duong Thi Viet An, Le Thanh Tung

Abstract: In this paper, sensitivity analysis of the efficient sets in parametric convex vector optimization is considered. Namely, the perturbation, weak perturbation, and proper perturbation maps are defined as set-valued maps. We establish the formulas for computing the Fr\'{e}chet coderivative of the profile of the above three kinds of perturbation maps. Because of the convexity assumptions, the conditions set are fairly simple if compared to those in the general case. In addition, our conditions are stated directly on the data of the problem. It is worth emphasizing that our approach is based on convex analysis tools which are different from those in the general case.

3.Towards continuous-time MPC: a novel trajectory optimization algorithm

2306.07107

Authors:Souvik Das, Siddhartha Ganguly, Muthyala Anjali, Debasish Chatterjee

Abstract: This article introduces a numerical algorithm that serves as a preliminary step toward solving continuous-time model predictive control (MPC) problems directly without explicit time-discretization. The chief ingredients of the underlying optimal control problem (OCP) are a linear time-invariant system, quadratic instantaneous and terminal cost functions, and convex path constraints. The thrust of the method involves finitely parameterizing the admissible space of control trajectories and solving the OCP satisfying the given constraints at every time instant in a tractable manner without explicit time-discretization. The ensuing OCP turns out to be a convex semi-infinite program (SIP), and some recently developed results are employed to obtain an optimal solution to this convex SIP. Numerical illustrations on some benchmark models are included to show the efficacy of the algorithm.

4.An agent-based decentralized threshold policy finding the constrained shortest paths

2306.07139

Authors:Francesca Rosset, Raffaele Pesenti, Franco Blanchini

Abstract: We consider a problem where autonomous agents enter a dynamic and unknown environment described by a network of weighted arcs. These agents move within the network from node to node according to a decentralized policy using only local information, with the goal of finding a path to an unknown sink node to leave the network. This policy makes each agent move to some adjacent node or stop at the current node. The transition along an arc is allowed or denied based on a threshold mechanism that takes into account the number of agents already accumulated in the arc's end nodes and the arc's weight. We show that this policy ensures path-length optimality in the sense that, in a finite time, all new agents entering the network reach the closer sinks by the shortest paths. Our approach is later extended to support constraints on the paths that agents can follow.

5.On the Computation-Communication Trade-Off with A Flexible Gradient Tracking Approach

2306.07159

Authors:Yan Huang, Jinming Xu

Abstract: We propose a flexible gradient tracking approach with adjustable computation and communication steps for solving distributed stochastic optimization problem over networks. The proposed method allows each node to perform multiple local gradient updates and multiple inter-node communications in each round, aiming to strike a balance between computation and communication costs according to the properties of objective functions and network topology in non-i.i.d. settings. Leveraging a properly designed Lyapunov function, we derive both the computation and communication complexities for achieving arbitrary accuracy on smooth and strongly convex objective functions. Our analysis demonstrates sharp dependence of the convergence performance on graph topology and properties of objective functions, highlighting the trade-off between computation and communication. Numerical experiments are conducted to validate our theoretical findings.

6.Analysis of the vanishing discount limit for optimal control problems in continuous and discrete time

2306.07234

Authors:Piermarco Cannarsa, Stephane Gaubert, Cristian Mendico, Marc Quincampoix

Abstract: A classical problem in ergodic continuous time control consists of studying the limit behavior of the optimal value of a discounted cost functional with infinite horizon as the discount factor $\lambda$ tends to zero. In the literature, this problem has been addressed under various controllability or ergodicity conditions ensuring that the rescaled value function converges uniformly to a constant limit. In this case the limit can be characterized as the unique constant such that a suitable Hamilton-Jacobi equation has at least one continuous viscosity solution. In this paper, we study this problem without such conditions, so that the aforementioned limit needs not be constant. Our main result characterizes the uniform limit (when it exists) as the maximal subsolution of a system of Hamilton-Jacobi equations. Moreover, when such a subsolution is a viscosity solution, we obtain the convergence of optimal values as well as a rate of convergence. This mirrors the analysis of the discrete time case, where we characterize the uniform limit as the supremum over a set of sub-invariant half-lines of the dynamic programming operator. The emerging structure in both discrete and continuous time models shows that the supremum over sub-invariato half-lines with respect to the Lax-Oleinik semigroup/dynamic programming operator, captures the behavior of the limit cost as discount vanishes.

Fri, 09 Jun 2023digest

1.An Accelerated Stochastic ADMM for Nonconvex and Nonsmooth Finite-Sum Optimization

2306.05899

Authors:Yuxuan Zeng, Zhiguo Wang, Jianchao Bai, Xiaojing Shen

Abstract: The nonconvex and nonsmooth finite-sum optimization problem with linear constraint has attracted much attention in the fields of artificial intelligence, computer, and mathematics, due to its wide applications in machine learning and the lack of efficient algorithms with convincing convergence theories. A popular approach to solve it is the stochastic Alternating Direction Method of Multipliers (ADMM), but most stochastic ADMM-type methods focus on convex models. In addition, the variance reduction (VR) and acceleration techniques are useful tools in the development of stochastic methods due to their simplicity and practicability in providing acceleration characteristics of various machine learning models. However, it remains unclear whether accelerated SVRG-ADMM algorithm (ASVRG-ADMM), which extends SVRG-ADMM by incorporating momentum techniques, exhibits a comparable acceleration characteristic or convergence rate in the nonconvex setting. To fill this gap, we consider a general nonconvex nonsmooth optimization problem and study the convergence of ASVRG-ADMM. By utilizing a well-defined potential energy function, we establish its sublinear convergence rate $O(1/T)$, where $T$ denotes the iteration number. Furthermore, under the additional Kurdyka-Lojasiewicz (KL) property which is less stringent than the frequently used conditions for showcasing linear convergence rates, such as strong convexity, we show that the ASVRG-ADMM sequence has a finite length and converges to a stationary solution with a linear convergence rate. Several experiments on solving the graph-guided fused lasso problem and regularized logistic regression problem validate that the proposed ASVRG-ADMM performs better than the state-of-the-art methods.

2.Robust Data-driven Prescriptiveness Optimization

2306.05937

Authors:Mehran Poursoltani, Erick Delage, Angelos Georghiou

Abstract: The abundance of data has led to the emergence of a variety of optimization techniques that attempt to leverage available side information to provide more anticipative decisions. The wide range of methods and contexts of application have motivated the design of a universal unitless measure of performance known as the coefficient of prescriptiveness. This coefficient was designed to quantify both the quality of contextual decisions compared to a reference one and the prescriptive power of side information. To identify policies that maximize the former in a data-driven context, this paper introduces a distributionally robust contextual optimization model where the coefficient of prescriptiveness substitutes for the classical empirical risk minimization objective. We present a bisection algorithm to solve this model, which relies on solving a series of linear programs when the distributional ambiguity set has an appropriate nested form and polyhedral structure. Studying a contextual shortest path problem, we evaluate the robustness of the resulting policies against alternative methods when the out-of-sample dataset is subject to varying amounts of distribution shift.

3.Lifting partial smoothing to solve HJB equations and stochastic control problems

2306.06016

Authors:Fausto Gozzi, Federica Masiero

Abstract: We study a family of stochastic control problems arising in typical applications (such as boundary control and control of delay equations with delay in the control) with the ultimate aim of finding solutions of the associated HJB equations, regular enough to find optimal feedback controls. These problems are difficult to treat since the underlying transition semigroups do not possess good smoothing properties nor the so-called "structure condition" which typically allows to apply the backward equations approach. In the papers [14], [15], and, more recently, [16] we studied such problems developing new partial smoothing techniques which allowed us to obtain the required regularity in the case when the cost functional is independent of the state variable. This is a somehow strong restriction which is not verified in most applications. In this paper (which can be considered a continuation of the research of the above papers) we develop a new approach to overcome this restriction. We extend the partial smoothing result to a wider class of functions which depend on the whole trajectory of the underlying semigroup and we use this as a key tool to improve our regularity result for the HJB equation. The fact that such class depends on trajectories requires a nontrivial technical work as we have to lift the original transition semigroup to a space of trajectories, defining a new "high-level" environment where our problems can be solved.

4.Branching via Cutting Plane Selection: Improving Hybrid Branching

2306.06050

Authors:Mark Turner, Timo Berthold, Mathieu Besançon, Thorsten Koch

Abstract: Cutting planes and branching are two of the most important algorithms for solving mixed-integer linear programs. For both algorithms, disjunctions play an important role, being used both as branching candidates and as the foundation for some cutting planes. We relate branching decisions and cutting planes to each other through the underlying disjunctions that they are based on, with a focus on Gomory mixed-integer cuts and their corresponding split disjunctions. We show that selecting branching decisions based on quality measures of Gomory mixed-integer cuts leads to relatively small branch-and-bound trees, and that the result improves when using cuts that more accurately represent the branching decisions. Finally, we show how the history of previously computed Gomory mixed-integer cuts can be used to improve the performance of the state-of-the-art hybrid branching rule of SCIP. Our results show a 4\% decrease in solve time, and an 8\% decrease in number of nodes over affected instances of MIPLIB 2017.

Thu, 08 Jun 2023digest

1.Communication-Efficient Gradient Descent-Accent Methods for Distributed Variational Inequalities: Unified Analysis and Local Updates

2306.05100

Authors:Siqi Zhang, Sayantan Choudhury, Sebastian U Stich, Nicolas Loizou

Abstract: Distributed and federated learning algorithms and techniques associated primarily with minimization problems. However, with the increase of minimax optimization and variational inequality problems in machine learning, the necessity of designing efficient distributed/federated learning approaches for these problems is becoming more apparent. In this paper, we provide a unified convergence analysis of communication-efficient local training methods for distributed variational inequality problems (VIPs). Our approach is based on a general key assumption on the stochastic estimates that allows us to propose and analyze several novel local training algorithms under a single framework for solving a class of structured non-monotone VIPs. We present the first local gradient descent-accent algorithms with provable improved communication complexity for solving distributed variational inequalities on heterogeneous data. The general algorithmic framework recovers state-of-the-art algorithms and their sharp convergence guarantees when the setting is specialized to minimization or minimax optimization problems. Finally, we demonstrate the strong performance of the proposed algorithms compared to state-of-the-art methods when solving federated minimax optimization problems.

2.Zero-sum stopper vs. singular-controller games with constrained control directions

2306.05113

Authors:Andrea Bovo, Tiziano De Angelis, Jan Palczewski

Abstract: We consider a class of zero-sum stopper vs.\ singular-controller games in which the controller can only act on a subset $d_0<d$ of the $d$ coordinates of a controlled diffusion. Due to the constraint on the control directions these games fall outside the framework of recently studied variational methods. In this paper we develop an approximation procedure, based on $L^1$-stability estimates for the controlled diffusion process and almost sure convergence of suitable stopping times. That allows us to prove existence of the game's value and to obtain an optimal strategy for the stopper, under continuity and growth conditions on the payoff functions. This class of games is a natural extension of (single-agent) singular control problems, studied in the literature, with similar constraints on the admissible controls.

3.On the Identification and Optimization of Nonsmooth Superposition Operators in Semilinear Elliptic PDEs

2306.05185

Authors:Constantin Christof, Julia Kowalczyk

Abstract: We study an infinite-dimensional optimization problem that aims to identify the Nemytskii operator in the nonlinear part of a prototypical semilinear elliptic partial differential equation (PDE) which minimizes the distance between the PDE-solution and a given desired state. In contrast to previous works, we consider this identification problem in a low-regularity regime in which the function inducing the Nemytskii operator is a-priori only known to be an element of $H^1_{loc}(\mathbb{R})$. This makes the studied problem class a suitable point of departure for the rigorous analysis of training problems for learning-informed PDEs in which an unknown superposition operator is approximated by means of a neural network with nonsmooth activation functions (ReLU, leaky-ReLU, etc.). We establish that, despite the low regularity of the controls, it is possible to derive a classical stationarity system for local minimizers and to solve the considered problem by means of a gradient projection method. The convergence of the resulting algorithm is proven in the function space setting. It is also shown that the established first-order necessary optimality conditions imply that locally optimal superposition operators share various characteristic properties with commonly used activation functions: They are always sigmoidal, continuously differentiable away from the origin, and typically possess a distinct kink at zero. The paper concludes with numerical experiments which confirm the theoretical findings.

4.Safe Adaptive Multi-Agent Coverage Control

2306.05187

Authors:Yang Bai, Yujie Wang, Xiaogang Xiong, Mikhail Svinin

Abstract: This paper presents a safe adaptive coverage controller for multi-agent systems with actuator faults and time-varying uncertainties. The centroidal Voronoi tessellation (CVT) is applied to generate an optimal configuration of multi-agent systems for covering an area of interest. As a conventional CVT-based controller cannot prevent collisions between agents with non-zero size, a control barrier function (CBF) based controller is developed to ensure collision avoidance with a function approximation technique (FAT) based design to deal with system uncertainties. The proposed controller is verified under simulations.

Wed, 07 Jun 2023digest

1.End-to-End Learning for Stochastic Optimization: A Bayesian Perspective

2306.04174

Authors:Yves Rychener, Daniel Kuhn Tobias Sutter

Abstract: We develop a principled approach to end-to-end learning in stochastic optimization. First, we show that the standard end-to-end learning algorithm admits a Bayesian interpretation and trains a posterior Bayes action map. Building on the insights of this analysis, we then propose new end-to-end learning algorithms for training decision maps that output solutions of empirical risk minimization and distributionally robust optimization problems, two dominant modeling paradigms in optimization under uncertainty. Numerical results for a synthetic newsvendor problem illustrate the key differences between alternative training schemes. We also investigate an economic dispatch problem based on real data to showcase the impact of the neural network architecture of the decision maps on their test performance.

2.Two-step inertial Bregman alternating structure-adapted proximal gradient descent algorithm for nonconvex and nonsmooth problems

2306.04208

Authors:Chenzheng Guo, Jing Zhao

Abstract: In the paper, we introduce several accelerate iterative algorithms for solving the multiple-set split common fixed-point problem of quasi-nonexpansive operators in real Hilbert space. Based on primal-dual method, we construct several iterative algorithms in a way that combines inertial technology and the self-adaptive stepsize such that the implementation of the algorithms doesn't need any prior information about bounded linear operator norm. Under suitable assumptions, weak convergence of the proposed algorithms is established. As applications, we obtain relative iterative algorithms to solve the multiple-set split feasibility problem. Finally, the performance of the proposed algorithms is illustrated by numerical experiments.

3.Input Rate Control in Stochastic Road Traffic Networks: Effective Bandwidths

2306.04218

Authors:Nikki Levering, Rudesindo Núñez-Queija

Abstract: In road traffic networks, large traffic volumes may lead to extreme delays. These severe delays are caused by the fact that, whenever the maximum capacity of a road is approached, speeds drop rapidly. Therefore, the focus in this paper is on real-time control of traffic input rates, thereby aiming to prevent such detrimental capacity drops. To account for the fact that, by the heterogeneity within and between traffic streams, the available capacity of a road suffers from randomness, we introduce a stochastic flow model that describes the impact of traffic input streams on the available road capacities. Then, exploiting similarities with traffic control of telecommunication networks, in which the available bandwidth is a stochastic function of the input rate, and in which the use of effective bandwidths have proven an effective input rate control framework, we propose a similar traffic rate control policy based on the concept of effective bandwidths. This policy allows for increased waiting times at the access boundaries of the network, so as to limit the probability of large delays within the network. Numerical examples show that, by applying such a control policy capacity violations are indeed rare, and that the increased waiting at the boundaries of the network is of much smaller scale, compared to uncontrolled delays in the network.

4.A Decomposition Approach to Last Mile Delivery Using Public Transportation Systems

2306.04219

Authors:Minakshi Punam Mandal, Claudia Archetti

Abstract: This study explores the potential of using public transportation systems for freight delivery, where we intend to utilize the spare capacities of public vehicles like buses, trams, metros, and trains, particularly during off-peak hours, to transport packages within the city instead of using dedicated delivery vehicles. The study contributes {to the growing} literature on innovative strategies for performing sustainable last mile deliveries. We study an operational level problem called the Three-Tier Delivery Problem on Public Transportation, where packages are first transported from the Consolidation and Distribution Center (CDC) to nearby public vehicle stations by delivery trucks. From there, public vehicles transport them into the city area. The last leg of the delivery is performed to deliver the packages to their respective customers using green vehicles or eco-friendly systems. We propose mixed-integer linear programming formulations to study the transport of packages from the CDC to the customers, use decomposition approaches to solve them, and provide numerical experiments to demonstrate the efficiency and effectiveness of the system. Our results show that this system has the potential to drastically reduce the length of trips performed by dedicated delivery vehicles, thereby reducing the negative social and environmental impacts of existing last mile delivery systems.

5.Distributed accelerated proximal conjugate gradient methods for multi-agent constrained optimization problems

2306.04230

Authors:Anteneh Getachew Gebrie

Abstract: The purpose of this paper is to introduce two new classes of accelerated distributed proximal conjugate gradient algorithms for multi-agent constrained optimization problems; given as minimization of a function decomposed as a sum of M number of smooth and M number of nonsmooth functions over the common fixed points of M number of nonlinear mappings. Exploiting the special properties of the cost component function of the objective function and the nonlinear mapping of the constraint problem of each agent, a new inertial accelerated incremental and parallel computing distributed algorithms will be presented based on the combinations of computations of proximal, conjugate gradient and Halpern methods. Some numerical experiments and comparisons are given to illustrate our results.

6.A Hierarchical OPF Algorithm with Improved Gradient Evaluation in Three-Phase Networks

2306.04350

Authors:Heng Liang, Xinyang Zhou, Changhong Zhao

Abstract: Linear approximation commonly used in solving alternating-current optimal power flow (AC-OPF) simplifies the system models but incurs accumulated voltage errors in large power networks. Such errors will make the primal-dual type gradient algorithms converge to the solutions at which the power networks may be exposed to the risk of voltage violation. In this paper, we improve a recent hierarchical OPF algorithm that rested on primal-dual gradients evaluated with a linearized distribution power flow model. Specifically, we propose a more accurate gradient evaluation method based on a three-phase unbalanced nonlinear distribution power flow model to mitigate the errors arising from model linearization. The resultant gradients feature a blocked structure that enables us to further develop an improved hierarchical primal-dual algorithm to solve the OPF problem. Numerical results on the IEEE $123$-bus test feeder and a $4,518$-node test feeder show that the proposed method can enhance the overall voltage safety while achieving comparable computational efficiency with the linearized algorithm.

7.Comparison of SeDuMi and SDPT3 Solvers for Stability of Continuous-time Linear System

2306.04531

Authors:Guangda Xu

Abstract: SeDuMi and SDPT3 are two solvers for solving Semi-definite Programming (SDP) or Linear Matrix Inequality (LMI) problems. A computational performance comparison of these two are undertaken in this paper regarding the Stability of Continuous-time Linear Systems. The comparison mainly focuses on computational times and memory requirements for different scales of problems. To implement and compare the two solvers on a set of well-posed problems, we employ YALMIP, a widely used toolbox for modeling and optimization in MATLAB. The primary goal of this study is to provide an empirical assessment of the relative computational efficiency of SeDuMi and SDPT3 under varying problem conditions. Our evaluation indicates that SDPT3 performs much better in large-scale, high-precision calculations.

8.The lifted functional approach to mean field games with common noise

2306.04560

Authors:Mark Cerenzia, Aaron Palmer

Abstract: We introduce a new path-by-path approach to mean field games with common noise that recovers duality at the pathwise level. We verify this perspective by explicitly solving some difficult examples with linear-quadratic data, including control in the volatility coefficient of the common noise as well as the constraint of partial information. As an application, we establish the celebrated separation principle in the latter context. In pursuing this program, we believe we have made a crucial contribution to clarifying the notion of regular solution in the path dependent PDE literature.

Tue, 06 Jun 2023digest

1.New Relaxation Modulus Based Iterative Method for Large and Sparse Implicit Complementarity Problem

2306.03563

Authors:Bharat Kumar, Deepmala, A. K. Das

Abstract: This article presents a class of new relaxation modulus-based iterative methods to process the large and sparse implicit complementarity problem (ICP). Using two positive diagonal matrices, we formulate a fixed-point equation and prove that it is equivalent to ICP. Also, we provide sufficient convergence conditions for the proposed methods when the system matrix is a $P$-matrix or an $H_+$-matrix. Keyword: Implicit complementarity problem, $H_{+}$-matrix, $P$-matrix, matrix splitting, convergence

2.Weak KAM Theory and Aubry-Mather Theory for sub-Riemannian control systems

2306.03808

Authors:Piermarco Cannarsa, Cristian Mendico

Abstract: The aim of this work is to provide a systemic study and generalization of the celebrated weak KAM theory and Aubry-Mather theory in sub-Riemannian setting, or equivalently, on a Carnot-Caratheodory metric space. In this framework we consider an optimal control problem with state equation of sub-Riemannian type, namely, admissible trajectories are solutions of a linear in control and nonlinear in space ODE. Such a nonlinearity is given by a family of smooth vector fields satisfying the Hormander condition which implies the controllability of the system. In this case, the Hamiltonian function associated with the above control problem fails to be coercive and thus the results in the Tonelli setting can not be applied. In order to overcome this issue, our approach is based on metric properties of the geometry induced on the state space by the sub-Riemannian structure.

3.Characterization of transport optimizers via graphs and applications to Stackelberg-Cournot-Nash equilibria

2306.03843

Authors:Beatrice Acciaio, Berenice Anne Neumann

Abstract: We introduce graphs associated to transport problems between discrete marginals, that allow to characterize the set of all optimizers given one primal optimizer. In particular, we establish that connectivity of those graphs is a necessary and sufficient condition for uniqueness of the dual optimizers. Moreover, we provide an algorithm that can efficiently compute the dual optimizer that is the limit, as the regularization parameter goes to zero, of the dual entropic optimizers. Our results find an application in a Stackelberg-Cournot-Nash game, for which we obtain existence and characterization of the equilibria.

Mon, 05 Jun 2023digest

1.On the convergence of the $k$-point bound for topological packing graphs

2306.02725

Authors:Bram Bekker, Fernando Mário de Oliveira Filho

Abstract: We show that the $k$-point bound of de Laat, Machado, Oliveira, and Vallentin, a hierarchy of upper bounds for the independence number of a topological packing graph derived from the Lasserre hierarchy, converges to the independence number.

2.On the Split Closure of the Periodic Timetabling Polytope

2306.02746

Authors:Niels Lindner, Berenike Masing

Abstract: The Periodic Event Scheduling Problem (PESP) is the central mathematical tool for periodic timetable optimization in public transport. PESP can be formulated in several ways as a mixed-integer linear program with typically general integer variables. We investigate the split closure of these formulations and show that split inequalities are identical with the recently introduced flip inequalities. While split inequalities are a general mixed-integer programming technique, flip inequalities are defined in purely combinatorial terms, namely cycles and arc sets of the digraph underlying the PESP instance. It is known that flip inequalities can be separated in pseudo-polynomial time. We prove that this is best possible unless P $=$ NP, but also observe that the complexity becomes linear-time if the cycle defining the flip inequality is fixed. Moreover, introducing mixed-integer-compatible maps, we compare the split closures of different formulations, and show that reformulation or binarization by subdivision do not lead to stronger split closures. Finally, we estimate computationally how much of the optimality gap of the instances of the benchmark library PESPlib can be closed exclusively by split cuts, and provide better dual bounds for five instances.

3.Tight Big-Ms for Optimal Transmission Switching

2306.02784

Authors:Salvador Pineda, Juan Miguel Morales, Álvaro Porras, Concepción Domínguez

Abstract: This paper addresses the Optimal Transmission Switching (OTS) problem in electricity networks, which aims to find an optimal power grid topology that minimizes system operation costs while satisfying physical and operational constraints. Existing methods typically convert the OTS problem into a Mixed-Integer Linear Program (MILP) using big-M constants. However, the computational performance of these approaches relies significantly on the tightness of these big-Ms. In this paper, we propose an iterative tightening strategy to strengthen the big-Ms by efficiently solving a series of bounding problems that account for the economics of the OTS objective function through an upper-bound on the generating cost. We also discuss how the performance of the proposed tightening strategy is enhanced if reduced line capacities are considered. Using the 118-bus test system we demonstrate that the proposed methodology outperforms existing approaches, offering tighter bounds and significantly reducing the computational burden of the OTS problem.

4.Integer Programming Games: A Gentle Computational Overview

2306.02817

Authors:Margarida Carvalho, Gabriele Dragotto, Andrea Lodi, Sriram Sankaranarayan

Abstract: In this tutorial, we present a computational overview on computing Nash equilibria in Integer Programming Games ($IPG$s), $i.e.$, how to compute solutions for a class of non-cooperative and nonconvex games where each player solves a mixed-integer optimization problem. $IPG$s are a broad class of games extending the modeling power of mixed-integer optimization to multi-agent settings. This class of games includes, for instance, any finite game and any multi-agent extension of traditional combinatorial optimization problems. After providing some background motivation and context of applications, we systematically review and classify the state-of-the-art algorithms to compute Nash equilibria. We propose an essential taxonomy of the algorithmic ingredients needed to compute equilibria, and we describe the theoretical and practical challenges associated with equilibria computation. Finally, we quantitatively and qualitatively compare a sequential Stackelberg game with a simultaneous $IPG$ to highlight the different properties of their solutions.

5.Probabilistic Region-of-Attraction Estimation with Scenario Optimization and Converse Theorems

2306.02832

Authors:Torbjørn Cunis

Abstract: The region of attraction characterizes well-behaved and safe operation of a nonlinear system and is hence sought after for verification. In this paper, a framework for probabilistic region of attraction estimation is developed that combines scenario optimization and converse theorems. With this approach, the probability of an unstable condition being included in the estimate is independent of the system's complexity, while convergence in probability to the true region of attraction is proven. Numerical examples demonstrate the effectiveness for optimization-based control applications. Combining systems theory and sampling, the complexity of Monte--Carlo-based verification techniques can be reduced. The results can be extended to arbitrary level sets of which the defining function can be sampled, such as finite-horizon viability. Thus, the proposed approach is applicable and/or adaptable to verification of a wide range of safety-related properties for nonlinear systems including feedback laws based on optimization or learning.

6.Exact Two-Step Benders Decomposition for Two-Stage Stochastic Mixed-Integer Programs

2306.02849

Authors:Sifa Celik, Layla Martin, Albert H. Schrotenboer, Tom Van Woensel

Abstract: Many real-life optimization problems belong to the class of two-stage stochastic mixed-integer programming problems with continuous recourse. This paper introduces Two-Step Benders Decomposition with Scenario Clustering (TBDS) as a general exact solution methodology for solving such stochastic programs to optimality. The method combines and generalizes Benders dual decomposition, partial Benders decomposition, and Scenario Clustering techniques and does so within a novel two-step decomposition along the binary and continuous first-stage decisions. We use TBDS to provide the first exact solutions for the so-called Time Window Assignment Traveling Salesperson problem. This is a canonical optimization problem for service-oriented vehicle routing; it considers jointly assigning time windows to customers and routing a vehicle among them while travel times are stochastic. Extensive experiments show that TBDS is superior to state-of-the-art approaches in the literature. It solves instances with up to 25 customers to optimality. It provides better lower and upper bounds that lead to faster convergence than related methods. For example, Benders dual decomposition cannot solve instances of 10 customers to optimality. We use TBDS to analyze the structure of the optimal solutions. By increasing routing costs only slightly, customer service can be improved tremendously, driven by smartly alternating between high- and low-variance travel arcs to reduce the impact of delay propagation throughout the executed vehicle route.

7.Curvature and complexity: Better lower bounds for geodesically convex optimization

2306.02959

Authors:Christopher Criscitiello, Nicolas Boumal

Abstract: We study the query complexity of geodesically convex (g-convex) optimization on a manifold. To isolate the effect of that manifold's curvature, we primarily focus on hyperbolic spaces. In a variety of settings (smooth or not; strongly g-convex or not; high- or low-dimensional), known upper bounds worsen with curvature. It is natural to ask whether this is warranted, or an artifact. For many such settings, we propose a first set of lower bounds which indeed confirm that (negative) curvature is detrimental to complexity. To do so, we build on recent lower bounds (Hamilton and Moitra, 2021; Criscitiello and Boumal, 2022) for the particular case of smooth, strongly g-convex optimization. Using a number of techniques, we also secure lower bounds which capture dependence on condition number and optimality gap, which was not previously the case. We suspect these bounds are not optimal. We conjecture optimal ones, and support them with a matching lower bound for a class of algorithms which includes subgradient descent, and a lower bound for a related game. Lastly, to pinpoint the difficulty of proving lower bounds, we study how negative curvature influences (and sometimes obstructs) interpolation with g-convex functions.

8.Frequency Regulation with Storage: On Losses and Profits

2306.02987

Authors:Dirk Lauinger, François Vuille, Daniel Kuhn

Abstract: Low-carbon societies will need to store vast amounts of electricity to balance intermittent generation from wind and solar energy, for example, through frequency regulation. Here, we derive an analytical solution to the decision-making problem of storage operators who sell frequency regulation power to grid operators and trade electricity on day-ahead markets. Mathematically, we treat future frequency deviation trajectories as functional uncertainties in a receding horizon robust optimization problem. We constrain the expected terminal state-of-charge to be equal to some target to allow storage operators to make good decisions not only for the present but also the future. Thanks to this constraint, the amount of electricity traded on day-ahead markets is an implicit function of the regulation power sold to grid operators. The implicit function quantifies the amount of power that needs to be purchased to cover the expected energy loss that results from providing frequency regulation. We show how the marginal cost associated with the expected energy loss decreases with roundtrip efficiency and increases with frequency deviation dispersion. We find that the profits from frequency regulation over the lifetime of energy-constrained storage devices are roughly inversely proportional to the length of time for which regulation power must be committed.

9.Explicit feedback synthesis driven by quasi-interpolation for nonlinear robust model predictive control

2306.03027

Authors:Siddhartha Ganguly, Debasish Chatterjee

Abstract: We present QuIFS (Quasi-Interpolation driven Feedback Synthesis) -- an offline feedback synthesis algorithm for explicit nonlinear robust minmax model predictive control (MPC) problems with guaranteed quality of approximation. The underlying technique is driven by a particular type of grid-based quasi-interpolation scheme. The QuIFS algorithm departs drastically from conventional approximation algorithms that are employed in the MPC industry (in particular, it is neither based on multi-parametric programming tools nor does it involve kernel methods), and the essence of their point of departure is encoded in the following challenge-answer approach: Given an error margin $\varepsilon>0$, compute a feasible feedback policy that is uniformly $\varepsilon$-close to the optimal MPC feedback policy for a given nonlinear system subjected to hard constraints and bounded uncertainties. Conditions for closed-loop stability and recursive feasibility under the approximate feedback policy are also established. We provide a library of numerical examples to illustrate our results.

10.Entropic mean-field min-max problems via Best Response and Fisher-Rao flows

2306.03033

Authors:Razvan-Andrei Lascu, Mateusz B. Majka, Łukasz Szpruch

Abstract: We investigate convergence properties of two continuous-time optimization methods, the Mean-Field Best Response and the Fisher-Rao (Mean-Field Birth-Death) flows, for solving convex-concave min-max games with entropy regularization. We introduce suitable Lyapunov functions to establish exponential convergence to the unique mixed Nash equilibrium for both methods, albeit under slightly different conditions. Additionally, we demonstrate the convergence of the fictitious play flow as a by-product of our analysis.

Fri, 02 Jun 2023digest

1.Optimal Control and Approximate controllability of fractional semilinear differential inclusion involving $ψ$- Hilfer fractional derivatives

2306.01352

Authors:Bholanath Kumbhakar, Dwijendra Narain Pandey

Abstract: The current paper initially studies the optimal control of linear $\psi$-Hilfer fractional derivatives with state-dependent control constraints and optimal control for a particular type of cost functional. Then, we investigate the approximate controllability of the abstract fractional semilinear differential inclusion involving $\psi$-Hilfer fractional derivative in reflexive Banach spaces. It is known that the existence, uniqueness, optimal control, and approximate controllability of fractional differential equations or inclusions have been demonstrated for a similar type of fractional differential equations or inclusions with different fractional order derivative operators. Hence it has to research fractional differential equations with more general fractional operators which incorporate all the specific fractional derivative operators. This motivates us to consider the $\psi$-Hilfer fractional differential inclusion. We assume the compactness of the corresponding semigroup and the approximate controllability of the associated linear control system and define the control with the help of duality mapping. We observe that convexity is essential in determining the controllability property of semilinear differential inclusion. In the case of Hilbert spaces, there is no issue of convexity as the duality map becomes simply the identity map. In contrast to Hilbert spaces, if we consider reflexive Banach spaces, there is an issue of convexity due to the nonlinear nature of duality mapping. The novelty of this paper is that we overcome this convexity issue and establish our main result. Finally, we test our outcomes through an example.

2.A Study of Qualitative Correlations Between Crucial Bio-markers and the Optimal Drug Regimen of Type-I Lepra Reaction: A Deterministic Approach

2306.01427

Authors:Dinesh Nayak, A. V. Sangeetha, D. K. K. Vamsi

Abstract: Mycobacterium leprae is a bacteria that causes the disease Leprosy (Hansen's disease), which is a neglected tropical disease. More than 200000 cases are being reported per year world wide. This disease leads to a chronic stage known as Lepra reaction that majorly causes nerve damage of peripheral nervous system leading to loss of organs. The early detection of this Lepra reaction through the level of bio-markers can prevent this reaction occurring and the further disabilities. Motivated by this, we frame a mathematical model considering the pathogenesis of leprosy and the chemical pathways involved in Lepra reactions. The model incorporates the dynamics of the susceptible schwann cells, infected schwann cells and the bacterial load and the concentration levels of the bio markers $IFN-\gamma$, $TNF-\alpha$, $IL-10$, $IL-12$, $IL-15$ and $IL-17$. We consider a nine compartment optimal control problem considering the drugs used in Multi Drug Therapy (MDT) as controls. We validate the model using 2D - heat plots. We study the correlation between the bio-markers levels and drugs in MDT and propose an optimal drug regimen through these optimal control studies. We use the Newton's Gradient Method for the optimal control studies.

3.The uniform diversification strategy is optimal for expected utility maximization under high model ambiguity

2306.01503

Authors:Laurence Carassus, Johannes Wiesel

Abstract: We investigate an expected utility maximization problem under model uncertainty in a one-period financial market. We capture model uncertainty by replacing the baseline model $\mathbb{P}$ with an adverse choice from a Wasserstein ball of radius $k$ around $\mathbb{P}$ in the space of probability measures and consider the corresponding Wasserstein distributionally robust optimization problem. We show that optimal solutions converge to the uniform diversification strategy when uncertainty is increasingly large, i.e. when the radius $k$ tends to infinity.

4.Load Asymptotics and Dynamic Speed Optimization for the Greenest Path Problem: A Comprehensive Analysis

2306.01687

Authors:Poulad Moradi, Joachim Arts, Josué Velázquez-Martínez

Abstract: We study the effect of using high-resolution elevation data on the selection of the most fuel-efficient (greenest) path for different trucks in various urban environments. We adapt a variant of the Comprehensive Modal Emission Model (CMEM) to show that the optimal speed and the greenest path are slope dependent (dynamic). When there are no elevation changes in a road network, the most fuel-efficient path is the shortest path with a constant (static) optimal speed throughout. However, if the network is not flat, then the shortest path is not necessarily the greenest path, and the optimal driving speed is dynamic. We prove that the greenest path converges to an asymptotic greenest path as the payload approaches infinity and that this limiting path is attained for a finite load. In a set of extensive numerical experiments, we benchmark the CO2 emissions reduction of our dynamic speed and the greenest path policies against policies that ignore elevation data. We use the geo-spatial data of 25 major cities across 6 continents, such as Los Angeles, Mexico City, Johannesburg, Athens, Ankara, and Canberra. Our results show that, on average, traversing the greenest path with a dynamic optimal speed policy can reduce the CO2 emissions by 1.19% to 10.15% depending on the city and truck type for a moderate payload. They also demonstrate that the average CO2 reduction of the optimal dynamic speed policy is between 2% to 4% for most of the cities, regardless of the truck type. We confirm that disregarding elevation data yields sub-optimal paths that are significantly less CO2 efficient than the greenest paths.

Thu, 01 Jun 2023digest

1.The Mini-batch Stochastic Conjugate Algorithms with the unbiasedness and Minimized Variance Reduction

2306.00459

Authors:Feifei Gao, Caixia Kou

Abstract: We firstly propose the new stochastic gradient estimate of unbiasedness and minimized variance in this paper. Secondly, we propose the two algorithms: Algorithml and Algorithm2 which apply the new stochastic gradient estimate to modern stochastic conjugate gradient algorithms SCGA 7and CGVR 8. Then we prove that the proposed algorithms can obtain linearconvergence rate under assumptions of strong convexity and smoothness. Finally, numerical experiments show that the new stochastic gradient estimatecan reduce variance of stochastic gradient effectively. And our algorithms compared SCGA and CGVR can convergent faster in numerical experimentson ridge regression model.

2.Optimization Algorithm Synthesis based on Integral Quadratic Constraints: A Tutorial

2306.00565

Authors:Carsten W. Scherer, Christian Ebenbauer, Tobias Holicki

Abstract: We expose in a tutorial fashion the mechanisms which underly the synthesis of optimization algorithms based on dynamic integral quadratic constraints. We reveal how these tools from robust control allow to design accelerated gradient descent algorithms with optimal guaranteed convergence rates by solving small-sized convex semi-definite programs. It is shown that this extends to the design of extremum controllers, with the goal to regulate the output of a general linear closed-loop system to the minimum of an objective function. Numerical experiments illustrate that we can not only recover gradient decent and the triple momentum variant of Nesterov's accelerated first order algorithm, but also automatically synthesize optimal algorithms even if the gradient information is passed through non-trivial dynamics, such as time-delays.

3.Robust Exponential Stability and Invariance Guarantees with General Dynamic O'Shea-Zames-Falb Multipliers

2306.00571

Authors:Carsten W. Scherer

Abstract: We propose novel time-domain dynamic integral quadratic constraints with a terminal cost for exponentially weighted slope-restricted gradients of not necessarily convex functions. This extends recent results for subdifferentials of convex function and their link to so-called O'Shea-Zames-Falb multipliers. The benefit of merging time-domain and frequency-domain techniques is demonstrated for linear saturated systems.

4.Data-driven optimal control under safety constraints using sparse Koopman approximation

2306.00596

Authors:Hongzhe Yu, Joseph Moyalan, Umesh Vaidya, Yongxin Chen

Abstract: In this work we approach the dual optimal reach-safe control problem using sparse approximations of Koopman operator. Matrix approximation of Koopman operator needs to solve a least-squares (LS) problem in the lifted function space, which is computationally intractable for fine discretizations and high dimensions. The state transitional physical meaning of the Koopman operator leads to a sparse LS problem in this space. Leveraging this sparsity, we propose an efficient method to solve the sparse LS problem where we reduce the problem dimension dramatically by formulating the problem using only the non-zero elements in the approximation matrix with known sparsity pattern. The obtained matrix approximation of the operators is then used in a dual optimal reach-safe problem formulation where a linear program with sparse linear constraints naturally appears. We validate our proposed method on various dynamical systems and show that the computation time for operator approximation is greatly reduced with high precision in the solutions.

5.Gauss-Southwell type descent methods for low-rank matrix optimization

2306.00897

Authors:Guillaume Olikier, André Uschmajew, Bart Vandereycken

Abstract: We consider gradient-related methods for low-rank matrix optimization with a smooth cost function. The methods operate on single factors of the low-rank factorization and share aspects of both alternating and Riemannian optimization. Two possible choices for the search directions based on Gauss-Southwell type selection rules are compared: one using the gradient of a factorized non-convex formulation, the other using the Riemannian gradient. While both methods provide gradient convergence guarantees that are similar to the unconstrained case, the version based on Riemannian gradient is significantly more robust with respect to small singular values and the condition number of the cost function, as illustrated by numerical experiments. As a side result of our approach, we also obtain new convergence results for the alternating least squares method.

6.Mean-field limit for stochastic control problems under state constraint

2306.00949

Authors:Samuel Daudin

Abstract: We study the convergence problem of mean-field control theory in the presence of state constraints and non-degenerate idiosyncratic noise. Our main result is the convergence of the value functions associated to stochastic control problems for many interacting particles subject to symmetric, almost-sure constraints toward the value function of a control problem of mean-field type, set on the space of probability measures. The key step of the proof is to show that admissible controls for the limit problem can be turned into admissible controls for the $N$-particle problem up to a correction which vanishes as the number of particles increases. The rest of the proof relies on compactness methods. We also provide optimality conditions for the mean-field problem and discuss the regularity of the optimal controls. Finally we present some applications and connections with large deviations for weakly interacting particle systems.

Wed, 31 May 2023digest

1.On the Linear Convergence of Policy Gradient under Hadamard Parameterization

2305.19575

Authors:Jiacai Liu, Jinchi Chen, Ke Wei

Abstract: The convergence of deterministic policy gradient under the Hadamard parametrization is studied in the tabular setting and the global linear convergence of the algorithm is established. To this end, we first show that the error decreases at an $O(\frac{1}{k})$ rate for all the iterations. Based on this result, we further show that the algorithm has a faster local linear convergence rate after $k_0$ iterations, where $k_0$ is a constant that only depends on the MDP problem and the step size. Overall, the algorithm displays a linear convergence rate for all the iterations with a loose constant than that for the local linear convergence rate.

2.A converse Lyapunov-type theorem for control systems with regulated cost

2305.19670

Authors:Anna Chiara Lai, Monica Motta

Abstract: Given a nonlinear control system, a target set, a nonnegative integral cost, and a continuous function $W$, we say that the system is globally asymptotically controllable to the target with W-regulated cost, whenever, starting from any point z, among the strategies that achieve classical asymptotic controllability we can select one that also keeps the cost less than W(z). In this paper, assuming mild regularity hypotheses on the data, we prove that a necessary and sufficient condition for global asymptotic controllability with regulated cost is the existence of a special, continuous Control Lyapunov function, called a Minimum Restraint function. The main novelty is the necessity implication, obtained here for the first time. Nevertheless, the sufficiency condition extends previous results based on semiconcavity of the Minimum Restraint function, while we require mere continuity.

3.Bilevel Optimal Control: Theory, Algorithms, and Applications

2305.19786

Authors:Stephan Dempe, Markus Friedemann, Felix Harder, Patrick Mehlitz, Gerd Wachsmuth

Abstract: In this chapter, we are concerned with inverse optimal control problems, i.e., optimization models which are used to identify parameters in optimal control problems from given measurements. Here, we focus on linear-quadratic optimal control problems with control constraints where the reference control plays the role of the parameter and has to be reconstructed. First, it is shown that pointwise M-stationarity, associated with the reformulation of the hierarchical model as a so-called mathematical problem with complementarity constraints (MPCC) in function spaces, provides a necessary optimality condition under some additional assumptions on the data. Second, we review two recently developed algorithms (an augmented Lagrangian method and a nonsmooth Newton method) for the computational identification of M-stationary points of finite-dimensional MPCCs. Finally, a numerical comparison of these methods, based on instances of the appropriately discretized inverse optimal control problem of our interest, is provided.

4.Convergence of the vertical Gradient flow for the Gaussian Monge problem

2305.19788

Authors:Erik Jansson, Klas Modin

Abstract: We investigate a matrix dynamical system related to optimal mass transport in the linear category, namely, the problem of finding an optimal invertible matrix by which two covariance matrices are congruent. We first review the differential geometric structure of the problem in terms of a principal fiber bundle. The dynamical system is a gradient flow restricted to the fibers of the bundle. We prove global existence of solutions to the flow, with convergence to the polar decomposition of the matrix given as initial data. The convergence is illustrated in a numerical example.

5.A fresh look at nonsmooth Levenberg--Marquardt methods with applications to bilevel optimization

2305.19870

Authors:Lateef O. Jolaoso, Patrick Mehlitz, Alain B. Zemkoho

Abstract: In this paper, we revisit the classical problem of solving over-determined systems of nonsmooth equations numerically. We suggest a nonsmooth Levenberg--Marquardt method for its solution which, in contrast to the existing literature, does not require local Lipschitzness of the data functions. This is possible when using Newton-differentiability instead of semismoothness as the underlying tool of generalized differentiation. Conditions for fast local convergence of the method are given. Afterwards, in the context of over-determined mixed nonlinear complementarity systems, our findings are applied, and globalized solution methods, based on a residual induced by the maximum and the Fischer--Burmeister function, respectively, are constructed. The assumptions for fast local convergence are worked out and compared. Finally, these methods are applied for the numerical solution of bilevel optimization problems. We recall the derivation of a stationarity condition taking the shape of an over-determined mixed nonlinear complementarity system involving a penalty parameter, formulate assumptions for local fast convergence of our solution methods explicitly, and present results of numerical experiments. Particularly, we investigate whether the treatment of the appearing penalty parameter as an additional variable is beneficial or not.

6.Efficient PDE-Constrained optimization under high-dimensional uncertainty using derivative-informed neural operators

2305.20053

Authors:Dingcheng Luo, Thomas O'Leary-Roseberry, Peng Chen, Omar Ghattas

Abstract: We propose a novel machine learning framework for solving optimization problems governed by large-scale partial differential equations (PDEs) with high-dimensional random parameters. Such optimization under uncertainty (OUU) problems may be computational prohibitive using classical methods, particularly when a large number of samples is needed to evaluate risk measures at every iteration of an optimization algorithm, where each sample requires the solution of an expensive-to-solve PDE. To address this challenge, we propose a new neural operator approximation of the PDE solution operator that has the combined merits of (1) accurate approximation of not only the map from the joint inputs of random parameters and optimization variables to the PDE state, but also its derivative with respect to the optimization variables, (2) efficient construction of the neural network using reduced basis architectures that are scalable to high-dimensional OUU problems, and (3) requiring only a limited number of training data to achieve high accuracy for both the PDE solution and the OUU solution. We refer to such neural operators as multi-input reduced basis derivative informed neural operators (MR-DINOs). We demonstrate the accuracy and efficiency our approach through several numerical experiments, i.e. the risk-averse control of a semilinear elliptic PDE and the steady state Navier--Stokes equations in two and three spatial dimensions, each involving random field inputs. Across the examples, MR-DINOs offer $10^{3}$--$10^{7} \times$ reductions in execution time, and are able to produce OUU solutions of comparable accuracies to those from standard PDE based solutions while being over $10 \times$ more cost-efficient after factoring in the cost of construction.

7.Alternating Minimization for Regression with Tropical Rational Functions

2305.20072

Authors:Alex Dunbar, Lars Ruthotto

Abstract: We propose an alternating minimization heuristic for regression over the space of tropical rational functions with fixed exponents. The method alternates between fitting the numerator and denominator terms via tropical polynomial regression, which is known to admit a closed form solution. We demonstrate the behavior of the alternating minimization method experimentally. Experiments demonstrate that the heuristic provides a reasonable approximation of the input data. Our work is motivated by applications to ReLU neural networks, a popular class of network architectures in the machine learning community which are closely related to tropical rational functions.

Tue, 30 May 2023digest

1.Stochastic Control/Stopping Problem with Expectation Constraints

2305.18664

Authors:Erhan Bayraktar, Song Yao

Abstract: We study a stochastic control/stopping problem with a series of inequality-type and equality-type expectation constraints in a general non-Markovian framework. We demonstrate that the stochastic control/stopping problem with expectation constraints (CSEC) is independent of a specific probability setting and is equivalent to the constrained stochastic control/stopping problem in weak formulation (an optimization over joint laws of Brownian motion, state dynamics, diffusion controls and stopping rules on an enlarged canonical space). Using a martingale-problem formulation of controlled SDEs in spirit of \cite{Stroock_Varadhan}, we characterize the probability classes in weak formulation by countably many actions of canonical processes, and thus obtain the upper semi-analyticity of the CSEC value function. Then we employ a measurable selection argument to establish a dynamic programming principle (DPP) in weak formulation for the CSEC value function, in which the conditional expected costs act as additional states for constraint levels at the intermediate horizon. This article extends the results of \cite{Elk_Tan_2013b} to the expectation-constraint case. We extend our previous work \cite{OSEC_stopping} to the more complicated setting where the diffusion is controlled. Compared to that paper the topological properties of diffusion-control spaces and the corresponding measurability are more technically involved which complicate the arguments especially for the measurable selection for the super-solution side of DPP in the weak formulation.

2.Blockwise Stochastic Variance-Reduced Methods with Parallel Speedup for Multi-Block Bilevel Optimization

2305.18730

Authors:Quanqi Hu, Zi-Hao Qiu, Zhishuai Guo, Lijun Zhang, Tianbao Yang

Abstract: In this paper, we consider non-convex multi-block bilevel optimization (MBBO) problems, which involve $m\gg 1$ lower level problems and have important applications in machine learning. Designing a stochastic gradient and controlling its variance is more intricate due to the hierarchical sampling of blocks and data and the unique challenge of estimating hyper-gradient. We aim to achieve three nice properties for our algorithm: (a) matching the state-of-the-art complexity of standard BO problems with a single block; (b) achieving parallel speedup by sampling $I$ blocks and sampling $B$ samples for each sampled block per-iteration; (c) avoiding the computation of the inverse of a high-dimensional Hessian matrix estimator. However, it is non-trivial to achieve all of these by observing that existing works only achieve one or two of these properties. To address the involved challenges for achieving (a, b, c), we propose two stochastic algorithms by using advanced blockwise variance-reduction techniques for tracking the Hessian matrices (for low-dimensional problems) or the Hessian-vector products (for high-dimensional problems), and prove an iteration complexity of $O(\frac{m\epsilon^{-3}\mathbb{I}(I<m)}{I\sqrt{I}} + \frac{m\epsilon^{-3}}{I\sqrt{B}})$ for finding an $\epsilon$-stationary point under appropriate conditions. We also conduct experiments to verify the effectiveness of the proposed algorithms comparing with existing MBBO algorithms.

3.Around a Farkas type Lemma

2305.18749

Authors:Nguyen Dinh, Miguel A. Goberna, M. Volle

Abstract: The first two authors of this paper asserted in Lemma 4 of "New Farkas-type constraint qualifications in convex infinite programming" (DOI: 10.1051/cocv:2007027) that a given reverse convex inequality is consequence of a given convex system satisfying the Farkas-Minkowski constraint qualification if and only if certain set depending on the data contains a particular point of the vertical axis. This paper identifies a hidden assumption in this reverse Farkas lemma which always holds in its applications to nontrivial optimization problems. Moreover, it shows that the statement remains valid when the Farkas-Minkowski constraint qualification fails by replacing the mentioned set by its closure. This hidden assumption is also characterized in terms of the data. Finally, the paper provides some applications to convex infinite systems and to convex infinite optimization problems.

4.Infinite-dimensional moment-SOS hierarchy for nonlinear partial differential equations

2305.18768

Authors:Didier Henrion, Maria Infusino, Salma Kuhlmann, Victor Vinnikov

Abstract: We formulate a class of nonlinear {evolution} partial differential equations (PDEs) as linear optimization problems on moments of positive measures supported on infinite-dimensional vector spaces. Using sums of squares (SOS) representations of polynomials in these spaces, we can prove convergence of a hierarchy of finite-dimensional semidefinite relaxations solving approximately these infinite-dimensional optimization problems. As an illustration, we report on numerical experiments for solving the heat equation subject to a nonlinear perturbation.

5.Global minimization of polynomial integral functionals

2305.18801

Authors:Giovanni Fantuzzi, Federico Fuentes

Abstract: We describe a `discretize-then-relax' strategy to globally minimize integral functionals over functions $u$ in a Sobolev space satisfying prescribed Dirichlet boundary conditions. The strategy applies whenever the integral functional depends polynomially on $u$ and its derivatives, even if it is nonconvex. The `discretize' step uses a bounded finite-element scheme to approximate the integral minimization problem with a convergent hierarchy of polynomial optimization problems over a compact feasible set, indexed by the decreasing size $h$ of the finite-element mesh. The `relax' step employs sparse moment-SOS relaxations to approximate each polynomial optimization problem with a hierarchy of convex semidefinite programs, indexed by an increasing relaxation order $\omega$. We prove that, as $\omega\to\infty$ and $h\to 0$, solutions of such semidefinite programs provide approximate minimizers that converge in $L^p$ to the global minimizer of the original integral functional if this is unique. We also report computational experiments that show our numerical strategy works well even when technical conditions required by our theoretical analysis are not satisfied.

6.Policy Gradient Algorithms for Robust MDPs with Non-Rectangular Uncertainty Sets

2305.19004

Authors:Mengmeng Li, Tobias Sutter, Daniel Kuhn

Abstract: We propose a policy gradient algorithm for robust infinite-horizon Markov Decision Processes (MDPs) with non-rectangular uncertainty sets, thereby addressing an open challenge in the robust MDP literature. Indeed, uncertainty sets that display statistical optimality properties and make optimal use of limited data often fail to be rectangular. Unfortunately, the corresponding robust MDPs cannot be solved with dynamic programming techniques and are in fact provably intractable. This prompts us to develop a projected Langevin dynamics algorithm tailored to the robust policy evaluation problem, which offers global optimality guarantees. We also propose a deterministic policy gradient method that solves the robust policy evaluation problem approximately, and we prove that the approximation error scales with a new measure of non-rectangularity of the uncertainty set. Numerical experiments showcase that our projected Langevin dynamics algorithm can escape local optima, while algorithms tailored to rectangular uncertainty fail to do so.

7.Adaptive Quasi-Newton and Anderson Acceleration Framework with Explicit Global (Accelerated) Convergence Rates

2305.19179

Authors:Damien Scieur

Abstract: Despite the impressive numerical performance of quasi-Newton and Anderson/nonlinear acceleration methods, their global convergence rates have remained elusive for over 50 years. This paper addresses this long-standing question by introducing a framework that derives novel and adaptive quasi-Newton or nonlinear/Anderson acceleration schemes. Under mild assumptions, the proposed iterative methods exhibit explicit, non-asymptotic convergence rates that blend those of gradient descent and Cubic Regularized Newton's method. Notably, these rates are achieved adaptively, as the method autonomously determines the optimal step size using a simple backtracking strategy. The proposed approach also includes an accelerated version that improves the convergence rate on convex functions. Numerical experiments demonstrate the efficiency of the proposed framework, even compared to a fine-tuned BFGS algorithm with line search.

8.Fast global convergence of gradient descent for low-rank matrix approximation

2305.19206

Authors:Hengchao Chen, Xin Chen, Mohamad Elmasri, Qiang Sun

Abstract: This paper investigates gradient descent for solving low-rank matrix approximation problems. We begin by establishing the local linear convergence of gradient descent for symmetric matrix approximation. Building on this result, we prove the rapid global convergence of gradient descent, particularly when initialized with small random values. Remarkably, we show that even with moderate random initialization, which includes small random initialization as a special case, gradient descent achieves fast global convergence in scenarios where the top eigenvalues are identical. Furthermore, we extend our analysis to address asymmetric matrix approximation problems and investigate the effectiveness of a retraction-free eigenspace computation method. Numerical experiments strongly support our theory. In particular, the retraction-free algorithm outperforms the corresponding Riemannian gradient descent method, resulting in a significant 29\% reduction in runtime.

9.Learning for Robust Optimization

2305.19225

Authors:Irina Wang, Cole Becker, Bart Van Parys, Bartolomeo Stellato

Abstract: We propose a data-driven technique to automatically learn the uncertainty sets in robust optimization. Our method reshapes the uncertainty sets by minimizing the expected performance across a family of problems while guaranteeing constraint satisfaction. We learn the uncertainty sets using a novel stochastic augmented Lagrangian method that relies on differentiating the solutions of the robust optimization problems with respect to the parameters of the uncertainty set. We show sublinear convergence to stationary points under mild assumptions, and finite-sample probabilistic guarantees of constraint satisfaction using empirical process theory. Our approach is very flexible and can learn a wide variety of uncertainty sets while preserving tractability. Numerical experiments show that our method outperforms traditional approaches in robust and distributionally robust optimization in terms of out of sample performance and constraint satisfaction guarantees. We implemented our method in the open-source package LROPT.

10.Minimal Sparsity for Second-Order Moment-SOS Relaxations of the AC-OPF Problem

2305.19232

Authors:Adrien Le Franc LAAS-POP, Victor Magron LAAS-POP,IMT, Jean-Bernard Lasserre LAAS-POP, Manuel Ruiz, Patrick Panciatici

Abstract: AC-OPF (Alternative Current Optimal Power Flow)aims at minimizing the operating costs of a power gridunder physical constraints on voltages and power injections.Its mathematical formulation results in a nonconvex polynomial optimizationproblem which is hard to solve in general,but that can be tackled by a sequence of SDP(Semidefinite Programming) relaxationscorresponding to the steps of the moment-SOS (Sums-Of-Squares) hierarchy.Unfortunately, the size of these SDPs grows drastically in the hierarchy,so that even second-order relaxationsexploiting the correlative sparsity pattern of AC-OPFare hardly numerically tractable for largeinstances -- with thousands of power buses.Our contribution lies in a new sparsityframework, termed minimal sparsity, inspiredfrom the specific structure of power flowequations.Despite its heuristic nature, numerical examples show that minimal sparsity allows the computation ofhighly accurate second-order moment-SOS relaxationsof AC-OPF, while requiring far less computing time and memory resources than the standard correlative sparsity pattern. Thus, we manage to compute second-order relaxations on test caseswith about 6000 power buses, which we believe to be unprecedented.

Mon, 29 May 2023digest

1.Adaptive Localized Cayley Parametrization for Optimization over Stiefel Manifold

2305.17901

Authors:Keita Kume, Isao Yamada

Abstract: We present an adaptive parametrization strategy for optimization problems over the Stiefel manifold by using generalized Cayley transforms to utilize powerful Euclidean optimization algorithms efficiently. The generalized Cayley transform can translate an open dense subset of the Stiefel manifold into a vector space, and the open dense subset is determined according to a tunable parameter called a center point. With the generalized Cayley transform, we recently proposed the naive Cayley parametrization, which reformulates the optimization problem over the Stiefel manifold as that over the vector space. Although this reformulation enables us to transplant powerful Euclidean optimization algorithms, their convergences may become slow by a poor choice of center points. To avoid such a slow convergence, in this paper, we propose to estimate adaptively 'good' center points so that the reformulated problem can be solved faster. We also present a unified convergence analysis, regarding the gradient, in cases where fairly standard Euclidean optimization algorithms are employed in the proposed adaptive parametrization strategy. Numerical experiments demonstrate that (i) the proposed strategy succeeds in escaping from the slow convergence observed in the naive Cayley parametrization strategy; (ii) the proposed strategy outperforms the standard strategy which employs a retraction.

2.Communication Efficient Distributed Newton Method with Fast Convergence Rates

2305.17945

Authors:Chengchang Liu, Lesi Chen, Luo Luo, John C. S. Lui

Abstract: We propose a communication and computation efficient second-order method for distributed optimization. For each iteration, our method only requires $\mathcal{O}(d)$ communication complexity, where $d$ is the problem dimension. We also provide theoretical analysis to show the proposed method has the similar convergence rate as the classical second-order optimization algorithms. Concretely, our method can find~$\big(\epsilon, \sqrt{dL\epsilon}\,\big)$-second-order stationary points for nonconvex problem by $\mathcal{O}\big(\sqrt{dL}\,\epsilon^{-3/2}\big)$ iterations, where $L$ is the Lipschitz constant of Hessian. Moreover, it enjoys a local superlinear convergence under the strongly-convex assumption. Experiments on both convex and nonconvex problems show that our proposed method performs significantly better than baselines.

3.A Parameter-Free Conditional Gradient Method for Composite Minimization under Hölder Condition

2305.18181

Authors:Masaru Ito, Zhaosong Lu, Chuan He

Abstract: In this paper we consider a composite optimization problem that minimizes the sum of a weakly smooth function and a convex function with either a bounded domain or a uniformly convex structure. In particular, we first present a parameter-dependent conditional gradient method for this problem, whose step sizes require prior knowledge of the parameters associated with the H\"older continuity of the gradient of the weakly smooth function, and establish its rate of convergence. Given that these parameters could be unknown or known but possibly conservative, such a method may suffer from implementation issue or slow convergence. We therefore propose a parameter-free conditional gradient method whose step size is determined by using a constructive local quadratic upper approximation and an adaptive line search scheme, without using any problem parameter. We show that this method achieves the same rate of convergence as the parameter-dependent conditional gradient method. Preliminary experiments are also conducted and illustrate the superior performance of the parameter-free conditional gradient method over the methods with some other step size rules.

4.Necessary and sufficient conditions for unique solvability of absolute value equations: A Survey

2305.18556

Authors:Shubham Kumar, Deepmala

Abstract: In this survey paper, we focus on the necessary and sufficient conditions for the unique solvability and unsolvability of the absolute value equations (AVEs) during the last twenty years (2004 to 2023). We discussed unique solvability conditions for various types of AVEs like standard absolute value equation (AVE), Generalized AVE (GAVE), New generalized AVE (NGAVE), Triple AVE (TAVE) and a class of NGAVE based on interval matrix, P-matrix, singular value conditions, spectral radius and $\mathcal{W}$-property. Based on the unique solution of AVEs, we also discussed unique solvability conditions for linear complementarity problems (LCP) and horizontal linear complementarity problems (HLCP).

Fri, 26 May 2023digest

1.Stochastic First-Order Algorithms for Constrained Distributionally Robust Optimization

2305.16584

Authors:Hyungki Im, Paul Grigas

Abstract: We consider distributionally robust optimization (DRO) problems, reformulated as distributionally robust feasibility (DRF) problems, with multiple expectation constraints. We propose a generic stochastic first-order meta-algorithm, where the decision variables and uncertain distribution parameters are each updated separately by applying stochastic first-order methods. We then specialize our results to the case of using two specific versions of stochastic mirror descent (SMD): (i) a novel approximate version of SMD to update the decision variables, and (ii) the bandit mirror descent method to update the distribution parameters in the case of $\chi^2$-divergence sets. For this specialization, we demonstrate that the total number of iterations is independent of the dimensions of the decision variables and distribution parameters. Moreover, the cost per iteration to update both sets of variables is nearly independent of the dimension of the distribution parameters, allowing for high dimensional ambiguity sets. Furthermore, we show that the total number of iterations of our algorithm has a logarithmic dependence on the number of constraints. Experiments on logistic regression with fairness constraints, personalized parameter selection in a social network, and the multi-item newsvendor problem verify the theoretical results and show the usefulness of the algorithm, in particular when the dimension of the distribution parameters is large.

Thu, 25 May 2023digest

1.Highly Smoothness Zero-Order Methods for Solving Optimization Problems under PL Condition

2305.15828

Authors:Aleksandr Lobanov, Alexander Gasnikov, Fedor Stonyakin

Abstract: In this paper, we study the black box optimization problem under the Polyak--Lojasiewicz (PL) condition, assuming that the objective function is not just smooth, but has higher smoothness. By using "kernel-based" approximation instead of the exact gradient in Stochastic Gradient Descent method, we improve the best known results of convergence in the class of gradient-free algorithms solving problem under PL condition. We generalize our results to the case where a zero-order oracle returns a function value at a point with some adversarial noise. We verify our theoretical results on the example of solving a system of nonlinear equations.

2.First Order Methods with Markovian Noise: from Acceleration to Variational Inequalities

2305.15938

Authors:Aleksandr Beznosikov, Sergey Samsonov, Marina Sheshukova, Alexander Gasnikov, Alexey Naumov, Eric Moulines

Abstract: This paper delves into stochastic optimization problems that involve Markovian noise. We present a unified approach for the theoretical analysis of first-order gradient methods for stochastic optimization and variational inequalities. Our approach covers scenarios for both non-convex and strongly convex minimization problems. To achieve an optimal (linear) dependence on the mixing time of the underlying noise sequence, we use the randomized batching scheme, which is based on the multilevel Monte Carlo method. Moreover, our technique allows us to eliminate the limiting assumptions of previous research on Markov noise, such as the need for a bounded domain and uniformly bounded stochastic gradients. Our extension to variational inequalities under Markovian noise is original. Additionally, we provide lower bounds that match the oracle complexity of our method in the case of strongly convex optimization problems.

3.Neural incomplete factorization: learning preconditioners for the conjugate gradient method

2305.16368

Authors:Paul Häusner, Ozan Öktem, Jens Sjölund

Abstract: In this paper, we develop a novel data-driven approach to accelerate solving large-scale linear equation systems encountered in scientific computing and optimization. Our method utilizes self-supervised training of a graph neural network to generate an effective preconditioner tailored to the specific problem domain. By replacing conventional hand-crafted preconditioners used with the conjugate gradient method, our approach, named neural incomplete factorization (NeuralIF), significantly speeds-up convergence and computational efficiency. At the core of our method is a novel message-passing block, inspired by sparse matrix theory, that aligns with the objective to find a sparse factorization of the matrix. We evaluate our proposed method on both a synthetic and a real-world problem arising from scientific computing. Our results demonstrate that NeuralIF consistently outperforms the most common general-purpose preconditioners, including the incomplete Cholesky method, achieving competitive performance across various metrics even outside the training data distribution.

4.Certificates of Nonexistence for Lyapunov-Based Stability, Stabilizability and Detectability of LPV Systems

2305.15982

Authors:T. J. Meijer, V. S. Dolk, W. P. M. H. Heemels

Abstract: By computing Lyapunov functions of a certain, convenient structure, Lyapunov-based methods guarantee stability properties of the system or, when performing synthesis, of the relevant closed-loop or error dynamics. In doing so, they provide conclusive affirmative answers to many analysis and design questions in systems and control. When these methods fail to produce a feasible solution, however, they often remain inconclusive due to (a) the method being conservative or (b) the fact that there may be multiple causes for infeasibility, such as ill-conditioning, solver tolerances or true infeasibility. To overcome this, we develop LMI-based theorems of alternatives based upon which we can guarantee, by computing a so-called certificate of nonexistence, that no poly-quadratic Lyapunov function exists for a given linear parameter-varying system. We extend these ideas to also certify the nonexistence of controllers and observers for which the corresponding closed-loop/error dynamics admit a poly-quadratic Lyapunov function. Finally, we illustrate our results in some numerical case studies.

5.An Optimal Structured Zeroth-order Algorithm for Non-smooth Optimization

2305.16024

Authors:Marco Rando, Cesare Molinari, Lorenzo Rosasco, Silvia Villa

Abstract: Finite-difference methods are a class of algorithms designed to solve black-box optimization problems by approximating a gradient of the target function on a set of directions. In black-box optimization, the non-smooth setting is particularly relevant since, in practice, differentiability and smoothness assumptions cannot be verified. To cope with nonsmoothness, several authors use a smooth approximation of the target function and show that finite difference methods approximate its gradient. Recently, it has been proved that imposing a structure in the directions allows improving performance. However, only the smooth setting was considered. To close this gap, we introduce and analyze O-ZD, the first structured finite-difference algorithm for non-smooth black-box optimization. Our method exploits a smooth approximation of the target function and we prove that it approximates its gradient on a subset of random {\em orthogonal} directions. We analyze the convergence of O-ZD under different assumptions. For non-smooth convex functions, we obtain the optimal complexity. In the non-smooth non-convex setting, we characterize the number of iterations needed to bound the expected norm of the smoothed gradient. For smooth functions, our analysis recovers existing results for structured zeroth-order methods for the convex case and extends them to the non-convex setting. We conclude with numerical simulations where assumptions are satisfied, observing that our algorithm has very good practical performances.

6.Hybrid Methods in Polynomial Optimisation

2305.16122

Authors:Johannes Aspman, Gilles Bareilles, Vyacheslav Kungurtsev, Jakub Marecek, Martin Takáč

Abstract: The Moment/Sum-of-squares hierarchy provides a way to compute the global minimizers of polynomial optimization problems (POP), at the cost of solving a sequence of increasingly large semidefinite programs (SDPs). We consider large-scale POPs, for which interior-point methods are no longer able to solve the resulting SDPs. We propose an algorithm that combines a first-order method for solving the SDP relaxation, and a second-order method on a non-convex problem obtained from the POP. The switch from the first to the second-order method is based on a quantitative criterion, whose satisfaction ensures that Newton's method converges quadratically from its first iteration. This criterion leverages the point-estimation theory of Smale and the active-set identification. We illustrate the methodology to obtain global minimizers of large-scale optimal power flow problems.

7.Accelerated Methods for Riemannian Min-Max Optimization Ensuring Bounded Geometric Penalties

2305.16186

Authors:David Martínez-Rubio, Christophe Roux, Christopher Criscitiello, Sebastian Pokutta

Abstract: In this work, we study optimization problems of the form $\min_x \max_y f(x, y)$, where $f(x, y)$ is defined on a product Riemannian manifold $\mathcal{M} \times \mathcal{N}$ and is $\mu_x$-strongly geodesically convex (g-convex) in $x$ and $\mu_y$-strongly g-concave in $y$, for $\mu_x, \mu_y \geq 0$. We design accelerated methods when $f$ is $(L_x, L_y, L_{xy})$-smooth and $\mathcal{M}$, $\mathcal{N}$ are Hadamard. To that aim we introduce new g-convex optimization results, of independent interest: we show global linear convergence for metric-projected Riemannian gradient descent and improve existing accelerated methods by reducing geometric constants. Additionally, we complete the analysis of two previous works applying to the Riemannian min-max case by removing an assumption about iterates staying in a pre-specified compact set.

8.Two-timescale Extragradient for Finding Local Minimax Points

2305.16242

Authors:Jiseok Chae, Kyuwon Kim, Donghwan Kim

Abstract: Minimax problems are notoriously challenging to optimize. However, we demonstrate that the two-timescale extragradient can be a viable solution. By utilizing dynamical systems theory, we show that it converges to points that satisfy the second-order necessary condition of local minimax points, under a mild condition. This work surpasses all previous results as we eliminate a crucial assumption that the Hessian, with respect to the maximization variable, is nondegenerate.

9.Approaching Collateral Optimization for NISQ and Quantum-Inspired Computing

2305.16395

Authors:Megan Giron, Georgios Korpas, Waqas Parvaiz, Prashant Malik, Johannes Aspman

Abstract: Collateral optimization refers to the systematic allocation of financial assets to satisfy obligations or secure transactions, while simultaneously minimizing costs and optimizing the usage of available resources. {This involves assessing number of characteristics, such as cost of funding and quality of the underlying assets to ascertain the optimal collateral quantity to be posted to cover exposure arising from a given transaction or a set of transactions. One of the common objectives is to minimise the cost of collateral required to mitigate the risk associated with a particular transaction or a portfolio of transactions while ensuring sufficient protection for the involved parties}. Often, this results in a large-scale combinatorial optimization problem. In this study, we initially present a Mixed Integer Linear Programming (MILP) formulation for the collateral optimization problem, followed by a Quadratic Unconstrained Binary optimization (QUBO) formulation in order to pave the way towards approaching the problem in a hybrid-quantum and NISQ-ready way. We conduct local computational small-scale tests using various Software Development Kits (SDKs) and discuss the behavior of our formulations as well as the potential for performance enhancements. We further survey the recent literature that proposes alternative ways to attack combinatorial optimization problems suitable for collateral optimization.

Wed, 24 May 2023digest

1.Block Coordinate Descent on Smooth Manifolds

2305.14744

Authors:Liangzu Peng, René Vidal

Abstract: Block coordinate descent is an optimization paradigm that iteratively updates one block of variables at a time, making it quite amenable to big data applications due to its scalability and performance. Its convergence behavior has been extensively studied in the (block-wise) convex case, but it is much less explored in the non-convex case. In this paper we analyze the convergence of block coordinate methods on non-convex sets and derive convergence rates on smooth manifolds under natural or weaker assumptions than prior work. Our analysis applies to many non-convex problems (e.g., generalized PCA, optimal transport, matrix factorization, Burer-Monteiro factorization, outlier-robust estimation, alternating projection, maximal coding rate reduction, neural collapse, adversarial attacks, homomorphic sensing), either yielding novel corollaries or recovering previously known results.

2.Accelerated Nonconvex ADMM with Self-Adaptive Penalty for Rank-Constrained Model Identification

2305.14781

Authors:Qingyuan Liu, Zhengchao Huang, Hao Ye, Dexian Huang, Chao Shang

Abstract: The alternating direction method of multipliers (ADMM) has been widely adopted in low-rank approximation and low-order model identification tasks; however, the performance of nonconvex ADMM is highly reliant on the choice of penalty parameter. To accelerate ADMM for solving rankconstrained identification problems, this paper proposes a new self-adaptive strategy for automatic penalty update. Guided by first-order analysis of the increment of the augmented Lagrangian, the self-adaptive penalty updating enables effective and balanced minimization of both primal and dual residuals and thus ensures a stable convergence. Moreover, improved efficiency can be obtained within the Anderson acceleration scheme. Numerical examples show that the proposed strategy significantly accelerates the convergence of nonconvex ADMM while alleviating the critical reliance on tedious tuning of penalty parameters.

3.The Minimization of Piecewise Functions: Pseudo Stationarity

2305.14798

Authors:Ying Cui, Junyi Liu, Jong-Shi Pang

Abstract: There are many significant applied contexts that require the solution of discontinuous optimization problems in finite dimensions. Yet these problems are very difficult, both computationally and analytically. With the functions being discontinuous and a minimizer (local or global) of the problems, even if it exists, being impossible to verifiably compute, a foremost question is what kind of ''stationary solutions'' one can expect to obtain; these solutions provide promising candidates for minimizers; i.e., their defining conditions are necessary for optimality. Motivated by recent results on sparse optimization, we introduce in this paper such a kind of solution, termed ''pseudo B- (for Bouligand) stationary solution'', for a broad class of discontinuous piecewise continuous optimization problems with objective and constraint defined by indicator functions of the positive real axis composite with functions that are possibly nonsmooth. We present two approaches for computing such a solution. One approach is based on lifting the problem to a higher dimension via the epigraphical formulation of the indicator functions; this requires the addition of some auxiliary variables. The other approach is based on certain continuous (albeit not necessarily differentiable) piecewise approximations of the indicator functions and the convergence to a pseudo B-stationary solution of the original problem is established. The conditions for convergence are discussed and illustrated by an example.

4.Decentralized Control of Linear Systems with Private Input and Measurement Information

2305.14921

Authors:Juanjuan Xu, Huanshui Zhang

Abstract: In this paper, we study the linear quadratic (LQ) optimal control problem of linear systems with private input and measurement information. The main challenging lies in the unavailability of other regulators' historical input information. To overcome this difficulty, we introduce a kind of novel observers by using the private input and measurement information and accordingly design a kind of new decentralized controllers. In particular, it is verified that the corresponding cost function under the proposed decentralized controllers are asymptotically optimal as comparison with the optimal cost under optimal state-feedback controller. The presented results in this paper are new to the best of our knowledge, which represent the fundamental contribution to classical decentralized control.

5.Improved Complexity Analysis of the Sinkhorn and Greenkhorn Algorithms for Optimal Transport

2305.14939

Authors:Jianzhou Luo, Dingchuan Yang, Ke Wei

Abstract: The Sinkhorn algorithm is a widely used method for solving the optimal transport problem, and the Greenkhorn algorithm is one of its variants. While there are modified versions of these two algorithms whose computational complexities are $O({n^2\|C\|_\infty^2\log n}/{\varepsilon^2})$ to achieve an $\varepsilon$-accuracy, the best known complexities for the vanilla versions are $O({n^2\|C\|_\infty^3\log n}/{\varepsilon^3})$. In this paper we fill this gap and show that the complexities of the vanilla Sinkhorn and Greenkhorn algorithms are indeed $O({n^2\|C\|_\infty^2\log n}/{\varepsilon^2})$. The analysis relies on the equicontinuity of the dual variables of the entropic regularized optimal transport problem, which is of independent interest.

6.A discrete-time Pontryagin maximum principle under rate constraints

2305.14940

Authors:Siddhartha Ganguly, Souvik Das, Debasish Chatterjee, Ravi Banavar

Abstract: Limited bandwidth and limited saturation in actuators are practical concerns in control systems. Mathematically, these limitations manifest as constraints being imposed on the control actions, their rates of change, and more generally, the global behavior of their paths. While the problem of actuator saturation has been studied extensively, little attention has been devoted to the problem of actuators having limited bandwidth. While attempts have been made in the direction of incorporating frequency constraints on state-action trajectories before, rate constraints on the control at the design stage have not been studied extensively in the discrete-time regime. This article contributes toward filling this lacuna. In particular, we establish a new discrete-time Pontryagin maximum principle with rate constraints being imposed on the control trajectories, and derive first-order necessary conditions for optimality. A brief discussion on the existence of optimal control is included, and numerical examples are provided to illustrate the results.

7.A note on the computational complexity of the moment-SOS hierarchy for polynomial optimization

2305.14944

Authors:Sander Gribling, Sven Polak, Lucas Slot

Abstract: The moment-sum-of-squares (moment-SOS) hierarchy is one of the most celebrated and widely applied methods for approximating the minimum of an n-variate polynomial over a feasible region defined by polynomial (in)equalities. A key feature of the hierarchy is that, at a fixed level, it can be formulated as a semidefinite program of size polynomial in the number of variables n. Although this suggests that it may therefore be computed in polynomial time, this is not necessarily the case. Indeed, as O'Donnell (2017) and later Raghavendra & Weitz (2017) show, there exist examples where the sos-representations used in the hierarchy have exponential bit-complexity. We study the computational complexity of the moment-SOS hierarchy, complementing and expanding upon earlier work of Raghavendra & Weitz (2017). In particular, we establish algebraic and geometric conditions under which polynomial-time computation is guaranteed to be possible.

8.ReSync: Riemannian Subgradient-based Robust Rotation Synchronization

2305.15136

Authors:Huikang Liu, Xiao Li, Anthony Man-Cho So

Abstract: This work presents ReSync, a Riemannian subgradient-based algorithm for solving the robust rotation synchronization problem, which arises in various engineering applications. ReSync solves a least-unsquared minimization formulation over the rotation group, which is nonsmooth and nonconvex, and aims at recovering the underlying rotations directly. We provide strong theoretical guarantees for ReSync under the random corruption setting. Specifically, we first show that the initialization procedure of ReSync yields a proper initial point that lies in a local region around the ground-truth rotations. We next establish the weak sharpness property of the aforementioned formulation and then utilize this property to derive the local linear convergence of ReSync to the ground-truth rotations. By combining these guarantees, we conclude that ReSync converges linearly to the ground-truth rotations under appropriate conditions. Experiment results demonstrate the effectiveness of ReSync.

9.Approximating Multiobjective Optimization Problems: How exact can you be?

2305.15142

Authors:Cristina Bazgan, Arne Herzel, Stefan Ruzika, Clemens Thielen, Daniel Vanderpooten

Abstract: It is well known that, under very weak assumptions, multiobjective optimization problems admit $(1+\varepsilon,\dots,1+\varepsilon)$-approximation sets (also called $\varepsilon$-Pareto sets) of polynomial cardinality (in the size of the instance and in $\frac{1}{\varepsilon}$). While an approximation guarantee of $1+\varepsilon$ for any $\varepsilon>0$ is the best one can expect for singleobjective problems (apart from solving the problem to optimality), even better approximation guarantees than $(1+\varepsilon,\dots,1+\varepsilon)$ can be considered in the multiobjective case since the approximation might be exact in some of the objectives. Hence, in this paper, we consider partially exact approximation sets that require to approximate each feasible solution exactly, i.e., with an approximation guarantee of $1$, in some of the objectives while still obtaining a guarantee of $1+\varepsilon$ in all others. We characterize the types of polynomial-cardinality, partially exact approximation sets that are guaranteed to exist for general multiobjective optimization problems. Moreover, we study minimum-cardinality partially exact approximation sets concerning (weak) efficiency of the contained solutions and relate their cardinalities to the minimum cardinality of a $(1+\varepsilon,\dots,1+\varepsilon)$-approximation set.

10.Efficiently Constructing Convex Approximation Sets in Multiobjective Optimization Problems

2305.15166

Authors:Stephan Helfrich, Stefan Ruzika, Clemens Thielen

Abstract: Convex approximation sets for multiobjective optimization problems are a well-studied relaxation of the common notion of approximation sets. Instead of approximating each image of a feasible solution by the image of some solution in the approximation set up to a multiplicative factor in each component, a convex approximation set only requires this multiplicative approximation to be achieved by some convex combination of finitely many images of solutions in the set. This makes convex approximation sets efficiently computable for a wide range of multiobjective problems - even for many problems for which (classic) approximations sets are hard to compute. In this article, we propose a polynomial-time algorithm to compute convex approximation sets that builds upon an exact or approximate algorithm for the weighted sum scalarization and is, therefore, applicable to a large variety of multiobjective optimization problems. The provided convex approximation quality is arbitrarily close to the approximation quality of the underlying algorithm for the weighted sum scalarization. In essence, our algorithm can be interpreted as an approximate variant of the dual variant of Benson's Outer Approximation Algorithm. Thus, in contrast to existing convex approximation algorithms from the literature, information on solutions obtained during the approximation process is utilized to significantly reduce both the practical running time and the cardinality of the returned solution sets while still guaranteeing the same worst-case approximation quality. We underpin these advantages by the first comparison of all existing convex approximation algorithms on several instances of the triobjective knapsack problem and the triobjective symmetric metric traveling salesman problem.

11.The Cooperative Maximum Capture Facility Location Problem

2305.15169

Authors:Concepción Domínguez, Ricardo Gázquez, Juan Miguel Morales, Salvador Pineda

Abstract: In the Maximum Capture Facility Location (MCFL) problem with a binary choice rule, a company intends to locate a series of facilities to maximize the captured demand, and customers patronize the facility that maximizes their utility. In this work, we generalize the MCFL problem assuming that the facilities of the decision maker act cooperatively to increase the customers' utility over the company. We propose a utility maximization rule between the captured utility of the decision maker and the opt-out utility of a competitor already installed in the market. Furthermore, we model the captured utility by means of an Ordered Median function (OMf) of the partial utilities of newly open facilities. We name this problem "the Cooperative Maximum Capture Facility Location problem" (CMCFL). The OMf serves as a means to compute the utility of each customer towards the company as an aggregation of ordered partial utilities, and constitutes a unifying framework for CMCFL models. We introduce a multiperiod non-linear bilevel formulation for the CMCFL with an embedded assignment problem characterizing the captured utilities. For this model, two exact resolution approaches are presented: a MILP reformulation with valid inequalities and an effective approach based on Benders' decomposition. Extensive computational experiments are provided to test our results with randomly generated data and an application to the location of charging stations for electric vehicles in the city of Trois-Rivi\`eres, Qu\`ebec, is addressed.

12.Using Scalarizations for the Approximation of Multiobjective Optimization Problems: Towards a General Theory

2305.15173

Authors:Stephan Helfrich, Arne Herzel, Stefan Ruzika, Clemens Thielen

Abstract: We study the approximation of general multiobjective optimization problems with the help of scalarizations. Existing results state that multiobjective minimization problems can be approximated well by norm-based scalarizations. However, for multiobjective maximization problems, only impossibility results are known so far. Countering this, we show that all multiobjective optimization problems can, in principle, be approximated equally well by scalarizations. In this context, we introduce a transformation theory for scalarizations that establishes the following: Suppose there exists a scalarization that yields an approximation of a certain quality for arbitrary instances of multiobjective optimization problems with a given decomposition specifying which objective functions are to be minimized / maximized. Then, for each other decomposition, our transformation yields another scalarization that yields the same approximation quality for arbitrary instances of problems with this other decomposition. In this sense, the existing results about the approximation via scalarizations for minimization problems carry over to any other objective decomposition -- in particular, to maximization problems -- when suitably adapting the employed scalarization. We further provide necessary and sufficient conditions on a scalarization such that its optimal solutions achieve a constant approximation quality. We give an upper bound on the best achievable approximation quality that applies to general scalarizations and is tight for the majority of norm-based scalarizations applied in the context of multiobjective optimization. As a consequence, none of these norm-based scalarizations can induce approximation sets for optimization problems with maximization objectives, which unifies and generalizes the existing impossibility results concerning the approximation of maximization problems.

13.A Privacy-Preserving Finite-Time Push-Sum based Gradient Method for Distributed Optimization over Digraphs

2305.15202

Authors:Xiaomeng Chen, Wei Jiang, Themistoklis Charalambous, Ling Shi

Abstract: This paper addresses the problem of distributed optimization, where a network of agents represented as a directed graph (digraph) aims to collaboratively minimize the sum of their individual cost functions. Existing approaches for distributed optimization over digraphs, such as Push-Pull, require agents to exchange explicit state values with their neighbors in order to reach an optimal solution. However, this can result in the disclosure of sensitive and private information. To overcome this issue, we propose a state-decomposition-based privacy-preserving finite-time push-sum (PrFTPS) algorithm without any global information such as network size or graph diameter. Then, based on PrFTPS, we design a gradient descent algorithm (PrFTPS-GD) to solve the distributed optimization problem. It is proved that under PrFTPS-GD, the privacy of each agent is preserved and the linear convergence rate related to the optimization iteration number is achieved. Finally, numerical simulations are provided to illustrate the effectiveness of the proposed approach.

14.Error Feedback Shines when Features are Rare

2305.15264

Authors:Peter Richtárik, Elnur Gasanov, Konstantin Burlachenko

Abstract: We provide the first proof that gradient descent $\left({\color{green}\sf GD}\right)$ with greedy sparsification $\left({\color{green}\sf TopK}\right)$ and error feedback $\left({\color{green}\sf EF}\right)$ can obtain better communication complexity than vanilla ${\color{green}\sf GD}$ when solving the distributed optimization problem $\min_{x\in \mathbb{R}^d} {f(x)=\frac{1}{n}\sum_{i=1}^n f_i(x)}$, where $n$ = # of clients, $d$ = # of features, and $f_1,\dots,f_n$ are smooth nonconvex functions. Despite intensive research since 2014 when ${\color{green}\sf EF}$ was first proposed by Seide et al., this problem remained open until now. We show that ${\color{green}\sf EF}$ shines in the regime when features are rare, i.e., when each feature is present in the data owned by a small number of clients only. To illustrate our main result, we show that in order to find a random vector $\hat{x}$ such that $\lVert {\nabla f(\hat{x})} \rVert^2 \leq \varepsilon$ in expectation, ${\color{green}\sf GD}$ with the ${\color{green}\sf Top1}$ sparsifier and ${\color{green}\sf EF}$ requires ${\cal O} \left(\left( L+{\color{blue}r} \sqrt{ \frac{{\color{red}c}}{n} \min \left( \frac{{\color{red}c}}{n} \max_i L_i^2, \frac{1}{n}\sum_{i=1}^n L_i^2 \right) }\right) \frac{1}{\varepsilon} \right)$ bits to be communicated by each worker to the server only, where $L$ is the smoothness constant of $f$, $L_i$ is the smoothness constant of $f_i$, ${\color{red}c}$ is the maximal number of clients owning any feature ($1\leq {\color{red}c} \leq n$), and ${\color{blue}r}$ is the maximal number of features owned by any client ($1\leq {\color{blue}r} \leq d$). Clearly, the communication complexity improves as ${\color{red}c}$ decreases (i.e., as features become more rare), and can be much better than the ${\cal O}({\color{blue}r} L \frac{1}{\varepsilon})$ communication complexity of ${\color{green}\sf GD}$ in the same regime.

15.Mathematical Models and Exact Algorithms for the Colored Bin Packing Problem

2305.15291

Authors:Yulle G. F. Borges, Rafael C. S. Schouery, Flávio K. Miyazawa

Abstract: This paper focuses on exact approaches for the Colored Bin Packing Problem (CBPP), a generalization of the classical one-dimensional Bin Packing Problem in which each item has, in addition to its length, a color, and no two items of the same color can appear consecutively in the same bin. To simplify modeling, we present a characterization of any feasible packing of this problem in a way that does not depend on its ordering. Furthermore, we present four exact algorithms for the CBPP. First, we propose a generalization of Val\'erio de Carvalho's arc flow formulation for the CBPP using a graph with multiple layers, each representing a color. Second, we present an improved arc flow formulation that uses a more compact graph and has the same linear relaxation bound as the first formulation. And finally, we design two exponential set-partition models based on reductions to a generalized vehicle routing problem, which are solved by a branch-cut-and-price algorithm through VRPSolver. To compare the proposed algorithms, a varied benchmark set with 574 instances of the CBPP is presented. Results show that the best model, our improved arc flow formulation, was able to solve over 62% of the proposed instances to optimality, the largest of which with 500 items and 37 colors. While being able to solve fewer instances in total, the set-partition models exceeded their arc flow counterparts in instances with a very small number of colors.

16.Mean field type control with species dependent dynamics via structured tensor optimization

2305.15292

Authors:Axel Ringh, Isabel Haasler, Yongxin Chen, Johan Karlsson

Abstract: In this work we consider mean field type control problems with multiple species that have different dynamics. We formulate the discretized problem using a new type of entropy-regularized multimarginal optimal transport problems where the cost is a decomposable structured tensor. A novel algorithm for solving such problems is derived, using this structure and leveraging recent results in entropy-regularized optimal transport. The algorithm is then demonstrated on a numerical example in robot coordination problem for search and rescue, where three different types of robots are used to cover a given area at minimal cost.

17.Inverse optimal control for averaged cost per stage linear quadratic regulators

2305.15332

Authors:Han Zhang, Axel Ringh

Abstract: Inverse Optimal Control (IOC) is a powerful framework for learning a behaviour from observations of experts. The framework aims to identify the underlying cost function that the observed optimal trajectories (the experts' behaviour) are optimal with respect to. In this work, we considered the case of identifying the cost and the feedback law from observed trajectories generated by an ``average cost per stage" linear quadratic regulator. We show that identifying the cost is in general an ill-posed problem, and give necessary and sufficient conditions for non-identifiability. Moreover, despite the fact that the problem is in general ill-posed, we construct an estimator for the cost function and show that the control gain corresponding to this estimator is a statistically consistent estimator for the true underlying control gain. In fact, the constructed estimator is based on convex optimization, and hence the proved statistical consistency is also observed in practice. We illustrate the latter by applying the method on a simulation example from rehabilitation robotics.

18.Algorithms for the Bin Packing Problem with Scenarios

2305.15351

Authors:Yulle G. F. Borges, Vinícius L. de Lima, Flávio K. Miyazawa, Lehilton L. C. Pedrosa, Thiago A. de Queiroz, Rafael C. S. Schouery

Abstract: This paper presents theoretical and practical results for the bin packing problem with scenarios, a generalization of the classical bin packing problem which considers the presence of uncertain scenarios, of which only one is realized. For this problem, we propose an absolute approximation algorithm whose ratio is bounded by the square root of the number of scenarios times the approximation ratio for an algorithm for the vector bin packing problem. We also show how an asymptotic polynomial-time approximation scheme is derived when the number of scenarios is constant. As a practical study of the problem, we present a branch-and-price algorithm to solve an exponential model and a variable neighborhood search heuristic. To speed up the convergence of the exact algorithm, we also consider lower bounds based on dual feasible functions. Results of these algorithms show the competence of the branch-and-price in obtaining optimal solutions for about 59% of the instances considered, while the combined heuristic and branch-and-price optimally solved 62% of the instances considered.

19.LQG Risk-Sensitive Mean Field Games with a Major Agent: A Variational Approach

2305.15364

Authors:Hanchao Liu, Dena Firoozi, Michèle Breton

Abstract: Risk sensitivity plays an important role in the study of finance and economics as risk-neutral models cannot capture and justify all economic behaviors observed in reality. Risk-sensitive mean field game theory was developed recently for systems where there exists a large number of indistinguishable, asymptotically negligible and heterogeneous risk-sensitive players, who are coupled via the empirical distribution of state across population. In this work, we extend the theory of Linear Quadratic Gaussian risk-sensitive mean-field games to the setup where there exists one major agent as well as a large number of minor agents. The major agent has a significant impact on each minor agent and its impact does not collapse with the increase in the number of minor agents. Each agent is subject to linear dynamics with an exponential-of-integral quadratic cost functional. Moreover, all agents interact via the average state of minor agents (so-called empirical mean field) and the major agent's state. We develop a variational analysis approach to derive the best response strategies of agents in the limiting case where the number of agents goes to infinity. We establish that the set of obtained best-response strategies yields a Nash equilibrium in the limiting case and an $\varepsilon$-Nash equilibrium in the finite player case. We conclude the paper with an illustrative example.

Tue, 23 May 2023digest

1.One-step differentiation of iterative algorithms

2305.13768

Authors:Jérôme Bolte, Edouard Pauwels, Samuel Vaiter

Abstract: In appropriate frameworks, automatic differentiation is transparent to the user at the cost of being a significant computational burden when the number of operations is large. For iterative algorithms, implicit differentiation alleviates this issue but requires custom implementation of Jacobian evaluation. In this paper, we study one-step differentiation, also known as Jacobian-free backpropagation, a method as easy as automatic differentiation and as performant as implicit differentiation for fast algorithms (e.g., superlinear optimization methods). We provide a complete theoretical approximation analysis with specific examples (Newton's method, gradient descent) along with its consequences in bilevel optimization. Several numerical examples illustrate the well-foundness of the one-step estimator.

2.Linear Boundary Port-Hamiltonian Systems with Implicitly Defined Energy

2305.13772

Authors:Bernhard Maschke, Arjan van der Schaft

Abstract: In this paper we extend the previously introduced class of boundary port-Hamiltonian systems to boundary control systems where the variational derivative of the Hamiltonian functional is replaced by a pair of reciprocal differential operators. In physical systems modelling, these differential operators naturally represent the constitutive relations associated with the implicitly defined energy of the system and obey Maxwell's reciprocity conditions. On top of the boundary variables associated with the Stokes-Dirac structure, this leads to additional boundary port variables and to the new notion of a Stokes-Lagrange subspace. This extended class of boundary port-Hamiltonian systems is illustrated by a number of examples in the modelling of elastic rods with local and non-local elasticity relations. Finally it shown how a Hamiltonian functional on an extended state space can be associated with the Stokes-Lagrange subspace, and how this leads to an energy balance equation involving the boundary variables of the Stokes-Dirac structure as well as of the Stokes-Lagrange subspace.

3.Distributed Inexact Newton Method with Adaptive Step Sizes

2305.13985

Authors:Dusan Jakovetic, Natasa Krejic, Greta Malaspina

Abstract: We consider two formulations for distributed optimization wherein $N$ agents in a generic connected network solve a problem of common interest: distributed personalized optimization and consensus optimization. A new method termed DINAS (Distributed Inexact Newton method with Adaptive Stepsize) is proposed. DINAS employs large adaptively computed step-sizes, requires a reduced global parameters knowledge with respect to existing alternatives, and can operate without any local Hessian inverse calculations nor Hessian communications. When solving personalized distributed learning formulations, DINAS achieves quadratic convergence with respect to computational cost and linear convergence with respect to communication cost, the latter rate being independent of the local functions condition numbers or of the network topology. When solving consensus optimization problems, DINAS is shown to converge to the global solution. Extensive numerical experiments demonstrate significant improvements of DINAS over existing alternatives. As a result of independent interest, we provide for the first time convergence analysis of the Newton method with the adaptive Polyak's step-size when the Newton direction is computed inexactly in centralized environment.

4.The Ensemble Approach of Column Generation for Solving Cutting Stock Problems

2305.14055

Authors:Mingjie Hu, Jie Yan, Liting Chen, Qingwei Lin

Abstract: This paper investigates the column generation (CG) for solving cutting stock problems (CSP). Traditional CG method, which repeatedly solves a restricted master problem (RMP), often suffers from two critical issues in practice -- the loss of solution quality introduced by linear relaxation of both feasible domain and objective and the high time cost of last iterations close to convergence. We empirically find that the first issue is common in ordinary CSPs with linear cutting constraints, while the second issue is especially severe in CSPs with nonlinear cutting constraints that are often generated by approximating chance constraints. We propose an alternative approach, ensembles of multiple column generation processes. In particular, we present two methods -- \mc (multi-column) which return multiple feasible columns in each RMP iteration, and \mt (multi-path) which restarts the RMP iterations from different initialized column sets once the iteration time exceeds a given time limit. The ideas behind are same: leverage the multiple column generation pathes to compensate the loss induced by relaxation, and add earlier sub-optimal columns to accelerate convergence of RMP iterations. Besides, we give theoretical analysis on performance improvement guarantees. Experiments on cutting stock problems demonstrate that compared to traditional CG, our method achieves significant run-time reduction on CSPs with nonlinear constraints, and dramatically improves the ratio of solve-to-optimal on CSPs with linear constraints.

5.An Equivalent Circuit Workflow for Unconstrained Optimization

2305.14061

Authors:Aayushya Agarwal, Carmel Fiscko, Soummya Kar, Larry Pileggi, Bruno Sinopoli

Abstract: We introduce a new workflow for unconstrained optimization whereby objective functions are mapped onto a physical domain to more easily design algorithms that are robust to hyperparameters and achieve fast convergence rates. Specifically, we represent optimization problems as an equivalent circuit that are then solved solely as nonlinear circuits using robust solution methods. The equivalent circuit models the trajectory of component-wise scaled gradient flow problem as the transient response of the circuit for which the steady-state coincides with a critical point of the objective function. The equivalent circuit model leverages circuit domain knowledge to methodically design new optimization algorithms that would likely not be developed without a physical model. We incorporate circuit knowledge into optimization methods by 1) enhancing the underlying circuit model for fast numerical analysis, 2) controlling the optimization trajectory by designing the nonlinear circuit components, and 3) solving for step sizes using well-known methods from the circuit simulation. We first establish the necessary conditions that the controls must fulfill for convergence. We show that existing descent algorithms can be re-derived as special cases of this approach and derive new optimization algorithms that are developed with insights from a circuit-based model. The new algorithms can be designed to be robust to hyperparameters, achieve convergence rates comparable or faster than state of the art methods, and are applicable to optimizing a variety of both convex and nonconvex problems.

6.Revisiting Subgradient Method: Complexity and Convergence Beyond Lipschitz Continuity

2305.14161

Authors:Xiao Li, Lei Zhao, Daoli Zhu, Anthony Man-Cho So

Abstract: The subgradient method is one of the most fundamental algorithmic schemes for nonsmooth optimization. The existing complexity and convergence results for this algorithm are mainly derived for Lipschitz continuous objective functions. In this work, we first extend the typical complexity results for the subgradient method to convex and weakly convex minimization without assuming Lipschitz continuity. Specifically, we establish $\mathcal{O}(1/\sqrt{T})$ bound in terms of the suboptimality gap ``$f(x) - f^*$'' for convex case and $\mathcal{O}(1/{T}^{1/4})$ bound in terms of the gradient of the Moreau envelope function for weakly convex case. Furthermore, we provide convergence results for non-Lipschitz convex and weakly convex objective functions using proper diminishing rules on the step sizes. In particular, when $f$ is convex, we show $\mathcal{O}(\log(k)/\sqrt{k})$ rate of convergence in terms of the suboptimality gap. With an additional quadratic growth condition, the rate is improved to $\mathcal{O}(1/k)$ in terms of the squared distance to the optimal solution set. When $f$ is weakly convex, asymptotic convergence is derived. The central idea is that the dynamics of properly chosen step sizes rule fully controls the movement of the subgradient method, which leads to boundedness of the iterates, and then a trajectory-based analysis can be conducted to establish the desired results. To further illustrate the wide applicability of our framework, we extend the complexity results to the truncated subgradient, the stochastic subgradient, the incremental subgradient, and the proximal subgradient methods for non-Lipschitz functions.

Mon, 22 May 2023digest

1.Chain recurrence and Selgrade`s theorem for affine flows

2305.12758

Authors:Fritz Colonius, Alexandre J. Santana

Abstract: Affine flows on vector bundles with chain transitive base flow are lifted to linear flows and the decomposition into exponentially separated subbundles provided by Selgrade's theorem is determined. The results are illustrated by an application to affine control systems with bounded control range.

2.Multi-task Combinatorial Optimization: Adaptive Multi-modality Knowledge Transfer by an Explicit Inter-task Distance

2305.12807

Authors:Peng Li, Bo Liu

Abstract: Scheduling problems are often tackled independently, and rarely solved by leveraging the commonalities across problems. Lack of awareness of this inter-task similarity could impede the search efficacy. A quantifiable relationship between scheduling problems is to-date rather unclear, how to leverage it in combinatorial optimization remains largely unknown, and its effects on search are also undeterminable. This paper addresses these hard questions by delving into quantifiable useful inter-task relationships and, through leveraging the explicit relationship, presenting a speed-up algorithm. After deriving an analytical inter-task distance metric to quantitatively reveal latent similarity across scheduling problems, an adaptive transfer of multi-modality knowledge is devised to promptly adjust the transfer in forms of explicit and implicit knowledge in response to heterogeneity in the inter-task discrepancy. For faintly related problems with disappearing dependences, a problem transformation function is suggested with a matching-feature-based greedy policy, and the function projects faintly related problems into a latent space where these problems gain similarity in a way that creates search speed-ups. Finally, a multi-task scatter search combinatorial algorithm is formed and a large-scale multi-task benchmark is generated serving the purposes of validation. That the algorithm exhibits dramatic speed-ups of 2~3 orders of magnitude, as compared to direct problem solving in strongly related problems and 3 times faster in weakly related ones, suggests leveraging commonality across problems could be successful.

3.Robust data-driven Lyapunov analysis with fixed data

2305.12813

Authors:Yingzhao Lian, Matteo Tacchi, Colin Jones

Abstract: In this era of digitalization, data has widely been used in control engineering. While stability analysis is a mainstay for control science, most stability analysis tools still require explicit knowledge of the model or a high-fidelity simulator. In this work, a new data-driven Lyapunov analysis framework is proposed. Without using the model or its simulator, the proposed approach can learn a piece-wise affine Lyapunov function with a finite and fixed off-line dataset. The learnt Lyapunov function is robust to any dynamics that are consistent with the off-line dataset. Along the development of proposed scheme, the Lyapunov stability criterion is generalized. This generalization enables an iterative algorithm to augment the region of attraction.

4.Non-uniform Grid Refinement for the Combinatorial Integral Approximation

2305.12846

Authors:Felix Bestehorn, Christoph Hansknecht, Christian Kirches, Paul Manns

Abstract: The combinatorial integral approximation (CIA) is a solution technique for integer optimal control problems. In order to regularize the solutions produced by CIA, one can minimize switching costs in one of its algorithmic steps. This leads to combinatorial optimization problems, which are called switching cost aware rounding problems (SCARP). They can be solved efficiently on one-dimensional domains but no efficient solution algorithms have been found so far for multi-dimensional domains. The CIA problem formulation depends on a discretization grid. We propose to reduce the number of variables and thus improve the computational tractability of SCARP by means of a non-uniform grid refinement strategy. We prove that the grid refinement preserves the approximation properties of the combinatorial integral approximation. Computational results are offered to show that the proposed approach is able to achieve, within a prescribed time limit, smaller duality gaps that does the uniform approach. For several large instances, a dual bound could only be obtained through adaptivity.

5.Variance Decay Property for Filter Stability

2305.12850

Authors:Jin Won Kim, Prashant G. Mehta

Abstract: This paper is concerned with the problem of nonlinear (stochastic) filter stability of a hidden Markov model (HMM) with white noise observations. The main contribution is the variance decay property which is used to conclude filter stability. The property is closely inspired by the Poincar\'e inequality (PI) in the study of stochastic stability of Markov processes. In this paper, the property is related to both the ergodicity of the Markov process as well as the observability of the HMM. The proofs are based upon a recently discovered minimum variance duality which is used to transform the nonlinear filtering problem into a stochastic optimal control problem for a backward stochastic differential equation (BSDE).

6.An output-polynomial time algorithm to determine all supported efficient solutions for multi-objective integer network flow problems

2305.12867

Authors:David Könen, Michael Stiglmayr

Abstract: This paper addresses the problem of enumerating all supported efficient solutions for a linear multi-objective integer minimum cost flow problem (MOIMCF). First, we highlight an inconsistency in various definitions of supported nondominated vectors for multi-objective integer linear programs (MOILP). Several characterizations for supported nondominated vectors/efficient solutions are used in the literature, which are equivalent in the non-integer case. However, they may lead to different sets of supported nondominated vectors/efficient solutions for MOILPs. This motivates us to summarize equivalent definitions and characterizations for supported efficient solutions and to distinguish between supported and weakly supported efficient solutions. In this paper we derive an output-polynomial time algorithm to determine all supported efficient solutions for MOIMCF problems. This is the first approach that solves this general problem in output-polynomial time. Moreover, we prove that the existence of an output-polynomial time algorithm to determine all weakly supported nondominated vectors (or all weakly supported efficient solutions) for a MOIMCF problem with a fixed number of d>3 objectives can be excluded, unless P = NP.

7."Good Lie Brackets" for Control Affine Systems

2305.12879

Authors:Andrei Agrachev

Abstract: We consider a smooth system of the form $\dot q=f_0(q)+\sum\limits_{i=1}^k u_i f_i(q)$, $q\in M,\ u_i\in\mathbb R,$ and study controllability issues on the group of diffeomorphisms of $M$. It is well-known that the system can arbitrarily well approximate the movement in the direction of any Lie bracket polynomial of $f_1,\ldots,f_k$. Any Lie bracket polynomial of $f_1,\ldots,f_k$ is good in this sense. Moreover, some combinations of Lie brackets which involve the drift term $f_0$ are also good but surely not all of them. In this paper we try to characterize good ones and, in particular, all universal good combinations, which are good for any nilpotent truncation of any system.

8.On the online path extension problem -- Location and routing problems in board games

2305.12898

Authors:Konstantin Kraus, Kathrin Klamroth, Michael Stiglmayr

Abstract: We consider an online version of a longest path problem in an undirected and planar graph that is motivated by a location and routing problem occurring in the board game "Turn & Taxis". Path extensions have to be selected based on only partial knowledge on the order in which nodes become available in later iterations. Besides board games, online path extension problems have applications in disaster relief management when infrastructure has to be rebuilt after natural disasters. For example, flooding may affect large parts of a road network, and parts of the network may become available only iteratively and decisions may have to be made without the possibility of planning ahead. We suggest and analyse selection criteria that identify promising nodes (locations) for path extensions. We introduce the concept of tentacles of paths as an indicator for the future extendability. Different initialization and extension heuristics are suggested on compared to an ideal solution that is obtained by an integer linear programming formulation assuming complete knowledge, i.e., assuming that the complete sequence in which nodes become available is known beforehand. All algorithms are tested and evaluated on the original "Turn & Taxis" graph, and on an extended version of the "Turn & Taxis" graph, with different parameter settings. The numerical results confirm that the number of tentacles is a useful criterion when selecting path extensions, leading to near-optimal paths at relatively low computational costs.

9.Entropy bounds for invariant measure perturbations in stochastic systems with uncertain noise

2305.12936

Authors:Igor G. Vladimirov

Abstract: This paper is concerned with stochastic systems whose state is a diffusion process governed by an Ito stochastic differential equation (SDE). In the framework of a nominal white-noise model, the SDE is driven by a standard Wiener process. For a scenario of statistical uncertainty, where the driving noise acquires a state-dependent drift and thus deviates from its idealised model, we consider the perturbation of the invariant probability density function (PDF) as a steady-state solution of the Fokker-Planck-Kolmogorov equation. We discuss an upper bound on a logarithmic Dirichlet form for the ratio of the invariant PDF to its nominal counterpart in terms of the Kullback-Leibler relative entropy rate of the actual noise distribution with respect the Wiener measure. This bound is shown to be achievable, provided the PDF ratio is preserved by the nominal steady-state probability flux. The logarithmic Dirichlet form bound is used in order to obtain an upper bound on the relative entropy of the perturbed invariant PDF in terms of quadratic-exponential moments of the noise drift in the uniform ellipticity case. These results are illustrated for perturbations of Gaussian invariant measures in linear stochastic systems involving linear noise drifts.

10.Generalized Polyak Step Size for First Order Optimization with Momentum

2305.12939

Authors:Xiaoyu Wang, Mikael Johansson, Tong Zhang

Abstract: In machine learning applications, it is well known that carefully designed learning rate (step size) schedules can significantly improve the convergence of commonly used first-order optimization algorithms. Therefore how to set step size adaptively becomes an important research question. A popular and effective method is the Polyak step size, which sets step size adaptively for gradient descent or stochastic gradient descent without the need to estimate the smoothness parameter of the objective function. However, there has not been a principled way to generalize the Polyak step size for algorithms with momentum accelerations. This paper presents a general framework to set the learning rate adaptively for first-order optimization methods with momentum, motivated by the derivation of Polyak step size. It is shown that the resulting methods are much less sensitive to the choice of momentum parameter and may avoid the oscillation of the heavy-ball method on ill-conditioned problems. These adaptive step sizes are further extended to the stochastic settings, which are attractive choices for stochastic gradient descent with momentum. Our methods are demonstrated to be more effective for stochastic gradient methods than prior adaptive step size algorithms in large-scale machine learning tasks.

11.Improved Dynamic Regret of Distributed Online Multiple Frank-Wolfe Convex Optimization

2305.12957

Authors:Wentao Zhang, Yang Shi, Baoyong Zhang, Deming Yuan

Abstract: In this paper, we consider a distributed online convex optimization problem over a time-varying multi-agent network. The goal of this network is to minimize a global loss function through local computation and communication with neighbors. To effectively handle the optimization problem with a high-dimensional and complicated constraint set, we develop a distributed online multiple Frank-Wolfe algorithm to avoid the expensive computational cost of projection operation. The dynamic regret bounds are established as $\mathcal{O}(T^{1-\gamma}+H_T)$ with the linear oracle number $\mathcal{O} (T^{1+\gamma})$, which depends on the horizon (total iteration number) $T$, the function variation $H_T$, and the tuning parameter $0<\gamma<1$. In particular, when the stringent computation requirement is satisfied, the bound can be enhanced to $\mathcal{O} (1+H_T)$. Moreover, we illustrate the significant advantages of the multiple iteration technique and reveal a trade-off between computational cost and dynamic regret bound. Finally, the performance of our algorithm is verified and compared through the distributed online ridge regression problems with two constraint sets.

12.Sketch-and-Project Meets Newton Method: Global $\mathcal O(k^{-2})$ Convergence with Low-Rank Updates

2305.13082

Authors:Slavomír Hanzely

Abstract: In this paper, we propose the first sketch-and-project Newton method with fast $\mathcal O(k^{-2})$ global convergence rate for self-concordant functions. Our method, SGN, can be viewed in three ways: i) as a sketch-and-project algorithm projecting updates of Newton method, ii) as a cubically regularized Newton ethod in sketched subspaces, and iii) as a damped Newton method in sketched subspaces. SGN inherits best of all three worlds: cheap iteration costs of sketch-and-project methods, state-of-the-art $\mathcal O(k^{-2})$ global convergence rate of full-rank Newton-like methods and the algorithm simplicity of damped Newton methods. Finally, we demonstrate its comparable empirical performance to baseline algorithms.

13.The Minimizer of the Sum of Two Strongly Convex Functions

2305.13134

Authors:Kananart Kuwaranancharoen, Shreyas Sundaram

Abstract: The problem of finding the minimizer of a sum of convex functions is central to the field of optimization. In cases where the functions themselves are not fully known (other than their individual minimizers and convexity parameters), it is of interest to understand the region containing the potential minimizers of the sum based only on those known quantities. Characterizing this region in the case of multivariate strongly convex functions is far more complicated than the univariate case. In this paper, we provide both outer and inner approximations for the region containing the minimizer of the sum of two strongly convex functions, subject to a constraint on the norm of the gradient at the minimizer of the sum. In particular, we explicitly characterize the boundary and interior of both outer and inner approximations. Interestingly, the boundaries as well as the interiors turn out to be identical and we show that the boundary of the region containing the potential minimizers is also identical to that of the outer and inner approximations.

14.SignSVRG: fixing SignSGD via variance reduction

2305.13187

Authors:Evgenii Chzhen, Sholom Schechtman

Abstract: We consider the problem of unconstrained minimization of finite sums of functions. We propose a simple, yet, practical way to incorporate variance reduction techniques into SignSGD, guaranteeing convergence that is similar to the full sign gradient descent. The core idea is first instantiated on the problem of minimizing sums of convex and Lipschitz functions and is then extended to the smooth case via variance reduction. Our analysis is elementary and much simpler than the typical proof for variance reduction methods. We show that for smooth functions our method gives $\mathcal{O}(1 / \sqrt{T})$ rate for expected norm of the gradient and $\mathcal{O}(1/T)$ rate in the case of smooth convex functions, recovering convergence results of deterministic methods, while preserving computational advantages of SignSGD.

15.Ground truth clustering is not the optimum clustering

2305.13218

Authors:Lucia Absalom Bautista, Timotej Hrga, Janez Povh, Shudian Zhao

Abstract: The clustering of data is one of the most important and challenging topics in data science. The minimum sum-of-squares clustering (MSSC) problem asks to cluster the data points into $k$ clusters such that the sum of squared distances between the data points and their cluster centers (centroids) is minimized. This problem is NP-hard, but there exist exact solvers that can solve such problem to optimality for small or medium size instances. In this paper, we use a branch-and-bound solver based on semidefinite programming relaxations called SOS-SDP to compute the optimum solutions of the MSSC problem for various $k$ and for multiple datasets, with real and artificial data, for which the data provider has provided ground truth clustering. Next, we use several extrinsic and intrinsic measures to evaluate how the optimum clustering and ground truth clustering matches, and how well these clusterings perform with respect to the criteria underlying the intrinsic measures. Our calculations show that the ground truth clusterings are generally far from the optimum solution to the MSSC problem. Moreover, the intrinsic measures evaluated on the ground truth clusterings are generally significantly worse compared to the optimum clusterings. However, when the ground truth clustering is in the form of convex sets, e.g., ellipsoids, that are well separated from each other, the ground truth clustering comes very close to the optimum clustering.

Fri, 19 May 2023digest

1.The Barzilai-Borwein Method for Distributed Optimization over Unbalanced Directed Networks

2305.11469

Authors:Jinhui Hu, Xin Chen, Lifeng Zheng, Ling Zhang, Huaqing Li

Abstract: This paper studies optimization problems over multi-agent systems, in which all agents cooperatively minimize a global objective function expressed as a sum of local cost functions. Each agent in the systems uses only local computation and communication in the overall process without leaking their private information. Based on the Barzilai-Borwein (BB) method and multi-consensus inner loops, a distributed algorithm with the availability of larger stepsizes and accelerated convergence, namely ADBB, is proposed. Moreover, owing to employing only row-stochastic weight matrices, ADBB can resolve the optimization problems over unbalanced directed networks without requiring the knowledge of neighbors' out-degree for each agent. Via establishing contraction relationships between the consensus error, the optimality gap, and the gradient tracking error, ADBB is theoretically proved to converge linearly to the globally optimal solution. A real-world data set is used in simulations to validate the correctness of the theoretical analysis.

2.Accelerating Convergence in Global Non-Convex Optimization with Reversible Diffusion

2305.11493

Authors:Ryo Fujino

Abstract: Langevin Dynamics has been extensively employed in global non-convex optimization due to the concentration of its stationary distribution around the global minimum of the potential function at low temperatures. In this paper, we propose to utilize a more comprehensive class of stochastic processes, known as reversible diffusion, and apply the Euler-Maruyama discretization for global non-convex optimization. We design the diffusion coefficient to be larger when distant from the optimum and smaller when near, thus enabling accelerated convergence while regulating discretization error, a strategy inspired by landscape modifications. Our proposed method can also be seen as a time change of Langevin Dynamics, and we prove convergence with respect to KL divergence, investigating the trade-off between convergence speed and discretization error. The efficacy of our proposed method is demonstrated through numerical experiments.

3.Dynamic Routing for the Electric Vehicle Shortest Path Problem with Charging Station Occupancy Information

2305.11773

Authors:Mohsen Dastpak, Fausto Errico, Ola Jabali, Federico Malucelli

Abstract: We study EVs traveling from origin to destination in the shortest time, focusing on long-distance settings with energy requirements exceeding EV autonomy. The EV may charge its battery at public Charging Stations (CSs), which are subject to uncertain waiting times. We model CSs using appropriately defined queues, whose status is revealed upon the EV arrival. However, we consider the availability of real-time binary Occupancy Indicator (OI) information, signaling if a CS is busy or not. At each OI update, we determine the sequence of CSs to visit along with associated charging quantities. We name the resulting problem the Electric Vehicle Shortest Path Problem with charging station Occupancy Indicator information (EVSPP-OI). In this problem, we consider that the EV is allowed to partially charge its battery, and we model charging times via piecewise linear charging functions that depend on the CS technology. We propose an MDP formulation for the EVSPP-OI and develop a reoptimization algorithm that establishes the sequence of CS visits and charging amounts based on system updates. Specifically, we propose a simulation-based approach to estimate the waiting time of the EV at a CS as a function of its arrival time. As the path to a CS may consist of multiple intermediate CS stops, estimating the arrival times at each CS is fairly intricate. To this end, we propose an efficient heuristic that yields approximate lower bounds on the arrival time of the EV at each CS. We use these estimations to define a deterministic EVSPP, which we solve with an existing algorithm. We conduct a comprehensive computational study and compare the performance of our methodology with a benchmark that observes the status of CSs only upon arrival. Results show that our method reduces waiting times and total trip duration by an average of 23.7%-95.4% and 1.4%-18.5%, respectively.

4.Multi-Objective Optimization Using the R2 Utility

2305.11774

Authors:Ben Tu, Nikolas Kantas, Robert M. Lee, Behrang Shafei

Abstract: The goal of multi-objective optimization is to identify a collection of points which describe the best possible trade-offs between the multiple objectives. In order to solve this vector-valued optimization problem, practitioners often appeal to the use of scalarization functions in order to transform the multi-objective problem into a collection of single-objective problems. This set of scalarized problems can then be solved using traditional single-objective optimization techniques. In this work, we formalise this convention into a general mathematical framework. We show how this strategy effectively recasts the original multi-objective optimization problem into a single-objective optimization problem defined over sets. An appropriate class of objective functions for this new problem is the R2 utility function, which is defined as a weighted integral over the scalarized optimization problems. We show that this utility function is a monotone and submodular set function, which can be optimised effectively using greedy optimization algorithms. We analyse the performance of these greedy algorithms both theoretically and empirically. Our analysis largely focusses on Bayesian optimization, which is a popular probabilistic framework for black-box optimization.

5.Small-time global approximate controllability of bilinear wave equations

2305.11794

Authors:Eugenio Pozzoli

Abstract: We consider a bilinear control problem for the wave equation on a torus of arbitrary dimension. We show that the system is globally approximately controllable in arbitrarily small times from a dense family of initial states. The control strategy is explicit, and based on a small-time limit of conjugated dynamics to move along non-directly accessible directions (a.k.a. Lie brackets of the generators).

6.A Theory of First Order Mean Field Type Control Problems and their Equations

2305.11848

Authors:Alain Bensoussan, Tak Kwong Wong, Sheung Chi Phillip Yam, Hongwei Yuan

Abstract: In this article, by using several new crucial {\it a priori} estimates which are still absent in the literature, we provide a comprehensive resolution of the first order generic mean field type control problems and also establish the global-in-time classical solutions of their Bellman and master equations. Rather than developing the analytical approach via tackling the Bellman and master equation directly, we apply the maximum principle approach by considering the induced forward-backward ordinary differential equation (FBODE) system; indeed, we first show the local-in-time unique existence of the solution of the FBODE system for a variety of terminal data by Banach fixed point argument, and then provide crucial a priori estimates of bounding the sensitivity of the terminal data for the backward equation by utilizing a monotonicity condition that can be deduced from the positive definiteness of the Schur complement of the Hessian matrix of the Lagrangian in the lifted version and manipulating first order condition appropriately; this uniform bound over the whole planning horizon $[0, T]$ allows us to partition $[0, T]$ into a number of sub-intervals with a common small length and then glue the consecutive local-in-time solutions together to form the unique global-in-time solution of the FBODE system. The regularity of the global-in-time solution follows from that of the local ones due to the regularity assumptions on the coefficient functions. Moreover, the regularity of the value function will also be shown with the aid of the regularity of the solution couple of the FBODE system and the regularity assumptions on the coefficient functions, with which we can further deduce that this value function and its linear functional derivative satisfy the Bellman and master equations, respectively.

Thu, 18 May 2023digest

1.A New Perspective of Accelerated Gradient Methods: The Controlled Invariant Manifold Approach

2305.10756

Authors:Revati Gunjal, Sushama Wagh, Syed Shadab Nayyer, Alex Stankovic, Navdeep M. Singh

Abstract: Gradient Descent (GD) is a ubiquitous algorithm for finding the optimal solution to an optimization problem. For reduced computational complexity, the optimal solution $\mathrm{x^*}$ of the optimization problem must be attained in a minimum number of iterations. For this objective, the paper proposes a genesis of an accelerated gradient algorithm through the controlled dynamical system perspective. The objective of optimally reaching the optimal solution $\mathrm{x^*}$ where $\mathrm{\nabla f(x^*)=0}$ with a given initial condition $\mathrm{x(0)}$ is achieved through control.

2.On the Geometric Convergence of Byzantine-Resilient Distributed Optimization Algorithms

2305.10810

Authors:Kananart Kuwaranancharoen, Shreyas Sundaram

Abstract: The problem of designing distributed optimization algorithms that are resilient to Byzantine adversaries has received significant attention. For the Byzantine-resilient distributed optimization problem, the goal is to (approximately) minimize the average of the local cost functions held by the regular (non adversarial) agents in the network. In this paper, we provide a general algorithmic framework for Byzantine-resilient distributed optimization which includes some state-of-the-art algorithms as special cases. We analyze the convergence of algorithms within the framework, and derive a geometric rate of convergence of all regular agents to a ball around the optimal solution (whose size we characterize). Furthermore, we show that approximate consensus can be achieved geometrically fast under some minimal conditions. Our analysis provides insights into the relationship among the convergence region, distance between regular agents' values, step-size, and properties of the agents' functions for Byzantine-resilient distributed optimization.

3.Optimization Modeling for Pandemic Vaccine Supply Chain Management: A Review

2305.10942

Authors:Shibshankar Dey, Ali Kaan Kurbanzade, Esma S. Gel, Joseph Mihaljevic, Sanjay Mehrotra

Abstract: During various stages of the COVID-19 pandemic, countries implemented diverse vaccine management approaches, influenced by variations in infrastructure and socio-economic conditions. This article provides a comprehensive overview of optimization models developed by the research community throughout the COVID-19 era, aimed at enhancing vaccine distribution and establishing a standardized framework for future pandemic preparedness. These models address critical issues such as site selection, inventory management, allocation strategies, distribution logistics, and route optimization encountered during the COVID-19 crisis. A unified framework is employed to describe the models, emphasizing their integration with epidemiological models to facilitate a holistic understanding.

4.Maximal workload, minimal workload, maximal workload difference: optimizing all criteria at once

2305.11036

Authors:Sébastien Dechamps, Frédéric Meunier

Abstract: In a simple model of assigning workers to tasks, every solution that minimizes the load difference between the most loaded worker and the least loaded one actually minimizes the maximal load and maximizes the minimal load. This can be seen as a consequence of standard results of optimization over polymatroids. We show that similar phenomena still occur in close models, simple to state, and that do not enjoy any polymatroid structure.

Wed, 17 May 2023digest

1.Solving the problem of batch deletion and insertion members in the Logical Key Hierarchy structure by a DC Programming approach

2305.10131

Authors:Hoai An Le Thi, Thi Tuyet Trinh Nguyen

Abstract: In secure group communications, users of a group share a common group key to prevent eavesdropping and protect the exchange content. A key server distributes the group key as well as performs group rekeying whenever the membership changes dynamically. Instead of rekeying after each join or leave request, we use batch rekeying to alleviate the out-of-sync problem and improve the efficiency. In this paper, we propose an optimization approach to the problem of updating group key in the Logical Key Hierarchy (LKH) structure with batch rekeying. A subtree of new nodes can be appended below a leaf node or is replaced the position of leaving node on the binary key tree. The latter has a lower updating key cost than the former since when a member leaves, all the keys on the path from the root to the deletion node must be updated anyway. We aim to minimize the total rekeying cost, which is the cost of deletion and insertion members while keeping the tree as balanced as possible. The mentioned problem is represented by a unified (deterministic) optimization model whose objective function contains discontinuous step functions with binary variables. Thanks to an exact penalty technique, the problem is equivalently reformulated as a standard DC (Difference of Convex functions) program that can be solved efficiently by DCA (DC algorithm). Numerical experiments have been studied intensively to justify the merit of our proposed approach as well as the corresponding DCA.

2.Algorithms for Boolean Matrix Factorization using Integer Programming

2305.10185

Authors:Christos Kolomvakis, Arnaud Vandaele, Nicolas Gillis

Abstract: Boolean matrix factorization (BMF) approximates a given binary input matrix as the product of two smaller binary factors. As opposed to binary matrix factorization which uses standard arithmetic, BMF uses the Boolean OR and Boolean AND operations to perform matrix products, which leads to lower reconstruction errors. BMF is an NP-hard problem. In this paper, we first propose an alternating optimization (AO) strategy that solves the subproblem in one factor matrix in BMF using an integer program (IP). We also provide two ways to initialize the factors within AO. Then, we show how several solutions of BMF can be combined optimally using another IP. This allows us to come up with a new algorithm: it generates several solutions using AO and then combines them in an optimal way. Experiments show that our algorithms (available on gitlab) outperform the state of the art on medium-scale problems.

3.Computing Optimal Strategies for a Search Game in Discrete Locations

2305.10342

Authors:Jake Clarkson, Kyle Y Lin

Abstract: Consider a two-person zero-sum search game between a hider and a searcher. The hider hides among $n$ discrete locations, and the searcher successively visits individual locations until finding the hider. Known to both players, a search at location $i$ takes $t_i$ time units and detects the hider -- if hidden there -- independently with probability $\alpha_i$, for $i=1,\ldots,n$. The hider aims to maximize the expected time until detection, while the searcher aims to minimize it. We present an algorithm to compute an optimal strategy for each player. We demonstrate the algorithm's efficiency in a numerical study, in which we also study the characteristics of the optimal hiding strategy.

4.Cost-Aware Bound Tightening for Constraint Screening in AC OPF

2305.10385

Authors:Mohamed Awadalla, François Bouffard

Abstract: The objective of electric power system operators is to determine cost-effective operating points by resolving optimization problems that include physical and engineering constraints. As empirical evidence and operator experience indicate, only a small portion of these constraints are found to be binding during operations. Several optimization-based methods have been developed to screen out redundant constraints in operational planning problems like the optimal power flow (OPF) problem. These elimination procedures primarily focus on the feasible region and ignore the role played by the problem's objective function. This letter addresses the constraint screening problem using the bound tightening technique in the context of the OPF problem formulated with a full ac power flow characterization. Due to the non-convexity of the ac OPF, we investigate line constraint screening under different convex relaxations of the problem, and we evaluate how the economics of the objective function impacts screening outcomes.

Tue, 16 May 2023digest

1.Push-LSVRG-UP: Distributed Stochastic Optimization over Unbalanced Directed Networks with Uncoordinated Triggered Probabilities

2305.09181

Authors:Jinhui Hu, Guo Chen, Huaqing Li, Zixiang Shen, Weidong Zhang

Abstract: Distributed stochastic optimization, arising in the crossing and integration of traditional stochastic optimization, distributed computing and storage, and network science, has advantages of high efficiency and a low per-iteration computational complexity in resolving large-scale optimization problems. This paper concentrates on resolving a large-scale convex finite-sum optimization problem in a multi-agent system over unbalanced directed networks. To tackle this problem in an efficient way, a distributed consensus optimization algorithm, adopting the push-sum technique and a distributed loopless stochastic variance-reduced gradient (LSVRG) method with uncoordinated triggered probabilities, is developed and named Push-LSVRG-UP. Each agent under this algorithmic framework performs only local computation and communicates only with its neighbors without leaking their private information. The convergence analysis of Push-LSVRG-UP is relied on analyzing the contraction relationships between four error terms associated with the multi-agent system. Theoretical results provide an explicit feasible range of the constant step-size, a linear convergence rate, and an iteration complexity of Push-LSVRG-UP when achieving the globally optimal solution. It is shown that Push-LSVRG-UP achieves the superior characteristics of accelerated linear convergence, fewer storage costs, and a lower per-iteration computational complexity than most existing works. Meanwhile, the introduction of an uncoordinated probabilistic triggered mechanism allows Push-LSVRG-UP to facilitate the independence and flexibility of agents in computing local batch gradients. In simulations, the practicability and improved performance of Push-LSVRG-UP are manifested via resolving two distributed learning problems based on real-world datasets.

2.Optimal Control of McKean-Vlasov equations with controlled stochasticity

2305.09379

Authors:Luca Di Persio, Peter Kuchling

Abstract: In this article, we analyse the existence of an optimal feedback controller of stochastic optimal control problems governed by SDEs which have the control in the diffusion part. To this end, we consider the underlying Fokker-Planck equation to transform the stochastic optimal control problem into a deterministic problem with open-loop controller.

3.Local well-posedness of the Mortensen observer

2305.09382

Authors:Tobias Breiten, Jesper Schröder

Abstract: The analytical background of nonlinear observers based on minimal energy estimation is discussed. It is shown that locally the derivation of the observer equation based on a trajectory with pointwise minimal energy can be done rigorously. The result is obtained by a local sensitivity analysis of the value function based on Pontryagin's maximum principle and the Hamilton-Jacobi-Bellman equation. The consideration of a differential Riccati equation reveals that locally the second derivative of the value function is a positive definite matrix. The local convexity ensures existence of a trajectory minimizing the energy, which is then shown to satisfy the observer equation.

4.Optimizing over trained GNNs via symmetry breaking

2305.09420

Authors:Shiqiang Zhang, Juan S Campos Salazar, Christian Feldmann, David Walz, Frederik Sandfort, Miriam Mathea, Calvin Tsay, Ruth Misener

Abstract: Optimization over trained machine learning models has applications including: verification, minimizing neural acquisition functions, and integrating a trained surrogate into a larger decision-making problem. This paper formulates and solves optimization problems constrained by trained graph neural networks (GNNs). To circumvent the symmetry issue caused by graph isomorphism, we propose two types of symmetry-breaking constraints: one indexing a node 0 and one indexing the remaining nodes by lexicographically ordering their neighbor sets. To guarantee that adding these constraints will not remove all symmetric solutions, we construct a graph indexing algorithm and prove that the resulting graph indexing satisfies the proposed symmetry-breaking constraints. For the classical GNN architectures considered in this paper, optimizing over a GNN with a fixed graph is equivalent to optimizing over a dense neural network. Thus, we study the case where the input graph is not fixed, implying that each edge is a decision variable, and develop two mixed-integer optimization formulations. To test our symmetry-breaking strategies and optimization formulations, we consider an application in molecular design.

5.A BSDE approach to the asymmetric risk-sensitive optimization and its applications

2305.09430

Authors:Mingshang Hu, Shaolin Ji, Rundong Xu, Xiaole Xue

Abstract: In this paper, we propose a formulation to describe a risk-sensitive criterion involving asymmetric risk attitudes toward different risk sources. The introduced criterion can only be defined through quadratic backward stochastic differential equations (BSDE). Before uncovering the mean-variance representation for the introduced asymmetric risk-sensitive criterion by variational approach, some axioms to characterize a variance decomposition of square integrable random variables are provided for the first time. The control problems under the asymmetric risk-sensitive criterion are characterized as a kind of stochastic recursive control problem that includes quadratic BSDEs. Under bounded and unbounded (linear quadratic case) conditions, the stochastic recursive control problems are investigated.

Mon, 15 May 2023digest

1.Optimal harvesting policy for biological resources with uncertain heterogeneity for application in fisheries management

2305.08361

Authors:Hidekazu Yoshioka

Abstract: Conventional harvesting problems for natural resources often assume physiological homogeneity of the body length/weight among individuals. However, such assumptions generally are not valid in real-world problems, where heterogeneity plays an essential role in the planning of biological resource harvesting. Furthermore, it is difficult to observe heterogeneity directly from the available data. This paper presents a novel optimal control framework for the cost-efficient harvesting of biological resources for application in fisheries management. The heterogeneity is incorporated into the resource dynamics, which is the population dynamics in this case, through a probability density that can be distorted from the reality. Subsequently, the distortion, which is the model uncertainty, is penalized through a divergence, leading to a non-standard dynamic differential game wherein the Hamilton-Jacobi-Bellman-Isaacs (HJBI) equation has a unique nonlinear partial differential term. Here, the existence and uniqueness results of the HJBI equation are presented along with an explicit monotone finite difference method. Finally, the proposed optimal control is applied to a harvesting problem with recreationally, economically, and ecologically important fish species using collected field data.

2.On the Optimal Rate for the Convergence Problem in Mean Field Control

2305.08423

Authors:Samuel Daudin, François Delarue, Joe Jackson

Abstract: The goal of this work is to obtain optimal rates for the convergence problem in mean field control. Our analysis covers cases where the solutions to the limiting problem may not be unique nor stable. Equivalently the value function of the limiting problem might not be differentiable on the entire space. Our main result is then to derive sharp rates of convergence in two distinct regimes. When the data is sufficiently regular, we obtain rates proportional to $N^{-1/2}$, with $N$ being the number of particles. When the data is merely Lipschitz and semi-concave with respect to the first Wasserstein distance, we obtain rates proportional to $N^{-2/(3d+6)}$. Noticeably, the exponent $2/(3d+6)$ is close to $1/d$, which is the optimal rate of convergence for uncontrolled particle systems driven by data with a similar regularity. The key argument in our approach consists in mollifying the value function of the limiting problem in order to produce functions that are almost classical sub-solutions to the limiting Hamilton-Jacobi equation (which is a PDE set on the space of probability measures). These sub-solutions can be projected onto finite dimensional spaces and then compared with the value functions associated with the particle systems. In the end, this comparison is used to prove the most demanding bound in the estimates. The key challenge therein is thus to exhibit an appropriate form of mollification. We do so by employing sup-convolution within a convenient functional Hilbert space. To make the whole easier, we limit ourselves to the periodic setting. We also provide some examples to show that our results are sharp up to some extent.

3.Fuzzy multiplier, sum and intersection rules in non-Lipschitzian settings: decoupling approach revisited

2305.08484

Authors:Marián Fabian, Alexander Y. Kruger, Patrick Mehlitz

Abstract: We revisit the decoupling approach widely used (often intuitively) in nonlinear analysis and optimization and initially formalized about a quarter of a century ago by Borwein & Zhu, Borwein & Ioffe and Lassonde. It allows one to streamline proofs of necessary optimality conditions and calculus relations, unify and simplify the respective statements, clarify and in many cases weaken the assumptions. In this paper we study weaker concepts of quasiuniform infimum, quasiuniform lower semicontinuity and quasiuniform minimum, putting them into the context of the general theory developed by the aforementioned authors. On the way, we unify the terminology and notation and fill in some gaps in the general theory. We establish rather general primal and dual necessary conditions characterizing quasiuniform $\varepsilon$-minima of the sum of two functions. The obtained fuzzy multiplier rules are formulated in general Banach spaces in terms of Clarke subdifferentials and in Asplund spaces in terms of Fr\'echet subdifferentials. The mentioned fuzzy multiplier rules naturally lead to certain fuzzy subdifferential calculus results. An application from sparse optimal control illustrates applicability of the obtained findings.

4.Minimal realizations of input-output behaviors by LPV state-space representations with affine dependency

2305.08508

Authors:Mihály Petreczky, Roland Tóth, Guillaume Mercère

Abstract: The paper makes the first steps towards a behavioral theory of LPV state-space representations with an affine dependency on scheduling, by characterizing minimality of such state-space representations. It is shown that minimality is equivalent to observability, and that minimal realizations of the same behavior are isomorphic.Finally, we establish a formal relationship between minimality of LPV state-space representations with an affine dependence on scheduling and minimality of LPV state-space representations with a dynamic and meromorphic dependence on scheduling.

5.A Note on the KKT Points for the Motzkin-Straus Program

2305.08519

Authors:G. Beretta Ca' Foscari University of Venice Polytechnic University of Turin, A. Torcinovich ETH, M. Pelillo Ca' Foscari University of Venice

Abstract: In a seminal 1965 paper, Motzkin and Straus established an elegant connection between the clique number of a graph and the global maxima of a quadratic program defined on the standard simplex. Since then, the result has been the subject of intensive research and has served as the motivation for a number of heuristics and bounds for the maximum clique problem. Most of the studies available in the literature, however, focus typically on the local/global solutions of the program, and little or no attention has been devoted so far to the study of its Karush-Kuhn-Tucker (KKT) points. In contrast, in this paper we study the properties of (a parameterized version of) the Motzkin-Straus program and show that its KKT points can provide interesting structural information and are in fact associated with certain regular sub-structures of the underlying graph.

6.Delay-agnostic Asynchronous Coordinate Update Algorithm

2305.08535

Authors:Xuyang Wu, Changxin Liu, Sindri Magnusson, Mikael Johansson

Abstract: We propose a delay-agnostic asynchronous coordinate update algorithm (DEGAS) for computing operator fixed points, with applications to asynchronous optimization. DEGAS includes novel asynchronous variants of ADMM and block-coordinate descent as special cases. We prove that DEGAS converges under both bounded and unbounded delays under delay-free parameter conditions. We also validate by theory and experiments that DEGAS adapts well to the actual delays. The effectiveness of DEGAS is demonstrated by numerical experiments on classification problems.

7.A Dynamical Systems Perspective on Discrete Optimization

2305.08536

Authors:Tong Guanchun, Michael Muehlebach

Abstract: We discuss a dynamical systems perspective on discrete optimization. Departing from the fact that many combinatorial optimization problems can be reformulated as finding low energy spin configurations in corresponding Ising models, we derive a penalized rank-two relaxation of the Ising formulation. It turns out that the associated gradient flow dynamics exactly correspond to a type of hardware solvers termed oscillator-based Ising machines. We also analyze the advantage of adding angle penalties by leveraging random rounding techniques. Therefore, our work contributes to a rigorous understanding of oscillator-based Ising machines by drawing connections to the penalty method in constrained optimization and providing a rationale for the introduction of sub-harmonic injection locking. Furthermore, we characterize a class of coupling functions between oscillators, which ensures convergence to discrete solutions. This class of coupling functions avoids explicit penalty terms or rounding schemes, which are prevalent in other formulations.

8.Uniqueness of optimal plans for multi-marginal mass transport problems via a reduction argument

2305.08650

Authors:Mohammad Ali Ahmadpoor, Abbas Momeni

Abstract: For a family of probability spaces $\{(X_k,\mathcal{B}_{X_k},\mu_k)\}_{k=1}^N$ and a cost function $c: X_1\times\cdots\times X_N\to \mathbb{R}$ we consider the Monge-Kantorovich problem \begin{align}\label{MONKANT} \inf_{\lambda\in\Pi(\mu_1,\ldots,\mu_N)}\int_{\prod_{k=1}^N X_k}c\,d\lambda. \end{align} Then for each ordered subset $\mathcal{P}=\{i_1,\ldots,i_p\}\subsetneq\{1,...,N\}$ with $p\geq 2$ we create a new cost function $c_\mathcal{P}$ corresponding to the original cost function $c$ defined on $\prod_{k=1}^p X_{i_k}$. This new cost function $c_\mathcal{P}$ enjoys many of the features of the original cost $c$ while it has the property that any optimal plan $\lambda$ of \eqref{MONKANT} restricted to $\prod_{k=1}^p X_{i_k}$ is also an optimal plan to the problem \begin{align}\label{REDMONKANT} \inf_{\tau\in\Pi(\mu_{i_1},\ldots\mu_{i_p})}\int_{\prod_{k=1}^p X_{i_k}}c_{\mathcal{P}}\,d\tau. \end{align} Our main contribution in this paper is to show that, for appropriate choices of index set $\mathcal{P}$, one can recover the optimal plans of \eqref{MONKANT} from \eqref{REDMONKANT}. In particular, we study situations in which the problem \eqref{MONKANT} admits a unique solution depending on the uniqueness of the solution for the lower marginal problems of the form \eqref{REDMONKANT}. This allows us to prove many uniqueness results for multi-marginal problems when the unique optimal plan is not necessarily induced by a map. To this end, we extensively benefit from disintegration theorems and the $c$-extremality notions. Moreover, by employing this argument, besides recovering many standard results on the subject including the pioneering work of Gangbo-\'Swi\c ech, several new applications will be demonstrated to evince the applicability of this argument.

9.On the connections between optimization algorithms, Lyapunov functions, and differential equations: theory and insights

2305.08658

Authors:Paul Dobson, Jesus Maria Sanz-Serna, Konstantinos Zygalakis

Abstract: We study connections between differential equations and optimization algorithms for $m$-strongly and $L$-smooth convex functions through the use of Lyapunov functions by generalizing the Linear Matrix Inequality framework developed by Fazylab et al. in 2018. Using the new framework we derive analytically a new (discrete) Lyapunov function for a two-parameter family of Nesterov optimization methods and characterize their convergence rate. This allows us to prove a convergence rate that improves substantially on the previously proven rate of Nesterov's method for the standard choice of coefficients, as well as to characterize the choice of coefficients that yields the optimal rate. We obtain a new Lyapunov function for the Polyak ODE and revisit the connection between this ODE and the Nesterov's algorithms. In addition discuss a new interpretation of Nesterov method as an additive Runge-Kutta discretization and explain the structural conditions that discretizations of the Polyak equation should satisfy in order to lead to accelerated optimization algorithms.

10.Model Predictive Control with Reach-avoid Analysis

2305.08712

Authors:Dejin Ren, Wanli Lu, Jidong Lv, Lijun Zhang, Bai Xue

Abstract: In this paper we investigate the optimal controller synthesis problem, so that the system under the controller can reach a specified target set while satisfying given constraints. Existing model predictive control (MPC) methods learn from a set of discrete states visited by previous (sub-)optimized trajectories and thus result in computationally expensive mixed-integer nonlinear optimization. In this paper a novel MPC method is proposed based on reach-avoid analysis to solve the controller synthesis problem iteratively. The reach-avoid analysis is concerned with computing a reach-avoid set which is a set of initial states such that the system can reach the target set successfully. It not only provides terminal constraints, which ensure feasibility of MPC, but also expands discrete states in existing methods into a continuous set (i.e., reach-avoid sets) and thus leads to nonlinear optimization which is more computationally tractable online due to the absence of integer variables. Finally, we evaluate the proposed method and make comparisons with state-of-the-art ones based on several examples.

11.The Non-Strict Projection Lemma

2305.08735

Authors:T. J. Meijer, T. Holicki, S. J. A. M. van den Eijnden, C. W. Scherer, W. P. M. H. Heemels

Abstract: The projection lemma (often also referred to as the elimination lemma) is one of the most powerful and useful tools in the context of linear matrix inequalities for system analysis and control. In its traditional formulation, the projection lemma only applies to strict inequalities, however, in many applications we naturally encounter non-strict inequalities. As such, we present, in this note, a non-strict projection lemma that generalizes both its original strict formulation as well as an earlier non-strict version. We demonstrate several applications of our result in robust linear-matrix-inequality-based marginal stability analysis and stabilization, a matrix S-lemma, which is useful in (direct) data-driven control applications, and matrix dilation.

12.A Multilevel Low-Rank Newton Method with Super-linear Convergence Rate and its Application to Non-convex Problems

2305.08742

Authors:Nick Tsipinakis, Panagiotis Tigkas, Panos Parpas

Abstract: Second-order methods can address the shortcomings of first-order methods for the optimization of large-scale machine learning models. However, second-order methods have significantly higher computational costs associated with the computation of second-order information. Subspace methods that are based on randomization have addressed some of these computational costs as they compute search directions in lower dimensions. Even though super-linear convergence rates have been empirically observed, it has not been possible to rigorously show that these variants of second-order methods can indeed achieve such fast rates. Also, it is not clear whether subspace methods can be applied to non-convex cases. To address these shortcomings, we develop a link between multigrid optimization methods and low-rank Newton methods that enables us to prove the super-linear rates of stochastic low-rank Newton methods rigorously. Our method does not require any computations in the original model dimension. We further propose a truncated version of the method that is capable of solving high-dimensional non-convex problems. Preliminary numerical experiments show that our method has a better escape rate from saddle points compared to accelerated gradient descent and Adam and thus returns lower training errors.

13.Near-optimal control of nonlinear systems with hybrid inputs and dwell-time constraints

2305.08760

Authors:Ioana Lal, Constantin Morarescu, Jamal Daafouz, Lucian Busoniu

Abstract: We propose two new optimistic planning algorithms for nonlinear hybrid-input systems, in which the input has both a continuous and a discrete component, and the discrete component must respect a dwell-time constraint. Both algorithms select sets of input sequences for refinement at each step, along with a continuous or discrete step to refine (split). The dwell-time constraint means that the discrete splits must keep the discrete mode constant if the required dwell-time is not yet reached. Convergence rate guarantees are provided for both algorithms, which show the dependency between the near-optimality of the sequence returned and the computational budget. The rates depend on a novel complexity measure of the dwell-time constrained problem. We present simulation results for two problems, an adaptive-quantization networked control system and a model for the COVID pandemic.

14.Learning on Manifolds: Universal Approximations Properties using Geometric Controllability Conditions for Neural ODEs

2305.08849

Authors:Karthik Elamvazhuthi, Xuechen Zhang, Samet Oymak, Fabio Pasqualetti

Abstract: In numerous robotics and mechanical engineering applications, among others, data is often constrained on smooth manifolds due to the presence of rotational degrees of freedom. Common datadriven and learning-based methods such as neural ordinary differential equations (ODEs), however, typically fail to satisfy these manifold constraints and perform poorly for these applications. To address this shortcoming, in this paper we study a class of neural ordinary differential equations that, by design, leave a given manifold invariant, and characterize their properties by leveraging the controllability properties of control affine systems. In particular, using a result due to Agrachev and Caponigro on approximating diffeomorphisms with flows of feedback control systems, we show that any map that can be represented as the flow of a manifold-constrained dynamical system can also be approximated using the flow of manifold-constrained neural ODE, whenever a certain controllability condition is satisfied. Additionally, we show that this universal approximation property holds when the neural ODE has limited width in each layer, thus leveraging the depth of network instead for approximation. We verify our theoretical findings using numerical experiments on PyTorch for the manifolds S2 and the 3-dimensional orthogonal group SO(3), which are model manifolds for mechanical systems such as spacecrafts and satellites. We also compare the performance of the manifold invariant neural ODE with classical neural ODEs that ignore the manifold invariant properties and show the superiority of our approach in terms of accuracy and sample complexity.

Fri, 12 May 2023digest

1.Projected solution for Generalized Nash Games with Non-ordered Preferences

2305.07275

Authors:Asrifa Sultana, Shivani Valecha

Abstract: Any individual's preference represents his choice in the set of available options. It is said to be complete if the person can compare any pair of available options. We aim to initiate the notion of projected solutions for the generalized Nash equilibrium problem with non-ordered (not necessarily complete and transitive) preferences and non-self constraint map. We provide the necessary and sufficient conditions under which projected solutions of a quasi-variational inequality and the considered GNEP coincide. Based on this variational reformulation, we derive the occurrence of projected solutions for the considered GNEP. Alternatively, by using a fixed point result, we ensure the existence of projected solutions for the considered GNEP without requiring the compactness of choice sets.

2.An Ant Colony System for the Team Orienteering Problem with Time Windows

2305.07305

Authors:Roberto Montemanni, Luca Maria Gambardella

Abstract: This paper discusses a heuristic approach for Team Orienteering Problems with Time Windows. The method we propose takes advantage of a solution model based on a hierarchic generalization of the original problem, which is combined with an Ant Colony System algorithm. Computational results on benchmark instances previously adopted in the literature suggest that the algorithm we propose is effective in practice.

3.A shape optimization pipeline for marine propellers by means of reduced order modeling techniques

2305.07515

Authors:Anna Ivagnes, Nicola Demo, Gianluigi Rozza

Abstract: In this paper, we propose a shape optimization pipeline for propeller blades, applied to naval applications. The geometrical features of a blade are exploited to parametrize it, allowing to obtain deformed blades by perturbating their parameters. The optimization is performed using a genetic algorithm that exploits the computational speed-up of reduced order models to maximize the efficiency of a given propeller. A standard offline-online procedure is exploited to construct the reduced-order model. In an expensive offline phase, the full order model, which reproduces an open water test, is set up in the open-source software OpenFOAM and the same full order setting is used to run the CFD simulations for all the deformed propellers. The collected high-fidelity snapshots and the deformed parameters are used in the online stage to build the non-intrusive reduced-order model. This paper provides a proof of concept of the pipeline proposed, where the optimized propeller improves the efficiency of the original propeller.

4.Efficient Dynamic Allocation Policy for Robust Ranking and Selection under Stochastic Control Framework

2305.07603

Authors:Hui Xiao, Zhihong Wei

Abstract: This research considers the ranking and selection with input uncertainty. The objective is to maximize the posterior probability of correctly selecting the best alternative under a fixed simulation budget, where each alternative is measured by its worst-case performance. We formulate the dynamic simulation budget allocation decision problem as a stochastic control problem under a Bayesian framework. Following the approximate dynamic programming theory, we derive a one-step-ahead dynamic optimal budget allocation policy and prove that this policy achieves consistency and asymptotic optimality. Numerical experiments demonstrate that the proposed procedure can significantly improve performance.

5.Jointly optimization of passenger-route assignment and transfer incentivization scheme for a customized modular bus system

2305.07616

Authors:Jianbiao Wang, Tomio Miwa, Takayuki Morikawa

Abstract: As an emerging travel mode, the modular vehicle system (MVS) is receiving increasing attention. In particular, the operators could connect multiple modular vehicles as an assembled bus in response to the temporary demand varies. Therefore, in this study, the MVS is adopted in the context of customized bus design to satisfy passengers reserved travel demand. In addition, to increase the potential of the customized modular bus system, the transfer among buses can be considered to group the passengers with the same or close destinations. However, the passengers will not actively transfer as it is viewed as a disutility. Thus, the appropriate incentivization should be provided. To this end, we jointly optimize the passenger-route assignment and the transfer incentivization scheme, in which the transfer demand is considered elastically under different incentivization. Then, the linearization approaches are adopted to transform the original nonlinear model into a mixed integer linear programming model, which can be solved by state-of-the-art solvers. The experiment on the small network reveals that the performance of the bus system with an incentivization scheme is better than that without an incentivization scheme, and the extent of such superiority depends on the total demand level in the system.

6.On the Partial Convexification for Low-Rank Spectral Optimization: Rank Bounds and Algorithms

2305.07638

Authors:Yongchun Li, Weijun Xie

Abstract: A Low-rank Spectral Optimization Problem (LSOP) minimizes a linear objective subject to multiple two-sided linear matrix inequalities intersected with a low-rank and spectral constrained domain set. Although solving LSOP is, in general, NP-hard, its partial convexification (i.e., replacing the domain set by its convex hull) termed "LSOP-R," is often tractable and yields a high-quality solution. This motivates us to study the strength of LSOP-R. Specifically, we derive rank bounds for any extreme point of the feasible set of LSOP-R and prove their tightness for the domain sets with different matrix spaces. The proposed rank bounds recover two well-known results in the literature from a fresh angle and also allow us to derive sufficient conditions under which the relaxation LSOP-R is equivalent to the original LSOP. To effectively solve LSOP-R, we develop a column generation algorithm with a vector-based convex pricing oracle, coupled with a rank-reduction algorithm, which ensures the output solution satisfies the theoretical rank bound. Finally, we numerically verify the strength of the LSOP-R and the efficacy of the proposed algorithms.

Thu, 11 May 2023digest

1.A Robust Control Approach to Asymptotic Optimality of the Heavy Ball Method for Optimization of Quadratic Functions

2305.06593

Authors:V. Ugrinovskii, I. R. Petersen, I. Shames

Abstract: Among first order optimization methods, Polyak's heavy ball method has long been known to guarantee the asymptotic rate of convergence matching Nesterov's lower bound for functions defined in an infinite-dimensional space. In this paper, we use results on the robust gain margin of linear uncertain feedback control systems to show that the heavy ball method is provably worst-case asymptotically optimal when applied to quadratic functions in a finite dimensional space.

2.Time-Reversed Dissipation Induces Duality Between Minimizing Gradient Norm and Function Value

2305.06628

Authors:Jaeyeon Kim, Asuman Ozdaglar, Chanwoo Park, Ernest K. Ryu

Abstract: In convex optimization, first-order optimization methods efficiently minimizing function values have been a central subject study since Nesterov's seminal work of 1983. Recently, however, Kim and Fessler's OGM-G and Lee et al.'s FISTA-G have been presented as alternatives that efficiently minimize the gradient magnitude instead. In this paper, we present H-duality, which represents a surprising one-to-one correspondence between methods efficiently minimizing function values and methods efficiently minimizing gradient magnitude. In continuous-time formulations, H-duality corresponds to reversing the time dependence of the dissipation/friction term. To the best of our knowledge, H-duality is different from Lagrange/Fenchel duality and is distinct from any previously known duality or symmetry relations. Using H-duality, we obtain a clearer understanding of the symmetry between Nesterov's method and OGM-G, derive a new class of methods efficiently reducing gradient magnitudes of smooth convex functions, and find a new composite minimization method that is simpler and faster than FISTA-G.

3.Linear System Analysis and Optimal Control of Natural Gas Dynamics in Pipeline Networks

2305.06658

Authors:Luke S. Baker, Sachin Shivakumar, Dieter Armbruster, Rodrigo B. Platte, Anatoly Zlotnik

Abstract: We derive a linear system of ordinary differential equations (ODEs) to approximate the dynamics of natural gas in pipeline networks. Although a closed-form expression of the eigenvalues of the state matrix does not generally exist, the poles of an irrational transfer function corresponding to the linearized partial differential equations are used to approximate the eigenvalues of the ODE system. Our analysis qualitatively demonstrates that the eigenvalues of the state matrix of the entire network system are "pipeline separable" in the sense that the eigenvalues are dominated by the individual pipeline parameters and not the incidence connectivity of the network graph. The linear system is used as the dynamic constraints of a linear optimal control problem (OCP) to design the control actions of compressor units to minimize the energy that they expend. The motivation of this work is to reduce the computational complexity of optimizing gas dynamics in large networks to meet the unpredictable and highly variable demand from electric generators. The linear and corresponding nonlinear OCPs are discretized in time to obtain linear and nonlinear optimization problems, which are demonstrated on a test network to illustrate the validity of linear programming. Moreover, an analytical bound on the error between the solutions of the linear and nonlinear flow dynamics is presented using Lyapunov functions and verified computationally by plotting the error against the size of the flow variation around the steady-state solution.

4.Regularization properties of dual subgradient flow

2305.06682

Authors:Vassilis Apidopoulos, Cesare Molinari, Lorenzo Rosasco, Silvia Villa

Abstract: Dual gradient descent combined with early stopping represents an efficient alternative to the Tikhonov variational approach when the regularizer is strongly convex. However, for many relevant applications, it is crucial to deal with regularizers which are only convex. In this setting, the dual problem is non smooth, and dual gradient descent cannot be used. In this paper, we study the regularization properties of a subgradient dual flow, and we show that the proposed procedure achieves the same recovery accuracy as penalization methods, while being more efficient from the computational perspective.

5.Lagrange Multipliers in locally convex spaces

2305.06736

Authors:Mohammed Bachir, Joel Blot

Abstract: We give a general Lagrange multiplier rule for mathematical programming problems in a Hausdorff locally convex space. We consider infinitely many inequality and equality constraints. Our results gives in particular a generalisation of the result of J. Jahn in \cite{Ja}, replacing Fr\'echet-differentiability assumptions on the functions by the Gateaux-differentiability. Moreover, the closed convex cone with a nonempty interior in the constraints is replaced by a strictly general class of closed subsets introduced in the paper and called {\it "admissible sets"}. Examples illustrating our results are given.

6.Multiplier rules for Dini-derivatives in a topological vector space

2305.06765

Authors:Mohammed Bachir, Rongzhen Lyu

Abstract: We provide new results of first-order necessary conditions of optimality problem in the form of John's theorem and in the form of Karush-Kuhn-Tucker's theorem. We establish our result in a topological vector space for problems with inequality constraints and in a Banach space for problems with equality and inequality constraints. Our contributions consist in the extension of the results known for the Fr\'echet and Gateaux-differentiable functions as well as for the Clarke's subdifferential of Lipschitz functions, to the more general Dini-differentiable functions. As consequences, we extend the result of B.H. Pourciau in \cite[Theorem 6, p. 445]{Po} from the convexity to the {\it "Dini-pseudoconvexity"}.

7.Alternating mixed-integer programming and neural network training for approximating stochastic two-stage problems

2305.06785

Authors:Jan Kronqvist, Boda Li, Jan Rolfes, Shudian Zhao

Abstract: The presented work addresses two-stage stochastic programs (2SPs), a broadly applicable model to capture optimization problems subject to uncertain parameters with adjustable decision variables. In case the adjustable or second-stage variables contain discrete decisions, the corresponding 2SPs are known to be NP-complete. The standard approach of forming a single-stage deterministic equivalent problem can be computationally challenging even for small instances, as the number of variables and constraints scales with the number of scenarios. To avoid forming a potentially huge MILP problem, we build upon an approach of approximating the expected value of the second-stage problem by a neural network (NN) and encoding the resulting NN into the first-stage problem. The proposed algorithm alternates between optimizing the first-stage variables and retraining the NN. We demonstrate the value of our approach with the example of computing operating points in power systems by showing that the alternating approach provides improved first-stage decisions and a tighter approximation between the expected objective and its neural network approximation.

8.Stochastic Variance-Reduced Majorization-Minimization Algorithms

2305.06848

Authors:Duy-Nhat Phan, Sedi Bartz, Nilabja Guha, Hung M. Phan

Abstract: We study a class of nonconvex nonsmooth optimization problems in which the objective is a sum of two functions: One function is the average of a large number of differentiable functions, while the other function is proper, lower semicontinuous and has a surrogate function that satisfies standard assumptions. Such problems arise in machine learning and regularized empirical risk minimization applications. However, nonconvexity and the large-sum structure are challenging for the design of new algorithms. Consequently, effective algorithms for such scenarios are scarce. We introduce and study three stochastic variance-reduced majorization-minimization (MM) algorithms, combining the general MM principle with new variance-reduced techniques. We provide almost surely subsequential convergence of the generated sequence to a stationary point. We further show that our algorithms possess the best-known complexity bounds in terms of gradient evaluations. We demonstrate the effectiveness of our algorithms on sparse binary classification problems, sparse multi-class logistic regressions, and neural networks by employing several widely-used and publicly available data sets.

Wed, 10 May 2023digest

1.Hermite kernel surrogates for the value function of high-dimensional nonlinear optimal control problems

2305.06122

Authors:Tobias Ehring, Bernard Haasdonk

Abstract: Numerical methods for the optimal feedback control of high-dimensional dynamical systems typically suffer from the curse of dimensionality. In the current presentation, we devise a mesh-free data-based approximation method for the value function of optimal control problems, which partially mitigates the dimensionality problem. The method is based on a greedy Hermite kernel interpolation scheme and incorporates context-knowledge by its structure. Especially, the value function surrogate is elegantly enforced to be 0 in the target state, non-negative and constructed as a correction of a linearized model. The algorithm is proposed in a matrix-free way, which circumvents the large-matrix-problem for multivariate Hermite interpolation. For finite time horizons, both convergence of the surrogate to the value function as well as for the surrogate vs. the optimal controlled dynamical system are proven. Experiments support the effectiveness of the scheme, using among others a new academic model that has a scalable dimension and an explicitly given value function. It may also be useful for the community to validate other optimal control approaches.

2.Two-stage and Lagrangian Dual Decision Rules for Multistage Adaptive Robust Optimization

2305.06190

Authors:Maryam Daryalal, Ayse N. Arslan, Merve Bodur

Abstract: In this work, we design primal and dual bounding methods for multistage adjustable robust optimization (MSARO) problems by adapting two decision rules rooted in the stochastic programming literature. This approach approximates the primal and dual formulations of an MSARO problem with two-stage models. From the primal perspective, this is achieved by applying two-stage decision rules that restrict the functional forms of a certain subset of decision variables. We present sufficient conditions under which the well-known constraint-and-column generation algorithm can be used to solve the primal approximation with finite convergence guarantees. From the dual side, we introduce a distributionally robust dual problem for MSARO models using their nonanticipative Lagrangian dual and then apply linear decision rules on the Lagrangian multipliers. For this dual approximation, we present a monolithic bilinear program valid for continuous recourse problems, and a cutting-plane method for mixed-integer recourse problems. Our framework is general-purpose and does not require strong assumptions such as a stage-wise independent uncertainty set, and can consider integer recourse variables. Computational experiments on newsvendor, location-transportation, and capital budgeting problems show that our bounds yield considerably smaller optimality gaps compared to the existing methods.

Tue, 09 May 2023digest

1.Symmetries in polynomial optimization

2305.05219

Authors:Philippe Moustrou, Cordian Riener, Hugues Verdure

Abstract: This chapter investigates how symmetries can be used to reduce the computational complexity in polynomial optimization problems. A focus will be specifically given on the Moment-SOS hierarchy in polynomial optimization, where results from representation theory and invariant theory of groups can be used. In addition, symmetry reduction techniques which are more generally applicable are also presented.

2.Numerical simulation of differential-algebraic equations with embedded global optimization criteria

2305.05288

Authors:Jens Deussen, Jonathan Hüser, Uwe Naumann

Abstract: We are considering differential-algebraic equations with embedded optimization criteria (DAEOs) in which the embedded optimization problem is solved by global optimization. This actually leads to differential inclusions for cases in which there are multiple global optimizer at the same time. Jump events from one global optimum to another result in nonsmooth DAEs and thus reduction of the convergence order of the numerical integrator to first-order. Implementation of event detection and location as introduced in this work preserves the higher-order convergence behavior of the integrator. This allows to compute discrete tangents and adjoint sensitivities for optimal control problems.

3.More on Projected Type Iteration Method and Linear Complementarity Problem

2305.05341

Authors:Bharat Kumar, Deepmala, A. K. Das

Abstract: In this article, we establish a class of new projected type iteration methods based on matrix spitting for solving the linear complementarity problem. Also, we provide a sufficient condition for the convergence analysis when the system matrix is an $H_+$-matrix. We show the efficiency of the proposed method by using two numerical examples for different parameters. Keywords. Iterative method, Linear complementarity problem, $H_{+}$-matrix, $P$-matrix, Matrix splitting, Convergence.

4.On the continuity assumption of "Finite Adaptability in Multistage Linear Optimization'' by Bertsimas and Caramanis

2305.05399

Authors:Safia Kedad-Sidhoum, Anton Medvedev, Frédéric Meunier

Abstract: Two-stage robust optimization is a fundamental paradigm for modeling and solving optimization problems with uncertain parameters. A now classical method within this paradigm is finite-adaptability, introduced by Bertsimas and Caramanis (IEEE Transactions on Automatic Control, 2010). In this note, we point out that the continuity assumption they stated to ensure the convergence of the method is not correct, and we propose an alternative assumption for which we prove the desired convergence.

5.Continuous-Time Linear Optimization

2305.05466

Authors:Valeriano Antunes de Oliveira

Abstract: In this work, optimality conditions and classical results from duality theory are derived for continuous-time linear optimization problems with inequality constraints. The optimality conditions are given in the Karush-Kuhn-Tucker form. Weak and strong duality properties, as well as, the complementary slackness theorem are established. A result concerning the existence of solutions is also stated.

6.On adaptive stochastic heavy ball momentum for solving linear systems

2305.05482

Authors:Yun Zeng, Deren Han, Yansheng Su, Jiaxin Xie

Abstract: The stochastic heavy ball momentum (SHBM) method has gained considerable popularity as a scalable approach for solving large-scale optimization problems. However, one limitation of this method is its reliance on prior knowledge of certain problem parameters, such as singular values of a matrix. In this paper, we propose an adaptive variant of the SHBM method for solving stochastic problems that are reformulated from linear systems using a user-defined distribution. Our adaptive SHBM (ASHBM) method utilizes iterative information to update the parameters, addressing an open problem in the literature regarding the adaptive learning of momentum parameters. We prove that our method converges linearly in expectation, with a better convergence rate compared to the basic method. Notably, we demonstrate that the deterministic version of our ASHBM algorithm can be reformulated as a variant of the conjugate gradient (CG) method, inheriting many of its appealing properties, such as finite-time convergence. Consequently, the ASHBM method can be further generalized to develop a brand-new framework of the stochastic CG (SCG) method for solving linear systems. Our theoretical results are supported by numerical experiments.

7.On Measurement Disturbances in Distributed Least Squares Solvers for Linear Equations

2305.05512

Authors:Yutao Tang, Yicheng Zhang, Ruonan Li, Xinghu Wang

Abstract: This paper aims at distributed algorithms for solving a system of linear algebraic equations. Different from most existing formulations for this problem, we assume that the local data at each node is not accurately measured but subject to some disturbances. To be specific, the local measurement consists of two parts: a nominal value and a multiple sinusoidal disturbance. By introducing an identifier-enhanced observer to estimate the disturbance, we present a novel distributed least squares solver for the linear equations using noisy measurements. The proposed solver is proven to be able to recover the least squares solution to the linear equations associated with the nominal values irrespective of any multi-sinusoidal disturbance even with unknown frequencies. We also show the robustness of the distributed solvers under standard conditions against unstructured perturbations. The effectiveness of our design is verified by a numerical example.

8.Signed tropicalization of polars and application to matrix cones

2305.05637

Authors:Marianne Akian, Xavier Allamigeon, Stéphane Gaubert, Sergei Sergeev

Abstract: We study the tropical analogue of the notion of polar of a cone, working over the semiring of tropical numbers with signs and show that the tropical polars of sets of nonnegative tropical vectors are precisely sets of signed vectors that are closed and that are stable by an operation of linear combination. We relate tropical polars with images by the nonarchimedean valuation of classical polars over real closed nonarchimedean fields and show, in particular, that for semi-algebraic sets over such a field, the operation of taking the polar commutes with the operation of signed valuation (keeping track both of the nonarchimedean valuation and sign). We apply these results to characterize images by the signed valuation of classical cones of matrices, including the cones of positive semidefinite matrices, completely positive matrices, completely positive semidefinite matrices, and their polars, including the cone of co-positive matrices. It turns out, in particular, that hierarchies of classical cones collapse under tropicalization. We finally discuss a simple application of these ideas to optimization with signed tropical numbers.

Mon, 08 May 2023digest

1.Nash equilibria for total expected reward absorbing Markov games: the constrained and unconstrained cases

2305.04514

Authors:François Dufour, Tomás Prieto-Rumeau

Abstract: We consider a nonzero-sum N-player Markov game on an abstract measurable state space with compact metric action spaces. The payoff functions are bounded Carath\'eodory functions and the transitions of the system are assumed to have a density function satisfying some continuity conditions. The optimality criterion of the players is given by a total expected payoff on an infinite discrete-time horizon. Under the condition that the game model is absorbing, we establish the existence of Markov strategies that are a noncooperative equilibrium in the family of all history-dependent strategies of the players for both the constrained and the unconstrained problems, We obtain, as a particular case of results, the existence of Nash equilibria for discounted constrained and unconstrained game models.

2.Optimal Battery Charge Scheduling For Revenue Stacking Under Operational Constraints Via Energy Arbitrage

2305.04566

Authors:Alban Puech IP Paris & DEIF, Gorazd Dimitrov IP Paris, Claudia D'Ambrosio LIX, CNRS

Abstract: As the share of variable renewable energy sources increases in the electricity mix, new solutions are needed to build a flexible and reliable grid. Energy arbitrage with battery storage systems supports renewable energy integration into the grid by shifting demand and increasing the overall utilization of power production systems. In this paper, we propose a mixed integer linear programming model for energy arbitrage on the day-ahead market, that takes into account operational and availability constraints of asset owners willing to get an additional revenue stream from their storage asset. This approach optimally schedules the charge and discharge operations associated with the most profitable trading strategy, and achieves between 80% and 90% of the maximum obtainable profits considering one-year time horizons using the prices of electricity in multiple European countries including Germany, France, Italy, Denmark, and Spain.

3.Accelerated Stochastic Optimization Methods under Quasar-convexity

2305.04736

Authors:Qiang Fu, Dongchu Xu, Ashia Wilson

Abstract: Non-convex optimization plays a key role in a growing number of machine learning applications. This motivates the identification of specialized structure that enables sharper theoretical analysis. One such identified structure is quasar-convexity, a non-convex generalization of convexity that subsumes convex functions. Existing algorithms for minimizing quasar-convex functions in the stochastic setting have either high complexity or slow convergence, which prompts us to derive a new class of stochastic methods for optimizing smooth quasar-convex functions. We demonstrate that our algorithms have fast convergence and outperform existing algorithms on several examples, including the classical problem of learning linear dynamical systems. We also present a unified analysis of our newly proposed algorithms and a previously studied deterministic algorithm.

4.Gradient descent with a general cost

2305.04917

Authors:Flavien Léger, Pierre-Cyril Aubin-Frankowski

Abstract: We present a new class of gradient-type optimization methods that extends vanilla gradient descent, mirror descent, Riemannian gradient descent, and natural gradient descent. Our approach involves constructing a surrogate for the objective function in a systematic manner, based on a chosen cost function. This surrogate is then minimized using an alternating minimization scheme. Using optimal transport theory we establish convergence rates based on generalized notions of smoothness and convexity. We provide local versions of these two notions when the cost satisfies a condition known as nonnegative cross-curvature. In particular our framework provides the first global rates for natural gradient descent and Newton's method.

Fri, 05 May 2023digest

1.AdaBiM: An adaptive proximal gradient method for structured convex bilevel optimization

2305.03559

Authors:Puya Latafat, Andreas Themelis, Silvia Villa, Panagiotis Patrinos

Abstract: Bilevel optimization is a comprehensive framework that bridges single- and multi-objective optimization. It encompasses many general formulations, including, but not limited to, standard nonlinear programs. This work demonstrates how elementary proximal gradient iterations can be used to solve a wide class of convex bilevel optimization problems without involving subroutines. Compared to and improving upon existing methods, ours (1) can handle a wider class of problems, including nonsmooth terms in the upper and lower level problems, (2) does not require strong convexity or global Lipschitz gradient continuity assumptions, and (3) provides a systematic adaptive stepsize selection strategy, allowing for the use of large stepsizes while being insensitive to the choice of parameters.

2.Scope Restriction for Scalable Real-Time Railway Rescheduling: An Exploratory Study

2305.03574

Authors:Erik Nygren, Christian Eichenberger, Emma Frejinger

Abstract: With the aim to stimulate future research, we describe an exploratory study of a railway rescheduling problem. A widely used approach in practice and state of the art is to decompose these complex problems by geographical scope. Instead, we propose defining a core problem that restricts a rescheduling problem in response to a disturbance to only trains that need to be rescheduled, hence restricting the scope in both time and space. In this context, the difficulty resides in defining a scoper that can predict a subset of train services that will be affected by a given disturbance. We report preliminary results using the Flatland simulation environment that highlights the potential and challenges of this idea. We provide an extensible playground open-source implementation based on the Flatland railway environment and Answer-Set Programming.

3.Convergence of the Preconditioned Proximal Point Method and Douglas-Rachford Splitting in the Absence of Monotonicity

2305.03605

Authors:Brecht Evens, Pieter Pas, Puya Latafat, Panagiotis Patrinos

Abstract: The proximal point algorithm (PPA) is the most widely recognized method for solving inclusion problems and serves as the foundation for many numerical algorithms. Despite this popularity, its convergence results have been largely limited to the monotone setting. In this work, we study the convergence of (relaxed) preconditioned PPA for a class of nonmonotone problems that satisfy an oblique weak Minty condition. Additionally, we study the (relaxed) Douglas-Rachford splitting (DRS) method in the nonmonotone setting by establishing a connection between DRS and the preconditioned PPA with a positive semidefinite preconditioner. To better characterize the class of problems covered by our analysis, we introduce the class of semimonotone operators, offering a natural extension to (hypo)monotone and co(hypo)monotone operators, and describe some of their properties. Sufficient conditions for global convergence of DRS involving the sum of two semimonotone operators are provided. Notably, it is shown that DRS converges even when the sum of the involved operators (or of their inverses) is nonmonotone. Various example problems are provided, demonstrating the tightness of our convergence results and highlighting the wide range of applications our theory is able to cover.

Thu, 04 May 2023digest

1.L$\infty$/L1 Duality Results In Optimal Control Problems

2305.02585

Authors:Dan Goreac LAMA, Alain Rapaport MISTEA

Abstract: We provide a duality result linking the value function for a control problem with supremum cost H under an isoperimetric inequality G $\le$ gmax, and the value function for the same controlled dynamics with cost G and state constraint H $\le$ hmax. This duality is proven for initial conditions at which lower semi-continuity of the value functions can be guaranteed, and is completed with optimality considerations. Furthermore, we provide structural assumptions on the dynamics under which such regularity can be established. As a by-product, we illustrate the partial equivalence between recentworks dealing with non-pharmaceutically controlled epidemics under peak or budget restrictions.

2.Well-posedness for the split equilibrium problem

2305.02696

Authors:Soumitra Dey, V. Vetrivel, Hong-Kun Xu

Abstract: We extend the concept of well-posedness to the split equilibrium problem and establish Furi-Vignoli-type characterizations for the well-posedness. We prove that the well-posedness of the split equilibrium problem is equivalent to the existence and uniqueness of its solution under certain assumptions on the bifunctions involved. We also characterize the generalized well-posedness of the split equilibrium problem via the Kuratowski measure of noncompactness. We illustrate our theoretical results by several examples.

3.Tracking Point Vortices and Circulations via Advected Passive Particles: an Estimation Approach

2305.02737

Authors:Gil Marques, Marco Martins Afonso, Sílvio Gama

Abstract: We present a novel method for estimating the circulations and positions of point vortices using trajectory data of passive particles in the presence of Gaussian noise. The method comprises two algorithms: the first one calculates the vortex circulations, while the second one reconstructs the vortex trajectories. This reconstruction is done thanks to a hierarchy of optimization problems, involving the integration of systems of differential equations, over time sub-intervals all with the same amplitude defined by the autocorrelation function for the advected passive particles' trajectories. Our findings indicate that accurately tracking the position of vortices and determining their circulations is achievable, even when passive particle trajectories are affected by noise.

4.New Accelerated Modulus-Based Iteration Method for Solving Large and Sparse Linear Complementarity Problem

2305.02764

Authors:Bharat Kumar, Deepmala, A. K. Das

Abstract: In this article, we establish a class of new accelerated modulus-based iteration methods for solving the linear complementarity problem. When the system matrix is an $H_+$-matrix, we present appropriate criteria for the convergence analysis. Also, we demonstrate the effectiveness of our proposed method and reduce the number of iterations and CPU time to accelerate the convergence performance by providing two numerical examples for various parameters. Keywords. Linear complementarity problem, Iteration method, $P$-matrix, $H_{+}$-matrix, Convergence analysis, Matrix splitting.

5.On the combined inverse-square effect of multiple point sources in multidimensional space

2305.02912

Authors:Keaton Coletti, Pawel Kalczynski, Zvi Drezner

Abstract: The inverse-square law states that the effect a source has on its surroundings is inversely proportional to the square of the Euclidean distance from that source. Its applicability spans multiple fields including physics, engineering, and computer science. We study the combined effect of multiple point sources surrounding a closed region in multidimensional space. We determine that the maximum effect in D dimensions is on the region's boundary if $D \leq 4$, and the minimum is on the boundary if $D \geq 4$.

6.Multi-period Power System Risk Minimization under Wildfire Disruptions

2305.02933

Authors:Hanbin Yang, Noah Rhodes, Haoxiang Yang, Line Roald, Lewis Ntaimo

Abstract: As climate change evolves, wildfire risk is increasing globally, posing a growing threat to power systems, with grid failures fueling the most destructive wildfires. In day-to-day operations, preemptive de-energization of equipment is an effective tool to mitigate the risk and damage of wildfires. However, such power outages have significant impacts on customers. This paper proposes a novel framework for planning preemptive de-energization of power systems to mitigate wildfire risk and damage. We model wildfire disruptions as stochastic disruptions with random magnitude and timing and formulate a two-stage stochastic program that maximizes the electricity delivered while proactively reducing the risk of wildfires by selectively de-energizing components over multiple time periods. We use cellular automata to sample grid failure and wildfire scenarios based on grid-related risks and environmental factors. We develop a decomposition algorithm that can generate adaptive shutoff plans before a disruption occurs. We test our method on an augmented version of the RTS-GLMC test case in Southern California and compare it with two benchmark cases. We find that our method reduces both wildfire damage costs and load-shedding losses over multiple time periods, and our nominal plan is robust against the uncertainty model perturbation.

Wed, 03 May 2023digest

1.A Generalisation of the Secant Criterion

2305.02088

Authors:Richard Pates

Abstract: The cyclic feedback interconnection of $n$ subsystems is the basic building block of control theory. Many robust stability tools have been developed for this interconnection. Two notable examples are the small gain theorem and the Secant Criterion. Both of these conditions guarantee stability if an inequality involving the geometric mean of a set of subsystem indices is satisfied. The indices in each case are designed to capture different core properties; gain in the case of the small gain theorem, and the degree of output-strict-passivity in the Secant Criterion. In this paper we identify entire families of other suitable indices based on mappings of the unit disk. This unifies the small gain theorem and the Secant Criterion, as well as a range of other stability criteria, into a single condition.

2.On a Unified and Simplified Proof for the Ergodic Convergence Rates of PPM, PDHG and ADMM

2305.02165

Authors:Haihao Lu, Jinwen Yang

Abstract: We present a unified viewpoint of proximal point method (PPM), primal-dual hybrid gradient (PDHG) and alternating direction method of multipliers (ADMM) for solving convex-concave primal-dual problems. This viewpoint shows the equivalence of these three algorithms upto a norm change, and it leads to a four-line simple proof of their $\mathcal O(1/k)$ ergodic rates.

3.Distributionally robust chance constrained Markov decision process with Kullback-Leibler divergence

2305.02167

Authors:Tian Xia, Jia Liu, Abdel Lisser

Abstract: This paper considers the distributionally robust chance constrained Markov decision process with random reward and ambiguous reward distribution. We consider individual and joint chance constraint cases with Kullback-Leibler divergence based ambiguity sets centered at elliptical distributions or elliptical mixture distributions, respectively. We derive tractable reformulations of the distributionally robust individual chance constrained Markov decision process problems and design a new hybrid algorithm based on the sequential convex approximation and line search method for the joint case. We carry out numerical tests with a machine replacement problem.

Tue, 02 May 2023digest

1.SOS Construction of Compatible Control Lyapunov and Barrier Functions

2305.01222

Authors:Michael Schneeberger, Florian Dörfler, Silvia Mastellone

Abstract: We propose a novel approach to certify closed-loop stability and safety of a constrained polynomial system based on the combination of Control Lyapunov Functions (CLFs) and Control Barrier Functions (CBFs). For polynomial systems that are affine in the control input, both classes of functions can be constructed via Sum Of Squares (SOS) programming. Using two versions of the Positivstellensatz we derive an SOS formulation seeking a rational controller that - if feasible - results in compatible CLF and multiple CBFs.

2.Time-Domain Moment Matching for Second-Order Systems

2305.01254

Authors:Xiaodong Cheng, Tudor C. Ionescu, Monica Pătraşcu

Abstract: This paper studies a structure-preserving model reduction problem for large-scale second-order dynamical systems via the framework of time-domain moment matching. The moments of a second-order system are interpreted as the solutions of second-order Sylvester equations, which leads to families of parameterized second-order reduced models that match the moments of an original second-order system at selected interpolation points. Based on this, a two-sided moment matching problem is addressed, providing a unique second-order reduced system that match two distinct sets interpolation points. Furthermore, we also construct the reduced second-order systems that matches the moments of both zero and first order derivative of the original second-order system. Finally, the Loewner framework is extended to the second-order systems, where two parameterized families of models are presented that retain the second-order structure and interpolate sets of tangential data.

3.The role of individual compensation and acceptance decisions in crowdsourced delivery

2305.01317

Authors:Alim Buğra Çınar, Wout Dullaert, Markus Leitner, Rosario Paradiso, Stefan Waldherr

Abstract: High demand, rising customer expectations, and government regulations are forcing companies to increase the efficiency and sustainability of urban (last-mile) distribution. Consequently, several new delivery concepts have been proposed that increase flexibility for customers and other stakeholders. One of these innovations is crowdsourced delivery, where deliveries are made by occasional drivers who wish to utilize their surplus resources (unused transport capacity) by making deliveries in exchange for some compensation. In addition to reducing delivery costs, the potential benefits of crowdsourced delivery include better utilization of transport capacity, a reduction in overall traffic, and increased flexibility (by scaling up and down delivery capacity as needed). The use of occasional drivers poses new challenges because (unlike traditional couriers) neither their availability nor their behavior in accepting delivery offers is certain. The relationship between the compensation offered to occasional drivers and the probability that they will accept a task has been largely neglected in the scientific literature. Therefore, we consider a setting in which compensation-dependent acceptance probabilities are explicitly considered in the process of assigning delivery tasks to occasional drivers. We propose a mixed-integer nonlinear model that minimizes the expected delivery costs while identifying optimal assignments of tasks to a mix of traditional and occasional drivers and their compensation. We propose exact linearization schemes for two practically relevant probability functions and an approximate linearization scheme for the general case. The results of our computational study show clear advantages of our new approach over existing ones.

4.Projection-Free Online Convex Optimization with Stochastic Constraints

2305.01333

Authors:Duksang Lee, Nam Ho-Nguyen, Dabeen Lee

Abstract: This paper develops projection-free algorithms for online convex optimization with stochastic constraints. We design an online primal-dual projection-free framework that can take any projection-free algorithms developed for online convex optimization with no long-term constraint. With this general template, we deduce sublinear regret and constraint violation bounds for various settings. Moreover, for the case where the loss and constraint functions are smooth, we develop a primal-dual conditional gradient method that achieves $O(\sqrt{T})$ regret and $O(T^{3/4})$ constraint violations. Furthermore, for the setting where the loss and constraint functions are stochastic and strong duality holds for the associated offline stochastic optimization problem, we prove that the constraint violation can be reduced to have the same asymptotic growth as the regret.

5.Random Function Descent

2305.01377

Authors:Felix Benning, Leif Döring

Abstract: While gradient based methods are ubiquitous in machine learning, selecting the right step size often requires "hyperparameter tuning". This is because backtracking procedures like Armijo's rule depend on quality evaluations in every step, which are not available in a stochastic context. Since optimization schemes can be motivated using Taylor approximations, we replace the Taylor approximation with the conditional expectation (the best $L^2$ estimator) and propose "Random Function Descent" (RFD). Under light assumptions common in Bayesian optimization, we prove that RFD is identical to gradient descent, but with calculable step sizes, even in a stochastic context. We beat untuned Adam in synthetic benchmarks. To close the performance gap to tuned Adam, we propose a heuristic extension competitive with tuned Adam.

6.A Novel Approach for Solving Security Constrained Optimal Power Flow Using the Inverse Matrix Modification Lemma and Benders Decomposition

2305.01395

Authors:Matias Vistnes, Vijay Venu Vadlamudi, Sigurd Hofsmo Jakobsen, Oddbjørn Gjerde

Abstract: With the increasing complexity of power systems, faster methods for power system reliability analysis are needed. We propose a novel methodology to solve the security constrained optimal power flow (SCOPF) problem that reduces the computational time by using the Sherman-Morrison-Woodbury identity and Benders decomposition. The case study suggests that in a 500 node system, the run time is reduced by 83.5% while ensuring a reliable operation of the system considering short- and long-term post-contingency limits and reducing the operational costs, compared to a preventive `N-1' strategy.

7.Approximation of deterministic mean field games under polynomial growth conditions on the data

2305.01445

Authors:Justina Gianatti, Francisco J. Silva, Ahmad Zorkot

Abstract: We consider a deterministic mean field games problem in which a typical agent solves an optimal control problem where the dynamics is affine with respect to the control and the cost functional has a growth which is polynomial with respect to the state variable. In this framework, we construct a mean field game problem in discrete time and finite state space that approximates equilibria of the original game. Two numerical examples, solved with the fictitious play method, are presented.

8.Optimal control problems for stochastic processes with absorbing regime

2305.01490

Authors:yaacov Kopeliovich

Abstract: In this paper we formulate and solve an optimal problem for Stochastic process with a regime absorbing state. The solution for this problem is obtained through a system of partial differential equations. The method is applied to obtain an explicit solution for the Merton portfolio problem when an asset has a default probability in case of a log utility.

Mon, 01 May 2023digest

1.Electric Vehicle Supply Equipment Location and Capacity Allocation for Fixed-Route Networks

2305.00806

Authors:Amir Davatgari, Taner Cokyasar, Anirudh Subramanyam, Jeffrey Larson, Abolfazl Mohammadian

Abstract: Electric vehicle (EV) supply equipment location and allocation (EVSELCA) problems for freight vehicles are becoming more important because of the trending electrification shift. Some previous works address EV charger location and vehicle routing problems simultaneously by generating vehicle routes from scratch. Although such routes can be efficient, introducing new routes may violate practical constraints, such as drive schedules, and satisfying electrification requirements can require dramatically altering existing routes. To address the challenges in the prevailing adoption scheme, we approach the problem from a fixed-route perspective. We develop a mixed-integer linear program, a clustering approach, and a metaheuristic solution method using a genetic algorithm (GA) to solve the EVSELCA problem. The clustering approach simplifies the problem by grouping customers into clusters, while the GA generates solutions that are shown to be nearly optimal for small problem cases. A case study examines how charger costs, energy costs, the value of time (VOT), and battery capacity impact the cost of the EVSELCA. Charger costs were found to be the most significant component in the objective function, with an 80\% decrease resulting in a 25\% cost reduction. VOT costs decrease substantially as energy costs increase. The number of fast chargers increases as VOT doubles. Longer EV ranges decrease total costs up to a certain point, beyond which the decrease in total costs is negligible.

2.Cascading failures: dynamics, stability and control

2305.00838

Authors:Stefanny Ramirez, Maaike Odijk, Dario Bauso

Abstract: We develop a dynamic model of cascading failures in a financial network whereby cross-holdings are viewed as feedback, external assets investments as inputs and failure penalties as static nonlinearities. We provide sufficient milder and stronger conditions for the system to be a positive one, and study equilibrium points and stability. Stability implies absence of cascades and convergence of market values to constant values. We provide a constructive method for control design to obtain stabilizing market investments in the form of feedback-feedforward control inputs.

3.A new Ordinal Regression procedure for Multiple Criteria Decision Aiding: the case of the space time model for a sustainable Ecovillage

2305.00940

Authors:Maria Barbati, Salvatore Greco, Isabella M. Lami

Abstract: In this paper, we present a methodology based on a multiobjective optimization suggesting which facility to implement, in which location, and at which time. In this context, we define a new elicitation procedure to handle Decision Makers (DMs) preferences with an intrinsic and more general interest that goes beyond the specific decision problem. In particular, the user's preferences are elicited by conjugating the deck of cards method with the ordinal regression approach allowing the DM to provide preference information in terms of ranking and pairwise comparing with regard to the intensity of preference of some solutions of the optimization problem. Then, the score of the reference solutions obtained through the deck of the cards method is used as a basis for an ordinal regression procedure that, to take into account interaction between criteria, represents DM's multicriteria preferences by means of a value function expressed in terms of a Choquet Integral. The obtained value function is then used to define a multiobjective optimization problem. The new feasible solutions obtained by the resolution of the optimization problem are proposed to the DM to verify his appreciation and collect further new preference information to iterate the interaction procedure ending when the DM is satisfied of the proposed solution. We apply our methodology to a real world problem to handle the planning procedure of a sustainable Ecovillage in the province of Turin (Italy). We consider a set of facilities to be distributed in a given space in a proper temporal sequence that we conveniently formulated in terms of the space time model introduced by Barbati et al. (2020). We interact with the President of the cooperative owning the Ecovillage to detail what facilities of the Ecovillage should be selected among the proposed ones, where they should be located, and when they should be planned.

Fri, 28 Apr 2023digest

1.On Underdamped Nesterov's Acceleration

2304.14642

Authors:Shuo Chen, Bin Shi, Ya-xiang Yuan

Abstract: The high-resolution differential equation framework has been proven to be tailor-made for Nesterov's accelerated gradient descent method~(\texttt{NAG}) and its proximal correspondence -- the class of faster iterative shrinkage thresholding algorithms (FISTA). However, the systems of theories is not still complete, since the underdamped case ($r < 2$) has not been included. In this paper, based on the high-resolution differential equation framework, we construct the new Lyapunov functions for the underdamped case, which is motivated by the power of the time $t^{\gamma}$ or the iteration $k^{\gamma}$ in the mixed term. When the momentum parameter $r$ is $2$, the new Lyapunov functions are identical to the previous ones. These new proofs do not only include the convergence rate of the objective value previously obtained according to the low-resolution differential equation framework but also characterize the convergence rate of the minimal gradient norm square. All the convergence rates obtained for the underdamped case are continuously dependent on the parameter $r$. In addition, it is observed that the high-resolution differential equation approximately simulates the convergence behavior of~\texttt{NAG} for the critical case $r=-1$, while the low-resolution differential equation degenerates to the conservative Newton's equation. The high-resolution differential equation framework also theoretically characterizes the convergence rates, which are consistent with that obtained for the underdamped case with $r=-1$.

2.A Method for Finding a Design Space as Linear Combinations of Parameter Ranges for Biopharmaceutical Control Strategies

2304.14666

Authors:Thomas Oberleitner, Thomas Zahel, Christoph Herwig

Abstract: According to ICH Q8 guidelines, the biopharmaceutical manufacturer submits a design space (DS) definition as part of the regulatory approval application, in which case process parameter (PP) deviations within this space are not considered a change and do not trigger a regulatory post approval procedure. A DS can be described by non-linear PP ranges, i.e., the range of one PP conditioned on specific values of another. However, independent PP ranges (linear combinations) are often preferred in biopharmaceutical manufacturing due to their operation simplicity. While some statistical software supports the calculation of a DS comprised of linear combinations, such methods are generally based on discretizing the parameter space - an approach that scales poorly as the number of PPs increases. Here, we introduce a novel method for finding linear PP combinations using a numeric optimizer to calculate the largest design space within the parameter space that results in critical quality attribute (CQA) boundaries within acceptance criteria, predicted by a regression model. A precomputed approximation of tolerance intervals is used in inequality constraints to facilitate fast evaluations of this boundary using a single matrix multiplication. Correctness of the method was validated against different ground truths with known design spaces. Compared to stateof-the-art, grid-based approaches, the optimizer-based procedure is more accurate, generally yields a larger DS and enables the calculation in higher dimensions. Furthermore, a proposed weighting scheme can be used to favor certain PPs over others and therefore enabling a more dynamic approach to DS definition and exploration. The increased PP ranges of the larger DS provide greater operational flexibility for biopharmaceutical manufacturers.

3.Design and Operation of Renewable Energy Microgrids under uncertainty towards Green Deal and Minimum Carbon Emissions

2304.14709

Authors:Su Meyra Tatar, Erdal Aydin

Abstract: The regulations regarding the Paris Agreement are planned to be adapted soon to keep the global temperature rise within 2 0C. Additionally, integrating renewable energy-based equipment and adopting new ways of producing energy resources, for example Power to Gas technology, becomes essential because of the current environmental and political concerns. Moreover, it is vital to supply the growing energy demand with the increasing population. Uncertainty must be considered in the transition phase since parameters regarding the electricity demand, carbon tax policies, and intermittency of renewable energy-based equipment have intermittent nature. A multi-period two-stage stochastic MILP model is proposed in this work where the wind speed, solar irradiance, temperature, power demand, carbon emission trading (CET) price, and CO2 emission limit are considered uncertain parameters. This model finds one single optimal design for the energy grid while considering several scenarios regarding uncertainties simultaneously. Three stochastic case studies with scenarios including different combinations of the aforementioned uncertain parameters are investigated. Results show that more renewable energy-based equipment with higher rated power values is chosen as the sanctions get stricter. In addition, the optimality of PtG technology is also investigated for a specific location. Implementing the CO2 emission limit as an uncertain parameter instead of including CET price as an uncertain parameter results in lower annual CO2 emission rates and higher net present cost values. Keywords: optimal renewable energy integration, power-to-gas, two-stage stochastic programming, carbon trade, carbon price, green hydrogen

4.Controlling Microgrids Without External Data: A Benchmark of Stochastic Programming Methods

2304.14808

Authors:Alban Puech SE, Tristan Rigaut LAAS-POP, Adrien Le Franc LAAS-POP, William Templier SE, Jean-Christophe Alais SE, Maud Tournoud SE, Victor Bossard SE, Alejandro Yousef SE, Elena Stolyarova SE

Abstract: Microgrids are local energy systems that integrate energy production, demand, and storage units. They are generally connected to the regional grid to import electricity when local production and storage do not meet the demand. In this context, Energy Management Systems (EMS) are used to ensure the balance between supply and demand, while minimizing the electricity bill, or an environmental criterion. The main implementation challenges for an EMS come from the uncertainties in the consumption, the local renewable energy production, and in the price and the carbon intensity of electricity. Model Predictive Control (MPC) is widely used to implement EMS but is particularly sensitive to the forecast quality, and often requires a subscription to expensive third-party forecast services. We introduce four Multistage Stochastic Control Algorithms relying only on historical data obtained from on-site measurements. We formulate them under the shared framework of Multistage Stochastic Programming and benchmark them against two baselines in 61 different microgrid setups using the EMSx dataset. Our most effective algorithm produces notable cost reductions compared to an MPC that utilizes the same uncertainty model to generate predictions, and it demonstrates similar performance levels to an ideal MPC that relies on perfect forecasts.

5.A Stochastic-Gradient-based Interior-Point Algorithm for Solving Smooth Bound-Constrained Optimization Problems

2304.14907

Authors:Frank E. Curtis, Vyacheslav Kungurtsev, Daniel P. Robinson, Qi Wang

Abstract: A stochastic-gradient-based interior-point algorithm for minimizing a continuously differentiable objective function (that may be nonconvex) subject to bound constraints is presented, analyzed, and demonstrated through experimental results. The algorithm is unique from other interior-point methods for solving smooth (nonconvex) optimization problems since the search directions are computed using stochastic gradient estimates. It is also unique in its use of inner neighborhoods of the feasible region -- defined by a positive and vanishing neighborhood-parameter sequence -- in which the iterates are forced to remain. It is shown that with a careful balance between the barrier, step-size, and neighborhood sequences, the proposed algorithm satisfies convergence guarantees in both deterministic and stochastic settings. The results of numerical experiments show that in both settings the algorithm can outperform a projected-(stochastic)-gradient method.

6.Solving constrained Procrustes problems: a conic optimization approach

2304.14961

Authors:Terézia Fulová, Mária Trnovská

Abstract: Procrustes problems are matrix approximation problems searching for a~transformation of the given dataset to fit another dataset. They find applications in numerous areas, such as factor and multivariate analysis, computer vision, multidimensional scaling or finance. The known methods for solving Procrustes problems have been designed to handle specific sub-classes, where the set of feasible solutions has a special structure (e.g. a Stiefel manifold), and the objective function is defined using a specific matrix norm (typically the Frobenius norm). We show that a wide class of Procrustes problems can be formulated and solved as a (rank-constrained) semi-definite program. This includes balanced and unbalanced (weighted) Procrustes problems, possibly to a partially specified target, but also oblique, projection or two-sided Procrustes problems. The proposed approach can handle additional linear, quadratic, or semi-definite constraints and the objective function defined using the Frobenius norm but also standard operator norms. The results are demonstrated on a set of numerical experiments and also on real applications.

Thu, 27 Apr 2023digest

1.Solving Data-Driven Newsvendor Pricing Problems with Decision-Dependent Effect

2304.13924

Authors:Wenxuan Liu, Zhihai Zhang

Abstract: This paper investigates the data-driven pricing newsvendor problem, which focuses on maximizing expected profit by deciding on inventory and pricing levels based on historical demand and feature data. We first build an approximate model by assigning weights to historical samples. However, due to decision-dependent effects, the resulting approximate model is complicated and unable to solve directly. To address this issue, we introduce the concept of approximate gradients and design an Approximate Gradient Descent (AGD) algorithm. We analyze the convergence of the proposed algorithm in both convex and non-convex settings, which correspond to the newsvendor pricing model and its variants respectively. Finally, we perform numerical experiment on both simulated and real-world dataset to demonstrate the efficiency and effectiveness of the AGD algorithm. We find that the AGD algorithm can converge to the local maximum provided that the approximation is effective. We also illustrate the significance of two characteristics: distribution-free and decision-dependent of our model. Consideration of the decision-dependent effect is necessary for approximation, and the distribution-free model is preferred when there is little information on the demand distribution and how demand reacts to the pricing decision. Moreover, the proposed model and algorithm are not limited to the newsvendor problem, but can also be used for a wide range of decision-dependent problems.

2.Optimal Transmission Switching with Uncertainties from both Renewable Energy and N-k Contingencies

2304.13944

Authors:Tong Han, David J. Hill, Yue Song

Abstract: This paper focuses on the N-k security-constrained optimal transmission switching (OTS) problem for variable renewable energy (VRE) penetrated power grids. A new three-stage stochastic and distributionally robust OTS model is proposed. The first stage has the primary purpose to schedule the power generation and network topology based on the forecast of VRE. The second stage controls the power generation and voltage magnitudes of voltage-controlled buses in response to VRE uncertainty, and the third stage reacts to N-k contingencies additionally by line switching and load shedding. The VRE and N-k contingencies, considering different availability of their probability distributions, are tackled by stochastic and distributionally robust optimization, respectively. By adopting stage-wise realization of uncertainties in VRE and contingencies, the associated corrective controls with different mechanisms can be handled separately and properly, which makes the proposed OTS model more realistic than existing two-stage ones. For solving the proposed OTS model, its tractable reformulation is derived, and a solution approach that combines the nested column-and-constraint generation algorithm and Dantzig-Wolfe procedure is developed. Finally, case studies include a simple IEEE network for illustrative purposes and then real system networks to demonstrate the efficacy of the proposed approach.

3.Convergence of Adam Under Relaxed Assumptions

2304.13972

Authors:Haochuan Li, Ali Jadbabaie, Alexander Rakhlin

Abstract: In this paper, we provide a rigorous proof of convergence of the Adaptive Moment Estimate (Adam) algorithm for a wide class of optimization objectives. Despite the popularity and efficiency of the Adam algorithm in training deep neural networks, its theoretical properties are not yet fully understood, and existing convergence proofs require unrealistically strong assumptions, such as globally bounded gradients, to show the convergence to stationary points. In this paper, we show that Adam provably converges to $\epsilon$-stationary points with $\mathcal{O}(\epsilon^{-4})$ gradient complexity under far more realistic conditions. The key to our analysis is a new proof of boundedness of gradients along the optimization trajectory, under a generalized smoothness assumption according to which the local smoothness (i.e., Hessian norm when it exists) is bounded by a sub-quadratic function of the gradient norm. Moreover, we propose a variance-reduced version of Adam with an accelerated gradient complexity of $\mathcal{O}(\epsilon^{-3})$.

4.Modeling the Complexity of City Logistics Systems for Sustainability

2304.13987

Authors:Taiwo Adetiloye, Anjali Awasthi

Abstract: The logistics of urban areas are becoming more sophisticated due to the fast city population growth. The stakeholders are faced with the challenges of the dynamic complexity of city logistics(CL) systems characterized by the uncertainty effect together with the freight vehicle emissions causing pollution. In this conceptual paper, we present a research methodology for the environmental sustainability of CL systems that can be attained by effective stakeholders' collaboration under non-chaotic situations and the presumption of the human levity tendency. We propose the mathematical axioms of the uncertainty effect while putting forward the notion of condition effectors, and how to assign hypothetical values to them. Finally, we employ a spider network and causal loop diagram to investigate the system's elements and their behavior over time.

5.Propagating Kernel Ambiguity Sets in Nonlinear Data-driven Dynamics Models

2304.14057

Authors:Jia-Jie Zhu

Abstract: This paper provides answers to an open problem: given a nonlinear data-driven dynamical system model, e.g., kernel conditional mean embedding (CME) and Koopman operator, how can one propagate the ambiguity sets forward for multiple steps? This problem is the key to solving distributionally robust control and learning-based control of such learned system models under a data-distribution shift. Different from previous works that use either static ambiguity sets, e.g., fixed Wasserstein balls, or dynamic ambiguity sets under known piece-wise linear (or affine) dynamics, we propose an algorithm that exactly propagates ambiguity sets through nonlinear data-driven models using the Koopman operator and CME, via the kernel maximum mean discrepancy geometry. Through both theoretical and numerical analysis, we show that our kernel ambiguity sets are the natural geometric structure for the learned data-driven dynamical system models.

6.How to avoid ordinal violations in incomplete pairwise comparisons

2304.14111

Authors:László Csató

Abstract: Assume that some ordinal preferences can be represented by a weakly connected directed acyclic graph. The data are collected into an incomplete pairwise comparison matrix, the missing entries are estimated, and the priorities are derived from the optimally filled pairwise comparison matrix. Our paper studies whether these weights are consistent with the partial order given by the underlying graph. According to previous results from the literature, two popular procedures, the incomplete eigenvector and the incomplete logarithmic least squares methods fail to satisfy the required property. Here, it is shown that the recently introduced lexicographically optimal completion combined with any of these weighting methods avoids ordinal violation in the above setting. This finding provides a powerful argument for using the lexicographically optimal completion to determine the missing elements in an incomplete pairwise comparison matrix.

7.Detection of a very serious error in the paper: "On identifiability of nonlinear ODE models and applications in viral dynamics"

2304.14288

Authors:Agostino Martinelli

Abstract: This erratum highlights a very serious error in a paper published by SIAM Review in 2011. The error is in Section 6.2 of [1]. It is very important to notify this error because of the following two reasons: (i) [1] is one of the most cited contributions in the field of identifiability of viral dynamics models, and (ii)the error is relevant because, as a result of it, a very popular viral model (perhaps the most popular in the field of HIV dynamics) has been classified as identifiable. In contrast, three of its parameters are not identifiable, even locally. This erratum first proves the non uniqueness of the three unidentifiable parameters by exhibiting infinitely many distinct but indistinguishable values of them. The non uniqueness is even local. Then, this erratum details the error made by the authors of [1] which produced the claimed (but false) local identifiability of all the model parameters.

Wed, 26 Apr 2023digest

1.Nonsmooth nonconvex stochastic heavy ball

2304.13328

Authors:Tam Le TSE-R

Abstract: Motivated by the conspicuous use of momentum based algorithms in deep learning, we study a nonsmooth nonconvex stochastic heavy ball method and show its convergence. Our approach relies on semialgebraic assumptions, commonly met in practical situations, which allow to combine a conservative calculus with nonsmooth ODE methods. In particular, we can justify the use of subgradient sampling in practical implementations that employ backpropagation or implicit differentiation. Additionally, we provide general conditions for the sample distribution to ensure the convergence of the objective function. As for the stochastic subgradient method, our analysis highlights that subgradient sampling can make the stochastic heavy ball method converge to artificial critical points. We address this concern showing that these artifacts are almost surely avoided when initializations are randomized.

2.IML FISTA: Inexact MuLtilevel FISTA for Image Restoration

2304.13329

Authors:Guillaume Lauga OCKHAM, Elisa Riccietti OCKHAM, Nelly Pustelnik Phys-ENS, Paulo Gonçalves OCKHAM

Abstract: This paper presents IML FISTA, a multilevel inertial and inexact forward-backward algorithm, based on the use of the Moreau envelope to build efficient and useful coarse corrections. Such construction is provided for a broad class of composite optimization problems with proximable functions. This approach is supported by strong theoretical guarantees: we prove both the rate of convergence and the convergence of the iterates to a minimum in the convex case, an important result for ill-posed problems. We evaluate our approach on several image reconstruction problems and we show that it considerably accelerates the convergence of classical methods such as FISTA, for large-scale images.

3.Semiconcavity for the value function of a minimum time problem with time delay

2304.13569

Authors:Elisa Continelli, Cristina Pignotti

Abstract: In this paper, we deal with a minimum time problem in presence of a time delay $\tau.$ The value function of the considered optimal control problem is no longer defined in a subset of $\mathbb{R}^{n}$, as it happens in the undelayed case, but its domain is a subset of the Banach space $C([-\tau,0];\mathbb{R}^{n})$. For the undelayed minimum time problem, it is known that the value function associated with it is semiconcave in a subset of the reachable set and is a viscosity solution of a suitable Hamilton-Jacobi-Belmann equation. The Hamilton-Jacobi theory for optimal control problems involving time delays has been developed by several authors. Here, we are rather interested in investigating the regularity properties of the minimum time functional. Extending classical arguments, we are able to prove that the minimum time functional is semiconcave in a suitable subset of the reachable set.

4.Polynomial-Time Solvers for the Discrete $\infty$-Optimal Transport Problems

2304.13467

Authors:Meyer Scetbon

Abstract: In this note, we propose polynomial-time algorithms solving the Monge and Kantorovich formulations of the $\infty$-optimal transport problem in the discrete and finite setting. It is the first time, to the best of our knowledge, that efficient numerical methods for these problems have been proposed.

5.Data-driven Piecewise Affine Decision Rules for Stochastic Programming with Covariate Information

2304.13646

Authors:Yiyang Zhang, Junyi Liu, Xiaobo Zhao

Abstract: Focusing on stochastic programming (SP) with covariate information, this paper proposes an empirical risk minimization (ERM) method embedded within a nonconvex piecewise affine decision rule (PADR), which aims to learn the direct mapping from features to optimal decisions. We establish the nonasymptotic consistency result of our PADR-based ERM model for unconstrained problems and asymptotic consistency result for constrained ones. To solve the nonconvex and nondifferentiable ERM problem, we develop an enhanced stochastic majorization-minimization algorithm and establish the asymptotic convergence to (composite strong) directional stationarity along with complexity analysis. We show that the proposed PADR-based ERM method applies to a broad class of nonconvex SP problems with theoretical consistency guarantees and computational tractability. Our numerical study demonstrates the superior performance of PADR-based ERM methods compared to state-of-the-art approaches under various settings, with significantly lower costs, less computation time, and robustness to feature dimensions and nonlinearity of the underlying dependency.

6.Optimal control of a class of semilinear fractional elliptic equations

2304.13853

Authors:Cyrille Kenne, Gisèle Mophou, Mahamadi Warma

Abstract: In this paper, a class of semilinear fractional elliptic equations associated to the spectral fractional Dirichlet Laplace operator is considered. We establish the existence of optimal solutions as well as a minimum principle of Pontryagin type and the first order necessary optimality conditions of associated optimal control problems. Second order conditions for optimality are also obtained for $L^{\infty}$ and $L^2-$ local solutions under some structural assumptions.

7.A minimal face constant rank constraint qualification for reducible conic programming

2304.13881

Authors:Roberto Andreani, Gabriel Haeser, Leonardo M. Mito, Héctor Ramírez

Abstract: In a previous paper [R. Andreani, G. Haeser, L. M. Mito, H. Ram\'irez, T. P. Silveira. First- and second-order optimality conditions for second-order cone and semidefinite programming under a constant rank condition. Mathematical Programming, 2023. DOI: 10.1007/s10107-023-01942-8] we introduced a constant rank constraint qualification for nonlinear semidefinite and second-order cone programming by considering all faces of the underlying cone. This condition is independent of Robinson's condition and it implies a strong second-order necessary optimality condition which depends on a single Lagrange multiplier instead of the full set of Lagrange multipliers. In this paper we expand on this result in several directions, namely, we consider the larger class of $\mathcal{C}^2-$cone reducible constraints and we show that it is not necessary to consider all faces of the cone; instead a single specific face should be considered (which turns out to be weaker than Robinson's condition) in order for the first order necessary optimality condition to hold. This gives rise to a notion of facial reduction for nonlinear conic programming, that allows locally redefining the original problem only in terms of this specific face instead of the whole cone, providing a more robust formulation of the problem in which Robinson's condition holds. We were also able to prove the strong second-order necessary optimality condition in this context by considering only the subfaces of this particular face, which is a new result even in nonlinear programming.

Tue, 25 Apr 2023digest

1.The logarithmic least squares priorities and ordinal violations in the best-worst method

2304.12626

Authors:László Csató

Abstract: The best-worst method is an increasingly popular approach to solving multi-criteria decision-making problems. Recently, the logarithmic least squares method has been proposed to derive priorities in the best-worst method since it has favourable properties such as the uniqueness of the weights, simple calculation, and the consideration of indirect comparisons. The current paper gives a sufficient condition for the logarithmic least squares method to guarantee the lack of ordinal violations in the best-worst method. It implies that the logarithmic least squares priorities are consistent with the preference order given by the decision-maker if the best alternative is at least two times more desirable than the others and the worst alternative is at least two times less desirable than the others, furthermore, the number of alternatives exceeds three. Our result provides another argument for using the logarithmic least squares priorities in the best-worst method.

2.A Moment-SOS Hierarchy for Robust Polynomial Matrix Inequality Optimization with SOS-Convexity

2304.12628

Authors:Feng Guo, Jie Wang

Abstract: We study a class of polynomial optimization problems with a robust polynomial matrix inequality constraint for which the uncertainty set is defined also by a polynomial matrix inequality (including robust polynomial semidefinite programs as a special case). Under certain SOS-convexity assumptions, we construct a hierarchy of moment-SOS relaxations for this problem to obtain convergent upper bounds of the optimal value by solving a sequence of semidefinite programs. To this end, we apply the Positivstellensatz for polynomial matrices and its dual matrix-valued moment theory to a conic reformulation of the problem. Most of the nice features of the moment-SOS hierarchy for the scalar polynomial optimization are generalized to the matrix case. In particular, the finite convergence of the hierarchy can be also certified if the flat extension condition holds. To extract global minimizers in this case, we develop a linear algebra approach to recover the representing matrix-valued measure for the corresponding truncated matrix-valued moment problem. As an application, we use this hierarchy to solve the problem of minimizing the smallest eigenvalue of a polynomial matrix subject to a polynomial matrix inequality. Finally, if SOS-convexity is replaced by convexity, we can still approximate the optimal value as closely as desired by solving a sequence of semidefinite programs, and certify global optimality in case that certain flat extension conditions hold true.

3.Minimal-time geodesics on a homogeneous spaces of the 2D Lie group

2304.12754

Authors:Victor Ayala, Adriano Da Silva, Maria Torreblanca

Abstract: Our main concern is to continue developing the theory of Linear Control Systems on homogeneous spaces of connected Lie groups. In this manuscript we solve a specific issue for this class of systems: time-optimal problem on a cylinder, a homogeneous space of the solvable Lie group of dimension two.

4.Faster than Fast: Accelerating the Griffin-Lim Algorithm

2304.12905

Authors:Rossen Nenov, Dang-Khoa Nguyen, Peter Balazs

Abstract: The phase retrieval problem is found in various areas of applications of engineering and applied physics. It is also a very active field of research in mathematics, signal processing and machine learning. In this paper, we present an accelerated version of the well known Fast Griffin-Lim algorithm (FGLA) for the phase retrieval problem in a general setting. It has increased the speed of convergence, and most importantly, the limit points of the generated sequence can reach a significantly smaller error than the ones generated by FGLA. We will give a motivation of the acceleration and compare it numerically to its predecessors and other algorithms typically used to solve similar problems.

5.Asymptotic Behaviors and Phase Transitions in Projected Stochastic Approximation: A Jump Diffusion Approach

2304.12953

Authors:Jiadong Liang, Yuze Han, Xiang Li, Zhihua Zhang

Abstract: In this paper we consider linearly constrained optimization problems and propose a loopless projection stochastic approximation (LPSA) algorithm. It performs the projection with probability $p_n$ at the $n$-th iteration to ensure feasibility. Considering a specific family of the probability $p_n$ and step size $\eta_n$, we analyze our algorithm from an asymptotic and continuous perspective. Using a novel jump diffusion approximation, we show that the trajectories connecting those properly rescaled last iterates weakly converge to the solution of specific stochastic differential equations (SDEs). By analyzing SDEs, we identify the asymptotic behaviors of LPSA for different choices of $(p_n, \eta_n)$. We find that the algorithm presents an intriguing asymptotic bias-variance trade-off and yields phase transition phenomenons, according to the relative magnitude of $p_n$ w.r.t. $\eta_n$. This finding provides insights on selecting appropriate ${(p_n, \eta_n)}_{n \geq 1}$ to minimize the projection cost. Additionally, we propose the Debiased LPSA (DLPSA) as a practical application of our jump diffusion approximation result. DLPSA is shown to effectively reduce projection complexity compared to vanilla LPSA.

6.Delayed impulsive stabilisation of discrete-time systems: a periodic event-triggering algorithm

2304.12976

Authors:Kexue Zhang, Elena Braverman

Abstract: This paper studies the problem of event-triggered impulsive control for discrete-time systems. A novel periodic event-triggering scheme with two tunable parameters is presented to determine the moments of updating impulsive control signals which are called event times. Sufficient conditions are established to guarantee asymptotic stability of the resulting impulsive systems. It is worth mentioning that the event times are different from the impulse times, that is, the control signals are updated at each event time but the actuator performs the impulsive control tasks at a later time due to time delays. The effectiveness of our theoretical result with the proposed scheme is illustrated by three examples.

7.Deep Reinforcement Learning in Finite-Horizon to Explore the Most Probable Transition Pathway

2304.12994

Authors:Jin Guo, Ting Gao, Peng Zhang, Jinqiao Duan

Abstract: This investigation focuses on discovering the most probable transition pathway for stochastic dynamical systems employing reinforcement learning. We first utilize Onsager-Machlup theory to quantify rare events in stochastic dynamical systems, and then convert the most likely transition path issue into a finite-horizon optimal control problem, because, in many instances, the transition path cannot be determined explicitly by variation. We propose the terminal prediction method and integrate it with reinforcement learning, develop our algorithm Finite Horizon Deep Deterministic Policy Gradient(FH-DDPG) to deal with the finite-horizon optimal control issue. Next, we present the convergence analysis of the algorithm for the value function estimation in terms of the neural network approximation error and the sampling error when estimating the network. Finally, experiments are conducted for the transition problem under Gaussian noise to verify the effectiveness of the algorithm.

Mon, 24 Apr 2023digest

1.Optimal Investment-Consumption-Insurance with Partial Information and Correlation Between Assets Price and Factor Process

2304.11825

Authors:Woundjiagué Apollinaire, Rodwell Kufakunesu, Julius Esunge

Abstract: In this research, we present an analysis of the optimal investment, consumption, and life insurance acquisition problem for a wage earner with partial information. Our study considers the non-linear filter case where risky asset prices are correlated to the factor processes under constant relative risk aversion (CRRA) preferences. We introduce a more general framework with an incomplete market, random parameters adapted to the Brownian motion filtration, and a general factor process with a non-linear state estimation and a correlation between the state process (risky asset prices) and the factor process. To address the wage earner's problem, we formulate it as a stochastic control problem with partial information where the risky assets prices are correlated to the factor processes. Our framework is extensive since the non-linear filter applied to the linear case gives a more robust result than the Kalman filter. We obtain the non-linear filter through the Zakai equation and derive a system of the Hamilton-Jacobi-Bellman (HJB) equation and two backward stochastic differential equations (BSDE). We establish the existence and uniqueness of the solution, prove the verification theorem, and construct the optimal strategy.

2.Stochastic Approximation for Nonlinear Discrete Stochastic Control: Finite-Sample Bounds

2304.11854

Authors:Hoang Huy Nguyen, Siva Theja Maguluri

Abstract: We consider a nonlinear discrete stochastic control system, and our goal is to design a feedback control policy in order to lead the system to a prespecified state. We adopt a stochastic approximation viewpoint of this problem. It is known that by solving the corresponding continuous-time deterministic system, and using the resulting feedback control policy, one ensures almost sure convergence to the prespecified state in the discrete system. In this paper, we adopt such a control mechanism and provide its finite-sample convergence bounds whenever a Lyapunov function is known for the continuous system. In particular, we consider four cases based on whether the Lyapunov function for the continuous system gives exponential or sub-exponential rates and based on whether it is smooth or not. We provide the finite-time bounds in all cases. Our proof relies on constructing a Lyapunov function for the discrete system based on the given Lyapunov function for the continuous system. We do this by appropriately smoothing the given function using the Moreau envelope. We present numerical experiments corresponding to the various cases, which validate the rates we establish.

3.On the Viability and Invariance of Proper Sets under Continuity Inclusions in Wasserstein Spaces

2304.11945

Authors:Benoît, Bonnet-Weill, Hélène Frankowska

Abstract: In this article, we derive necessary and sufficient conditions for the existence of solutions to state-constrained continuity inclusions in Wasserstein spaces whose right-hand sides may be discontinuous in time. These latter are based on fine investigations of the infinitesimal behaviour of the underlying reachable sets, through which we show that up to a negligible set of times, every admissible velocity of the inclusion can be approximately realised as the metric derivative of a solution of the dynamics, and vice versa. Building on these results, we are able to establish necessary and sufficient geometric conditions for the viability and invariance of stationary and time-dependent constraints, which involve a suitable notion of contingent cones in Wasserstein spaces, presented in ascending order of generality. We then close the article by exhibiting two prototypical examples of constraints sets appearing in applications for which one can compute relevant subfamilies of contingent directions.

4.On sparse solution of tensor complementarity problem

2304.11986

Authors:R. Deb, A. K. Das

Abstract: In this article we consider the sparse solutions of the tensor complementarity problem (TCP) which are the solutions of the smallest cardinality. We establish a connection between the least element of the feasible solution set of TCP and sparse solution for $Z$-tensor. We propose a $p$ norm regularized minimization model when $p\in (0,1)$ and show that it can approximate sparse solution applying the regularization of parameter. Keywords: Tensor complementarity problem, sparse solution, $l_p$ regularized minimization, $Z$-tensor.

5.A descent method for nonsmooth multiobjective optimization problems on Riemannian manifolds

2304.11990

Authors:Chunming Tang, Hao He, Jinbao Jian, Miantao Chao

Abstract: In this paper, a descent method for nonsmooth multiobjective optimization problems on complete Riemannian manifolds is proposed. The objective functions are only assumed to be locally Lipschitz continuous instead of convexity used in existing methods. A necessary condition for Pareto optimality in Euclidean space is generalized to the Riemannian setting. At every iteration, an acceptable descent direction is obtained by constructing a convex hull of some Riemannian $\varepsilon$-subgradients. And then a Riemannian Armijo-type line search is executed to produce the next iterate. The convergence result is established in the sense that a point satisfying the necessary condition for Pareto optimality can be generated by the algorithm in a finite number of iterations. Finally, some preliminary numerical results are reported, which show that the proposed method is efficient.

6.Designing Optimal Personalized Incentive for Traffic Routing using BIG Hype algorithm

2304.12004

Authors:Panagiotis D. Grontas, Carlo Cenedese, Marta Fochesato, Giuseppe Belgioioso, John Lygeros, Florian Dörfler

Abstract: We study the problem of optimally routing plug-in electric and conventional fuel vehicles on a city level. In our model, commuters selfishly aim to minimize a local cost that combines travel time, from a fixed origin to a desired destination, and the monetary cost of using city facilities, parking or service stations. The traffic authority can influence the commuters' preferred routing choice by means of personalized discounts on parking tickets and on the energy price at service stations. We formalize the problem of designing these monetary incentives optimally as a large-scale bilevel game, where constraints arise at both levels due to the finite capacities of city facilities and incentives budget. Then, we develop an efficient decentralized solution scheme with convergence guarantees based on BIG Hype, a recently-proposed hypergradient-based algorithm for hierarchical games. Finally, we validate our model via numerical simulations over the Anaheim's network, and show that the proposed approach produces sensible results in terms of traffic decongestion and it is able to solve in minutes problems with more than 48000 variables and 110000 constraints.

7.Risk in Stochastic and Robust Model Predictive Path-Following Control for Vehicular Motion Planning

2304.12063

Authors:Leon Tolksdorf, Arturo Tejada, Nathan van de Wouw, Christian Birkner

Abstract: In automated driving, risk describes potential harm to passengers of an autonomous vehicle (AV) and other road users. Recent studies suggest that human-like driving behavior emerges from embedding risk in AV motion planning algorithms. Additionally, providing evidence that risk is minimized during the AV operation is essential to vehicle safety certification. However, there has yet to be a consensus on how to define and operationalize risk in motion planning or how to bound or minimize it during operation. In this paper, we define a stochastic risk measure and introduce it as a constraint into both robust and stochastic nonlinear model predictive path-following controllers (RMPC and SMPC respectively). We compare the vehicle's behavior arising from employing SMPC and RMPC with respect to safety and path-following performance. Further, the implementation of an automated driving example is provided, showcasing the effects of different risk tolerances and uncertainty growths in predictions of other road users for both cases. We find that the RMPC is significantly more conservative than the SMPC, while also displaying greater following errors towards references. Further, the RMPCs behavior cannot be considered as human-like. Moreover, unlike SMPC, the RMPC cannot account for different risk tolerances. The RMPC generates undesired driving behavior for even moderate uncertainties, which are handled better by the SMPC.

8.A closed-loop design for scalable high-order consensus

2304.12064

Authors:Jonas Hansson, Emma Tegling

Abstract: This paper studies the problem of coordinating a group of $n$th-order integrator systems. As for the well-studied conventional consensus problem, we consider linear and distributed control with only local and relative measurements. We propose a closed-loop dynamic that we call serial consensus and prove it achieves $n$th order consensus regardless of model order and underlying network graph. This alleviates an important scalability limitation in conventional consensus dynamics of order $n\ge 2$, whereby they may lose stability if the underlying network grows. The distributed control law which achieves the desired closed loop dynamics is shown to be localized and obey the limitation to relative state measurements. Furthermore, through use of the small-gain theorem, the serial consensus system is shown to be robust to both model and feedback uncertainties. We illustrate the theoretical results through examples.

9.Regularity results and optimal velocity control of the convective nonlocal Cahn-Hilliard equation in 3D

2304.12074

Authors:Andrea Poiatti, Andrea Signori

Abstract: In this contribution, we study an optimal control problem for the celebrated nonlocal Cahn-Hilliard equation endowed with the singular Flory-Huggins potential in the three-dimensional setting. The control enters the governing state system in a nonlinear fashion in the form of a prescribed solenoidal, that is a divergence-free, vector field, whereas the cost functional to be minimized is of tracking-type. The novelties of the present paper are twofold: in addition to the control application, the intrinsic difficulties of the optimization problem forced us to first establish new regularity results on the nonlocal Cahn-Hilliard equation that were unknown even without the coupling with a velocity field and are therefore of independent interest. This happens to be shown using the recently proved separation property along with ad hoc H\"older regularities and a bootstrap method. For the control problem, the existence of an optimal strategy as well as first-order necessary conditions are then established.

10.Wasserstein Tube MPC with Exact Uncertainty Propagation

2304.12093

Authors:Liviu Aolaritei, Marta Fochesato, John Lygeros, Florian Dörfler

Abstract: We study model predictive control (MPC) problems for stochastic LTI systems, where the noise distribution is unknown, compactly supported, and only observable through a limited number of i.i.d. noise samples. Building upon recent results in the literature, which show that distributional uncertainty can be efficiently captured within a Wasserstein ambiguity set, and that such ambiguity sets propagate exactly through the system dynamics, we start by formulating a novel Wasserstein Tube MPC (WT-MPC) problem, with distributionally robust CVaR constraints. We then show that the WT-MPC problem: (1) is a direct generalization of the (deterministic) Robust Tube MPC (RT-MPC) to the stochastic setting; (2) through a scalar parameter, it interpolates between the data-driven formulation based on sample average approximation and the RT-MPC formulation, allowing us to optimally trade between safety and performance; (3) admits a tractable convex reformulation; and (4) is recursively feasible. We conclude the paper with a numerical comparison of WT-MPC and RT-MPC.

11.Review of ensemble gradients for robust optimisation

2304.12136

Authors:Patrick N. Raanes, Andreas S. Stordal, Rolf J. Lorentzen

Abstract: In robust optimisation problems the objective function consists of an average over (an ensemble of) uncertain parameters. Ensemble optimisation (EnOpt) implements steepest descent by estimating the gradient using linear regression on Monte-Carlo simulations of (an ensemble of) control parameters. Applying EnOpt for robust optimisation is costly unless the evaluations over the two ensembles are combined, i.e. 'paired'. Here, we provide a new and more rigorous perspective on the stochastic simplex approximate gradient (StoSAG) used in EnOpt, explaining how it addresses detrimental cross-correlations arising from pairing by only capturing the variability due to the control vector, and not the vector of uncertain parameters. A few minor variants are derived from a generalised derivation, as well as a new approach using decorrelation. These variants are tested on linear and non-linear toy gradient estimation problems, where they achieve highly similar accuracy, but require a very large ensemble size to outperform the non-robust approach when accounting for variance and not just bias. Other original contributions include a discussion of the particular robust control objectives for which EnOpt is suited, illustrations, a variance reduction perspective, and a discussion on the centring in covariance and gradient estimation.

12.Strengthening SONC Relaxations with Constraints Derived from Variable Bounds

2304.12145

Authors:Ksenia Bestuzheva, Helena Völker, Ambros Gleixner

Abstract: Nonnegativity certificates can be used to obtain tight dual bounds for polynomial optimization problems. Hierarchies of certificate-based relaxations ensure convergence to the global optimum, but higher levels of such hierarchies can become very computationally expensive, and the well-known sums of squares hierarchies scale poorly with the degree of the polynomials. This has motivated research into alternative certificates and approaches to global optimization. We consider sums of nonnegative circuit polynomials (SONC) certificates, which are well-suited for sparse problems since the computational cost depends on the number of terms in the polynomials and does not depend on the degrees of the polynomials. We propose a method that guarantees that given finite variable domains, a SONC relaxation will yield a finite dual bound. This method opens up a new approach to utilizing variable bounds in SONC-based methods, which is particularly crucial for integrating SONC relaxations into branch-and-bound algorithms. We report on computational experiments with incorporating SONC relaxations into the spatial branch-and-bound algorithm of the mixed-integer nonlinear programming framework SCIP. Applying our strengthening method increases the number of instances where the SONC relaxation of the root node yielded a finite dual bound from 9 to 330 out of 349 instances in the test set.

13.Fuglede-type arguments for isoperimetric problems and applications to stability among convex shapes

2304.12157

Authors:Raphaël Prunier

Abstract: This paper is concerned with stability of the ball for a class of isoperimetric problems under convexity constraint. Considering the problem of minimizing $P+\varepsilon R$ among convex subsets of $\mathbb{R}^N$ of fixed volume, where $P$ is the perimeter functional, $R$ is a perturbative term and $\varepsilon>0$ is a small parameter, stability of the ball for this perturbed isoperimetric problem means that the ball is the unique (local, up to translation) minimizer for any $\varepsilon$ sufficiently small. We investigate independently two specific cases where $\Omega\mapsto R(\Omega)$ is an energy arising from PDE theory, namely the capacity and the first Dirichlet eigenvalue of a domain $\Omega\subset\mathbb{R}^N$. While in both cases stability fails among all shapes, in the first case we prove (non-sharp) stability of the ball among convex shapes, by building an appropriate competitor for the capacity of a perturbation of the ball. In the second case we prove sharp stability of the ball among convex shapes by providing the optimal range of $\varepsilon$ such that stability holds, relying on the \emph{selection principle} technique and a regularity theory under convexity constraint.

14.Approximation of Optimal Control Surfaces for the Bass Model with Stochastic Dynamics

2304.12265

Authors:Gabriel Nicolosi, Christopher Griffin

Abstract: The Bass diffusion equation is a well-known and established modeling approach for describing new product adoption in a competitive market. This model also describes diffusion phenomena in various contexts: infectious disease spread modeling and estimation, rumor spread on social networks, prediction of renewable energy technology markets, among others. Most of these models, however, consider a deterministic trajectory of the associated state variable (e.g., market-share). In reality, the diffusion process is subject to noise, and a stochastic component must be added to the state dynamics. The stochastic Bass model has also been studied in many areas, such as energy markets and marketing. Exploring the stochastic version of the Bass diffusion model, we propose in this work an approximation of (stochastic) optimal control surfaces for a continuous-time problem arising from a $2\times2$ skew symmetric evolutionary game, providing the stochastic counter-part of the Fourier-based optimal control approximation already existent in the literature.

15.Is a sophisticated agent a wise one?

2304.12273

Authors:Jianfeng Zhang

Abstract: For time inconsistent optimal control problems, a quite popular approach is the equilibrium approach, taken by the sophisticated agents. In this short note we construct a deterministic continuous time example where the unique equilibrium is dominated by another control. Therefore, in this situation it may not be wise to take the equilibrium strategy.

Fri, 21 Apr 2023digest

1.A scalable solution for the extended multi-channel facility location problem

2304.10799

Authors:Etika Agarwal, Karthik S. Gurumoorthy, Ankit Ajit Jain, Shantala Manchenahally

Abstract: We study the extended version of the non-uniform, capacitated facility location problem with multiple fulfilment channels between the facilities and clients, each with their own channel capacities and service cost. Though the problem has been extensively studied in the literature, all the prior works assume a single channel of fulfilment, and the existing methods based on linear programming, primal-dual relationships, local search heuristics etc. do not scale for a large supply chain system involving millions of decision variables. Using the concepts of sub-modularity and optimal transport theory, we present a scalable algorithm for determining the set of facilities to be opened under a cardinality constraint. By introducing various schemes such as: (i) iterative facility selection using incremental gain, (ii) approximation of the linear program using novel multi-stage Sinkhorn iterations, (iii) creation of facilities one for each fulfilment channel etc., we develop a fast but a tight approximate solution, requiring $\mathcal{O}\left(\frac{3+k}{m}ln\left(\frac{1}{\epsilon}\right)\right)$ instances of optimal transport problems to select k facilities from m options, each solvable in linear time. Our algorithm is implicitly endowed with all the theoretical guarantees enjoyed by submodular maximisation problems and the Sinkhorn distances. When compared against the state-of-the-art commercial MILP solvers, we obtain a 100-fold speedup in computation, while the difference in objective values lies within a narrow range of 3%.

2.An extended Merton problem with relaxed benchmark tracking

2304.10802

Authors:Lijun Bo, Yijie Huang, Xiang Yu

Abstract: This paper studies a Merton's optimal portfolio and consumption problem in an extended formulation incorporating the tracking of a benchmark process described by a geometric Brownian motion. We consider a relaxed tracking formulation such that that the wealth process compensated by a fictitious capital injection outperforms the external benchmark at all times. The fund manager aims to maximize the expected utility of consumption deducted by the cost of the capital injection, where the latter term can also be regarded as the expected largest shortfall with reference to the benchmark. By introducing an auxiliary state process with reflection, we formulate and tackle an equivalent stochastic control problem by means of the dual transform and probabilistic representation, where the dual PDE can be solved explicitly. On the strength of the closed-form results, we can derive and verify the feedback optimal control in the semi-analytical form for the primal control problem, allowing us to observe and discuss some new and interesting financial implications on portfolio and consumption decision making induced by the additional risk-taking in capital injection and the goal of tracking.

3.A neurodynamic approach for a class of pseudoconvex semivectorial bilevel optimization problem

2304.10898

Authors:Tran Ngoc Thang, Dao Minh Hoang, Nguyen Viet Dung

Abstract: The article proposes an exact approach to find the global solution of a nonconvex semivectorial bilevel optimization problem, where the objective functions at each level are pseudoconvex, and the constraints are quasiconvex. Due to its non-convexity, this problem is challenging, but it attracts more and more interest because of its practical applications. The algorithm is developed based on monotonic optimization combined with a recent neurodynamic approach, where the solution set of the lower-level problem is inner approximated by copolyblocks in outcome space. From that, the upper-level problem is solved using the branch-and-bound method. Finding the bounds is converted to pseudoconvex programming problems, which are solved using the neurodynamic method. The algorithm's convergence is proved, and computational experiments are implemented to demonstrate the accuracy of the proposed approach.

4.Hierarchical distributed scenario-based model predictive control of interconnected microgrids

2304.10901

Authors:T. Alissa Schenck, Christian A. Hans

Abstract: Microgrids are autonomous clusters of generators, storage units and loads. Special requirements arise in interconnected operation: control schemes that do not require individual microgrids to disclose data about their internal structure and operating objectives are preferred for privacy reasons. Moreover, a safe and economically meaningful operation shall be achieved in presence of uncertain load and weather-dependent availability of renewable infeed. In this paper, we propose a distributed model predictive control approach that satisfies these requirements. Specifically, we demonstrate that costs and safety of supply can be improved through a scenario-based stochastic control scheme. In a numerical case study, our approach is compared to a certainty equivalence scheme. The results illustrate the improved safety and reduced runtime costs as well as sufficiently fast convergence.

5.Near-Optimal Decentralized Momentum Method for Nonconvex-PL Minimax Problems

2304.10902

Authors:Feihu Huang, Songcan Chen

Abstract: Minimax optimization plays an important role in many machine learning tasks such as generative adversarial networks (GANs) and adversarial training. Although recently a wide variety of optimization methods have been proposed to solve the minimax problems, most of them ignore the distributed setting where the data is distributed on multiple workers. Meanwhile, the existing decentralized minimax optimization methods rely on the strictly assumptions such as (strongly) concavity and variational inequality conditions. In the paper, thus, we propose an efficient decentralized momentum-based gradient descent ascent (DM-GDA) method for the distributed nonconvex-PL minimax optimization, which is nonconvex in primal variable and is nonconcave in dual variable and satisfies the Polyak-Lojasiewicz (PL) condition. In particular, our DM-GDA method simultaneously uses the momentum-based techniques to update variables and estimate the stochastic gradients. Moreover, we provide a solid convergence analysis for our DM-GDA method, and prove that it obtains a near-optimal gradient complexity of $O(\epsilon^{-3})$ for finding an $\epsilon$-stationary solution of the nonconvex-PL stochastic minimax problems, which reaches the lower bound of nonconvex stochastic optimization. To the best of our knowledge, we first study the decentralized algorithm for Nonconvex-PL stochastic minimax optimization over a network.

6.Distributed Optimization of Clique-wise Coupled Problems

2304.10904

Authors:Yuto Watanabe, Kazunori Sakurama

Abstract: This study addresses a distributed optimization with a novel class of coupling of variables, called clique-wise coupling. A clique is a node set of a complete subgraph of an undirected graph. This setup is an extension of pairwise coupled optimization problems (e.g., consensus optimization) and allows us to handle coupling of variables consisting of more than two agents systematically. To solve this problem, we propose a clique-based linearized ADMM algorithm, which is proved to be distributed. Additionally, we consider objective functions given as a sum of nonsmooth and smooth convex functions and present a more flexible algorithm based on the FLiP-ADMM algorithm. Moreover, we provide convergence theorems of these algorithms. Notably, all the algorithmic parameters and the derived condition in the theorems depend only on local information, which means that each agent can choose the parameters in a distributed manner. Finally, we apply the proposed methods to a consensus optimization problem and demonstrate their effectiveness via numerical experiments.

7.Learning-based control safeguarded by robust funnel MPC

2304.10910

Authors:Lukas Lanza, Dario Dennstädt, Thomas Berger, Karl Worthmann

Abstract: Recently, a two component MPC scheme was introduced, consisting of pure feedback control (funnel control) and model-based predictive control (funnel MPC). It achieves output tracking of a given reference signal with prescribed performance of the tracking error for a class of unknown nonlinear systems. Relying on the feedback controller's ability to compensate for tracking errors even in the presence of noise and uncertainties, this control structure is robust with respect to model-plant mismatches and bounded disturbances. In the present article, we extend this control structure by a learning component in order to adapt the underlying model to the system data and hence to improve the contribution of MPC. Since the combined control scheme robust funnel MPC is inherently robust with respect to model-plant mismatches and the evolution of the tracking error in the prescribed performance funnel is always guaranteed, the additional learning component is able to perform the learning task online without an initial model or offline training.

8.Riemannian formulation of Pontrygin's principle for robotic manipulators

2304.10959

Authors:François Dubois LMSSC, LMO, Hedy César Ramírez-De-{Á}vila TecNM, Juan Antonio Rojas-Quintero CONACYT

Abstract: In this work, we consider a mechanical system whose mass tensor implements a scalar product in a Riemannian manifold. This system is controlled with the help of forces and torques. A cost functional is minimized to achieve an optimal trajectory. In this contribution, this cost function is supposed to be an arbitrary regular function invariant under a change of coordinates. Optimal control evolution based on Pontryagin's principle induces a covariant second-order ordinary differential equation for an adjoint variable featuring the Riemann curvature tensor. This second order time evolution is derived in this contribution.

9.An Accelerated Proximal Alternating Direction Method of Multipliers for Optimal Decentralized Control of Uncertain Systems

2304.11037

Authors:Bo Yang, Xinyuan Zhao, Xudong Li, Defeng Sun

Abstract: To ensure the system stability of the $\bf{\mathcal{H}_{2}}$-guaranteed cost optimal decentralized control problem (ODC), an approximate semidefinite programming (SDP) problem is formulated based on the sparsity of the gain matrix of the decentralized controller. To reduce data storage and improve computational efficiency, the SDP problem is vectorized into a conic programming (CP) problem using the Kronecker product. Then, a proximal alternating direction method of multipliers (PADMM) is proposed to solve the dual of the resulted CP. By linking the (generalized) PADMM with the (relaxed) proximal point algorithm, we are able to accelerate the proposed PADMM via the Halpern fixed-point iterative scheme. This results in a fast convergence rate for the Karush-Kuhn-Tucker (KKT) residual along the sequence generated by the accelerated algorithm. Numerical experiments further demonstrate that the accelerated PADMM outperforms both the well-known CVXOPT and SCS algorithms for solving the large-scale CP problems arising from $\bf{\mathcal{H}_{2}}$-guaranteed cost ODC problems.

10.Optimal control of a reaction-diffusion model related to the spread of COVID-19

2304.11114

Authors:Pierluigi Colli, Gianni Gilardi, Gabriela Marinoschi, Elisabetta Rocca

Abstract: This paper is concerned with the well-posedness and optimal control problem of a reaction-diffusion system for an epidemic Susceptible-Infected-Recovered-Susceptible (SIRS) mathematical model in which the dynamics develops in a spatially heterogeneous environment. Using as control variables the transmission rates $u_{i}$ and $u_{e}$ of contagion resulting from the contact with both asymptomatic and symptomatic persons, respectively, we optimize the number of exposed and infected individuals at a final time $T$ of the controlled evolution of the system. More precisely, we search for the optimal $u_{i}$ and $u_{e}$ such that the number of infected plus exposed does not exceed at the final time a threshold value $\Lambda$, fixed a priori. We prove here the existence of optimal controls in a proper functional framework and we derive the first-order necessary optimality conditions in terms of the adjoint variables.

11.Commutation relations and stability of switched systems: a personal history

2304.11155

Authors:Daniel Liberzon

Abstract: This expository article presents an overview of research, conducted mostly between the mid-1990s and late 2000s, that explores a link between commutation relations among a family of asymptotically stable vector fields and stability properties of the switched system that these vector fields generate. This topic is viewed through the lens of the author's own involvement with it, by interspersing explanations of technical developments with personal reminiscences and anecdotes.

Thu, 20 Apr 2023digest

1.A Riemannian Dimention-reduced Second Order Method with Application in Sensor Network Localization

2304.10092

Authors:Tianyun Tang, Kim-Chuan Toh, Nachuan Xiao, Yinyu Ye

Abstract: In this paper, we propose a cubic-regularized Riemannian optimization method (RDRSOM), which partially exploits the second order information and achieves the iteration complexity of $\mathcal{O}(1/\epsilon^{3/2})$. In order to reduce the per-iteration computational cost, we further propose a practical version of (RDRSOM), which is an extension of the well known Barzilai-Borwein method and achieves the iteration complexity of $\mathcal{O}(1/\epsilon^{3/2})$. We apply our method to solve a nonlinear formulation of the wireless sensor network localization problem whose feasible set is a Riemannian manifold that has not been considered in the literature before. Numerical experiments are conducted to verify the high efficiency of our algorithm compared to state-of-the-art Riemannian optimization methods and other nonlinear solvers.

2.An Adaptive Multi-Level Max-Plus Method for Deterministic Optimal Control Problems

2304.10342

Authors:Marianne Akian, Stéphane Gaubert, Shanqing Liu

Abstract: We introduce a new numerical method to approximate the solution of a finite horizon deterministic optimal control problem. We exploit two Hamilton-Jacobi-Bellman PDE, arising by considering the dynamics in forward and backward time. This allows us to compute a neighborhood of the set of optimal trajectories, in order to reduce the search space. The solutions of both PDE are successively approximated by max-plus linear combinations of appropriate basis functions, using a hierarchy of finer and finer grids. We show that the sequence of approximate value functions obtained in this way does converge to the viscosity solution of the HJB equation in a neighborhood of optimal trajectories. Then, under certain regularity assumptions, we show that the number of arithmetic operations needed to compute an approximate optimal solution of a $d$-dimensional problem, up to a precision $\varepsilon$, is bounded by $O(C^d (1/\varepsilon) )$, for some constant $C>1$, whereas ordinary grid-based methods have a complexity in$O(1/\varepsilon^{ad}$) for some constant $a>0$.

3.Uncertainty over Uncertainty in Environmental Policy Adoption: Bayesian Learning of Unpredictable Socioeconomic Costs

2304.10344

Authors:Matteo Basei, Giorgio Ferrari, Neofytos Rodosthenous

Abstract: The socioeconomic impact of pollution naturally comes with uncertainty due to, e.g., current new technological developments in emissions' abatement or demographic changes. On top of that, the trend of the future costs of the environmental damage is unknown: Will global warming dominate or technological advancements prevail? The truth is that we do not know which scenario will be realised and the scientific debate is still open. This paper captures those two layers of uncertainty by developing a real-options-like model in which a decision maker aims at adopting a once-and-for-all costly reduction in the current emissions rate, when the stochastic dynamics of the socioeconomic costs of pollution are subject to Brownian shocks and the drift is an unobservable random variable. By keeping track of the actual evolution of the costs, the decision maker is able to learn the unknown drift and to form a posterior dynamic belief of its true value. The resulting decision maker's timing problem boils down to a truly two-dimensional optimal stopping problem which we address via probabilistic free-boundary methods and a state-space transformation. We show that the optimal timing for implementing the emissions reduction policy is the first time that the learning process has become ``decisive'' enough; that is, when it exceeds a time-dependent percentage. This is given in terms of an endogenously determined threshold uniquely solving a nonlinear integral equation, which we can solve numerically. We discuss the implications of the optimal policy and also perform comparative statics to understand the role of the relevant model's parameters in the optimal policy.

4.Secondary Controller Design for the Safety of Nonlinear Systems via Sum-of-Squares Programming

2304.10359

Authors:Yankai Lin, Michelle S. Chong, Carlos Murguia

Abstract: We consider the problem of ensuring the safety of nonlinear control systems under adversarial signals. Using Lyapunov based reachability analysis, we first give sufficient conditions to assess safety, i.e., to guarantee that the states of the control system, when starting from a given initial set, always remain in a prescribed safe set. We consider polynomial systems with semi-algebraic safe sets. Using the S-procedure for polynomial functions, safety conditions can be formulated as a Sum-Of-Squares (SOS) programme, which can be solved efficiently. When safety cannot be guaranteed, we provide tools via SOS to synthesize polynomial controllers that enforce safety of the closed loop system. The theoretical results are illustrated through numerical simulations.

5.The spectral radius of a square matrix can be approximated by its "weighted" spectral norm

2304.10421

Authors:Yongqiang Wang

Abstract: In distributed optimization or Nash-equilibrium seeking over directed graphs, it is crucial to find a matrix norm under which the disagreement of individual agents' states contracts. In existing results, the matrix norm is usually defined by approximating the spectral radius of the matrix, which is possible when the matrix is real and has zero row-sums. In this brief note, we show that this technique can be applied to general square complex matrices. More specifically, we prove that the spectral radius of any complex square matrix can be approximated by its "weighted" spectral norm to an arbitrary degree of accuracy.

6.Projective Proximal Gradient Descent for A Class of Nonconvex Nonsmooth Optimization Problems: Fast Convergence Without Kurdyka-Lojasiewicz (KL) Property

2304.10499

Authors:Yingzhen Yang, Ping Li

Abstract: Nonconvex and nonsmooth optimization problems are important and challenging for statistics and machine learning. In this paper, we propose Projected Proximal Gradient Descent (PPGD) which solves a class of nonconvex and nonsmooth optimization problems, where the nonconvexity and nonsmoothness come from a nonsmooth regularization term which is nonconvex but piecewise convex. In contrast with existing convergence analysis of accelerated PGD methods for nonconvex and nonsmooth problems based on the Kurdyka-\L{}ojasiewicz (K\L{}) property, we provide a new theoretical analysis showing local fast convergence of PPGD. It is proved that PPGD achieves a fast convergence rate of $\cO(1/k^2)$ when the iteration number $k \ge k_0$ for a finite $k_0$ on a class of nonconvex and nonsmooth problems under mild assumptions, which is locally Nesterov's optimal convergence rate of first-order methods on smooth and convex objective function with Lipschitz continuous gradient. Experimental results demonstrate the effectiveness of PPGD.

Wed, 19 Apr 2023digest

1.An Analysis Tool for Push-Sum Based Distributed Optimization

2304.09443

Authors:Yixuan Lin, Ji Liu

Abstract: The push-sum algorithm is probably the most important distributed averaging approach over directed graphs, which has been applied to various problems including distributed optimization. This paper establishes the explicit absolute probability sequence for the push-sum algorithm, and based on which, constructs quadratic Lyapunov functions for push-sum based distributed optimization algorithms. As illustrative examples, the proposed novel analysis tool can improve the convergence rates of the subgradient-push and stochastic gradient-push, two important algorithms for distributed convex optimization over unbalanced directed graphs. Specifically, the paper proves that the subgradient-push algorithm converges at a rate of $O(1/\sqrt{t})$ for general convex functions and stochastic gradient-push algorithm converges at a rate of $O(1/t)$ for strongly convex functions, over time-varying unbalanced directed graphs. Both rates are respectively the same as the state-of-the-art rates of their single-agent counterparts and thus optimal, which closes the theoretical gap between the centralized and push-sum based (sub)gradient methods. The paper further proposes a heterogeneous push-sum based subgradient algorithm in which each agent can arbitrarily switch between subgradient-push and push-subgradient. The heterogeneous algorithm thus subsumes both subgradient-push and push-subgradient as special cases, and still converges to an optimal point at an optimal rate. The proposed tool can also be extended to analyze distributed weighted averaging.

2.Global Convergence of Algorithms Based on Unions of Nonexpansive Maps

2304.09537

Authors:Alexander J. Zaslavski

Abstract: In his recent research M. K. Tam (2018) considered a framework for the analysis of iterative algorithms which can be described in terms of a structured set-valued operator. At each point in the ambient space, the value of the operator can be expressed as a finite union of values of single-valued paracontracting operators. He showed that the associated fixed point iteration is locally convergent around strong fixed points. This result generalizes a theorem due to Bauschke and Noll (2014). In the present paper we generalize the result of Tam and show the global convergence of his algorithm for an arbitrary starting point. An analogous result is also proved for the Krasnosel'ski-Mann iterations.

3.Leveraging the two timescale regime to demonstrate convergence of neural networks

2304.09576

Authors:Pierre Marion, Raphaël Berthier

Abstract: We study the training dynamics of shallow neural networks, in a two-timescale regime in which the stepsizes for the inner layer are much smaller than those for the outer layer. In this regime, we prove convergence of the gradient flow to a global optimum of the non-convex optimization problem in a simple univariate setting. The number of neurons need not be asymptotically large for our result to hold, distinguishing our result from popular recent approaches such as the neural tangent kernel or mean-field regimes. Experimental illustration is provided, showing that the stochastic gradient descent behaves according to our description of the gradient flow and thus converges to a global optimum in the two-timescale regime, but can fail outside of this regime.

4.Linear convergence in time-varying generalized Nash equilibrium problems

2304.09593

Authors:Mattia Bianchi, Emilio Benenati, Sergio Grammatico

Abstract: We study generalized games with full row rank equality constraints and we provide a strikingly simple proof of strong monotonicity of the associated KKT operator. This allows us to show linear convergence to a variational equilibrium of the resulting primal-dual pseudo-gradient dynamics. Then, we propose a fully-distributed algorithm with linear convergence guarantee for aggregative games under partial-decision information. Based on these results, we establish stability properties for online GNE seeking in games with time-varying cost functions and constraints. Finally, we illustrate our findings numerically on an economic dispatch problem for peer-to-peer energy markets.

5.The alternating simultaneous Halpern-Lions-Wittmann-Bauschke algorithm for finding the best approximation pair for two disjoint intersections of convex sets

2304.09600

Authors:Yair Censor, Rafiq Mansour, Daniel Reem

Abstract: Given two nonempty and disjoint intersections of closed and convex subsets, we look for a best approximation pair relative to them, i.e., a pair of points, one in each intersection, attaining the minimum distance between the disjoint intersections. We propose an iterative process based on projections onto the subsets which generate the intersections. The process is inspired by the Halpern-Lions-Wittmann-Bauschke algorithm and the classical alternating process of Cheney and Goldstein, and its advantage is that there is no need to project onto the intersections themselves, a task which can be rather demanding. We prove that under certain conditions the two interlaced subsequences converge to a best approximation pair. These conditions hold, in particular, when the space is Euclidean and the subsets which generate the intersections are compact and strictly convex. Our result extends the one of Aharoni, Censor and Jiang ["Finding a best approximation pair of points for two polyhedra", Computational Optimization and Applications 71 (2018), 509-523] which considered the case of finite-dimensional polyhedra.

6.An efficient solver for multi-objective onshore wind farm siting and network integration

2304.09658

Authors:Jaap Pedersen, Jann Michael Weinand, Chloi Syranidou, Daniel Rehfeldt

Abstract: Existing planning approaches for onshore wind farm siting and network integration often do not meet minimum cost solutions or social and environmental considerations. In this paper, we develop an approach for the multi-objective optimization of turbine locations and their network connection using a Quota Steiner tree problem. Applying a novel transformation on a known directed cut formulation, reduction techniques, and heuristics, we design an exact solver that makes large problem instances solvable and outperforms generic MIP solvers. Although our case studies in selected regions of Germany show large trade-offs between the objective criteria of cost and landscape impact, small burdens on one criterion can significantly improve the other criteria. In addition, we demonstrate that contrary to many approaches for exclusive turbine siting, network integration must be simultaneously optimized in order to avoid excessive costs or landscape impacts in the course of a wind farm project. Our novel problem formulation and the developed solver can assist planners in decision making and help optimize wind farms in large regions in the future.

Tue, 18 Apr 2023digest

1.Constrained Assortment Optimization under the Cross-Nested Logit Model

2304.08790

Authors:Cuong Le, Tien Mai

Abstract: We study the assortment optimization problem under general linear constraints, where the customer choice behavior is captured by the Cross-Nested Logit model. In this problem, there is a set of products organized into multiple subsets (or nests), where each product can belong to more than one nest. The aim is to find an assortment to offer to customers so that the expected revenue is maximized. We show that, under the Cross-Nested Logit model, the assortment problem is NP-hard, even without any constraints. To tackle the assortment optimization problem, we develop a new discretization mechanism to approximate the problem by a linear fractional program with a performance guarantee of $\frac{1 - \epsilon}{1+\epsilon}$, for any accuracy level $\epsilon>0$. We then show that optimal solutions to the approximate problem can be obtained by solving mixed-integer linear programs. We further show that our discretization approach can also be applied to solve a joint assortment optimization and pricing problem, as well as an assortment problem under a mixture of Cross-Nested Logit models to account for multiple classes of customers. Our empirical results on a large number of randomly generated test instances demonstrate that, under a performance guarantee of 90%, the percentage gaps between the objective values obtained from our approximation methods and the optimal expected revenues are no larger than 1.2%.

2.Sensor Fault Detection and Isolation in Autonomous Nonlinear Systems Using Neural Network-Based Observers

2304.08837

Authors:John Cao, Muhammad Umar B. Niazi, Karl Henrik Johansson

Abstract: This paper presents a new observer-based approach to detect and isolate faulty sensors in industrial systems. Two types of sensor faults are considered: complete failure and sensor deterioration. The proposed method is applicable to general autonomous nonlinear systems without making any assumptions about its triangular and/or normal form, which is usually considered in the observer design literature. The key aspect of our approach is a learning-based design of the Luenberger observer, which involves using a neural network to approximate the injective map that transforms the nonlinear system into a stable linear system with output injection. This learning-based Luenberger observer accurately estimates the system's state, allowing for the detection of sensor faults through residual generation. The residual is computed as the norm of the difference between the system's measured output and the observer's predicted output vectors. Fault isolation is achieved by comparing each sensor's measurement with its corresponding predicted value. We demonstrate the effectiveness of our approach in capturing and isolating sensor faults while remaining robust in the presence of measurement noise and system uncertainty. We validate our method through numerical simulations of sensor faults in a network of Kuramoto oscillators.

3.Local Error Bounds for Affine Variational Inequalities on Hilbert Spaces

2304.08884

Authors:Tuan Ngoc Hoang, Lim Yongdo, Yend Dong Nguyen

Abstract: This paper gives some results related to the research problem about infinite-dimensional affine variational inequalities raised by N.D. Yen and X. Yang [Affine variational inequalities on normed spaces, J. Optim. Theory Appl., 178 (2018), 36--55]. Namely, we obtain local error bounds for affine variational inequalities on Hilbert spaces. To do so, we revisit two fundamental properties of polyhedral mappings. Then, we prove a locally upper Lipschitzian property of the inverse of the residual mapping of the infinite-dimensional affine variational inequality under consideration. Finally, we derive the desired local error bounds from that locally upper Lipschitzian property.

4.SOStab: a Matlab Toolbox for Approximating Regions of Attraction of Nonlinear Systems

2304.08889

Authors:Stéphane Drobot, Matteo Tacchi, Colin N. Jones

Abstract: This paper presents a novel Matlab toolbox, aimed at facilitating the use of polynomial optimization for stability analysis of nonlinear systems. Indeed, in the past decade several decisive contributions made it possible to recast the difficult problem of computing stability regions of nonlinear systems, under the form of convex optimization problems that are tractable in modest dimensions. However, these techniques combine sophisticated frameworks such as algebraic geometry, measure theory and mathematical programming, and existing software still requires their user to be fluent in Sum-of-Squares and Moment programming, preventing these techniques from being used more widely in the control community. To address this issue, SOStab entirely automates the writing and solving of optimization problems, and directly outputs relevant data for the user, while requiring minimal input. In particular, no specific knowledge of optimization is needed to use it.

5.On the Separation of Estimation and Control in Risk-Sensitive Investment Problems under Incomplete Observation

2304.08910

Authors:Sébastien Lleo, Wolfgang J. Runggaldier

Abstract: A typical approach to tackle stochastic control problems with partial observation is to separate the control and estimation tasks. However, it is well known that this separation generally fails to deliver an actual optimal solution for risk-sensitive control problems. This paper investigates the separability of a general class of risk-sensitive investment management problems when a finite-dimensional filter exists. We show that the corresponding separated problem, where instead of the unobserved quantities, one considers their conditional filter distribution given the observations, is strictly equivalent to the original control problem. We widen the applicability of the so-called Modified Zakai Equation (MZE) for the study of the separated problem and prove that the MZE simplifies to a PDE in our approach. Furthermore, we derive criteria for separability. We do not solve the separated control problem but note that the existence of a finite-dimensional filter leads to a finite state space for the separated problem. Hence, the difficulty is equivalent to solving a complete observation risk-sensitive problem. Our results have implications for existing risk-sensitive investment management models with partial observations in that they establish their separability. Their implications for future research on new applications is mainly to provide conditions to ensure separability.

6.Linear-quadratic-singular stochastic differential games and applications

2304.09033

Authors:Jodi Dianetti

Abstract: We consider a class of non-cooperative N-player non-zero-sum stochastic differential games with singular controls, in which each player can affect a linear stochastic differential equation in order to minimize a cost functional which is quadratic in the state and linear in the control. We call these games linear-quadratic-singular stochastic differential games. Under natural assumptions, we show the existence of open-loop Nash equilibria, which are characterized through a linear system of forward-backward stochastic differential equations. The proof is based on an approximation via a sequence of games in which players are restricted to play Lipschitz continuous strategies. We then discuss an application of these results to a model of capacity expansion in oligopoly markets.

7.A bilevel approach for compensation and routing decisions in last-mile delivery

2304.09170

Authors:Martina Cerulli, Claudia Archetti, Elena Fernandez, Ivana Ljubic

Abstract: In last-mile delivery logistics, peer-to-peer logistic platforms play an important role in connecting senders, customers, and independent carriers to fulfill delivery requests. Since the carriers are not under the platform's control, the platform has to anticipate their reactions, while deciding how to allocate the delivery operations. Indeed, carriers' decisions largely affect the platform's revenue. In this paper, we model this problem using bilevel programming. At the upper level, the platform decides how to assign the orders to the carriers; at the lower level, each carrier solves a profitable tour problem to determine which offered requests to accept, based on her own profit maximization. Possibly, the platform can influence carriers' decisions by determining also the compensation paid for each accepted request. The two considered settings result in two different formulations: the bilevel profitable tour problem with fixed compensation margins and with margin decisions, respectively. For each of them, we propose single-level reformulations and alternative formulations where the lower-level routing variables are projected out. A branch-and-cut algorithm is proposed to solve the bilevel models, with a tailored warm-start heuristic used to speed up the solution process. Extensive computational tests are performed to compare the proposed formulations and analyze solution characteristics.

Mon, 17 Apr 2023digest

1.Accelerated Distributed Aggregative Optimization

2304.08051

Authors:Jiaxu Liu, Song Chen, Shengze Cai, Chao Xu

Abstract: In this paper, we investigate a distributed aggregative optimization problem in a network, where each agent has its own local cost function which depends not only on the local state variable but also on an aggregated function of state variables from all agents. To accelerate the optimization process, we combine heavy ball and Nesterov's accelerated methods with distributed aggregative gradient tracking, and propose two novel algorithms named DAGT-HB and DAGT-NES for solving the distributed aggregative optimization problem. We analyse that the DAGT-HB and DAGT-NES algorithms can converge to an optimal solution at a global $\mathbf{R}-$linear convergence rate when the objective function is smooth and strongly convex, and when the parameters (e.g., step size and momentum coefficients) are selected within certain ranges. A numerical experiment on the optimal placement problem is given to verify the effectiveness and superiority of our proposed algorithms.

2.On the benefit of overparameterisation in state reconstruction: An empirical study of the nonlinear case

2304.08066

Authors:Jonas F. Haderlein, Andre D. H. Peterson, Parvin Zarei Eskikand, Anthony N. Burkitt, Iven M. Y. Mareels, David B. Grayden

Abstract: The empirical success of machine learning models with many more parameters than measurements has generated an interest in the theory of overparameterisation, i.e., underdetermined models. This paradigm has recently been studied in domains such as deep learning, where one is interested in good (local) minima of complex, nonlinear loss functions. Optimisers, like gradient descent, perform well and consistently reach good solutions. Similarly, nonlinear optimisation problems are encountered in the field of system identification. Examples of such high-dimensional problems are optimisation tasks ensuing from the reconstruction of model states and parameters of an assumed known dynamical system from observed time series. In this work, we identify explicit parallels in the benefits of overparameterisation between what has been analysed in the deep learning context and system identification. We test multiple chaotic time series models, analysing the optimisation process for unknown model states and parameters in batch mode. We find that gradient descent reaches better solutions if we assume more parameters to be unknown. We hypothesise that, indeed, overparameterisation leads us towards better minima, and that more degrees of freedom in the optimisation are beneficial so long as the system is, in principle, observable.

3.A Criterion for ${\rm Q}$-tensors

2304.08119

Authors:Sonali Sharma, K. Palpandi

Abstract: A tensor ${\mathcal A}$ of order $m$ and dimension $n$ is called a ${\rm Q}$-tensor if the tensor complementarity problem has a solution for all ${\bf q} \in {\mathbb R}^{n}$. This means that for every vector ${\bf q}$, there exists a vector ${\bf u}$ such that ${\bf u} \geq {\bf 0},{\bf w} = {\mathcal A}{\bf u}^{m-1}+{\bf q} \geq {\bf 0},~\text{and}~ {\bf u}^{T}{\bf w} = 0$. In this paper, we prove that within the class of rank one symmetric tensors, the ${\rm Q}$-tensors are precisely the positive tensors. Additionally, for a symmetric ${\mathrm Q}$-tensor ${\mathcal A}$ with $rank({\mathcal A})=2$, we show that ${\mathcal A}$ is an ${\mathrm R}_{0}$-tensor. The idea is inspired by the recent work of Parthasarathy et al. \cite{Parthasarathy} and Sivakumar et al. \cite{Sivakumar} on ${\rm Q}$-matrices.

4.Decentralized projected Riemannian gradient method for smooth optimization on compact submanifolds

2304.08241

Authors:Kangkang Deng, Jiang Hu

Abstract: We consider the problem of decentralized nonconvex optimization over a compact submanifold, where each local agent's objective function defined by the local dataset is smooth. Leveraging the powerful tool of proximal smoothness, we establish local linear convergence of the projected gradient descent method with unit step size for solving the consensus problem over the compact manifold. This serves as the basis for analyzing decentralized algorithms on manifolds. Then, we propose two decentralized methods, namely the decentralized projected Riemannian gradient descent (DPRGD) and the decentralized projected Riemannian gradient tracking (DPRGT) methods. We establish their convergence rates of $\mathcal{O}(1/\sqrt{K})$ and $\mathcal{O}(1/K)$, respectively, to reach a stationary point. To the best of our knowledge, DPRGT is the first decentralized algorithm to achieve exact convergence for solving decentralized optimization over a compact manifold. The key ingredients in the proof are the Lipschitz-type inequalities of the projection operator on the compact manifold and smooth functions on the manifold, which could be of independent interest. Finally, we demonstrate the effectiveness of our proposed methods compared to state-of-the-art ones through numerical experiments on eigenvalue problems and low-rank matrix completion.

5.Unrolled three-operator splitting for parameter-map learning in Low Dose X-ray CT reconstruction

2304.08350

Authors:Andreas Kofler, Fabian Altekrüger, Fatima Antarou Ba, Christoph Kolbitsch, Evangelos Papoutsellis, David Schote, Clemens Sirotenko, Felix Frederik Zimmermann, Kostas Papafitsoros

Abstract: We propose a method for fast and automatic estimation of spatially dependent regularization maps for total variation-based (TV) tomography reconstruction. The estimation is based on two distinct sub-networks, with the first sub-network estimating the regularization parameter-map from the input data while the second one unrolling T iterations of the Primal-Dual Three-Operator Splitting (PD3O) algorithm. The latter approximately solves the corresponding TV-minimization problem incorporating the previously estimated regularization parameter-map. The overall network is then trained end-to-end in a supervised learning fashion using pairs of clean-corrupted data but crucially without the need of having access to labels for the optimal regularization parameter-maps.

6.Beyond first-order methods for non-convex non-concave min-max optimization

2304.08389

Authors:Abhijeet Vyas, Brian Bullins

Abstract: We propose a study of structured non-convex non-concave min-max problems which goes beyond standard first-order approaches. Inspired by the tight understanding established in recent works [Adil et al., 2022, Lin and Jordan, 2022b], we develop a suite of higher-order methods which show the improvements attainable beyond the monotone and Minty condition settings. Specifically, we provide a new understanding of the use of discrete-time $p^{th}$-order methods for operator norm minimization in the min-max setting, establishing an $O(1/\epsilon^\frac{2}{p})$ rate to achieve $\epsilon$-approximate stationarity, under the weakened Minty variational inequality condition of Diakonikolas et al. [2021]. We further present a continuous-time analysis alongside rates which match those for the discrete-time setting, and our empirical results highlight the practical benefits of our approach over first-order methods.

Fri, 14 Apr 2023digest

1.Fixed non-stockout-probability policies for the single-item lost-sales model

2304.06936

Authors:Ton de Kok

Abstract: We consider the classical discrete time lost-sales model under stationary continuous demand and linear holding and penalty costs and positive constant lead time. To date the optimal policy structure is only known implicitly by solving numerically the Bellman equations. In this paper we derive the first optimality equation for the lost-sales model. We propose a fixed non-stockout-probability (FP3) policy, implying that each period the order size ensures that P3, the probability of no-stockout at the end of the period of arrival of this order, equals some target value. The FP3-policy can be computed efficiently and accurately from an exact recursive expression and two-moment fits to the emerging random variables. We use the lost-sales optimality equation to compute the optimal FP3-policy. Comparison against the optimal policy for discrete demand suggests that the fixed P3-policy is close-to-optimal. An extensive numerical experiment shows that the FP3-policy outperforms other policies proposed in literature in 97% of all cases. Under the FP3-policy, the volatility of the replenishment process is much lower than the volatility of the demand process. This cv-reduction holds a promise for substantial cost reduction at upstream stages in the supply chain of the end-item under consideration, compared to the situation with backlogging.

2.Stochastic maximum principle for recursive optimal control problems with varying terminal time

2304.07026

Authors:Jiaqi Wang, Shuzhen Yang

Abstract: This paper introduces a new recursive stochastic optimal control problem driven by a forward-backward stochastic differential equations (FBSDEs), where the ter?minal time varies according to the constraints of the state of the forward equation. This new optimal control problem can be used to describe the investment portfolio problems with the varying investment period. Based on novel \r{ho}-moving variational and adjoint equations, we establish the stochastic maximum principle for this optimal control problem including the classical optimal control problem as a particular case. Furthermore, we propose an example to verify our main results.

3.Extremum Seeking Regulator for Nonlinear Systems with Unknown Control Directions and an Uncertain Exosystem

2304.07106

Authors:Shimin Wang, Martin Guay, Dabo Xu

Abstract: This paper proposes a solution to the practical robust output regulation problem for a class of nonlinear systems with unknown control directions and uncertain exosystem dynamics. The concurrence of the unknown control directions and uncertain parameters in both the system dynamics and the exosystem pose a significant challenge to solve this problem. Moreover, in light of the nonlinear internal model approach, this paper converts the robust, practical output regulation problem into a robust non-adaptive stabilization problem for the augmented system with integral Input-to-State Stable (iISS) inverse dynamics. By employing an extremum-seeking control approach, the construction of the control laws avoids the use of Nussbaumtype gain techniques to handle the practical robust output regulation problem subject to time-varying control directions. The stability of the non-adaptive output regulation design is proven via a Lie bracket averaging technique where uniform ultimate boundedness of the closed-loop signals is guaranteed. As a result, the estimation and tracking errors converge to zero exponentially, provided that the frequency of the dither signal goes to infinity. Finally, a numerical example with unknown coefficients is provided to illustrate the validity of the theoretical results.

4.On the local everywhere bounndedness of the minima of a class of integral functionals of the Calculus of the Variations with q between 1 and 2

2304.07128

Authors:Tiziano Granucci

Abstract: In this paper we study the regularity and the boundedness of the minima of two classes of functionals of the calculus of variations

5.Using a one-dimensional finite-element approximation of Webster's horn equation to estimate individual ear canal acoustic transfer from input impedances

2304.07131

Authors:Nick Wulbusch, Reinhild Roden, Alexey Chernov, Matthias Blau

Abstract: In many applications, knowledge of the sound pressure transfer to the eardrum is important. The transfer is highly influenced by the shape of the ear canal and its acoustic properties, such as the acoustic impedance at the eardrum. Invasive procedures to measure the sound pressure at the eardrum are usually elaborate or costly. In this work, we propose a numerical method to estimate the transfer impedance at the eardrum given only input impedance measurements at the ear canal entrance by using one-dimensional first-order finite elements and Nelder-Mead optimization algorithm. Estimations on the area function of the ear canal and the acoustic impedance at the eardrum are achieved. Results are validated through numerical simulations on ten different ear canal geometries and three different acoustic impedances at the eardrum using synthetically generated data from three-dimensional finite element simulations.

6.Human preference and asset performance systems design integration

2304.07168

Authors:Harold van Heukelum, Ruud Binnekamp, Rogier Wolfert

Abstract: Current systems design optimisation methodologies are one-sided as these ignore the dynamic interplay between people's preferences (demand) and engineering assets' physical performance (supply). Moreover, classical multi-objective optimisation methods contain fundamental (aggregation) modelling errors. Also, the classical multi-objective optimisation Pareto front will not offer a best-fit design point but rather a set of design performance alternatives. This leaves designers without a unique solution to their problems. Finally, current multi-objective optimisation processes are rather disconnected from design and management practices since these lack deep involvement of decision-makers for expressing their interests in one common preference domain. Therefore, a new open design systems methodology and a novel integrative optimisation method based on maximising the aggregated group preference are introduced in this paper. Their added value and use are demonstrated in two real-life infrastructure design exemplars, showing how to arrive at a true best fit for common-purpose design points.

7.Towards Learning and Verifying Maximal Neural Lyapunov Functions

2304.07215

Authors:Jun Liu, Yiming Meng, Maxwell Fitzsimmons, Ruikun Zhou

Abstract: The search for Lyapunov functions is a crucial task in the analysis of nonlinear systems. In this paper, we present a physics-informed neural network (PINN) approach to learning a Lyapunov function that is nearly maximal for a given stable set. A Lyapunov function is considered nearly maximal if its sub-level sets can be made arbitrarily close to the boundary of the domain of attraction. We use Zubov's equation to train a maximal Lyapunov function defined on the domain of attraction. Additionally, we propose conditions that can be readily verified by satisfiability modulo theories (SMT) solvers for both local and global stability. We provide theoretical guarantees on the existence of maximal Lyapunov functions and demonstrate the effectiveness of our computational approach through numerical examples.

8.Learning-Assisted Optimization for Transmission Switching

2304.07269

Authors:Salvador Pineda, Juan Miguel Morales, Asunción Jiménez-Cordero

Abstract: The design of new strategies that exploit methods from Machine Learning to facilitate the resolution of challenging and large-scale mathematical optimization problems has recently become an avenue of prolific and promising research. In this paper, we propose a novel learning procedure to assist in the solution of a well-known computationally difficult optimization problem in power systems: The Direct Current Optimal Transmission Switching (DC-OTS). This model consists in finding the configuration of the power network that results in the cheapest dispatch of the power generating units. For this, the model includes a set of binaries that determine the on/off status of the switchable transmission lines. Therefore, the DC-OTS problem takes the form of a mixed-integer program, which is NP-hard in general. Its solution has been approached by way of exact and heuristic methods. The former employ techniques from mixed-integer programming to solve the problem to certified global optimality, while the latter seek to identify good solutions quickly. While the heuristic methods tend to be comparatively much faster, they may suggest suboptimal or even infeasible networks topologies. The proposed approach in this paper leverages known solutions to past instances of the DC-OTS problem to speed up the mixed-integer optimization of a new unseen model. Although it does not offer optimality guarantees, a series of numerical experiments run on a real-life power system dataset show that it features a very high success rate in identifying the optimal grid topology (especially when compared to alternative competing heuristics), while rendering remarkable speed-up factors.

Thu, 13 Apr 2023digest

1.Separable approximations of optimal value functions under a decaying sensitivity assumption

2304.06379

Authors:Mario Sperl, Luca Saluzzi, Lars Grüne, Dante Kalise

Abstract: A new approach for the construction of separable approximations of optimal value functions from interconnected optimal control problems is presented. The approach is based on assuming decaying sensitivities between subsystems, enabling a curse-of-dimensionality free approximation, for instance by deep neural networks.

2.Optimal Control of the Landau-de Gennes Model of Nematic Liquid Crystals

2304.06421

Authors:Thomas M. Surowiec, Shawn W. Walker

Abstract: We present an analysis and numerical study of an optimal control problem for the Landau-de Gennes (LdG) model of nematic liquid crystals (LCs), which is a crucial component in modern technology. They exhibit long range orientational order in their nematic phase, which is represented by a tensor-valued (spatial) order parameter $Q = Q(x)$. Equilibrium LC states correspond to $Q$ functions that (locally) minimize an LdG energy functional. Thus, we consider an $L^2$-gradient flow of the LdG energy that allows for finding local minimizers and leads to a semi-linear parabolic PDE, for which we develop an optimal control framework. We then derive several a priori estimates for the forward problem, including continuity in space-time, that allow us to prove existence of optimal boundary and external ``force'' controls and to derive optimality conditions through the use of an adjoint equation. Next, we present a simple finite element scheme for the LdG model and a straightforward optimization algorithm. We illustrate optimization of LC states through numerical experiments in two and three dimensions that seek to place LC defects (where $Q(x) = 0$) in desired locations, which is desirable in applications.

3.A Nonsmooth Augmented Lagrangian Method and its Application to Poisson Denoising and Sparse Control

2304.06434

Authors:Christian Kanzow, Fabius Krämer, Patrick Mehlitz, Gerd Wachsmuth, Frank Werner

Abstract: In this paper, fully nonsmooth optimization problems in Banach spaces with finitely many inequality constraints, an equality constraint within a Hilbert space framework, and an additional abstract constraint are considered. First, we suggest a (safeguarded) augmented Lagrangian method for the numerical solution of such problems and provide a derivative-free global convergence theory which applies in situations where the appearing subproblems can be solved to approximate global minimality. Exemplary, the latter is possible in a fully convex setting. As we do not rely on any tool of generalized differentiation, the results are obtained under minimal continuity assumptions on the data functions. We then consider two prominent and difficult applications from image denoising and sparse optimal control where these findings can be applied in a beneficial way. These two applications are discussed and investigated in some detail. Due to the different nature of the two applications, their numerical solution by the (safeguarded) augmented Lagrangian approach requires problem-tailored techniques to compute approximate minima of the resulting subproblems. The corresponding methods are discussed, and numerical results visualize our theoretical findings.

4.Convergence rate of Tsallis entropic regularized optimal transport

2304.06616

Authors:Takeshi Suguro, Toshiaki Yachimura

Abstract: In this paper, we consider Tsallis entropic regularized optimal transport and discuss the convergence rate as the regularization parameter $\varepsilon$ goes to $0$. In particular, we establish the convergence rate of the Tsallis entropic regularized optimal transport using the quantization and shadow arguments developed by Eckstein--Nutz. We compare this to the convergence rate of the entropic regularized optimal transport with Kullback--Leibler (KL) divergence and show that KL is the fastest convergence rate in terms of Tsallis relative entropy.

5.Blamelessly Optimal Control For Polytopic Safety Sets

2304.06625

Authors:Natalia Pavlasek, Sarah H. Q. Li, Behçet Açıkmeşe, Meeko Oishi, Claus Danielson

Abstract: In many safety-critical optimal control problems, users may request multiple safety constraints that are jointly infeasible due to external factors such as subsystem failures, unexpected disturbances, or fuel limitations. In this manuscript, we introduce the concept of blameless optimality to characterize control actions that a) satisfy the highest prioritized and feasible safety constraints and b) remain optimal with respect to a mission objective. For a general optimal control problem with jointly infeasible safety constraints, we prove that a single optimization problem cannot find a blamelessly optimal controller. Instead, finding blamelessly optimal control actions requires sequentially solving at least two optimal control problems: one to determine the highest priority level of constraints that is feasible and another to determine the optimal control action with respect to these constraints. We apply our results to a helicopter emergency landing scenario in which violating at least one safety-induced landing constraint is unavoidable. Leveraging the concept of blameless optimality, we formulate blamelessly optimal controllers that can autonomously prioritize human safety over property integrity.

6.Inducing a probability distribution in Stochastic Multicriteria Acceptability Analysis

2304.06650

Authors:Sally Giuseppe Arcidiacono, Salvatore Corrente, Salvatore Greco

Abstract: In multiple criteria decision aiding, very often the alternatives are compared by means of a value function compatible with the preferences expressed by the Decision Maker. The problem is that, in general, there is a plurality of compatible value functions, and providing a final recommendation on the problem at hand considering only one of them could be considered arbitrary to some extent. For such a reason, Stochastic Multicriteria Acceptability Analysis gives information in statistical terms by taking into account a sample of models compatible with the provided preferences. These statistics are given assuming the existence of a probability distribution in the space of value functions being defined a priori. In this paper, we propose some methods aiming to build a probability distribution on the space of value functions considering the preference information given by the Decision Maker. To prove the goodness of our proposal we performed an extensive set of simulations. Moreover, a sensitivity analysis on the variables of our procedure has been done as well.

7.Sparse recovery of an electrical network based on algebraic variety fitting and graph sparsification

2304.06676

Authors:Álvaro Samperio

Abstract: The problem of recovering the topology and parameters of an electrical network from power and voltage data at all nodes is a problem of fitting both an algebraic variety and a graph which is often ill-posed. In case there are multiple electrical networks which fit the data up to a given tolerance, we seek a solution in which the graph and therefore the algebraic equations associated with the electrical network are sparse, i.e. with few edges and terms. From an applied point of view, frequently it is difficult for system operators to know the precise information of the network. On the other hand, improvements on measurement devices increasingly provide more data about voltage and power, so it is useful to use this amount of data to estimate the network. We propose an algorithm for recovering simultaneously a sparse topology and the cable parameters of any network, combining in an iterative procedure the resolution of algebraic fitting convex problems and techniques of spectral graph sparsification. The algorithm is tested on several electrical networks.

Wed, 12 Apr 2023digest

1.Dynamic Discretization Discovery for the Multi-Depot Vehicle Scheduling Problem with Trip Shifting

2304.05665

Authors:Rolf van Lieshout, Thomas van der Schaft

Abstract: The solution of the Multi-Depot Vehicle Scheduling Problem (MDVSP) can often be improved substantially by incorporating Trip Shifting (TS) as a model feature. By allowing departure times to deviate a few minutes from the original timetable, new combinations of trips may be carried out by the same vehicle, thus leading to more efficient scheduling. However, explicit modeling of each potential trip shift quickly causes the problem to get prohibitively large for current solvers, such that researchers and practitioners were obligated to resort to heuristic methods to solve large instances. In this paper, we develop a Dynamic Discretization Discovery algorithm that guarantees an optimal continuous-time solution to the MDVSP-TS without explicit consideration of all trip shifts. It does so by iteratively solving and refining the problem on a partially time-expanded network until the solution can be converted to a feasible vehicle schedule on the fully time-expanded network. Computational results demonstrate that this algorithm outperforms the explicit modeling approach by a wide margin and is able to solve the MDVSP-TS even when many departure time deviations are considered.

2.Optimal Motions of an Elastic Structure under Finite-Dimensional Distributed Control

2304.05765

Authors:Georgy Kostin, Alexander Gavrikov

Abstract: An optimal control problem for longitudinal motions of a thin elastic rod is considered. We suppose that a normal force, which changes piecewise constantly along the rod's length, is applied to the cross-section so that the positions of force jumps are equidistantly placed along the length. Additionally, external loads act at the rod ends. These distributed force and boundary loads are considered as control functions of the dynamic system. Given initial and terminal states at fixed time instants, the problem is to minimize the mean mechanical energy stored in the rod during its motion. We replace the classical wave equation with a variational problem solved via traveling waves defined on a special time-space mesh. For a uniform rod, the shortest admissible time horizon is estimated exactly, and the exact optimal control law is symbolically found in a recurrent way.

3.Parameter-free Maximum Likelihood Localization of a Network of Moving Agents from Ranges, Bearings and Velocity measurements

2304.05988

Authors:Filipa Valdeira, Cláudia Soares, João Gomes

Abstract: Localization is a fundamental enabler technology for many applications, like vehicular networks, IoT, and even medicine. While Global Navigation Satellite Systems solutions offer great performance, it is unavailable in scenarios like indoor or underwater environments, and, for large networks, the cost of instrumentation is prohibitive. We develop a localization algorithm from ranges and bearings, suitable for generic mobile networks of agents. Our algorithm is built on a tight convex relaxation of the Maximum Likelihood position estimator for a generic network. To serve positioning to mobile agents, a horizon-based version is developed accounting for velocity measurements at each agent. To solve the convex problem, a distributed gradient-based method is provided. This constitutes an advantage over other centralized approaches, which usually exhibit high latency for large networks and present a single point of failure. Additionally, the algorithm estimates all required parameters and effectively becomes parameter-free. Our solution to the dynamic network localization problem is theoretically well-founded and still easy to understand. We obtain a parameter-free, outlier-robust and trajectory-agnostic algorithm, with nearly constant positioning error regardless of the trajectories of agents and anchors, achieving better or comparable performance to state-of-the-art methods, as our simulations show. Furthermore, the method is distributed, convex and does not require any particular anchor configuration.

Tue, 11 Apr 2023digest

1.A structure exploiting SDP solver for robust controller synthesis

2304.05037

Authors:Dennis Gramlich, Tobias Holicki, Carsten W. Scherer, Christian Ebenbauer

Abstract: In this paper, we revisit structure exploiting SDP solvers dedicated to the solution of Kalman-Yakubovic-Popov semi-definite programs (KYP-SDPs). These SDPs inherit their name from the KYP Lemma and they play a crucial role in e.g. robustness analysis, robust state feedback synthesis, and robust estimator synthesis for uncertain dynamical systems. Off-the-shelve SDP solvers require $O(n^6)$ arithmetic operations per Newton step to solve this class of problems, where $n$ is the state dimension of the dynamical system under consideration. Specialized solvers reduce this complexity to $O(n^3)$. However, existing specialized solvers do not include semi-definite constraints on the Lyapunov matrix, which is necessary for controller synthesis. In this paper, we show how to include such constraints in structure exploiting KYP-SDP solvers.

2.Generative modeling for time series via Schr{ö}dinger bridge

2304.05093

Authors:Mohamed Hamdouche LPSM, Pierre Henry-Labordere LPSM, Huyên Pham LPSM

Abstract: We propose a novel generative model for time series based on Schr{\"o}dinger bridge (SB) approach. This consists in the entropic interpolation via optimal transport between a reference probability measure on path space and a target measure consistent with the joint data distribution of the time series. The solution is characterized by a stochastic differential equation on finite horizon with a path-dependent drift function, hence respecting the temporal dynamics of the time series distribution. We can estimate the drift function from data samples either by kernel regression methods or with LSTM neural networks, and the simulation of the SB diffusion yields new synthetic data samples of the time series. The performance of our generative model is evaluated through a series of numerical experiments. First, we test with a toy autoregressive model, a GARCH Model, and the example of fractional Brownian motion, and measure the accuracy of our algorithm with marginal and temporal dependencies metrics. Next, we use our SB generated synthetic samples for the application to deep hedging on real-data sets. Finally, we illustrate the SB approach for generating sequence of images.

3.Robust Tube Model Predictive Control with Uncertainty Quantification for Discrete-Time Linear Systems

2304.05105

Authors:Yulong Gao, Shuhao Yan, Jian Zhou, Mark Cannon, Alessandro Abate, Karl H. Johansson

Abstract: This paper is concerned with model predictive control (MPC) of discrete-time linear systems subject to bounded additive disturbance and hard constraints on the state and input, whereas the true disturbance set is unknown. Unlike most existing work on robust MPC, we propose an MPC algorithm incorporating online uncertainty quantification that builds on prior knowledge of the disturbance, i.e., a known but conservative disturbance set. We approximate the true disturbance set at each time step with a parameterised set, which is referred to as a quantified disturbance set, using the scenario approach with additional disturbance realisations collected online. A key novelty of this paper is that the parameterisation of these quantified disturbance sets enjoy desirable properties such that the quantified disturbance set and its corresponding rigid tube bounding disturbance propagation can be efficiently updated online. We provide statistical gaps between the true and quantified disturbance sets, based on which, probabilistic recursive feasibility of MPC optimisation problems are discussed. Numerical simulations are provided to demonstrate the efficacy of our proposed algorithm and compare with conventional robust MPC algorithms.

4.Sufficient Conditions for the Exact Relaxation of Complementarity Constraints for Storages in Multi-period ACOPF

2304.05175

Authors:Qi Wang, Wenchuan Wu, Chenhui Lin, Xueliang Li

Abstract: Storage-concerned Alternative Current Optimal Power Flow (ACOPF) with complementarity constraints is highly non-convex and intractable. In this letter, we first derive two types of relaxation conditions, which guarantee no simultaneous charging and discharging (SCD) in the relaxed multi-period ACOPF. Moreover, we prove that the regions on LMPs formed by the proposed two conditions both contain the other four typical ones. We also generalize the application premise of sufficient conditions from the positive electricity price requirements to the negative electricity price scenarios. The case studies verify the exactness and advantages of the proposed method.

5.Local Conditions for Global Convergence of Gradient Flows and Proximal Point Sequences in Metric Spaces

2304.05239

Authors:Lorenzo Dello Schiavo, Jan Maas, Francesco Pedrotti

Abstract: This paper deals with local criteria for the convergence to a global minimiser for gradient flow trajectories and their discretisations. To obtain quantitative estimates on the speed of convergence, we consider variations on the classical Kurdyka--{\L}ojasiewicz inequality for a large class of parameter functions. Our assumptions are given in terms of the initial data, without any reference to an equilibrium point. The main results are convergence statements for gradient flow curves and proximal point sequences to a global minimiser, together with sharp quantitative estimates on the speed of convergence. These convergence results apply in the general setting of lower semicontinuous functionals on complete metric spaces, generalising recent results for smooth functionals on $\mathbb{R}^n$. While the non-smooth setting covers very general spaces, it is also useful for (non)-smooth functionals on $\mathbb{R}^n$.

6.A priori data-driven robustness guarantees on strategic deviations from generalised Nash equilibria

2304.05308

Authors:George Pantazis, Filiberto Fele, Kostas Margellos

Abstract: In this paper we focus on noncooperative games with uncertain constraints coupling the agents' decisions. We consider a setting where bounded deviations of agents' decisions from the equilibrium are possible, and uncertain constraints are inferred from data. Building upon recent advances in the so called scenario approach, we propose a randomised algorithm that returns a nominal equilibrium such that a pre-specified bound on the probability of violation for yet unseen constraints is satisfied for an entire region of admissible deviations surrounding it, thus supporting neighbourhoods of equilibria with probabilistic feasibility certificates. For the case in which the game admits a potential function, whose minimum coincides with the social welfare optimum of the population, the proposed algorithmic scheme opens the road to achieve a trade-off between the guaranteed feasibility levels of the region surrounding the nominal equilibrium, and its system-level efficiency. Detailed numerical simulations corroborate our theoretical results.

7.Data-driven Distributionally Robust Optimization over Time

2304.05377

Authors:Kevin-Martin Aigner, Andreas Bärmann, Kristin Braun, Frauke Liers, Sebastian Pokutta, Oskar Schneider, Kartikey Sharma, Sebastian Tschuppik

Abstract: Stochastic Optimization (SO) is a classical approach for optimization under uncertainty that typically requires knowledge about the probability distribution of uncertain parameters. As the latter is often unknown, Distributionally Robust Optimization (DRO) provides a strong alternative that determines the best guaranteed solution over a set of distributions (ambiguity set). In this work, we present an approach for DRO over time that uses online learning and scenario observations arriving as a data stream to learn more about the uncertainty. Our robust solutions adapt over time and reduce the cost of protection with shrinking ambiguity. For various kinds of ambiguity sets, the robust solutions converge to the SO solution. Our algorithm achieves the optimization and learning goals without solving the DRO problem exactly at any step. We also provide a regret bound for the quality of the online strategy which converges at a rate of $\mathcal{O}(\log T / \sqrt{T})$, where $T$ is the number of iterations. Furthermore, we illustrate the effectiveness of our procedure by numerical experiments on mixed-integer optimization instances from popular benchmark libraries and give practical examples stemming from telecommunications and routing. Our algorithm is able to solve the DRO over time problem significantly faster than standard reformulations.

Mon, 10 Apr 2023digest

1.Closing Duality Gaps of SDPs through Perturbation

2304.04433

Authors:Takashi Tsuchiya, Bruno F. Lourenço, Masakazu Muramatsu, Takayuki Okuno

Abstract: Let $({\bf P},{\bf D})$ be a primal-dual pair of SDPs with a nonzero finite duality gap. Under such circumstances, ${\bf P}$ and ${\bf D}$ are weakly feasible and if we perturb the problem data to recover strong feasibility, the (common) optimal value function $v$ as a function of the perturbation is not well-defined at zero (unperturbed data) since there are ``two different optimal values'' $v({\bf P})$ and $v({\bf D})$, where $v({\bf P})$ and $v({\bf D})$ are the optimal values of ${\bf P}$ and ${\bf D}$ respectively. Thus, continuity of $v$ is lost at zero though $v$ is continuous elsewhere. Nevertheless, we show that a limiting version ${v_a}$ of $v$ is a well-defined monotone decreasing continuous bijective function connecting $v({\bf P})$ and $v({\bf D})$ with domain $[0, \pi/2]$ under the assumption that both ${\bf P}$ and ${\bf D}$ have singularity degree one. The domain $[0,\pi/2]$ corresponds to directions of perturbation defined in a certain manner. Thus, ${v_a}$ ``completely fills'' the nonzero duality gap under a mild regularity condition. Our result is tight in that there exists an instance with singularity degree two for which ${v_a}$ is not continuous.

2.Fourier-Gegenbauer Pseudospectral Method for Solving Periodic Fractional Optimal Control Problems

2304.04454

Authors:Kareem T. Elgindy

Abstract: This paper introduces a new accurate model for periodic fractional optimal control problems (PFOCPs) using Riemann-Liouville (RL) and Caputo fractional derivatives (FDs) with sliding fixed memory lengths. The paper also provides a novel numerical method for solving PFOCPs using Fourier and Gegenbauer pseudospectral methods. By employing Fourier collocation at equally spaced nodes and Fourier and Gegenbauer quadratures, the method transforms the PFOCP into a simple constrained nonlinear programming problem (NLP) that can be treated easily using standard NLP solvers. We propose a new transformation that largely simplifies the problem of calculating the periodic FDs of periodic functions to the problem of evaluating the integral of the first derivatives of their trigonometric Lagrange interpolating polynomials, which can be treated accurately and efficiently using Gegenbauer quadratures. We introduce the notion of the {\alpha}th-order fractional integration matrix with index L based on Fourier and Gegenbauer pseudospectral approximations, which proves to be very effective in computing periodic FDs. We also provide a rigorous priori error analysis to predict the quality of the Fourier-Gegenbauer-based approximations to FDs. The numerical results of the benchmark PFOCP demonstrate the performance of the proposed pseudospectral method.

3.Inexact Online Proximal Mirror Descent for time-varying composite optimization

2304.04710

Authors:Woocheol Choi, Myeong-Su Lee, Seok-Bae Yun

Abstract: In this paper, we consider the online proximal mirror descent for solving the time-varying composite optimization problems. For various applications, the algorithm naturally involves the errors in the gradient and proximal operator. We obtain sharp estimates on the dynamic regret of the algorithm when the regular part of the cost is convex and smooth. If the Bregman distance is given by the Euclidean distance, our result also improves the previous work in two ways: (i) We establish a sharper regret bound compared to the previous work in the sense that our estimate does not involve $O(T)$ term appearing in that work. (ii) We also obtain the result when the domain is the whole space $\mathbb{R}^n$, whereas the previous work was obtained only for bounded domains. We also provide numerical tests for problems involving the errors in the gradient and proximal operator.

4.First-order methods for Stochastic Variational Inequality problems with Function Constraints

2304.04778

Authors:Digvijay Boob, Qi Deng

Abstract: The monotone Variational Inequality (VI) is an important problem in machine learning. In numerous instances, the VI problems are accompanied by function constraints which can possibly be data-driven, making the projection operator challenging to compute. In this paper, we present novel first-order methods for function constrained VI (FCVI) problem under various settings, including smooth or nonsmooth problems with a stochastic operator and/or stochastic constraints. First, we introduce the~{\texttt{OpConEx}} method and its stochastic variants, which employ extrapolation of the operator and constraint evaluations to update the variables and the Lagrangian multipliers. These methods achieve optimal operator or sample complexities when the FCVI problem is either (i) deterministic nonsmooth, or (ii) stochastic, including smooth or nonsmooth stochastic constraints. Notably, our algorithms are simple single-loop procedures and do not require the knowledge of Lagrange multipliers to attain these complexities. Second, to obtain the optimal operator complexity for smooth deterministic problems, we present a novel single-loop Adaptive Lagrangian Extrapolation~(\texttt{AdLagEx}) method that can adaptively search for and explicitly bound the Lagrange multipliers. Furthermore, we show that all of our algorithms can be easily extended to saddle point problems with coupled function constraints, hence achieving similar complexity results for the aforementioned cases. To our best knowledge, many of these complexities are obtained for the first time in the literature.

5.Dynamically adaptive networks for integrating optimal pressure management and self-cleaning controls

2304.04727

Authors:Bradley Jenks, Aly-Joy Ulusoy, Filippo Pecci, Ivan Stoianov

Abstract: This paper investigates the problem of integrating optimal pressure management and self-cleaning controls in dynamically adaptive water distribution networks. We review existing single-objective valve placement and control problems for minimizing average zone pressure (AZP) and maximizing self-cleaning capacity (SCC). Since AZP and SCC are conflicting objectives, we formulate a bi-objective design-for-control problem where locations and operational settings of pressure control and automatic flushing valves are jointly optimized. We approximate Pareto fronts using the weighted sum scalarization method, which uses a previously developed convex heuristic to solve the sequence of parametrized single-objective problems. The resulting Pareto fronts suggest that significant improvements in SCC can be achieved for minimal trade-offs in AZP performance. Moreover, we demonstrate that a hierarchical design strategy is capable of yielding good quality solutions to both objectives. This hierarchical design considers pressure control valves first placed for the primary AZP objective, followed by automatic flushing valves placed to augment SCC conditions. In addition, we investigate an adaptive control scheme for dynamically transitioning between AZP and SCC controls. We demonstrate these control challenges on case networks with both interconnected and branched topology.

6.Gradient and Hessian of functions with non-independent variables

2304.05835

Authors:Matieyendou Lamboni

Abstract: Mathematical models are sometime given as functions of independent input variables and equations or inequations connecting the input variables. A probabilistic characterization of such models results in treating them as functions with non-independent variables. Using the distribution function or copula of such variables that comply with such equations or inequations, we derive two types of partial derivatives of functions with non-independent variables (i.e., actual and dependent derivatives) and argue in favor of the latter. The dependent partial derivatives of functions with non-independent variables rely on the dependent Jacobian matrix of dependent variables, which is also used to define a tensor metric. The differential geometric framework allows for deriving the gradient, Hessian and Taylor-type expansion of functions with non-independent variables.

7.Iterative Singular Tube Hard Thresholding Algorithms for Tensor Completion

2304.04860

Authors:Rachel Grotheer, Shuang Li, Anna Ma, Deanna Needell, Jing Qin

Abstract: Due to the explosive growth of large-scale data sets, tensors have been a vital tool to analyze and process high-dimensional data. Different from the matrix case, tensor decomposition has been defined in various formats, which can be further used to define the best low-rank approximation of a tensor to significantly reduce the dimensionality for signal compression and recovery. In this paper, we consider the low-rank tensor completion problem. We propose a novel class of iterative singular tube hard thresholding algorithms for tensor completion based on the low-tubal-rank tensor approximation, including basic, accelerated deterministic and stochastic versions. Convergence guarantees are provided along with the special case when the measurements are linear. Numerical experiments on tensor compressive sensing and color image inpainting are conducted to demonstrate convergence and computational efficiency in practice.

8.Finite Element Error Analysis and Solution Stability of affine optimal control problems

2304.04882

Authors:Nicolai Jork

Abstract: We consider affine optimal control problems subject to semilinear elliptic PDEs. The results are two-fold; first, we continue the analysis of solution stability of control problems under perturbations appearing jointly in the objective functional and the PDE. For this, we consider a coercivity-type property that is common in the field of optimal control. The second result is concerned with the obtainment of error estimates for the numerical approximation for a finite element and a variational discretization scheme. The error estimates for the optimal controls and states are obtained under several conditions of different strengths that appeared recently in the context of solution stability. This includes an improvement of error estimates for the optimal controls and states under a H\"older-type growth condition.

arXiv daily: Optimization and Control (math.OC)

8 papers

1.Chemotherapy planning and multi-appointment scheduling: formulations, heuristics and bounds

2.Bilinear control of semilinear elliptic PDEs: Convergence of a semismooth Newton method

3.Online Mixed Discrete and Continuous Optimization: Algorithms, Regret Analysis and Applications

4.Tulipa Energy Model: Mathematical Formulation

5.Optimal inexactness schedules for Tunable Oracle based Methods

6.Learning to Warm-Start Fixed-Point Optimization Algorithms

7.Mean-field games of speedy information access with observation costs

8.Acceleration by Stepsize Hedging I: Multi-Step Descent and the Silver Stepsize Schedule

8 papers

1.Maximum Principle for Mean Field Type Control Problems with General Volatility Functions

2.Nonlinear network identifiability: The static case

3.Barzilai-Borwein Descent Methods for Multiobjective Optimization Problems with Variable Trade-off Metrics

4.On the Intelligent Proportional Controller Applied to Linear Systems

5.Dynamical convergence analysis for nonconvex linearized proximal ADMM algorithms

6.Absorbing Markov Decision Processes

7.Complexity analysis of regularization methods for implicitly constrained least squares

8.Optimal adaptive control with separable drift uncertainty

4 papers

1.Relating Electric Vehicle Charging to Speed Scaling with Job-Specific Speed Limits

2.Inexact Decentralized Dual Gradient Tracking for Constraint-Coupled Optimization

3.Stochastic Bridges over Ensemble of Linear Systems

4.Symmetric Stair Preconditioning of Linear Systems for Parallel Trajectory Optimization

11 papers

1.Optimization Method Based On Optimal Control

2.Simba: A Scalable Bilevel Preconditioned Gradient Method for Fast Evasion of Flat Areas and Saddle Points

3.Computing Wasserstein Barycenter via operator splitting: the method of averaged marginals

4.Dynamic Pricing in an Energy Community Providing Capacity Limitation Services

5.Convergence analysis of the semismooth Newton method for sparse control problems governed by semilinear elliptic equations

6.Turnpike and dissipativity in generalized discrete-time stochastic linear-quadratic optimal control

7.Algorithms for DC Programming via Polyhedral Approximations of Convex Functions

8.Energy-optimal Timetable Design for Sustainable Metro Railway Networks

9.Safe Adaptive Control of Hyperbolic PDE-ODE Cascades

10.A distributionally robust index tracking model with the CVaR penalty: tractable reformulation

11.An exact algorithm for linear optimization problem subject to max-product fuzzy relational inequalities with fuzzy constraints

2 papers

1.Optimal strategies for mosquitoes replacement techniques: influence of the carrying capacity on spatial releases

2.A hybrid physics-informed neural network based multiscale solver as a partial differential equation constrained optimization problem

10 papers

1.Local properties and augmented Lagrangians in fully nonconvex composite optimization

2.An optimal control approach for the treatment of hepatitis C patients

3.PROMISE: Preconditioned Stochastic Optimization Methods by Incorporating Scalable Curvature Estimates

4.A Prescriptive Trilevel Equilibrium Model for Optimal Emissions Pricing and Sustainable Energy Systems Development

5.Backward error analysis and the qualitative behaviour of stochastic optimization algorithms: Application to stochastic coordinate descent

6.Finite dimensional backstepping controller design

7.Lifting functionals defined on maps to measure-valued maps via optimal transport

8.An Efficient Semi-Real-Time Algorithm for Path Planning in the Hamilton-Jacobi Formulation

9.First and zeroth-order implementations of the regularized Newton method with lazy approximated Hessians

10.Crack propagation in anisotropic brittle materials: from a phase-field model to a shape optimization approach

7 papers

1.Model Predictive Control using MATLAB

2.Convergence Analysis of the Best Response Algorithm for Time-Varying Games

3.Urban Logistics in Amsterdam: A Modal Shift from Roadways to Waterway

4.Enhancing PGA Tour Performance: Leveraging ShotlinkTM Data for Optimization and Prediction

5.Directional Tykhonov well-posedness for optimization problems and variational inequalities

6.Integral Quadratic Constraints with Infinite-Dimensional Channels

7.Online Distributed Learning over Random Networks

7 papers

1.Optimal Stopping of BSDEs with Constrained Jumps and Related Zero-Sum Games

2.Interior point methods in optimal control problems of affine systems: Convergence results and solving algorithms

3.Investigating Sparse Reconfigurable Intelligent Surfaces (SRIS) via Maximum Power Transfer Efficiency Method Based on Convex Relaxation

4.On solving a rank regularized minimization problem via equivalent factorized column-sparse regularized models

5.An Efficient Framework for Global Non-Convex Polynomial Optimization over the Hypercube

6.Moreau Envelope ADMM for Decentralized Weakly Convex Optimization

7.A Divide and Conquer Approximation Algorithm for Partitioning Rectangles

6 papers

1.Variational Analysis of Kurdyka-Lojasiewicz Property by Way of Outer Limiting Subgradients

2.A Note on Linear Quadratic Regulator and Kalman Filter

3.Design of Coherent Passive Quantum Equalizers Using Robust Control Theory

4.Riemannian Optimistic Algorithms

5.Quasioptimal alternating projections and their use in low-rank approximation of matrices and tensors

6.The Bus Rapid Transit Investment Problem

8 papers

1.A Geometric Algorithm for Maximizing the Distance over an Intersection of Balls to a Given Point

2.Frequency-domain criterion on the stabilizability for infinite-dimensional linear control systems

3.The Agricultural Spraying Vehicle Routing Problem With Splittable Edge Demands

4.Limited memory gradient methods for unconstrained optimization

5.Uniform Turnpike Property and Singular Limits

6.Energy Space Newton Differentiability for Solution Maps of Unilateral and Bilateral Obstacle Problems